Genetic Genealogy - DNA Relative Matches

I have new DNA cousin "matches".  This is a very important avenue of DNA testing for genealogy and ancestry that I have simply missed until recently.  Up to now, I've concentrated on DNA testing for general ancestry (or ethnicity as some businesses will call it).  The problem was that I first tested with 23andme, and simply, using their heavy USA customer base, and user unfriendly "experiences", I couldn't find any DNA relatives that actually had paper trails that could correlate to my own.

One of the problems is I feel, is that an awful lot of Eastern English migration to the Atlantic Coast of North America, occurred very early - late 16th to early 18th centuries AD.  As a result, although some generous matching systems (such as 23andme's) suggests much more recent shared ancestry, in reality, our links to our distant USA cousins are so old, that all they do is reflect that my distant cousins have Puritan, New England, and Virginian ancestry from Eastern England.  Even for those that do claim to trace ancestry to those pilgrim fathers - I can't.  Certainly not for the thousands of my direct ancestors for Generations 11 - 14.  I don't think any of us can.  Chuck in a bit of genetic folding, and all that these distant relationships is really telling us is, that we both have some ancestry from south east England between 300 and 600 years ago.

Then I tested with Ancestry.com, Ancestry.co.uk, AncestryDNA or whatever you want to call that genealogy mega-business.  Their matching system is dumbed down to the frustrating level.  No chromosome locations or chromosome browsers for painting.  Instead however, they have the fattest database of testers and customers - some of whom, will like myself, be subscription slaves to their family tree and documentary genealogical services.  Their matching systems may cut out chromosome data - but on the flip side, you can browse trees, surnames, ancestral locations, of your DNA matches.  As a consequence, I've found 14 matches that share DNA, with predicted relationships - that correlate to a paper trail relationship.

In addition I am now scouring GEDmatch, 23andme, and FT-DNA Family Finder for more relative DNA matches.  I'm recording everything (including chromosome locations when available) onto a spreadsheet.  The image at the top of this page demonstrates my DNA matches where they share ancestry so far.  The darker the shade, the stronger the verification.

I'm starting to see how this is a better tool to understanding, or verifying ancestry, than any stupid ethnicity / ancestry composition by DNA.  Family isn't always biological.  However, finding a genetic correlation is the ultimate evidence to strengthen a tree.  It's fascinating to see actual paper research turning up as segments of inherited DNA on matches.

Building bridges and walls through ancestry

Copied from openstreetmap.org and modified under the Open Data Commons Open Database License.  

Bridges and Walls, Snakes and Ladders

I've noticed two perspectives within the broad scope of genealogy where it ties to population genetics.

  • Some people, those with nationalistic, right wing political views, frequently look for what divides their ancestry from others.  What defines and ties them to a historical population, or even to a land.  They may well want to prove connection to a romanticised historical group within their part of the world.
  • Others - those of a more international, liberal persuasion, instead tend to look for what unites them with other peoples alive today - what connects them within the community of humanity.

I have to confess to being more of the latter.

On Paper

I started out with a pretty well researched paper genealogical record.  A family tree.  A family history.  Researched through oral history, interviews, parish records, state records, and then on to digitalised records in more recent years.  A genealogical database of 1,570 individuals for my kids, and 207 direct ancestors recorded for myself - going back to the 1680s.  My recorded ancestry was 100% English - dominated by the County of Norfolk.  The majority of present day English perhaps have some non-English ancestry, perhaps Irish or Scottish, or something a little further afield.  I didn't find any.  All English surnames, and English denominations.  Some of those surnames however, did echo rather more ancient immigration from across the North Sea.

Autosomal DNA Testing

Autosomal DNA testing for ancestry provided a bit of a surprise.  I took a 23andMe DNA test, along with my mother, who's results I phased with to provide more accuracy.  The 23andMe Ancestry Composition analysis in standard mode didn't simply see me as English, or even as British.  It did see me pretty much as 100% European.  Not a hint of Africa nor Asia within the past several hundred years.  It saw 86% of my autosomal DNA as definitively North-West European.  However, it could only see a mere 17% as distinctly belonging to British & Irish.  So, the ancestry test of my autosomal DNA certainly agreed that I was European, NW European even, but couldn't be sure on how English or even British that I was.

23andMe Ancestry Composition in the very unreliable speculative mode rated my British/Irishness at only 37%.  The highest percentage of focus - but it saw 22% of my autosomal DNA ancestry as French / German, 1% as Scandinavian, and 2% as South European.  So considering my 100% English ancestry on paper, autosomal DNA testing couldn't really be very sure about my ancestry.  Even in speculative mode, it had 34% of my DNA as "Broadly NW European", meaning that it couldn't be sure, but somewhere in that corner of that continent.

Fair enough I suppose.  I've lost a certain amount of faith in any autosome DNA tests for ancestry to be able to pinpoint the English.  You see, even ignoring recent waves of immigration of Irish, Scottish, French, Germans, West Indians, South Asians, etc, etc.  The truth is that the English were already a very admixed population even 1,500 years before present.  Already a mixture of prehistoric populations, immigrants from across the Roman Empire, then from across the North Sea, from the Low Countries, Northern Germany, Denmark, Scandinavia, etc.  23andMe claim that their product reflects your ancestry 500 years ago.  No it does not.  It uses modern reference populations.  Genes have been circulating around the World for a long time.  Autosomal DNA tests for ancestry have really improved.  They are pretty good now for recognising a Continent - sometimes even a corner of a continent, as the source of some ancestry.  But they cannot pinpoint many populations with accuracy, and they cannot pinpoint the English.

So, my paper record said English.  My 23andMe autosome DNA test said North-West European, but couldn't even pinpoint British.  It suggested admixture.  It did however - this is important - only see me as European.  Okay, in Standard Mode, it did have a tiny 0.3% that it failed to assign to Europe, nor anywhere.  It did not see Asian.

Haplogroup DNA Testing

Haplogroups follow two narrow lines of ancestry.  The Y follows the direct paternal line, the MT follows the direct maternal.  They do not represent the bulk of your ancestry.  However, they can tell a more accurate, and longer term story.  Ancestry can be lost in Autosomal DNA within a few centuries.  In addition, it gets messed up through recombination.  Not so with the two haplogroups.  So where did mine come from?

My MT-DNA

There is an awful lot that we will know in future about our haplogroups, that we don't yet know - especially in the case of mt-DNA. However, we do know that my haplogroup, H6a1, did not originate in Europe.

H is common in Europe, and it most likely originated either there, or in South West Asia, during the Upper Palaeolithic. H6 did not originate in Europe.  It may be West or Central Asian in origin.  H6a1 has not been recovered in any ancient DNA within Western Europe.  However, it has been recovered in the DNA of the Yamnaya on the Eurasian Steppes.  For this reason, it is generally thought - based on evidence so far, to have been brought into Western Europe during the Early Bronze Age, by the expansion from the Eurasian Steppes at that time.

It isn't too fanciful - based on this evidence, to imagine that my distant grandmothers belonged to tribes of Early Bronze Age pastoralists, living on the Steppes of what is now the Ukraine.

My Y-DNA

This one has been a cracker for me.  Anyone that has followed my blog, might be getting bored with this.  I've thoroughly tested my Y-DNA.  It's not an exaggeration to suggest that it is quite likely Ancient Persian.  Based on current evidence, I believe that my Y-DNA arrived into England within the last millennia - probably between 350 and 800 years ago.  I'm still working on it's most likely route here.  I do believe that it was most likely still located in the region of Iran circa 1,000 to 2,000 years ago.  My nearest 111 STR match is to a guy in Australia who's paternal line lived in Birjand, Eastern Iran.  We shared a common ancestor around 2,000 years ago.  My terminal SNP is shared on record with only one other man so far - in the world.  He was a Balochi speaker that lives in Makran, SW Pakistan - close to the border with Iran.  The Balochi are believed to have migrated from North Iran between the 5th and 14th centuries AD.

Nomad camp, at the Zagros Mountains, Iran.  By C Whitely on Flickr under Creative Commons License.

A bit more distant, I have a Y cousin in the USA that maybe I shared a common ancestor with 3,000 years ago.  He is of Azores Portuguese descent on his Y line, but he carries a distinct STR marker that has been associated with the Parsi, who migrated to India and Pakistan, but originated in Iran.

And going further back, the Y haplogroup L most likely originated within the area of Iran and Iraq, during the Ice Age.  It would have been carried by Upper Paleolithic hunter-gatherers in that region.  13,000 years, I shared grandfathers with two Pontic Greek Y cousins, who's ancestors lived in Trazbon, Eastern Anatolia.  Maybe one Y ancestral son headed to the Black Sea, the other settled at the Caspian Sea?  The Ice Age was drawing to a close, but with a ferocity and climate instability that drove bands of people apart and into refuges at that time.

The Parsi connection keeps hinting.  They descend from Persians that worshiped the ancient religion of Zoroastrianism.  I've just seen a Y haplogroup study of men in Pakistan.  The background level of Y haplogroup L-M317 sat at 1.1%.  However, in the sample of Parsi men there - it spikes up to 13.3%.  That might not be the route however, of my Y line.  The SK1414 SNP turned up in that same study, but that was found on the Makrani Boluch man that was tested L-M317, not in the 12 Parsi men that also tested positive for L-M317.

Conclusion

I prefer bridges to walls, and that is what I got.  My paper ancestry said 100% English - much of it East Anglian.  I'm quite proud of that, but I'm equally proud of my more distant ancestors that emigrated here.  I've found North Sea admixture, from places such as the Netherlands and southern Scandinavia.  I've found a grandmother in a Bronze Age tribe of pastoralists in the Ukraine.  I've found ancient Persians, descending from hunters of Ibex in the Iran / Iraq region.  I've found distant cousins in the USA, Iran, Pakistan, Australia, and Turkey.

One species, one family.

Ancestry and DNA Tests

I'm writing this post in response to a number of comments that I see online with regards to using a commercial DNA test, in order to ascertain ancestry.  Quite often, when someone asks how to find out their family history or ancestry, someone will come back with an answer in the form of "just spit in a vial, send it to Ancestry.com, and they'll tell you".  It's not really that simple, so I'm making this post, to explain how an ancestry DNA test can help, or not help, you discover your ancestry.  Nicely dumbed down I hope, for the beginner.

Traditional Genealogy

Traditional genealogists usually set out to create a genealogy (family history and tree), using interview techniques, artefacts, and oral memories, recorded from older relatives.  Artefacts might for example, include old family medals, or photographs.  They then extend the research, through documentary evidences, such as birth, death, and marriage certificates, church registers, census records, transcripts, electoral rolls, and military records. If they are interested in recording all ancestral information, and not merely a single line such as the surname line, then this research can go on for months, years, even decades.

What you cannot do, is to simply pay a small fee, and your entire family history drops through the letter box in a brown envelope.  It takes years of time to research, collate, and to verify a good family tree.  Most genealogy enthusiasts don't mind this, because they actually enjoy doing the research itself.  It becomes a hobby, even sometimes a passion.

However, a number of commercial DNA companies may give the general public the impression, that you now can simply pay a fee, spit or swab, and your ancestry magically appears for you on a website.  It's big business.  Does it work though?  Exactly what is genetic genealogy?

What is Ancestry and why do we care.

Ancestry can simply be defined as our descent from forbearers.  Why do we care who they were? Which forbearers or ancestors?  How many are there?  How far back?

Of course, not every one does care.  Not everyone cares about history.  But for other's how we define ourselves, our communities, and families, it does matter.  It tells us who we are, where we came from.  It defines us, gives us grounding.  It gives us identity.  Wars have often been inspired by ancestry.  At the same time, a deeper appreciation of the human family, and it's common ancestry, can be used to relate to those elsewhere.  One big family.  Discovering the immense poverty and hardships of our ancestors can help us to appreciate what we have, and to help others in need today.

So what ancestry can we discover?  For those few that merely concentrate on one patriarchal line, it's quite simple to define - the generations of a surname.  However, beyond that one narrow line of descent, few appreciate exactly how much total ancestry that we have.  Lets look at our biological ancestors at each generation:

  • 2 parents
  • 4 grandparents
  • 8 great grandparents
  • 16 great grandparents
  • 32 g.g grandparents
  • 64 g.g.g.grandparents
  • 128 g.g.g.g grandprents
  • 256 g.g.g.g.g grandparents.

These are only your 510 most recent direct ancestors, yet just those generations, will take you back to only around 250 years of family history.  Now add all of the recorded children of these direct ancestors - the great great uncles and aunts to the theoretical family tree.  You're probable going to have a tree of around 1,300 individuals.  That is just for 250 years.  You have a big family  Go back a few more generations, and it will explode before you reach far.  All of those direct ancestors though, are a part of your ancestry.  You'll most likely carry some DNA from most of them.  They are, from a biological perspective, who you are.

By the way, the number of biological ancestors will not continue to increase infinitely.  Because increasingly, you will find couples within your tree that are distant biological cousins of each other.  As this accelerates through thousands of years, that explains how all modern people around the world, all descend from a very small population around 100,000 years ago.

So before considering what DNA can do for genealogy, we need to consider which ancestors matter to us.  Do you just want to know who your biological parents, or grandparents were?  Do you want to know the names, places and social positions of your ancestors over centuries?  Do you want to know which parts of the world that your ancestors lived 500 years ago?  Do you want to know how some of your prehistoric ancestors moved across the globe, thousands of years ago?  Maybe you want to know everything.

Let's now turn to genetics for genealogy, and how DNA tests can answer some of these questions.

There are two main types of DNA tests for ancestry, although they are often incorporated together by commercial companies:

  1. The haplogroups, the Y-DNA and mt-DNA
  2. Autosomal DNA
The Haplogroups

The haplogroups are chains, or markers, that are carried on one of only two strict lines of descent.  They do not apply to your entire ancestry - just two lines.  As we saw above, we have 256 g.g.g.g.g grandparents (unless any of their descendants reproduced together).  Our haplogroups came from only two of them.  Your haplogroup does not define you.  Yet, it's quite odd, because very quickly, many genetic genealogists do relate to them, rather like a hereditary football club.  They do become an identity, only if you enthuse over them.

The Y or paternal haplogroup, follows the strict paternal line.  From father to son.  Women do not have a Y chromosome, so cannot pass it on.  It has to come from the biological father.  However, within this constraint, Y-DNA is particularly useful to genealogists.  It mutates often, both as STRs and less often, as SNPs (snips).  Because of these frequent mutations, it is useful for tracing shared descent with others.  It can also be aligned with surname studies.  The champion commercial DNA company for Y-DNA research, is Family Tree DNA.

The mt or mitochondrial (maternal) haplogroup, follows the strict maternal line.  From mother to children.  Both sons and daughter inherit their mt-DNA haplogroup from their biological mother.  However, only the daughters can pass it down.  Two downfalls to mt-DNA for genealogy.  1) The surname frequently changes, traditionally nearly every generation through marriage. 2) it doesn't mutate as frequently as the STRs of Y-DNA. It is still a useful tool, and can prove descent through the maternal line.  It can also still be used for studies of much deeper, ancient ancestry.

Autosomal DNA

This is the bulk of you DNA.  All of the snips (SNPs), that make up who you are genetically.  You receive approximately 50% from each parent, 25% from each grandparent, 12.5% from each great grandparent.  This subdivision cannot go on forever, and indeed, once you go back much more than six generations, the approximates start to deviate, so that you may have no snips at all from a particular line that joined your family tree over 250 years ago.

The problem with autosomal DNA is that it can be such a mess.  It recombines randomly with every generation.  Therefore, it is much harder to track ancestry in the same way, that we can with the haplogroups.

So how can they be applied for genealogy:

Biological descent

Not everyone knows who their biological parents were, or where they came from.  This is the first use of DNA testing.  It can be used to find, test, or prove recent descent.  The first hurdle of genealogy.  Both haplogroup evidence, and autosomal evidence can be used to prove or determine relationship.

Cousins

Many genetic genealogists, use DNA to find distant, and sometimes not so distant cousins.  The hope is that they can link trees, share knowledge and research, perhaps copies of artefacts.  Therefore an awful lot of genetic genealogy is about tracing genetic relatives, and establishing common ancestry.

There are two main tools:

  • Haplogroup Projects.  The Y haplogroup is favoured for it's frequent STRs, and also for it's link to surnames.  Family Tree DNA projects track the STR and SNP data of it's members, tracking families, relationship, known mutations.  Project administrators at FTDNA can predict relationship to other members in the project.  Your Y cousins.
  • Shared segments.  Autosomal DNA can be used for finding distant cousins.  23andMe for example, have Relative Finder.  Alternatively customers of any commercial DNA company that gives them access to their raw data, can upload that data to GEDMATCH.  At GEDmatch, they can search for other kits, looking for lengths of shared segments (measured in cM - centimorgans) on the autosomes or X chromosomes.  The longer or more segments can be used to indicate shared ancestry.

It is important to understand, that this is not about directly tracing ancestry.  It is only about establishing shared biological ancestry, with other researchers, with which you may be able to share resources.  In the old days of genealogy, we would find distantly related researchers by browsing through annually printed surname interest directories.  Here, the same thing is happening, but we are finding people by comparing DNA.

Ancestry from Autosomes

Most commercial DNA companies providing ancestry information, now use their own propriety calculators to look at the autosomal DNA of their customers for patterns that they can relate to a number of reference populations.  23andMe for example, uses Ancestry Composition to determine what parts of the world, that the ancestors of their customers lived 500 years ago.  They predict from this in percentages of ancestry.

However, it is very much a developing art.  The problem is that genes have been randomly mixing and moving around ever since prehistory.  The customers of these DNA companies want hard facts.  They want their ancestry accurately pin pointed down to modern or ancient nation-states, or to historical populations such as the Vikings or Huns.  Ancestral DNA companies are under pressure to provide this deep ancestry.  However, can they?  Ancestral analysis of DNA can be very enlightening.  It can provide some surprises within a family history.  However, it's accuracy is exaggerated.  It is increasingly successful at predicting ancestry from a particular corner or end of a particular continent.  But it cannot for example, accurately tell French, British, and German ancestry apart to any high accuracy.  It can recognise some populations better than others.  It cannot tell anyone if they had Viking ancestry.

Ancient Ancestry

This is a particular value of the haplogroups.  As we accumulate more and more data on more mutations, as we expand the recorded database, and as we relate that to more ancient DNA extracted from referenced and dated ancient human remains, so we will be able to better explore the population genetics not only in history, but deep into prehistory.

However, it is also becoming increasingly realised, that patterns of ancient admixture can also be detected within the autosomes.  Although Autosomal DNA ancestry calculators claim to reveal relatively recent admixtures over the past 500 years, it is becoming clear that these are being confused by much older patterns of admixture.  These patterns can now be explored and probed on a number of GEDmatch programs.  People can compare their DNA with the kits from ancient DNA, or predict just how much of their ancestry was likely "Western Hunter-Gatherer, or "Early Neolithic Farmer".

In addition, more DNA companies are now measuring for much more ancient admixture with archaic populations such as the Neanderthals.

Conclusion

Genetic Genealogy is fun, great fun.  It is not however, a quick and easy replacement for traditional genealogy.  Unless you get lucky with some comparative Y-DNA in a project, it is not going to directly tell you the names or social status of any ancestors.  It can give you a phylogenetic tree, but not any kind of family tree that you can bore other family members with.

Genetic genealogy can provide some tools to some researchers.  It can test biological relationship.  It can be used to predict some of your ancient history.  For most researchers, particularly those that are able to interview many local family members, search local grave yards, access archives and records - it has no, or little value to the pursuit of collecting ancestors.

I personally love to explore my genetic genealogy. But it is documentary research that provides the names.  Genetic genealogy for myself, is more about the long and ancient journey.