Building bridges and walls through ancestry

Copied from openstreetmap.org and modified under the Open Data Commons Open Database License.  

Bridges and Walls, Snakes and Ladders

I've noticed two perspectives within the broad scope of genealogy where it ties to population genetics.

  • Some people, those with nationalistic, right wing political views, frequently look for what divides their ancestry from others.  What defines and ties them to a historical population, or even to a land.  They may well want to prove connection to a romanticised historical group within their part of the world.
  • Others - those of a more international, liberal persuasion, instead tend to look for what unites them with other peoples alive today - what connects them within the community of humanity.

I have to confess to being more of the latter.

On Paper

I started out with a pretty well researched paper genealogical record.  A family tree.  A family history.  Researched through oral history, interviews, parish records, state records, and then on to digitalised records in more recent years.  A genealogical database of 1,570 individuals for my kids, and 207 direct ancestors recorded for myself - going back to the 1680s.  My recorded ancestry was 100% English - dominated by the County of Norfolk.  The majority of present day English perhaps have some non-English ancestry, perhaps Irish or Scottish, or something a little further afield.  I didn't find any.  All English surnames, and English denominations.  Some of those surnames however, did echo rather more ancient immigration from across the North Sea.

Autosomal DNA Testing

Autosomal DNA testing for ancestry provided a bit of a surprise.  I took a 23andMe DNA test, along with my mother, who's results I phased with to provide more accuracy.  The 23andMe Ancestry Composition analysis in standard mode didn't simply see me as English, or even as British.  It did see me pretty much as 100% European.  Not a hint of Africa nor Asia within the past several hundred years.  It saw 86% of my autosomal DNA as definitively North-West European.  However, it could only see a mere 17% as distinctly belonging to British & Irish.  So, the ancestry test of my autosomal DNA certainly agreed that I was European, NW European even, but couldn't be sure on how English or even British that I was.

23andMe Ancestry Composition in the very unreliable speculative mode rated my British/Irishness at only 37%.  The highest percentage of focus - but it saw 22% of my autosomal DNA ancestry as French / German, 1% as Scandinavian, and 2% as South European.  So considering my 100% English ancestry on paper, autosomal DNA testing couldn't really be very sure about my ancestry.  Even in speculative mode, it had 34% of my DNA as "Broadly NW European", meaning that it couldn't be sure, but somewhere in that corner of that continent.

Fair enough I suppose.  I've lost a certain amount of faith in any autosome DNA tests for ancestry to be able to pinpoint the English.  You see, even ignoring recent waves of immigration of Irish, Scottish, French, Germans, West Indians, South Asians, etc, etc.  The truth is that the English were already a very admixed population even 1,500 years before present.  Already a mixture of prehistoric populations, immigrants from across the Roman Empire, then from across the North Sea, from the Low Countries, Northern Germany, Denmark, Scandinavia, etc.  23andMe claim that their product reflects your ancestry 500 years ago.  No it does not.  It uses modern reference populations.  Genes have been circulating around the World for a long time.  Autosomal DNA tests for ancestry have really improved.  They are pretty good now for recognising a Continent - sometimes even a corner of a continent, as the source of some ancestry.  But they cannot pinpoint many populations with accuracy, and they cannot pinpoint the English.

So, my paper record said English.  My 23andMe autosome DNA test said North-West European, but couldn't even pinpoint British.  It suggested admixture.  It did however - this is important - only see me as European.  Okay, in Standard Mode, it did have a tiny 0.3% that it failed to assign to Europe, nor anywhere.  It did not see Asian.

Haplogroup DNA Testing

Haplogroups follow two narrow lines of ancestry.  The Y follows the direct paternal line, the MT follows the direct maternal.  They do not represent the bulk of your ancestry.  However, they can tell a more accurate, and longer term story.  Ancestry can be lost in Autosomal DNA within a few centuries.  In addition, it gets messed up through recombination.  Not so with the two haplogroups.  So where did mine come from?

My MT-DNA

There is an awful lot that we will know in future about our haplogroups, that we don't yet know - especially in the case of mt-DNA. However, we do know that my haplogroup, H6a1, did not originate in Europe.

H is common in Europe, and it most likely originated either there, or in South West Asia, during the Upper Palaeolithic. H6 did not originate in Europe.  It may be West or Central Asian in origin.  H6a1 has not been recovered in any ancient DNA within Western Europe.  However, it has been recovered in the DNA of the Yamnaya on the Eurasian Steppes.  For this reason, it is generally thought - based on evidence so far, to have been brought into Western Europe during the Early Bronze Age, by the expansion from the Eurasian Steppes at that time.

It isn't too fanciful - based on this evidence, to imagine that my distant grandmothers belonged to tribes of Early Bronze Age pastoralists, living on the Steppes of what is now the Ukraine.

My Y-DNA

This one has been a cracker for me.  Anyone that has followed my blog, might be getting bored with this.  I've thoroughly tested my Y-DNA.  It's not an exaggeration to suggest that it is quite likely Ancient Persian.  Based on current evidence, I believe that my Y-DNA arrived into England within the last millennia - probably between 350 and 800 years ago.  I'm still working on it's most likely route here.  I do believe that it was most likely still located in the region of Iran circa 1,000 to 2,000 years ago.  My nearest 111 STR match is to a guy in Australia who's paternal line lived in Birjand, Eastern Iran.  We shared a common ancestor around 2,000 years ago.  My terminal SNP is shared on record with only one other man so far - in the world.  He was a Balochi speaker that lives in Makran, SW Pakistan - close to the border with Iran.  The Balochi are believed to have migrated from North Iran between the 5th and 14th centuries AD.

Nomad camp, at the Zagros Mountains, Iran.  By C Whitely on Flickr under Creative Commons License.

A bit more distant, I have a Y cousin in the USA that maybe I shared a common ancestor with 3,000 years ago.  He is of Azores Portuguese descent on his Y line, but he carries a distinct STR marker that has been associated with the Parsi, who migrated to India and Pakistan, but originated in Iran.

And going further back, the Y haplogroup L most likely originated within the area of Iran and Iraq, during the Ice Age.  It would have been carried by Upper Paleolithic hunter-gatherers in that region.  13,000 years, I shared grandfathers with two Pontic Greek Y cousins, who's ancestors lived in Trazbon, Eastern Anatolia.  Maybe one Y ancestral son headed to the Black Sea, the other settled at the Caspian Sea?  The Ice Age was drawing to a close, but with a ferocity and climate instability that drove bands of people apart and into refuges at that time.

The Parsi connection keeps hinting.  They descend from Persians that worshiped the ancient religion of Zoroastrianism.  I've just seen a Y haplogroup study of men in Pakistan.  The background level of Y haplogroup L-M317 sat at 1.1%.  However, in the sample of Parsi men there - it spikes up to 13.3%.  That might not be the route however, of my Y line.  The SK1414 SNP turned up in that same study, but that was found on the Makrani Boluch man that was tested L-M317, not in the 12 Parsi men that also tested positive for L-M317.

Conclusion

I prefer bridges to walls, and that is what I got.  My paper ancestry said 100% English - much of it East Anglian.  I'm quite proud of that, but I'm equally proud of my more distant ancestors that emigrated here.  I've found North Sea admixture, from places such as the Netherlands and southern Scandinavia.  I've found a grandmother in a Bronze Age tribe of pastoralists in the Ukraine.  I've found ancient Persians, descending from hunters of Ibex in the Iran / Iraq region.  I've found distant cousins in the USA, Iran, Pakistan, Australia, and Turkey.

One species, one family.

The Other SK1414. My Cousin in Baluchistan

By Baluchistan on Flickr under a Creative Commons Licence. No, this young man is not the SK1414 tester, but the mandolinist in me found this photo kind of cool.  A young man from Makran.  The other SK1414 tester was also a male Makrani Baloch.

I'm hot on the trail of my Y or paternal line, following my FTDNA Y111 STR, then Big Y tests.  These tests analysed the DNA on my Y chromosome.  It is passed down strictly from father to biological sons.  the mutations (SNP and STR) that can be identified in the Y-DNA, can be used to assess relationship, and in some cases, to date the time of most recent common ancestry.  So, with the assistance of Gareth Henson, administrator of the FT-DNA Y haplogroup L Project, and with help from my new distant cousins, what have I learned over the past few weeks?

The Smoking Gun of Y-DNA

Between 45,000 and 13,000 years ago, my paternal ancestors most likely were hunter-gatherers, that lived in the region of what is now Iran and Iraq, during the last Ice Age.  Some sharp changes in glaciation, and cold extremes towards the end of that period, may have generated a number of adaptations, and subsequently, split new sub clades of my Y haplogroup L.

13,000 years ago (based on the Big Y test), I share a common paternal great x grandfather with a number of distant cousins, that descend from Pontic Greek families from the Trabzon region in Turkey.

Between 3,000 and 1,000 years ago (based on the less accurate STR evidence at 111 marker), I share common paternal great x grandfather with another cousin, who's paternal line Habibi, can be traced back to the 1850's in the town of Birjand, Southern Khorasan, Eastern Iran, close to the modern Afghanistan border.  This closer cousin now lives in Australia.

Human male karyotpe high resolution - Y chromosome

My Big Y test produced no less than 90 previously unrecorded or known SNP (pronounced "snip") mutations.  That might be because my Y-DNA is rare, or / and, that it is mainly found in parts of the World where very few people test at this level.  The last SNP on the roll that had been seen before, has been called SK1414.  Because now two of us have tested for this SNP, it is my terminal SNP, so at the moment (although it still has to be submitted to the YFull Tree), I can declare my Y haplogroup sub clade designation to be L-SK1414.  Only one of two so far recorded in the World.

So, who is this Y cousin that shares my SK1414 mutation?

My Baluchistan Cousin

By Baluchistan on Flickr under a Creative Commons Licence.  Another photo from Makran, Balochistan.

The other SK1414 turned up during an early survey, back in the early 2000s by the Human Genome Diversity Project.  It turned up in a sample of the Baluchi in Makran, South-west Pakistan.  Could this cousin be closer than the Habibi tester?  Could my Habibi cousin, from an eastern Iranian family also carry SK1414?

The Baluch, are an Iranic people, that speak Baluchi, an Iranian language that belongs, as do most European languages, to the Indo-European linguistic family.  According to the Iran Chamber Society website, they moved to Makran during the 12th Century AD.  Traditionally the Baluch claim that they originated in Syria, but a linguistic study has instead suggested that they actully originated from the south east of the Caspian region, and that they moved westwards between the 6th and 12th centuries AD in a series of waves.  No other Y sub clade L1b (L-M317) have been found in Southern Asia outside of two samples of this survey, so perhaps the tester did have ancestry from Western Asia.

Iran regions map fr

It would seem likely that I do have a number of Y cousins, most likely in the region of Eastern Iran and South-Western Pakistan.  That doesn't necessarily follow though, that our most recent common Y ancestors lived there.  As I said above, the Baluch of Makrani, Pakistan are said to have migrated from further north-west, from the Caspian Sea region.

There is a tentative suggestion of a link to the Parsi. A Portuguese STR tester with a genetic distance (based on 67 markers) of 22, has (thanks again Gareth) "a distinctive value of 10 at DYS393. In the Qamar paper this value is found in the Parsi population".  So there is just the possibility also, of the Parsi ethnicity carrying L1b from Western Asia into Southern Asia.  Perhaps this marker was picked up by a Portuguese seafarer link to Southern Asia.  It could even be the link to my English line, via the Anglo-Portuguese Alliance.  A lot of speculation.  I don't think that M317 has been found yet in India.

Into England

I have found STR links with four people that carry the surname Chandler.  They live in England, Australia, and the USA.  These cousins appear to descend from a Thomas Chandler, that lived in Basingstoke during the 1740s.  That is 32 miles away from my own contemporary surname ancestor, John Brooker, who lived at the same time at the village of Long Wittenham in the Thames Valley.

Unfortunately three of the Chandlers have only 12 markers tested, and the fourth at 37 markers.  Therefore time of most recent ancestor is not accurate, but it looks as the Chandler and Brooker Y hg L testers of Southern England, most likely shared a common paternal great x grandfather sometime between 800 and 350 years ago.

That only these two lines have turned up, and that they are geographically and genetically so close, might suggest that our Y-DNA lineage arrived in Southern England around the late medieval, perhaps from between the 13th and 17th centuries AD.  It could just be through a Portuguese navigator link, or it could be through thousands of other routes.  More L-M20 testers could turn up in England in the future, that could push the arrival to an earlier date.

Today

I could have any number of cousins from south England.  The Brookers and Chandlers may well have other paternal line descendants living in the Thames Valley, Hampshire, London, or elsewhere.  I'd love to prove a Brooker from the Berkshire / Oxfordshire area, as sharing ancestry.  I believe for example, that the journalist Charlie Brooker descends from one of the Thames Valley families, although not necessarily from mine.  Do they carry the Y hg L?

My great great grandfather Henry Brooker, did not appear to have any more sons, other than my great grandfather John Henry Brooker - who in turn, only had one son, my grandfather Reginald John Brooker.

I have one Y haplogroup first cousin.  He has I believe, a son, and a grandson.

On the Trail of our Y Ancestor

Locator map Iran South Khorasan Province

Early examinations of the Chandler / Brooker Southern English L-M20 Y haplogroup samples, seem to be suggesting that they share a common ancestor quite recently, perhaps between 300 and 600 years ago.  That might mean that a Y ancestor carried the haplogroup into England, perhaps between the 13th and 17th centuries AD.

Where did that Y-DNA come from?  It could have been carried directly by one Y ancestor from a homeland, or it could have transported to England gradually over many generations, from a homeland in Western Asia.

An early match has been forwarded by Caspian, forum user at Anthropogenica.  It is a 111 STR marker, from Birjand / Southern Khorasan, in Eastern Iran.

Could this be the home of our Brooker family Y ancestors?  That is to say, if I was to trace my father, back to his father, to his father - and to continue along this route, might I eventually find my ancestors on this paternal line, in Eastern Iran?  It's an early possibility.  More data, more tests, might eventually give me a better answer.

The STR evidence linked on a Google Sheet.


Edit. 25th May 2016

Early analysis by Gareth Henson, informally suggests a tmrca (time since most recent common ancestor) between myself and the guy in Eastern Iran, of circa 3,000 years ago, or if you prefer, 1000 BC. That would mean that we shared a common lineage until around the time of the Later Bronze Age in British terms. Our common  Y ancestor most likely lived nearer to his home in Western Asia than to mine in North West Europe. 

That isn't long ago. It might suggest that our most recent common ancestor lived in Western Asia around about the time of a series of tensions and conflicts  between Greeks and Persians.  On the other hand, Anthropogenica user Anabasis, using the Clan McDonald TMRCA Calculator, suggests a more recent date, around 1,800 to 1,500 years ago.  That in his words puts it into a context of "In that times Roman - Sasanian wars happened along Eastern Anatolia. Greek- Persian wars were 1 millennium earlier.".  However, he warns, that STR data is not a trustworthy indicator of a TMRCA.

What I love though, is that it stirs the imagination.  Whether 1,500 years ago, or 3,000 years ago - I, an East Anglian, had a paternal ancestor somewhere most likely, between Eastern Anatolia, and Afghanistan.

The Chandler-Brooker Y haplogroup L1b (L-M317)

Link to STR data for Southern English L M20 (Brooker / Chandler)

My Family Tree DNA Y111 STR test results are in.  Only yesterday, I predicted that ftDNA kit number 29369 could be of particular interest.  That prediction has now been proven correct.  Here is what I have learned since yesterday.

The 12 marker STR kit belonged to a descendant of a Thomas Chandler, that lived 1728 to 1782 at Basingstoke, Hampshire.  Although only 12 markers - it proved a perfect match for my first 12.  100%.  Family Tree DNA rated it's genetic distance as zero.

Basingstoke, Hampshire by modern road is only 32 miles (51 km) from Long Wittenham, Berkshire (now Oxon), where my surname ancestor, John Brooker lived, at the same time.  Based on the limitation of a 12 marker comparison, FTDNA give 71% confidence to the testers sharing a common Y ancestor within 12 generations, and 91% confidence of us sharing a common Y ancestor within 24 generations.  I'd say that suggests that myself and the present day descendant of Thomas Chandler, shared a common Y ancestral lineage until between circa 1500 and 1700.

So most likely, between the 16th and 17th centuries inclusive, the Y chromosome moved between two surnames, what we call an NPE (non parental event).  Usually either illegitimacy, where the Y-DNA detached from the surname of the biological father, or simply, the biological father of an ancestor, was not the husband of their mother.  This event most probably occurred in England, somewhere in the Hampshire, Berkshire, Oxfordshire area.  Both my Brooker lineage record, and the Chandler record, merge somewhere in that area.

It gets better.  Searches on FT-DNA, ySearch, and an email trail, revealed more Chandler Y cousins with an L haplogroup. All together, I have today found two 12 marker STR tests, that match my first 12 markers perfectly, with a prediction of zero genetic distance.  I have found another 12 marker with  slight difference, and a genetic distance of 1.  I have found a 37 marker test with some differences, but that still gives a genetic distance of 3.  A comparison with the Y37 test result, predicts 78% confidence of sharing a most recent common ancestor with me within 12 generations, and a 99% confidence of us sharing common Y descent within 24 generations.  This correlates quite nicely with the two perfect 12 marker testers.  All four testers are descended on the paternal line from Chandlers in the Basingstoke area.

The Chandlers of Basingstoke

The FT-DNA Chandler Surname Project is very well managed through the Chandler Family Association.  The three Y12 test kits, that hail from a Basingstoke ancestor, and have proven to belong to the exclusive L M20 Y-DNA haplogroup, have been clustered together as Chandler Group 10.  If our surname was Chandler, rather than Brooker, my Y111 results would fit perfectly into this cluster.  This is because we shared a common paternal lineage, until between 500 and 320 years ago.

Origins of the Chandler-Brooker L1b Y haplogroup

That one has still to be answered.  I'll be consulting others, the Y haplogroup L project administrator, and looking forward to my Big Y test, which is scheduled to take place soon.  However, judging by how very few L-M317 Y haplogroup carriers have so far been recorded in the British Isles, or in North West Europe; I'd dare to propose that the common paternal ancestor of both lines, most likely had not been in England for very long.  Perhaps they could for have example, have carried the Y-DNA here as a 16th or 17th century protestant refugee?  Maybe not, they could have equally been a merchant, an artisan, a servant, a mercenary, or have arrived in another capacity - if indeed they did arrive here that recently.  There is no indication in neither the Brooker or Chandler surname of anything but a medieval English origin (unless originally Bruche, or Chandelier?).

If the common ancestor did arrive that late, where did he come from?  What modern population elsewhere most resembles his Y-DNA?  Hopefully, the Big Y test will help to answer that.  Meanwhile my untrained eyes see correlations within many of the STR markers of people that descend from the Pontic Greek community, that once lived in Eastern Anatolia, and around the Black Sea area of Western Asia.  Of course, the Y-DNA might not have been carried to Southern England from such a homeland within one generation.  It could have been?  There is no sign of any West Asian, Balkan, or Caucasus ancestry within my autosomal DNA.  However, even six to eight generations ago - that could be washed out through recombination - leaving only the Y-DNA to tell what would have otherwise remained a lost untold story.  However, it could have moved across via a number of generations.  It will be worth looking out for any evidence of this on results across the Continent.

See also my earlier posts:





Kit Number 26369

Family Tree DNA (ftDNA) is a commercial genetics genealogy company, with a reputation for cornering the market in Y-DNA testing, and in accumulating references for haplogroups.

That map above, that is the sum total of Y haplogroup L submissions on their database for the UK.  All four of them.  The two to the east are L2 and L2a.  The one in Oxfordshire represents my own pending results (expected L1b or L-M317).  Just to the south of that, the SW representative, is kit number 26369.

The cluster in Central Europe, is the "Rhine Danube Cluster", but that is L1a (L M349).

So you see, except for kit 26369, my Y haplogroup is way out here, like a distant satellite on it's own.  So what is Kit 26379?  Well, it is only a Y12 STR test result.  Predicted to M20, but it has been assigned to L-M317 un-clustered.  Up to now!  It's located only 32 miles south of my surname line during the 1740's.  Could it relate to our line?

STR:

11 23 15 10 11-17 11 12 12 14 14 31

Time will tell.  My Y111 test should take place within the next month.

Ancestry and DNA Tests

I'm writing this post in response to a number of comments that I see online with regards to using a commercial DNA test, in order to ascertain ancestry.  Quite often, when someone asks how to find out their family history or ancestry, someone will come back with an answer in the form of "just spit in a vial, send it to Ancestry.com, and they'll tell you".  It's not really that simple, so I'm making this post, to explain how an ancestry DNA test can help, or not help, you discover your ancestry.  Nicely dumbed down I hope, for the beginner.

Traditional Genealogy

Traditional genealogists usually set out to create a genealogy (family history and tree), using interview techniques, artefacts, and oral memories, recorded from older relatives.  Artefacts might for example, include old family medals, or photographs.  They then extend the research, through documentary evidences, such as birth, death, and marriage certificates, church registers, census records, transcripts, electoral rolls, and military records. If they are interested in recording all ancestral information, and not merely a single line such as the surname line, then this research can go on for months, years, even decades.

What you cannot do, is to simply pay a small fee, and your entire family history drops through the letter box in a brown envelope.  It takes years of time to research, collate, and to verify a good family tree.  Most genealogy enthusiasts don't mind this, because they actually enjoy doing the research itself.  It becomes a hobby, even sometimes a passion.

However, a number of commercial DNA companies may give the general public the impression, that you now can simply pay a fee, spit or swab, and your ancestry magically appears for you on a website.  It's big business.  Does it work though?  Exactly what is genetic genealogy?

What is Ancestry and why do we care.

Ancestry can simply be defined as our descent from forbearers.  Why do we care who they were? Which forbearers or ancestors?  How many are there?  How far back?

Of course, not every one does care.  Not everyone cares about history.  But for other's how we define ourselves, our communities, and families, it does matter.  It tells us who we are, where we came from.  It defines us, gives us grounding.  It gives us identity.  Wars have often been inspired by ancestry.  At the same time, a deeper appreciation of the human family, and it's common ancestry, can be used to relate to those elsewhere.  One big family.  Discovering the immense poverty and hardships of our ancestors can help us to appreciate what we have, and to help others in need today.

So what ancestry can we discover?  For those few that merely concentrate on one patriarchal line, it's quite simple to define - the generations of a surname.  However, beyond that one narrow line of descent, few appreciate exactly how much total ancestry that we have.  Lets look at our biological ancestors at each generation:

  • 2 parents
  • 4 grandparents
  • 8 great grandparents
  • 16 great grandparents
  • 32 g.g grandparents
  • 64 g.g.g.grandparents
  • 128 g.g.g.g grandprents
  • 256 g.g.g.g.g grandparents.

These are only your 510 most recent direct ancestors, yet just those generations, will take you back to only around 250 years of family history.  Now add all of the recorded children of these direct ancestors - the great great uncles and aunts to the theoretical family tree.  You're probable going to have a tree of around 1,300 individuals.  That is just for 250 years.  You have a big family  Go back a few more generations, and it will explode before you reach far.  All of those direct ancestors though, are a part of your ancestry.  You'll most likely carry some DNA from most of them.  They are, from a biological perspective, who you are.

By the way, the number of biological ancestors will not continue to increase infinitely.  Because increasingly, you will find couples within your tree that are distant biological cousins of each other.  As this accelerates through thousands of years, that explains how all modern people around the world, all descend from a very small population around 100,000 years ago.

So before considering what DNA can do for genealogy, we need to consider which ancestors matter to us.  Do you just want to know who your biological parents, or grandparents were?  Do you want to know the names, places and social positions of your ancestors over centuries?  Do you want to know which parts of the world that your ancestors lived 500 years ago?  Do you want to know how some of your prehistoric ancestors moved across the globe, thousands of years ago?  Maybe you want to know everything.

Let's now turn to genetics for genealogy, and how DNA tests can answer some of these questions.

There are two main types of DNA tests for ancestry, although they are often incorporated together by commercial companies:

  1. The haplogroups, the Y-DNA and mt-DNA
  2. Autosomal DNA
The Haplogroups

The haplogroups are chains, or markers, that are carried on one of only two strict lines of descent.  They do not apply to your entire ancestry - just two lines.  As we saw above, we have 256 g.g.g.g.g grandparents (unless any of their descendants reproduced together).  Our haplogroups came from only two of them.  Your haplogroup does not define you.  Yet, it's quite odd, because very quickly, many genetic genealogists do relate to them, rather like a hereditary football club.  They do become an identity, only if you enthuse over them.

The Y or paternal haplogroup, follows the strict paternal line.  From father to son.  Women do not have a Y chromosome, so cannot pass it on.  It has to come from the biological father.  However, within this constraint, Y-DNA is particularly useful to genealogists.  It mutates often, both as STRs and less often, as SNPs (snips).  Because of these frequent mutations, it is useful for tracing shared descent with others.  It can also be aligned with surname studies.  The champion commercial DNA company for Y-DNA research, is Family Tree DNA.

The mt or mitochondrial (maternal) haplogroup, follows the strict maternal line.  From mother to children.  Both sons and daughter inherit their mt-DNA haplogroup from their biological mother.  However, only the daughters can pass it down.  Two downfalls to mt-DNA for genealogy.  1) The surname frequently changes, traditionally nearly every generation through marriage. 2) it doesn't mutate as frequently as the STRs of Y-DNA. It is still a useful tool, and can prove descent through the maternal line.  It can also still be used for studies of much deeper, ancient ancestry.

Autosomal DNA

This is the bulk of you DNA.  All of the snips (SNPs), that make up who you are genetically.  You receive approximately 50% from each parent, 25% from each grandparent, 12.5% from each great grandparent.  This subdivision cannot go on forever, and indeed, once you go back much more than six generations, the approximates start to deviate, so that you may have no snips at all from a particular line that joined your family tree over 250 years ago.

The problem with autosomal DNA is that it can be such a mess.  It recombines randomly with every generation.  Therefore, it is much harder to track ancestry in the same way, that we can with the haplogroups.

So how can they be applied for genealogy:

Biological descent

Not everyone knows who their biological parents were, or where they came from.  This is the first use of DNA testing.  It can be used to find, test, or prove recent descent.  The first hurdle of genealogy.  Both haplogroup evidence, and autosomal evidence can be used to prove or determine relationship.

Cousins

Many genetic genealogists, use DNA to find distant, and sometimes not so distant cousins.  The hope is that they can link trees, share knowledge and research, perhaps copies of artefacts.  Therefore an awful lot of genetic genealogy is about tracing genetic relatives, and establishing common ancestry.

There are two main tools:

  • Haplogroup Projects.  The Y haplogroup is favoured for it's frequent STRs, and also for it's link to surnames.  Family Tree DNA projects track the STR and SNP data of it's members, tracking families, relationship, known mutations.  Project administrators at FTDNA can predict relationship to other members in the project.  Your Y cousins.
  • Shared segments.  Autosomal DNA can be used for finding distant cousins.  23andMe for example, have Relative Finder.  Alternatively customers of any commercial DNA company that gives them access to their raw data, can upload that data to GEDMATCH.  At GEDmatch, they can search for other kits, looking for lengths of shared segments (measured in cM - centimorgans) on the autosomes or X chromosomes.  The longer or more segments can be used to indicate shared ancestry.

It is important to understand, that this is not about directly tracing ancestry.  It is only about establishing shared biological ancestry, with other researchers, with which you may be able to share resources.  In the old days of genealogy, we would find distantly related researchers by browsing through annually printed surname interest directories.  Here, the same thing is happening, but we are finding people by comparing DNA.

Ancestry from Autosomes

Most commercial DNA companies providing ancestry information, now use their own propriety calculators to look at the autosomal DNA of their customers for patterns that they can relate to a number of reference populations.  23andMe for example, uses Ancestry Composition to determine what parts of the world, that the ancestors of their customers lived 500 years ago.  They predict from this in percentages of ancestry.

However, it is very much a developing art.  The problem is that genes have been randomly mixing and moving around ever since prehistory.  The customers of these DNA companies want hard facts.  They want their ancestry accurately pin pointed down to modern or ancient nation-states, or to historical populations such as the Vikings or Huns.  Ancestral DNA companies are under pressure to provide this deep ancestry.  However, can they?  Ancestral analysis of DNA can be very enlightening.  It can provide some surprises within a family history.  However, it's accuracy is exaggerated.  It is increasingly successful at predicting ancestry from a particular corner or end of a particular continent.  But it cannot for example, accurately tell French, British, and German ancestry apart to any high accuracy.  It can recognise some populations better than others.  It cannot tell anyone if they had Viking ancestry.

Ancient Ancestry

This is a particular value of the haplogroups.  As we accumulate more and more data on more mutations, as we expand the recorded database, and as we relate that to more ancient DNA extracted from referenced and dated ancient human remains, so we will be able to better explore the population genetics not only in history, but deep into prehistory.

However, it is also becoming increasingly realised, that patterns of ancient admixture can also be detected within the autosomes.  Although Autosomal DNA ancestry calculators claim to reveal relatively recent admixtures over the past 500 years, it is becoming clear that these are being confused by much older patterns of admixture.  These patterns can now be explored and probed on a number of GEDmatch programs.  People can compare their DNA with the kits from ancient DNA, or predict just how much of their ancestry was likely "Western Hunter-Gatherer, or "Early Neolithic Farmer".

In addition, more DNA companies are now measuring for much more ancient admixture with archaic populations such as the Neanderthals.

Conclusion

Genetic Genealogy is fun, great fun.  It is not however, a quick and easy replacement for traditional genealogy.  Unless you get lucky with some comparative Y-DNA in a project, it is not going to directly tell you the names or social status of any ancestors.  It can give you a phylogenetic tree, but not any kind of family tree that you can bore other family members with.

Genetic genealogy can provide some tools to some researchers.  It can test biological relationship.  It can be used to predict some of your ancient history.  For most researchers, particularly those that are able to interview many local family members, search local grave yards, access archives and records - it has no, or little value to the pursuit of collecting ancestors.

I personally love to explore my genetic genealogy. But it is documentary research that provides the names.  Genetic genealogy for myself, is more about the long and ancient journey.

Autosomal DNA Tests for Genealogy

First a disclaimer.  I'm very new to the whole world of genetic genealogy.  I'm not new however, to traditional genealogy, and I do have a pretty good amateur understanding of relative archaeological and anthropological discussions over the past fifty years.  The following is not meant as a critique of genetic genealogy, so much as a review, or my experience, of ancestry composition based on autosomal DNA analysis.

Let's start with my paper trail.

Traditional Genealogy

I am English by ethnicity, British by nationality, and a subject of Queen Elizabeth II (often now referred to as a UK Citizen).

My paper recorded ancestry consists of the genealogical records of:

  • Generation 1 has 1 individual. (100.00%)
  • Generation 2 has 2 individuals. (100.00%)
  • Generation 3 has 4 individuals. (100.00%)
  • Generation 4 has 8 individuals. (100.00%)
  • Generation 5 has 16 individuals. (100.00%)
  • Generation 6 has 29 individuals. (90.62%)
  • Generation 7 has 49 individuals. (76.56%)
  • Generation 8 has 35 individuals. (27.34%)
  • Generation 9 has 24 individuals. (10.16%)
  • Generation 10 has 10 individuals. (2.34%)
  • Generation 11 has 4 individuals. (0.39%)
  • Total ancestors in generations 2 to 11 is 181. (9.04%)

All 181 ancestors, reaching back to the 1690's, appear to be English born, of English ethnicity, with English surnames.  The majority of them (100% on my mother's side, and 81% on my father's side) were East Anglian, with the vast majority of that percentage being born in the county of Norfolk.  Religions recorded or indicated were CofE Anglican or non-conformist Christian.  No sign of any Catholicism, Islam, or Judaism.

Therefore it would look pretty likely, that I can claim English heritage, wouldn't you agree?

Genetic Genealogy and Ancestry Prediction

There are three aspects or avenues of inquiry, available for genetic genealogy.  First of all, the two sex haplogroups; the y-DNA, and the mt-DNA. These two "signals" are referred to as haplogroups.

  1. The y-DNA.  This follows the Y chromosome.  It is only carried by men.  It is passed along the paternal line, and only by that line, from grandfather, down to father, down to son, until the line is broken.  What a lot of people do often misunderstand, is that it does not represent 50% of your ancestry.  It does not represent all of your biological father's ancestry.  For example, his mother's father, and her brothers, although on your father's side, would most likely carry a different y-DNA haplogroup.  It only comes down an uninterrupted strictly paternal line.  Even at Generation 7 (g.g.g.g grandparents) above, it would have been carried by one out of my sixty four biological ancestors at that generation.  The other thirty one g.g.g.g grandfathers for that generation may have carried different Y haplogroups.
  2. The mt-DNA.  Although a very different type of DNA, this one works as the opposite sex haplogroup.  It is a signal that is passed down the strictly maternal line, from grandmother, to mother, to her children.  Yes, we men do inherit our mother's mt_DNA, but we can't pass it down.  Only our sisters can.
  3. The au-DNA, better known as Autosomal DNA.  Whereas the former two sex haplogroups are handy, because we can measure their mutations, and track their formation and movement across thousands of years, au-DNA really is the stuff that we are made of - all of the SNPs on our chromosomes that personalise us within the human genome.  We inherit our au-DNA from all of our recent ancestors.  Roughly 50% from our biological mother, and 50% from our biological father.  Equally, we could say on average, 25% from each grandparent, or 12.5% from each great grandparent.  However, it is messy.  At every reproduction (meiosis), it gets messed up by recombination.  Not only that, but go back much more than six generations, and it becomes more and more likely that you can lose entire lineages.  You can have no surviving trace of any DNA from for example, a particular g.g.g.g.g grandparent.

Autosomal DNA is what makes us individuals, gives us our hereditary traits.  It is passed down from many ancestors, via our parents.  However, the sex haplogroups are of interest because they can be traced across the globe, and the millennia.  As we gain more and more data - both from living populations, and ancient DNA from archaeological finds, so we will be able to track the STR and SNP mutation data more precisely.

However, what about poor old messed up autosomal DNA?  It represents our entire biological heritage over many generations. It is what we are. However, making sense of it is less easy, less precise.  Genetic genealogists are making progress, but it is far less of a precise science than either of the haplogroups.  They use calculators, that measure the segments of DNA cross the chromosomes, looking for patterns that they recognise from a number of known reference populations.  From that, these calculators predict an ancestry.  Exactly what and when that ancestry refers to, does seem to vary from one calculator to another.  There is an argument that the precision can be improved if you also test close known relatives including at least one parent.  The results can then be phased.  I'm actually waiting for the results for my mother, so that I can see my own au-DNA ancestry results phased and corrected.

So lets have a bit of fun, and see what some of the calculators suggest for my autosomal DNA, at least before any phasing with my mother's DNA.  What do they make of my 100% English paper ancestry?

23andMe.com Ancestry Composition Standard Mode

99.9% European.

Broken into:

83% NW European

17% Broadly (unassigned) European

I think that's pretty cool.  As I'm getting to know au-DNA predictions, so as I'm learning to appreciate it when they get the right continent, and the right corner of that continent.  That is more than they could do a decade or two ago.  The prediction is correct, I am a NW European.  I'm not a West African, a South Asian, or a East Siberian.

23andMe.com Ancestry Composition Speculative Mode

100% European

Broken into:

94% NW European

3% S European

3% Broadly (unassigned) European.

Whoa, where did that South European come from?  It could just be a stray incorrectly identified signal, or it could be telling me that one of my ancestors, maybe around Generation 6, were from down south!  Lets break down the prediction further.  First, the NW European:

32% British & Irish

27% French & German

7% Scandinavian

But surely I should be 100% British & Irish?  Not only 32%.  I have my own ideas about this.  I think that although 23andMe claims that Ancestry Composition only represents the ancestry of the past 300 to 500 years (the so-called migration period, as sold to USA customers), that it gets confused by earlier migrations across their reference populations, including those during the early medieval period, and perhaps even some of those during late prehistory.  I've noticed that across Ireland and Britain, the further to the east, the more diluted the 23andMe British & Irish assignment.  People of solid Irish ancestry get between 85% and 98% British & Irish.  My East Anglian results, mixed between British & Irish, French & German, and Scandinavian, are actually rather more like those received by Dutch customers of 23andMe.

As for that Southern European prediction, how does that break down?

0.5% Iberian

2.4% Broadly (unassigned) South European.

Which if taken seriously, might suggest that I have an unknown Spanish or Portuguese ancestor around Generation 6.  If I did take it seriously that is.  I wonder what my mother's test will reveal?

DNA.Land.com Ancestry Composition

This is a third party site, that you can upload your 23andMe V4 raw data to, and see what their calculators predict for your ancestry.  It has recently had it's ancestry composition revised.  What did that make of my 100% English au-DNA?

West Eurasian 100%.

I like that designation, the amateur anthropologist in me prefers that broad designation over "European".  Broken down:

77% North/Central European

19% South European

2.4% Finnish

1.3% unassigned.

What?  Why not 100% North/Central European?  Finnish?  Did some early medieval Scandinavian settlers of East Anglia bring it?  Or is it a false signal?  Misidentified au-DNA?

That darned South European kicked in again.  I'm here looking at a biological cuckoo NPE (non-parental event) at around Generation 5 or even more recent!  Did a great grandmother secretly have a South European lover?  But this South European breaks down further:

13% Balkan

6% Italian.

Oh my goodness, whereas 23andMe speculative mode suggested SW Europe - this one suggests SE Europe!  Do I have a secret Albanian great grandfather?  Or is it all nonsense?

WeGene.com

This is a cracking new third party DNA analyser.  It is based in China, and it's predictors appear to calculate mainly for a Chinese market.  It not only predicts your ancestry composition, but also your two sex haplogroups, and lots of traits and health predictions to compliment those of 23andMe.  It even tries to predict your genetic disposition to sexuality!

It will allow you to send your 23andMe V4 raw data direct to it's own calculators.  However, at the moment the website is almost entirely in Chinese (Mandarin?).  There are two options.  1) At the bottom of the webpages is a hyperlink to English, which gives, in English, a basic ancestry composition, and your haplogroups.  It does not include English versions of the health and trait results.  2) use an online translator, such as the one built into the Google Chrome browser.  It actually serves pretty well.

On sex haplogroups they give my Y-DNA as

L1.  Not bad, but they didn't make it to L1b or L-M317.

My mtDNA?

H6a1a8.  Very good.  Better than 23andMe's H6a1, and the same as the mthap program.

But this is about au-DNA, how did they do, what did they make of my 100% English ancestry?

81% French

19% English/Briton

Now, that sounds pretty awful, but on closer inspection, I'm impressed.  No South European great grandfather.  Okay, so most of my DNA has been placed on the wrong side of the Channel.  However, I know that French and English DNA is actually very close.  Recent surveys even suggest that the English have inherited a lot of common ancestry with the French during unknown migration late in prehistory.  So again - they very much got the right corner of the right Continent.  Well done WeGene.

GEDmatch.com Eurogenes K13

GEDmatch is a website that you can upload raw data not only from 23andMe, but from a range of testers, and from V3 chips as well as V4.  It hosts a number of tools and predictors - some Open Source.  Some of these predictors are for Admixture or ancestry composition.  They measure your ancestry in terms of distance from known reference populations.  The lower the number, the closer you are to their reference.  They use calculators known as oracles to predict ancestry, including mixed ancestry or admixture.

The oracles on the Eurogenes K13 and K15 calculator models have a good reputation at working with West Eurasian ancestry.  So how does K13 first, score my 100% English ancestry?

On Single Population Sharing, it rates my DNA against the closest references.  In order of closest to not so close, the top five are:

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97

I think that's a cracking result.  Okay, it thinks that I'm closer to South Dutch, than I am to SE English, but so close - and my East Anglian ancestry most likely does include a lot of admixture from the Low Countries from the early medieval period.  I really like Eurogenes K13.

Okay, let's now use the Oracle 4 option, to suggest admixture.  First on three populations admixing to create my DNA, what comes closest?

50% Southeast_English +25% Spanish_Valencia +25% Swedish @ 2.087456

Well that's interesting!  The SE English hit the net.  The Swedish?  Could be ancient Scandinavian admixture - but the Iberian prediction has reemerged!

On four populations admixing?

1 Southeast_English + Southeast_English + Spanish_Valencia + Swedish @ 2.087456
2 Southeast_English + Southeast_English + Spanish_Murcia + Swedish @ 2.147237
3 Norwegian + Portuguese + Southeast_English + Southeast_English @ 2.216714
4 Danish + Portuguese + Southeast_English + Southeast_English @ 2.225334
5 Portuguese + Southeast_English + Southeast_English + Swedish @ 2.230991

Oh my goodness.  K13 agrees with 23andMe AC, that I have an Iberian link.  I'm now really starting to wonder.

Let's finish off by trying K15 on my 100% English ancestry:

GEDmatch.com Eurogenes EU test V2 K15


Using Oracle for single population first, the top five closest:

1 Southwest_English 2.7
2 South_Dutch 3.98
3 Southeast_English 4.33
4 Irish 6.23
5 West_German 6.25

Okay, I'm SE English, not SW English, but pretty impressive again.

Using the oracle 4 for three population admixture, what mix comes closest to my auDNA?

50% Southwest_English +25% Spanish_Castilla_Y_Leon +25% West_Norwegian @ 1.080952

That Iberian back again!

Top five mix ups of populations closest to me?

1 Southwest_English + Southwest_English + Spanish_Castilla_Y_Leon + West_Norwegian @ 1.080952
2 Irish + North_Dutch + Southwest_English + Spanish_Galicia @ 1.111268
3 North_Dutch + Southwest_English + Spanish_Galicia + West_Scottish @ 1.282744
4 Southeast_English + Southwest_English + Spanish_Castilla_Y_Leon + West_Norwegian @ 1.295819
5 North_Dutch + North_Dutch + Southwest_English + Spanish_Castilla_Y_Leon @ 1.304939

I can't help preferring the K13 results to the EU test V2 K15 - simply because it recognises me better as SE English, rather than to their SW English reference.

Conclusions

If anyone ever bothers reading this far too lengthy post, I hope that I have imparted the following lessons:

  • Don't expect DNA Ancestry tests to pin point an actual country of ancestry.  They're not no where near that good yet.  The populations of West Eurasia, and elsewhere, are actually all mixed up, or share a lot of recent admixture.  In addition, many European nation-states are quite recent inventions.  I've seen the borders of Europe change in my short lifetime.
  • Don't expect precision.  If for example, you are an American, and a 23andMe AC test suggests only 32% British & Irish, then you could actually have 100% English ancestry over the past 300 years!  We're so mixed up, that these tests are struggling to part and identify us by nationality.
  • If you are willing to share your raw data (there are privacy issues), then have fun trying out all of these third party calculators.  It's a lot of fun as you can see.  They rarely agree.  There are other tools on GEDmatch for example, where you can compare DNA along with .gedcom genealogical files with other users - and look for shared segments on the chromosomes.  You can also compare your DNA to that of ancient populations.
  • Treat au-DNA differently to haplogroup results.  au-DNA is very interesting, and represents so much of our ancestry, if we could just sort some of the mess out.  You can partially do this by phasing your results with those of close relatives.  It is worthwhile phasing with at least one biological parent, if you can.  However, haplogroup results, provide by their mutations incredible stories over much longer periods - thousands of years.  A different kind of genealogy.  As we gather more data, and reference it also to ancient-DNA, so it will tell us more and more about two lines of descent.  Perhaps even into historical times.