Thoughts in understanding ancestry DNA

Above image.  My Global 10 Genetic Map coordinates:  PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10 ,0.019,0.0272,0.0002,-0.0275,-0.0055,0.0242,0.0241,-0.0033,-0.0029,0.0015.  The cross marks my position on a genetic map by David Wesolowski, of the Eurogenes Blog

The above map shows genetic distances between different human populations around the planet.  Look how tightly the Europeans cluster.  Razib Kahn recently blogged on just this subject.  The fact of the matter is that the greatest diversity exists between populations outside of Europe, particularly within Africa, and between African and non-African populations.  However, we obsess over tiny differences within European populations, when in truth, most Western Eurasians are very closely related.  We share ancient ancestry from slightly varied mixes of only three base ancestral groups, with the last layer arriving only 4,300 years ago.  This obsession in the Market drives DNA to the consumer businesses to largely ignore non-European diversity, and to focus too closely on differences that blur into each other.

The above image is from CARTA lecture. 2016. Johannes Krause of the Max Planck Institute. It shows the currently three known founder populations of Europeans and their average percentages.

However, at the same time the new Living DNA service seeks to zoom in closer on British populations, attempting to detect ancestry percentages from such tiny zones as "East Anglia".  They appear to be having a level of success with it as well, although that blurriness, that overlap and closeness of populations in Europe gives problems.  Germans are given false percentages of British, Some Scottish appear as Northern Irish, and the Irish dilute into false British areas.  However, I've seen enough results now to suggest that it is far from genetic astrology.  They get it correct to a certain level, particularly for us with English ancestry.  Ancestry DNA customers expect perfection.  I don't think that we will ever get that from such closely related populations at this resolution, but it does provide a new genealogical tool that can point us into some revealing directions.

Above image.  My Living DNA Map.  Based on my recorded genealogy, I estimate 77% to 85% East Anglian ancestry over the past 250 years or so.  Living DNA at Standard Mode gave me 39%.  I'm impressed by that.  That a DNA test can recognise even at a 50% success, my recent ancestry in such a tiny zone of the planet.  I have doubts though that this sort of test will ever be free of errors, and mistakes.  The safest DNA test for ancestry is still one that is based on more distinct populations, and outside of Africa, that can be as wide as "European".  23andMe for example in their "Standard Mode" (75% confidence), assign me 97.3% European, and 0.3% Unassigned.  That is a pretty safe result.

Autosomal DNA tests for ancestry, particularly for West Eurasian (European and Western Asia) descendants, are not reliable at high resolution.  If you want to get really local, then sure - do it.  However, only use the results as an indication, not as a truth.  Populations in Western Eurasia are closely related, and share recent common descent.  There has been a high degree of mobility and admixture ever since.  Some modern populations tested do not have a high level of deep rooted local ancestry in that region.  They overlap with each other.  Keep researching and meander through different perspectives of what your older pre-recorded ancestry could have been.

Above image by Anthrogenica board member Tolan.  Based on 23andMe AC results.  My results skew away from British, and towards North French.  He generated this map, plotting myself (marked as Norfolk in red), and my Normand Ancestral DNA twin Helge in yellow.  My results fall in the overlap with French.  Helge is Normand but in AC appears more British than myself.  I am East Anglian yet in this test appear more French than he does.



My Global 10 Genetic Map - and Frenchness!

My Global 10 Genetic Map coordinates:  PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10 ,0.019,0.0272,0.0002,-0.0275,-0.0055,0.0242,0.0241,-0.0033,-0.0029,0.0015

This is my position on the latest genetic map by David Wesolowski, of the Eurogenes Blog.  One point of interest that has been picked up on the Anthrogenica Forums, is my consistent closeness in ancestral results, to a Normand member!  Our Basal-rich K7 results were almost identical.  On 23andMe Ancestry Composition (spec mode), I just get a bit more French & German, while he gets just a bit more British & Irish.  We are close!

Another forum member argued though that it's my results that are skewed away from British, and towards North French.  He generated this map, plotting myself (marked as Norfolk in red), and my Norman Ancestral DNA twin Helge in yellow:

I had to point out though, that I've rarely seen other SE English with a record of local ancestry, test - and that the red circles representing British & Irish include many people with some Irish, Scottish, Welsh, Western, or Northern ancestry.  The map suggests a pull to Northern France, Belgium, and the Netherlands.

As I commented towards the end of my last post, I initially expected a pull to Denmark, Northern Germany, and perhaps to the Netherlands.  This is because so many of my 17th-20th century ancestors lived on what was the frontier of Anglo-Saxon and Danish immigration during the 4th to 11th centuries.

But instead, autosomal DNA tests for ancestry all seem to be suggesting more shared ancestry from a more southerly direction - Northern France and Belgium particularly.  Although there has so far been a dearth of local testers from local families, the POBI survey seems to find this common among the English.  We appear to be a halfway house between Old British, and the French, more than the ancestors of Anglo-Saxons and Danes.  This contradicts the historical and archaeological records.  POBI suggested that this was due to waves of unrecorded immigration from the South during late prehistory.  Others have pointed the finger at Norman and French admixture in medieval Southern Britain.  It could be both!

Can I apply for French citizenship?

The Other SK1414. My Cousin in Baluchistan

By Baluchistan on Flickr under a Creative Commons Licence. No, this young man is not the SK1414 tester, but the mandolinist in me found this photo kind of cool.  A young man from Makran.  The other SK1414 tester was also a male Makrani Baloch.

I'm hot on the trail of my Y or paternal line, following my FTDNA Y111 STR, then Big Y tests.  These tests analysed the DNA on my Y chromosome.  It is passed down strictly from father to biological sons.  the mutations (SNP and STR) that can be identified in the Y-DNA, can be used to assess relationship, and in some cases, to date the time of most recent common ancestry.  So, with the assistance of Gareth Henson, administrator of the FT-DNA Y haplogroup L Project, and with help from my new distant cousins, what have I learned over the past few weeks?

The Smoking Gun of Y-DNA

Between 45,000 and 13,000 years ago, my paternal ancestors most likely were hunter-gatherers, that lived in the region of what is now Iran and Iraq, during the last Ice Age.  Some sharp changes in glaciation, and cold extremes towards the end of that period, may have generated a number of adaptations, and subsequently, split new sub clades of my Y haplogroup L.

13,000 years ago (based on the Big Y test), I share a common paternal great x grandfather with a number of distant cousins, that descend from Pontic Greek families from the Trabzon region in Turkey.

Between 3,000 and 1,000 years ago (based on the less accurate STR evidence at 111 marker), I share common paternal great x grandfather with another cousin, who's paternal line Habibi, can be traced back to the 1850's in the town of Birjand, Southern Khorasan, Eastern Iran, close to the modern Afghanistan border.  This closer cousin now lives in Australia.

Human male karyotpe high resolution - Y chromosome

My Big Y test produced no less than 90 previously unrecorded or known SNP (pronounced "snip") mutations.  That might be because my Y-DNA is rare, or / and, that it is mainly found in parts of the World where very few people test at this level.  The last SNP on the roll that had been seen before, has been called SK1414.  Because now two of us have tested for this SNP, it is my terminal SNP, so at the moment (although it still has to be submitted to the YFull Tree), I can declare my Y haplogroup sub clade designation to be L-SK1414.  Only one of two so far recorded in the World.

So, who is this Y cousin that shares my SK1414 mutation?

My Baluchistan Cousin

By Baluchistan on Flickr under a Creative Commons Licence.  Another photo from Makran, Balochistan.

The other SK1414 turned up during an early survey, back in the early 2000s by the Human Genome Diversity Project.  It turned up in a sample of the Baluchi in Makran, South-west Pakistan.  Could this cousin be closer than the Habibi tester?  Could my Habibi cousin, from an eastern Iranian family also carry SK1414?

The Baluch, are an Iranic people, that speak Baluchi, an Iranian language that belongs, as do most European languages, to the Indo-European linguistic family.  According to the Iran Chamber Society website, they moved to Makran during the 12th Century AD.  Traditionally the Baluch claim that they originated in Syria, but a linguistic study has instead suggested that they actully originated from the south east of the Caspian region, and that they moved westwards between the 6th and 12th centuries AD in a series of waves.  No other Y sub clade L1b (L-M317) have been found in Southern Asia outside of two samples of this survey, so perhaps the tester did have ancestry from Western Asia.

Iran regions map fr

It would seem likely that I do have a number of Y cousins, most likely in the region of Eastern Iran and South-Western Pakistan.  That doesn't necessarily follow though, that our most recent common Y ancestors lived there.  As I said above, the Baluch of Makrani, Pakistan are said to have migrated from further north-west, from the Caspian Sea region.

There is a tentative suggestion of a link to the Parsi. A Portuguese STR tester with a genetic distance (based on 67 markers) of 22, has (thanks again Gareth) "a distinctive value of 10 at DYS393. In the Qamar paper this value is found in the Parsi population".  So there is just the possibility also, of the Parsi ethnicity carrying L1b from Western Asia into Southern Asia.  Perhaps this marker was picked up by a Portuguese seafarer link to Southern Asia.  It could even be the link to my English line, via the Anglo-Portuguese Alliance.  A lot of speculation.  I don't think that M317 has been found yet in India.

Into England

I have found STR links with four people that carry the surname Chandler.  They live in England, Australia, and the USA.  These cousins appear to descend from a Thomas Chandler, that lived in Basingstoke during the 1740s.  That is 32 miles away from my own contemporary surname ancestor, John Brooker, who lived at the same time at the village of Long Wittenham in the Thames Valley.

Unfortunately three of the Chandlers have only 12 markers tested, and the fourth at 37 markers.  Therefore time of most recent ancestor is not accurate, but it looks as the Chandler and Brooker Y hg L testers of Southern England, most likely shared a common paternal great x grandfather sometime between 800 and 350 years ago.

That only these two lines have turned up, and that they are geographically and genetically so close, might suggest that our Y-DNA lineage arrived in Southern England around the late medieval, perhaps from between the 13th and 17th centuries AD.  It could just be through a Portuguese navigator link, or it could be through thousands of other routes.  More L-M20 testers could turn up in England in the future, that could push the arrival to an earlier date.

Today

I could have any number of cousins from south England.  The Brookers and Chandlers may well have other paternal line descendants living in the Thames Valley, Hampshire, London, or elsewhere.  I'd love to prove a Brooker from the Berkshire / Oxfordshire area, as sharing ancestry.  I believe for example, that the journalist Charlie Brooker descends from one of the Thames Valley families, although not necessarily from mine.  Do they carry the Y hg L?

My great great grandfather Henry Brooker, did not appear to have any more sons, other than my great grandfather John Henry Brooker - who in turn, only had one son, my grandfather Reginald John Brooker.

I have one Y haplogroup first cousin.  He has I believe, a son, and a grandson.

Ancestry and DNA Tests

I'm writing this post in response to a number of comments that I see online with regards to using a commercial DNA test, in order to ascertain ancestry.  Quite often, when someone asks how to find out their family history or ancestry, someone will come back with an answer in the form of "just spit in a vial, send it to Ancestry.com, and they'll tell you".  It's not really that simple, so I'm making this post, to explain how an ancestry DNA test can help, or not help, you discover your ancestry.  Nicely dumbed down I hope, for the beginner.

Traditional Genealogy

Traditional genealogists usually set out to create a genealogy (family history and tree), using interview techniques, artefacts, and oral memories, recorded from older relatives.  Artefacts might for example, include old family medals, or photographs.  They then extend the research, through documentary evidences, such as birth, death, and marriage certificates, church registers, census records, transcripts, electoral rolls, and military records. If they are interested in recording all ancestral information, and not merely a single line such as the surname line, then this research can go on for months, years, even decades.

What you cannot do, is to simply pay a small fee, and your entire family history drops through the letter box in a brown envelope.  It takes years of time to research, collate, and to verify a good family tree.  Most genealogy enthusiasts don't mind this, because they actually enjoy doing the research itself.  It becomes a hobby, even sometimes a passion.

However, a number of commercial DNA companies may give the general public the impression, that you now can simply pay a fee, spit or swab, and your ancestry magically appears for you on a website.  It's big business.  Does it work though?  Exactly what is genetic genealogy?

What is Ancestry and why do we care.

Ancestry can simply be defined as our descent from forbearers.  Why do we care who they were? Which forbearers or ancestors?  How many are there?  How far back?

Of course, not every one does care.  Not everyone cares about history.  But for other's how we define ourselves, our communities, and families, it does matter.  It tells us who we are, where we came from.  It defines us, gives us grounding.  It gives us identity.  Wars have often been inspired by ancestry.  At the same time, a deeper appreciation of the human family, and it's common ancestry, can be used to relate to those elsewhere.  One big family.  Discovering the immense poverty and hardships of our ancestors can help us to appreciate what we have, and to help others in need today.

So what ancestry can we discover?  For those few that merely concentrate on one patriarchal line, it's quite simple to define - the generations of a surname.  However, beyond that one narrow line of descent, few appreciate exactly how much total ancestry that we have.  Lets look at our biological ancestors at each generation:

  • 2 parents
  • 4 grandparents
  • 8 great grandparents
  • 16 great grandparents
  • 32 g.g grandparents
  • 64 g.g.g.grandparents
  • 128 g.g.g.g grandprents
  • 256 g.g.g.g.g grandparents.

These are only your 510 most recent direct ancestors, yet just those generations, will take you back to only around 250 years of family history.  Now add all of the recorded children of these direct ancestors - the great great uncles and aunts to the theoretical family tree.  You're probable going to have a tree of around 1,300 individuals.  That is just for 250 years.  You have a big family  Go back a few more generations, and it will explode before you reach far.  All of those direct ancestors though, are a part of your ancestry.  You'll most likely carry some DNA from most of them.  They are, from a biological perspective, who you are.

By the way, the number of biological ancestors will not continue to increase infinitely.  Because increasingly, you will find couples within your tree that are distant biological cousins of each other.  As this accelerates through thousands of years, that explains how all modern people around the world, all descend from a very small population around 100,000 years ago.

So before considering what DNA can do for genealogy, we need to consider which ancestors matter to us.  Do you just want to know who your biological parents, or grandparents were?  Do you want to know the names, places and social positions of your ancestors over centuries?  Do you want to know which parts of the world that your ancestors lived 500 years ago?  Do you want to know how some of your prehistoric ancestors moved across the globe, thousands of years ago?  Maybe you want to know everything.

Let's now turn to genetics for genealogy, and how DNA tests can answer some of these questions.

There are two main types of DNA tests for ancestry, although they are often incorporated together by commercial companies:

  1. The haplogroups, the Y-DNA and mt-DNA
  2. Autosomal DNA
The Haplogroups

The haplogroups are chains, or markers, that are carried on one of only two strict lines of descent.  They do not apply to your entire ancestry - just two lines.  As we saw above, we have 256 g.g.g.g.g grandparents (unless any of their descendants reproduced together).  Our haplogroups came from only two of them.  Your haplogroup does not define you.  Yet, it's quite odd, because very quickly, many genetic genealogists do relate to them, rather like a hereditary football club.  They do become an identity, only if you enthuse over them.

The Y or paternal haplogroup, follows the strict paternal line.  From father to son.  Women do not have a Y chromosome, so cannot pass it on.  It has to come from the biological father.  However, within this constraint, Y-DNA is particularly useful to genealogists.  It mutates often, both as STRs and less often, as SNPs (snips).  Because of these frequent mutations, it is useful for tracing shared descent with others.  It can also be aligned with surname studies.  The champion commercial DNA company for Y-DNA research, is Family Tree DNA.

The mt or mitochondrial (maternal) haplogroup, follows the strict maternal line.  From mother to children.  Both sons and daughter inherit their mt-DNA haplogroup from their biological mother.  However, only the daughters can pass it down.  Two downfalls to mt-DNA for genealogy.  1) The surname frequently changes, traditionally nearly every generation through marriage. 2) it doesn't mutate as frequently as the STRs of Y-DNA. It is still a useful tool, and can prove descent through the maternal line.  It can also still be used for studies of much deeper, ancient ancestry.

Autosomal DNA

This is the bulk of you DNA.  All of the snips (SNPs), that make up who you are genetically.  You receive approximately 50% from each parent, 25% from each grandparent, 12.5% from each great grandparent.  This subdivision cannot go on forever, and indeed, once you go back much more than six generations, the approximates start to deviate, so that you may have no snips at all from a particular line that joined your family tree over 250 years ago.

The problem with autosomal DNA is that it can be such a mess.  It recombines randomly with every generation.  Therefore, it is much harder to track ancestry in the same way, that we can with the haplogroups.

So how can they be applied for genealogy:

Biological descent

Not everyone knows who their biological parents were, or where they came from.  This is the first use of DNA testing.  It can be used to find, test, or prove recent descent.  The first hurdle of genealogy.  Both haplogroup evidence, and autosomal evidence can be used to prove or determine relationship.

Cousins

Many genetic genealogists, use DNA to find distant, and sometimes not so distant cousins.  The hope is that they can link trees, share knowledge and research, perhaps copies of artefacts.  Therefore an awful lot of genetic genealogy is about tracing genetic relatives, and establishing common ancestry.

There are two main tools:

  • Haplogroup Projects.  The Y haplogroup is favoured for it's frequent STRs, and also for it's link to surnames.  Family Tree DNA projects track the STR and SNP data of it's members, tracking families, relationship, known mutations.  Project administrators at FTDNA can predict relationship to other members in the project.  Your Y cousins.
  • Shared segments.  Autosomal DNA can be used for finding distant cousins.  23andMe for example, have Relative Finder.  Alternatively customers of any commercial DNA company that gives them access to their raw data, can upload that data to GEDMATCH.  At GEDmatch, they can search for other kits, looking for lengths of shared segments (measured in cM - centimorgans) on the autosomes or X chromosomes.  The longer or more segments can be used to indicate shared ancestry.

It is important to understand, that this is not about directly tracing ancestry.  It is only about establishing shared biological ancestry, with other researchers, with which you may be able to share resources.  In the old days of genealogy, we would find distantly related researchers by browsing through annually printed surname interest directories.  Here, the same thing is happening, but we are finding people by comparing DNA.

Ancestry from Autosomes

Most commercial DNA companies providing ancestry information, now use their own propriety calculators to look at the autosomal DNA of their customers for patterns that they can relate to a number of reference populations.  23andMe for example, uses Ancestry Composition to determine what parts of the world, that the ancestors of their customers lived 500 years ago.  They predict from this in percentages of ancestry.

However, it is very much a developing art.  The problem is that genes have been randomly mixing and moving around ever since prehistory.  The customers of these DNA companies want hard facts.  They want their ancestry accurately pin pointed down to modern or ancient nation-states, or to historical populations such as the Vikings or Huns.  Ancestral DNA companies are under pressure to provide this deep ancestry.  However, can they?  Ancestral analysis of DNA can be very enlightening.  It can provide some surprises within a family history.  However, it's accuracy is exaggerated.  It is increasingly successful at predicting ancestry from a particular corner or end of a particular continent.  But it cannot for example, accurately tell French, British, and German ancestry apart to any high accuracy.  It can recognise some populations better than others.  It cannot tell anyone if they had Viking ancestry.

Ancient Ancestry

This is a particular value of the haplogroups.  As we accumulate more and more data on more mutations, as we expand the recorded database, and as we relate that to more ancient DNA extracted from referenced and dated ancient human remains, so we will be able to better explore the population genetics not only in history, but deep into prehistory.

However, it is also becoming increasingly realised, that patterns of ancient admixture can also be detected within the autosomes.  Although Autosomal DNA ancestry calculators claim to reveal relatively recent admixtures over the past 500 years, it is becoming clear that these are being confused by much older patterns of admixture.  These patterns can now be explored and probed on a number of GEDmatch programs.  People can compare their DNA with the kits from ancient DNA, or predict just how much of their ancestry was likely "Western Hunter-Gatherer, or "Early Neolithic Farmer".

In addition, more DNA companies are now measuring for much more ancient admixture with archaic populations such as the Neanderthals.

Conclusion

Genetic Genealogy is fun, great fun.  It is not however, a quick and easy replacement for traditional genealogy.  Unless you get lucky with some comparative Y-DNA in a project, it is not going to directly tell you the names or social status of any ancestors.  It can give you a phylogenetic tree, but not any kind of family tree that you can bore other family members with.

Genetic genealogy can provide some tools to some researchers.  It can test biological relationship.  It can be used to predict some of your ancient history.  For most researchers, particularly those that are able to interview many local family members, search local grave yards, access archives and records - it has no, or little value to the pursuit of collecting ancestors.

I personally love to explore my genetic genealogy. But it is documentary research that provides the names.  Genetic genealogy for myself, is more about the long and ancient journey.