Comparing results from actual recorded ancestry, to that predicted by Ancestry.com, 23andme, My Heritage, Living DNA, FT-DNA and more.
Recorded Ancestry
Results
Comparing results from actual recorded ancestry, to that predicted by Ancestry.com, 23andme, My Heritage, Living DNA, FT-DNA and more.
I have taken several DNA tests for ancestry, including those provided by the FT-DNA, 23andMe, and Living DNA companies. Unusual for a tester, I am actually of a single population, very local, well documented ancestry here in East Anglia, South-East England. I'm not someone in the Americas or Australia, that might have very little clue what parts of the world that their ancestors lived in, previous to immigration. I know my roots, I'm lucky. I live them. You might ask, why did I feel the need to test DNA for ancestry? The answer is, curiosity, to test the documented evidence, fill the gaps, look for surprises, and in particular, to understand the longer term, to reach further back into my ancestry.
I have though, become a bit of a skeptic, even a critic, of autosomal DNA (auDNA) tests for ancestry. They are the tests presented by the businesses in results called something like Ancestry, Family Ancestry, Origins, Family, Composition, etc. Instead of testing the haplogroups on either the direct paternal (Y-DNA), or direct maternal (mtDNA), these tests scan the autosomal and X chromosomes. That's good, because that is where all of the real business is, what makes you an individual. However, it is subject to a phenomena that we call genetic recombination (the X chromosome is a little more complicated). This means that every generation circa 50% of both parents DNA is randomly inherited from each parent. I said randomly. Each generation, that randomness chops up the inherited segments smaller, and moves them around. After about seven or eight generations, the chances of inheriting any DNA from any particular ancestral line quickly diminishes. It becomes washed out by genetic recombination.
Therefore, not only are the autosomes subject to a randomness, and genetic recombination - they are only useful for assessing family admixture only over the past three hundred years or so. There is arguably, DNA that has been shared between populations much further back, that we call background population admixture. It survived, because it entered many lines, for many families, following for example, a major ancient migration event. If this phenomena is accepted - it can only cause more problems and confusion, because it can fool results into suggesting more recent family admixture - e.g. that a great grandparent in an American family must have been Scandinavian, when in fact many Scandinavians may have settled another part of Europe, and admixed with that ancestral population, more than one thousand years ago.
DNA businesses compare segments of auDNA, against those in a number of modern day reference populations or data sets from around the world. They look for what segments are similar to these World populations, and then try to project, what percentages of your DNA is shared or similar to these other populations. Therefore:
It is far truer to say that your auDNA test results reflect shared DNA with modern population data sets, rather than to claim descent from them. For example, 10% Finnish simply means that you appear to share similar DNA with a number of people that were hopefully sampled in Finland (and hopefully not just claim Finnish ancestry) - not that 10% of your ancestors came from Finland. That is, for the above reasons, presumptuous. It might indeed suggest some Finnish ancestry, but this is where many people go wrong, it does not prove ancestry from anywhere.
This is my main quibble. So many testers take their autosomal (for Family/Ancestry) DNA test results to be infallible truths. They are NOT. White papers do not make a test and analysis system perfect and proven as accurate. Regarding something as Science does not make it unquestionable - quite the opposite. The fact of the matter is, if you test with different companies, different siblings, add phasing, you receive different ancestry results. Therefore which result is true and unquestionable?
So what use is DNA testing for ancestry? Actually, I would say, lots of use. If you take the results with a pinch of salt, test with different companies, then it can help point you in a direction. Never however take autosomal results as infallible. Critical is to test with companies with well thought out, high quality reference data sets. Also to test with companies that intend to progress and improve their analysis and your results.
For DNA relative matching, then sure, the companies with the best matching system, the largest match (contactable customer) databases, and with custom in the regions of the world that you hope to match with. There is also, GEDmatch. Personally, I find it thrilling when I match through DNA, but in truth, I had more genealogical success back in the days when genealogists posted their surname interests in printed magazines and directories.
The results of each ancestry test should be taken as a clue. Look at the results of testers with more proven documented and known genealogies. Learn to recognise what might be population background, as opposed to recent admixture in a family. Investigate haplogroup DNA - it has a relative truth, although over a much longer time, and wider area. Just be aware that your haplogroup/s represent only one or two lines of descent - your ancestry over the past few thousand years may not be well represented by a haplogroup. Investigate everything. Enjoy the journey. Explore World History.
First a disclaimer. I'm very new to the whole world of genetic genealogy. I'm not new however, to traditional genealogy, and I do have a pretty good amateur understanding of relative archaeological and anthropological discussions over the past fifty years. The following is not meant as a critique of genetic genealogy, so much as a review, or my experience, of ancestry composition based on autosomal DNA analysis.
Let's start with my paper trail.
I am English by ethnicity, British by nationality, and a subject of Queen Elizabeth II (often now referred to as a UK Citizen).
My paper recorded ancestry consists of the genealogical records of:
All 181 ancestors, reaching back to the 1690's, appear to be English born, of English ethnicity, with English surnames. The majority of them (100% on my mother's side, and 81% on my father's side) were East Anglian, with the vast majority of that percentage being born in the county of Norfolk. Religions recorded or indicated were CofE Anglican or non-conformist Christian. No sign of any Catholicism, Islam, or Judaism.
Therefore it would look pretty likely, that I can claim English heritage, wouldn't you agree?
There are three aspects or avenues of inquiry, available for genetic genealogy. First of all, the two sex haplogroups; the y-DNA, and the mt-DNA. These two "signals" are referred to as haplogroups.
Autosomal DNA is what makes us individuals, gives us our hereditary traits. It is passed down from many ancestors, via our parents. However, the sex haplogroups are of interest because they can be traced across the globe, and the millennia. As we gain more and more data - both from living populations, and ancient DNA from archaeological finds, so we will be able to track the STR and SNP mutation data more precisely.
However, what about poor old messed up autosomal DNA? It represents our entire biological heritage over many generations. It is what we are. However, making sense of it is less easy, less precise. Genetic genealogists are making progress, but it is far less of a precise science than either of the haplogroups. They use calculators, that measure the segments of DNA cross the chromosomes, looking for patterns that they recognise from a number of known reference populations. From that, these calculators predict an ancestry. Exactly what and when that ancestry refers to, does seem to vary from one calculator to another. There is an argument that the precision can be improved if you also test close known relatives including at least one parent. The results can then be phased. I'm actually waiting for the results for my mother, so that I can see my own au-DNA ancestry results phased and corrected.
So lets have a bit of fun, and see what some of the calculators suggest for my autosomal DNA, at least before any phasing with my mother's DNA. What do they make of my 100% English paper ancestry?
99.9% European.
Broken into:
83% NW European
17% Broadly (unassigned) European
I think that's pretty cool. As I'm getting to know au-DNA predictions, so as I'm learning to appreciate it when they get the right continent, and the right corner of that continent. That is more than they could do a decade or two ago. The prediction is correct, I am a NW European. I'm not a West African, a South Asian, or a East Siberian.
100% European
Broken into:
94% NW European
3% S European
3% Broadly (unassigned) European.
Whoa, where did that South European come from? It could just be a stray incorrectly identified signal, or it could be telling me that one of my ancestors, maybe around Generation 6, were from down south! Lets break down the prediction further. First, the NW European:
32% British & Irish
27% French & German
7% Scandinavian
But surely I should be 100% British & Irish? Not only 32%. I have my own ideas about this. I think that although 23andMe claims that Ancestry Composition only represents the ancestry of the past 300 to 500 years (the so-called migration period, as sold to USA customers), that it gets confused by earlier migrations across their reference populations, including those during the early medieval period, and perhaps even some of those during late prehistory. I've noticed that across Ireland and Britain, the further to the east, the more diluted the 23andMe British & Irish assignment. People of solid Irish ancestry get between 85% and 98% British & Irish. My East Anglian results, mixed between British & Irish, French & German, and Scandinavian, are actually rather more like those received by Dutch customers of 23andMe.
As for that Southern European prediction, how does that break down?
0.5% Iberian
2.4% Broadly (unassigned) South European.
Which if taken seriously, might suggest that I have an unknown Spanish or Portuguese ancestor around Generation 6. If I did take it seriously that is. I wonder what my mother's test will reveal?
This is a third party site, that you can upload your 23andMe V4 raw data to, and see what their calculators predict for your ancestry. It has recently had it's ancestry composition revised. What did that make of my 100% English au-DNA?
West Eurasian 100%.
I like that designation, the amateur anthropologist in me prefers that broad designation over "European". Broken down:
77% North/Central European
19% South European
2.4% Finnish
1.3% unassigned.
What? Why not 100% North/Central European? Finnish? Did some early medieval Scandinavian settlers of East Anglia bring it? Or is it a false signal? Misidentified au-DNA?
That darned South European kicked in again. I'm here looking at a biological cuckoo NPE (non-parental event) at around Generation 5 or even more recent! Did a great grandmother secretly have a South European lover? But this South European breaks down further:
13% Balkan
6% Italian.
Oh my goodness, whereas 23andMe speculative mode suggested SW Europe - this one suggests SE Europe! Do I have a secret Albanian great grandfather? Or is it all nonsense?
This is a cracking new third party DNA analyser. It is based in China, and it's predictors appear to calculate mainly for a Chinese market. It not only predicts your ancestry composition, but also your two sex haplogroups, and lots of traits and health predictions to compliment those of 23andMe. It even tries to predict your genetic disposition to sexuality!
It will allow you to send your 23andMe V4 raw data direct to it's own calculators. However, at the moment the website is almost entirely in Chinese (Mandarin?). There are two options. 1) At the bottom of the webpages is a hyperlink to English, which gives, in English, a basic ancestry composition, and your haplogroups. It does not include English versions of the health and trait results. 2) use an online translator, such as the one built into the Google Chrome browser. It actually serves pretty well.
On sex haplogroups they give my Y-DNA as
L1. Not bad, but they didn't make it to L1b or L-M317.
My mtDNA?
H6a1a8. Very good. Better than 23andMe's H6a1, and the same as the mthap program.
But this is about au-DNA, how did they do, what did they make of my 100% English ancestry?
81% French
19% English/Briton
Now, that sounds pretty awful, but on closer inspection, I'm impressed. No South European great grandfather. Okay, so most of my DNA has been placed on the wrong side of the Channel. However, I know that French and English DNA is actually very close. Recent surveys even suggest that the English have inherited a lot of common ancestry with the French during unknown migration late in prehistory. So again - they very much got the right corner of the right Continent. Well done WeGene.
GEDmatch is a website that you can upload raw data not only from 23andMe, but from a range of testers, and from V3 chips as well as V4. It hosts a number of tools and predictors - some Open Source. Some of these predictors are for Admixture or ancestry composition. They measure your ancestry in terms of distance from known reference populations. The lower the number, the closer you are to their reference. They use calculators known as oracles to predict ancestry, including mixed ancestry or admixture.
The oracles on the Eurogenes K13 and K15 calculator models have a good reputation at working with West Eurasian ancestry. So how does K13 first, score my 100% English ancestry?
On Single Population Sharing, it rates my DNA against the closest references. In order of closest to not so close, the top five are:
1 | South_Dutch | 3.89 |
2 | Southeast_English | 4.35 |
3 | West_German | 5.22 |
4 | Southwest_English | 6.24 |
5 | Orcadian | 6.97 |
1 | Southwest_English | 2.7 |
2 | South_Dutch | 3.98 |
3 | Southeast_English | 4.33 |
4 | Irish | 6.23 |
5 | West_German | 6.25 |
The above photo at A Capela dos Ossos (the bone chapel) in Évora, Portugal. The entire chapel is covered with human bones. Every wall and pillar is decorated with skulls and bones. On another wall hangs the mummified remains of a man and child, said to have been cursed. There is a sign at the entrance of the chapel which states "Nós ossos que aqui estamos, pelos vossos esperamos" (Our bones here, await yours).
I was a sceptic of genetic genealogy, I'll admit it. Now I'm hooked. Not because I feel that it has been a way of hooking up with distant cousins, that can help me extend my family tree. That's not the way that I've used it so far. Instead, it has provided very different kind of information, that helps me understand who I am, and how I can link my ancestry to known heritage.
I might not have been so hooked, but I've had so many surprises with my 23andMe results. If my results had been perhaps, dire and boring, then maybe I would have retreated to traditional genealogy and regarded the technique as predictable and uninteresting. However, what ancestry related surprises did I have?
I captured the above photo at Cabo Espichel, Portugal.
There was a fourth, further surprise in my 23andme results. It lay in the autosome. 23andMe AC (Ancestry Composition) on speculative mode, suggested 2.4% Southern Europe, including a prediction of 0.5% Iberian ancestry. On speculative mode again, it falls on five pairs of chromosomes - but never on both sides. On standard mode, 0.1% remains, just on one side of pair 21. This suggests that all of it comes from just one of my parents.
I might think that this was just "background noise", an error in AC. However, it keeps popping up. Indeed when I upload my raw data to the program at DNA.land, they predict only 80% North/Central European, and a whopping 15% South European. It doesn't stop there. On GEDMATCH, the Eurogene calculators keep suggesting Iberian or South European admixture on their mixed population oracles. Eurogenes K9 for example, gives me 61% North European, 29% Mediterranean, and 6% Caucasus.
Let's just refer back to my recorded paper ancestry. I have 190 recorded ancestors, all in England, with English surnames. No sign of any Roman Catholicism. I have all sixteen of Generation 6 (G.G grandparents) named. All born and named English. No sign of any South European even in the 1,490 people on the entire family tree for my kids.
However, I think that all of the autosome ancestry calculators could be telling me a truth, that I can't see in my known family tree. If I have a South European ancestor somewhere, whether Iberian or not, then either a) I have not yet found them, or b) they were the biological ancestor of a NPE (non-parental event), a cuckoo. I have 3 out of my 32 Generation 7 ancestors unnamed - all absent fathers. I have 15 missing ancestors in Generation 8. Above that, the representation really starts to decline, although I have some ancestors named up to Generation 11. Could a South European be in there? 23andMe in speculative mode suggested 2.4%. That would seem "average" for an ancestor in Generations 7 or 8 (3 to 4 x G grandparent level) Of course from around that point, "averages" become pointless, and subject to a randomness that can delete entire lineages further up from any surviving DNA. None-the-less, I could have a South European from around that period - either one of the 18 "missing" ancestors, or a NPE cuckoo.
I'm commissioning a 23andme test for my mother. Three reasons. 1) she wont be here for ever. Recording her genome feels valuable and worthy. 2) I want to see how her very dense 100% recorded Norfolk ancestry projects on Ancestry Composition and on GEDMATCH. 3) I want to phase her results against mine. It will tell me for example, where my "South European" DNA came from - which parent. It will help me further understand my own genetic ancestry.
The results were uploaded to my 23andMe profile today. I posted/registered the sample from the UK, nine weeks ago. The sample traveled to the USA lab via a NL holding depot. It took six weeks to process the sample and results, from the time of being marked as arriving at the USA lab. I feel very fortunate, as many 23andMe customers are reporting a seasonal log-jam that is delaying the process. My results though were comfortably within the proposed time frame.
There were a number of pleasant surprises. The results were far from boring.
An amusing little trait, that IS identified by the DNA analysis, is on Asparagus Metabolite Detection. When I eat asparagus, my urine smells strongly. It confirms for me - that the system works! It also correctly identifies that I have a sweet tooth, that I have blue eyes, etc.
Now to the genetic genealogy goodies.
The genetic marker that I inherit from my strictly paternal lineage - father's father, father, and so on, going back. On paper, I've traced this back to a John Brooker, that lived in Oxfordshire, but was born outside of that county, perhaps in nearby Berkshire, circa 1785. Of course, that is if no-one ever lied in forms over who the father was.
This one was a shocker. A little background first. Although my paper ancestry over the past 350 years is overwhelmingly localised in parts of the county of Norfolk, in East Anglia, my paternal-line surname carrier, that should be the donor of my Y chromosome marker, or Y-DNA, can be traced to Oxfordshire, in Wessex. Out of my eight paper great grandparents, seven were Norfolk born and bred. However, the exception was my paternal great grandfather. Therefore I would not expect my Y-DNA to belong to any local Norfolk gene-pool. It is the least representative lineage for my heritage. This is why I feel that people can sometimes place too much value on their haplogroups. I did however, expect it to belong to a common English or British haplogroup such as the Y-DNA R1b group.
I was in for a surprise. It is exotic L2*.
From initial research including an Internet search, this haplogroup forms only a rare back scatter across Europe. It appears more commonly across Western Asia and the Sub-Continent, from Turkey to Southern India. It is most common in Pakistan, where it may originate, circa 30,000 years ago. It is not a common European Y-DNA haplogroup. I need to more carefully research this in the near future, but I'm in awe to find that I have an exotic Y-DNA. It does conjure up images of one of my paternal ancestors being a Syrian archer, or Persian mercenary in the Roman Army, fathering a child, while stationed in Britannia, or perhaps elsewhere in Roman Europe. But that might be too fanciful. Anyway, I'm having pheasant curry for dinner tonight.
This genetic marker should be shared with my son, and my brother. A few of my first cousins will also have it.
The genetic marker that I inherit from my strictly maternal lineage - mother's mother, mother and so on back. On paper, I've traced it back to a Mary Page, who was born in 1802, in Norfolk. I like the maternal line, as it is actually the most biologically secure. Few forms lie about who the mother is. I'd expect my mt-DNA to be a haplogroup firmly established in East Anglia.
A nice one to have. It is H6a1.
This haplogroup belongs to the Helena group. However, it is not ancient European. H6 is believed to have mutated from H around 30,000 years ago in Central Asia.
H6a1 has recently been associated with the Yamnaya migration into Western Europe, from the Eurasian Steppes to the north of the Black Sea, some 4,000 to 5,500 years ago. In Europe itself, it could be associated with a number of Early Bronze Age cultures, the Corded Ware culture. It has been linked with the R1b Y haplogroup, that dominates Western European countries such as Ireland, France, and the British Isles. Recent studies have indeed suggested a significant displacement of people in Western Europe, that occurred in late prehistory, with the arrival of pastoralists from Eurasia. This migration is also associated with the rise of the dominant Indo-European linguistic group of Europe. If H6a1 does indeed prove to be linked to the Indo-European explosion of the early Bronze Age, I'd be very happy. I like to imagine one of my maternal ancestors 5,500 years ago, accompanying a band of prehistoric pastoralists, that are heading westwards into Europe with their horses.
This genetic marker will be shared with my mother, my brother, my sisters, and their children. A few cousins will also share it.
This is an area that I've been trying to understand recently. It uses computer analysis, to compare my autosome DNA to a number of others in reference populations from around the World, which then composes suggested ancestry in percentages. This magic attempts to look not at a few genetic markers or haplogroups, but at all of the patterns in my autosomal DNA, to predict likely ancestry on any lineages that survive in my DNA.
Previous to receiving my results, I recently revised and bolstered up my paper genealogy based family tree, I now have 172 direct ancestors listed, going back to Generation 14 during the 17th Century. I noted that all, and everyone of my paper recorded ancestors were English. All of them. That includes all of my eight grandparents, all of my sixteen great great grandparents, and thirty of my thirty two great great great grandparents. That is 100% English.
Now, I'm sure that you'd agree, I should be expecting my 23andMe ancestry composition to give 100% English, right? Well no. They can't presently identify an ethnic group like the English. Instead, I should expect my results to fall 100% into the British & Irish category.
100% British & Irish? No, I'll give this one early. it was 32% British & Irish on speculative mode. More on this further down.
My paper research before I received my results also revealed just how concentrated, most of my ancestry has been over the past 350 years. I compiled the below map of East Anglia. The BLUE marking the places of ancestral events from my family tree on my father's side; and the RED marking the places of ancestral events on my mother's side. The larger the marker, the more events recorded.
I also made a map based on East Norfolk during the 4th Century AD, before sea levels fell, and drainage changed the coastline. I then marked out the area of my mother's ancestry on that.
100% European
60% British & Irish
10% French & German
2% Scandinavian
25% unidentified broadly NW European
People of Irish heritage, or even Americans with either Irish or British ancestry, tend to score a higher percentage of British & Irish than do the present day ethnic English. 23andMe has a generous and growing reference population in it's British & Irish database. However I hypothesised that 1) the 23andMe B&I reference is skewed to the Irish, and away from English. It is also possible that it is distorted by a case of genetic drift by testing Americans of British origin. 2) that the British & Irish designation may actually be inadvertently looking at DNA that arrived in the British Isles largely previous to the early medieval North Sea migrations. To the British and Irish genes that have been here since late prehistory. On the other hand, the French & German, the Scandinavian, and perhaps some of the undesignated Broadly NW European percentages that are usually assigned to the ethnic English, may actually reflect early medieval migration from across the North Sea. The computer analysis is simply unable to distinguish some of the DNA from that of present day French, Germans, or Scandinavians, because of ancient admixture.
I'm told that this would not be the case, that 23andMe ancestral composition could not detect such deep, ancient admixture. However, what if I am correct about my own heritage - that I likely have enhanced levels of Anglo-Saxon and perhaps Norse heritage, because of the geographical location of so many of my ancestors? Should I not expect even lower percentage of the 23andMe British & Irish category, and even higher percentages of other NW Europeans from across the North Sea? So what was my 23andMe ancestry composition percentages (speculative mode)?
100% European. Broken down into:
94% NW European.
3% South European.
I'll get to the South European later, but what about this North west European? Let's break it down into 23andMe's sub categories:
32% British & Irish
27% French & German
7% Scandinavian
29% undistinguished broadly NW European
Oh my goodness. It correctly fits my prediction. I have more than double the average percentage of F&G and Scand for English people. Despite having a paper researched genealogy that is 100% English, 23andMe's ancestry composition based on a generous reference sample size of 1251 sets, gives me 32% British & Irish.
So a predicted, but still incredibly exciting result. I'm chuffed to bits. It does in my eyes, blow 23andMe's British & Irish designation out of the water though. Their reference samples do not appear to match the East English. Instead, their software misreads some of the English DNA for French & German, or Scandinavian. I'm suggesting that this is because of ancient admixture, during the 4th to 11th centuries AD, with North Sea immigration. I invite others to knock my suggestion down.
One more surprise from my Ancestry Composition: A South European 2.7%. Broken down into 23andMe's sub categories:
0.5% Iberian
2.4% undistinguished broadly South European
This looks real. It appears that I have a small percentage of South European heritage. Most likely from Spain, Portugal, or Basque. I probably have Iberian ancestry that I have not yet detected using paper genealogy. Either that, or it's an anomaly, a incorrect interpretation.
An estimated 2.9%.
That's just slightly above the average of 2.7% for modern Europeans. So I am not more Neanderthal than most others. Sorry to disappoint.
All in all, very happy that I spent the money.