Which is more accurate AncestryDNA or 23andMe?

I've been researching my ancestry, on and off, for about 30 years. That included interviewing elderly relatives, family histories, photographs, documents, etc. It progressed onto BMD certificates, Census, and many, many visits to local record offices, churches, and archives across Britain in order to examine parish registers, transcripts, minute books, etc.

These days I have the luxury of online genealogical resources, and the ability of searching online data bases. To cut a long story short, I have accumulated a family history that includes the names of 279 recorded, direct ancestors, 277, of which lived in South East England, particularly in the East Anglian County of Norfolk (the other two were a Swiss 3 x great grandparent and his named father).

At Generation 6 (3 x great grandparent), I can say that on paper, I am 97% South East English (including 75% East Anglian), and 3% Swiss. In other words, pretty much of local East Anglian ancestry. Here is a map showing my recorded ancestors - blue via my father (minus the few in Switzerland), red via my mother:

Okay, I will still have some mistakes in my genealogical research, particularly on more distant lines, where records start to be come more scarce and have less survival. There would also be some NPEs (non parental events). However, if I compare my pedigree with DNA matches / cousins that share common ancestors both in segments and on paper trails, I get this (shaded areas verified with DNA matches to paper trails):
So there is a reasonable verification there.

That is my background. The results? Remember, Generation 6:

97% English
3% Swiss

Well before phasing, 23andme gave me:

100% European: 94% NW European. 3% Southern European. 3% Broadly European. Broken down to:
32% British & Irish
27% French & German
7% Scandinavian
29% Broadly NW European
2% Broadly Southern European (including 0.5% Iberian)

After phasing with a surviving parent, it adjusted to:

100% European: 96% NW European. 2% Southern European. 2% Broadly European.
38% British & Irish (23% from father, 15% from mother)
24% French & German (13% from father, 11% from mother)
0.8% Scandinavian (from mother alone)
34% Broadly NW European (22% from father, 12% from mother)
2% Broadly Southern European (1% from father, 1% from mother)

Not very impressive is it?

AncestryDNA gave me:



Still way off, but a lot closer and more precise than 23andme. They also assigned me both to the Southern England Genetic Community, and to the East Anglia & Essex Genetic Community, perfectly correct.

Therefore in summary - I've come to the conclusion that NO current autosomal DNA test for ancestry is capable of accurately predicting your ancestry below a very large region, such as NW Europe - unless your ancestors belong to a particularly well defined population that avoided medieval admixture. They are all inaccurate. More important to me now is that they have fat databases of testers, with a system of searching them alongside family trees, ancestor locations and surnames. However, side by side - for my results, and weighed simply against recorded family history, I have to pronounce AncestryDNA/.com to be more accurate than 23andme.

Thoughts in understanding ancestry DNA

Above image.  My Global 10 Genetic Map coordinates:  PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10 ,0.019,0.0272,0.0002,-0.0275,-0.0055,0.0242,0.0241,-0.0033,-0.0029,0.0015.  The cross marks my position on a genetic map by David Wesolowski, of the Eurogenes Blog

The above map shows genetic distances between different human populations around the planet.  Look how tightly the Europeans cluster.  Razib Kahn recently blogged on just this subject.  The fact of the matter is that the greatest diversity exists between populations outside of Europe, particularly within Africa, and between African and non-African populations.  However, we obsess over tiny differences within European populations, when in truth, most Western Eurasians are very closely related.  We share ancient ancestry from slightly varied mixes of only three base ancestral groups, with the last layer arriving only 4,300 years ago.  This obsession in the Market drives DNA to the consumer businesses to largely ignore non-European diversity, and to focus too closely on differences that blur into each other.

The above image is from CARTA lecture. 2016. Johannes Krause of the Max Planck Institute. It shows the currently three known founder populations of Europeans and their average percentages.

However, at the same time the new Living DNA service seeks to zoom in closer on British populations, attempting to detect ancestry percentages from such tiny zones as "East Anglia".  They appear to be having a level of success with it as well, although that blurriness, that overlap and closeness of populations in Europe gives problems.  Germans are given false percentages of British, Some Scottish appear as Northern Irish, and the Irish dilute into false British areas.  However, I've seen enough results now to suggest that it is far from genetic astrology.  They get it correct to a certain level, particularly for us with English ancestry.  Ancestry DNA customers expect perfection.  I don't think that we will ever get that from such closely related populations at this resolution, but it does provide a new genealogical tool that can point us into some revealing directions.

Above image.  My Living DNA Map.  Based on my recorded genealogy, I estimate 77% to 85% East Anglian ancestry over the past 250 years or so.  Living DNA at Standard Mode gave me 39%.  I'm impressed by that.  That a DNA test can recognise even at a 50% success, my recent ancestry in such a tiny zone of the planet.  I have doubts though that this sort of test will ever be free of errors, and mistakes.  The safest DNA test for ancestry is still one that is based on more distinct populations, and outside of Africa, that can be as wide as "European".  23andMe for example in their "Standard Mode" (75% confidence), assign me 97.3% European, and 0.3% Unassigned.  That is a pretty safe result.

Autosomal DNA tests for ancestry, particularly for West Eurasian (European and Western Asia) descendants, are not reliable at high resolution.  If you want to get really local, then sure - do it.  However, only use the results as an indication, not as a truth.  Populations in Western Eurasia are closely related, and share recent common descent.  There has been a high degree of mobility and admixture ever since.  Some modern populations tested do not have a high level of deep rooted local ancestry in that region.  They overlap with each other.  Keep researching and meander through different perspectives of what your older pre-recorded ancestry could have been.

Above image by Anthrogenica board member Tolan.  Based on 23andMe AC results.  My results skew away from British, and towards North French.  He generated this map, plotting myself (marked as Norfolk in red), and my Normand Ancestral DNA twin Helge in yellow.  My results fall in the overlap with French.  Helge is Normand but in AC appears more British than myself.  I am East Anglian yet in this test appear more French than he does.



Will ancestry DNA tests tell me my family origins?

I have taken several DNA tests for ancestry, including those provided by the FT-DNA, 23andMe, and Living DNA companies.  Unusual for a tester, I am actually of a single population, very local, well documented ancestry here in East Anglia, South-East England.  I'm not someone in the Americas or Australia, that might have very little clue what parts of the world that their ancestors lived in, previous to immigration.  I know my roots, I'm lucky.  I live them.  You might ask, why did I feel the need to test DNA for ancestry?  The answer is, curiosity, to test the documented evidence, fill the gaps, look for surprises, and in particular, to understand the longer term, to reach further back into my ancestry.

I have though, become a bit of a skeptic, even a critic, of autosomal DNA (auDNA) tests for ancestry.  They are the tests presented by the businesses in results called something like Ancestry, Family Ancestry, Origins, Family, Composition, etc.  Instead of testing the haplogroups on either the direct paternal (Y-DNA), or direct maternal (mtDNA), these tests scan the autosomal and X chromosomes.  That's good, because that is where all of the real business is, what makes you an individual.  However, it is subject to a phenomena that we call genetic recombination (the X chromosome is a little more complicated).  This means that every generation circa 50% of both parents DNA is randomly inherited from each parent.  I said randomly.  Each generation, that randomness chops up the inherited segments smaller, and moves them around.  After about seven or eight generations, the chances of inheriting any DNA from any particular ancestral line quickly diminishes.  It becomes washed out by genetic recombination.

Therefore, not only are the autosomes subject to a randomness, and genetic recombination - they are only useful for assessing family admixture only over the past three hundred years or so.  There is arguably, DNA that has been shared between populations much further back, that we call background population admixture.  It survived, because it entered many lines, for many families, following for example, a major ancient migration event.  If this phenomena is accepted - it can only cause more problems and confusion, because it can fool results into suggesting more recent family admixture - e.g. that a great grandparent in an American family must have been Scandinavian, when in fact many Scandinavians may have settled another part of Europe, and admixed with that ancestral population, more than one thousand years ago.

DNA businesses compare segments of auDNA, against those in a number of modern day reference populations or data sets from around the world.  They look for what segments are similar to these World populations, and then try to project, what percentages of your DNA is shared or similar to these other populations.  Therefore:

  1. Your results will depend on the quality and choice of geographic boundary, allocated to any reference population data set.  A number of distinct populations of different ancestry and ethnicity may exist with in them, and cross the boundaries into other data sets.  How well are the samples chosen? Do they include urban people (that tend to have more admixture and mobility than many rural people).  Do they include descendants of migrants that merely claim a certain ancestry previous to migration?What was the criteria for sample selection?
  2. Your results might be confused by background population admixture.
  3. You are testing against modern day populations, not those of your ancestors 300 - 500 years ago.  People may well have moved around since then.  In some parts of the World, they certainly have!

It is far truer to say that your auDNA test results reflect shared DNA with modern population data sets, rather than to claim descent from them.  For example, 10% Finnish simply means that you appear to share similar DNA with a number of people that were hopefully sampled in Finland (and hopefully not just claim Finnish ancestry) - not that 10% of your ancestors came from Finland.  That is, for the above reasons, presumptuous.  It might indeed suggest some Finnish ancestry, but this is where many people go wrong, it does not prove ancestry from anywhere.

Truth

This is my main quibble.  So many testers take their autosomal (for Family/Ancestry) DNA test results to be infallible truths.  They are NOT.  White papers do not make a test and analysis system perfect and proven as accurate.  Regarding something as Science does not make it unquestionable - quite the opposite.  The fact of the matter is, if you test with different companies, different siblings, add phasing, you receive different ancestry results.  Therefore which result is true and unquestionable?

A Tool for further investigation

So what use is DNA testing for ancestry?  Actually, I would say, lots of use.  If you take the results with a pinch of salt, test with different companies, then it can help point you in a direction.  Never however take autosomal results as infallible.  Critical is to test with companies with well thought out, high quality reference data sets.  Also to test with companies that intend to progress and improve their analysis and your results.

For DNA relative matching, then sure, the companies with the best matching system, the largest match (contactable customer) databases, and with custom in the regions of the world that you hope to match with. There is also, GEDmatch.  Personally, I find it thrilling when I match through DNA, but in truth, I had more genealogical success back in the days when genealogists posted their surname interests in printed magazines and directories. 

The results of each ancestry test should be taken as a clue.  Look at the results of testers with more proven documented and known genealogies.  Learn to recognise what might be population background, as opposed to recent admixture in a family.  Investigate haplogroup DNA - it has a relative truth, although over a much longer time, and wider area.  Just be aware that your haplogroup/s represent only one or two lines of descent - your ancestry over the past few thousand years may not be well represented by a haplogroup.  Investigate everything.  Enjoy the journey.  Explore World History.

The most common misunderstanding - mtDNA

I just see so many misunderstandings on genetic genealogy and DNA test forums concerning mtDNA haplogroups, that I feel compelled to try to explain.

DNA testing businesses tend to dumb down a lot of information for their "audience".  I feel that this actually increases misunderstandings, and mtDNA haplogroups are a good example.  Rather than use the lengthy description mitochondrial DNA, or even it's shortened mtDNA, businesses describe it more frequently as Mother Line, or Maternal.  It misleads so many of their customers.  So let us put this straight:

  • A haplogroup is a  "combination of alleles at different chromosomes regions that are closely linked and that tend to be inherited together"  A series of mutations, that are inherited across generations.
  • mtDNA are a series of mutations within the DNA of mitochondria.  Mitochondria exist outside of a cell nucleus.  They have their own independent DNA, apart from the nuclear chromosomal DNA that dictates how we develop, what we are.  We all have mitchondria, in most of our cells.  They actually serve a function by processing energy.
  • As humans, we use nomenclature to group those mutations within a family tree of humanity.  My mtDNA mutations fall within Haplogroup H.
  • mtDNA cannot be passed on to future generations by males.  it is passed down to the children from the mother only.  I inherit H6a1a8 (my haplotype) from my mother, as do my brother and our sisters.  Only my sisters though will reproduce that mtDNA in their children.  My own children inherited the mtDNA of their mother, not mine.

So what does this mean in practice?

  • A Maternal / Motherline / mtDNA Haplogroup does NOT represent your biological ancestry.
  • A Maternal / Motherline / mtDNA Haplogroup does NOT even represent your mother's "half" of your biological ancestry.
  • For example, your father's mother most likely carried a different mtDNA.  Your mother's father most likely had a different mtDNA haplotype.  Only one of your sixteen great great grandparents passed down their mtDNA to you.
  • Instead, it acts pretty much as a single line genetic "marker" that can be traced only along one very narrow, single line of ancestry.  Look at the image at the top of your post.  Do you see?  Just one line of descent. It follows your mother's, mother's, mother line, and so on, all of the way back to a hypothetical "Mitochondrial Eve" 100,000 to 200,000 years ago.
  • It is not a tribe, ethnicity, or identity.  It is just the mtDNA genetic marker (Haplotype) that you inherited from your mother.
  • It is no good going onto mtDNA genetic genealogy forums and giving the names and origins of ANY direct ancestor, other than a woman (or her children) on that maternal line (mother's mother's, mother, and so on).
  • Forget surname studies.  In most western societies, and in many other's, the "family" name is inherited from the father - and follows a completely different course (Y-DNA).  Indeed, the surname of your true mtDNA ancestor changes most generations with marriage.  That is what makes this the most difficult line to trace with documentary methods.
  • Although difficult, it is the most true and secure.  Although secret or hidden adoptions can occur, the risk of non-parental events is much lower than for the strictly male line (Y-DNA).
  • Mitochondrial DNA mutates at a very slow rate.  This, along with the change in surnames most generations, can make it difficult to use successfully for genetic genealogy.  Many of the mutations are thousands of years old.  Alternatively, it makes it a valuable evidence for tracing ancient ancestry within a population.

That is all that I wanted to say.  it is a fascinating marker, but it is not representative of even 50% of your ancestry, it is not an identity, it is pretty irrelevant to surname (studies), it is inherited only down one narrow line - but all of the way back.

My earliest mtDNA ancestor with a surviving photograph.  My mother's mother's, mother's, mother (2xgreat grandmother), born Sarah Daynes in Norfolk, during 1845.  Her mtDNA would be H6a1a8.  Her mother was born Sarah Quantrill in Norfolk during 1827.  Her mother in turn was born Mary Page in Norfolk during 1791.  Her mother in turn was born Elizabeth Hardiment in Norfolk during 1751.  Her mother in turn (my 6xgreat grandmother) was Susannah Briting, who married John Hardyman in Norfolk during 1747.  If my documentary research along this line is correct, then Susannah inherited mtDNA haplotype H6a1a8 from her mother.

My Global 10 Genetic Map - and Frenchness!

My Global 10 Genetic Map coordinates:  PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10 ,0.019,0.0272,0.0002,-0.0275,-0.0055,0.0242,0.0241,-0.0033,-0.0029,0.0015

This is my position on the latest genetic map by David Wesolowski, of the Eurogenes Blog.  One point of interest that has been picked up on the Anthrogenica Forums, is my consistent closeness in ancestral results, to a Normand member!  Our Basal-rich K7 results were almost identical.  On 23andMe Ancestry Composition (spec mode), I just get a bit more French & German, while he gets just a bit more British & Irish.  We are close!

Another forum member argued though that it's my results that are skewed away from British, and towards North French.  He generated this map, plotting myself (marked as Norfolk in red), and my Norman Ancestral DNA twin Helge in yellow:

I had to point out though, that I've rarely seen other SE English with a record of local ancestry, test - and that the red circles representing British & Irish include many people with some Irish, Scottish, Welsh, Western, or Northern ancestry.  The map suggests a pull to Northern France, Belgium, and the Netherlands.

As I commented towards the end of my last post, I initially expected a pull to Denmark, Northern Germany, and perhaps to the Netherlands.  This is because so many of my 17th-20th century ancestors lived on what was the frontier of Anglo-Saxon and Danish immigration during the 4th to 11th centuries.

But instead, autosomal DNA tests for ancestry all seem to be suggesting more shared ancestry from a more southerly direction - Northern France and Belgium particularly.  Although there has so far been a dearth of local testers from local families, the POBI survey seems to find this common among the English.  We appear to be a halfway house between Old British, and the French, more than the ancestors of Anglo-Saxons and Danes.  This contradicts the historical and archaeological records.  POBI suggested that this was due to waves of unrecorded immigration from the South during late prehistory.  Others have pointed the finger at Norman and French admixture in medieval Southern Britain.  It could be both!

Can I apply for French citizenship?

DNA.land - raw file comparison

Comparing the ancestry results of two raw files from the same tester (myself) uploaded to DNA.land.

Myself.
Paper trail and family history 100% SE English, mainly East Anglian. 249 direct ancestors named in documentary research.

23andMe result before phasing (spec mode):
100% European broken into
94% Northwestern Europe
3% Southern Europe
3% unassigned European

Broken down further to:
32% British & Irish
27% French & German
7% Scandinavian
29% Broadly NW European
0.5% Iberian
2.4% Broadly South European

23andMe result after phasing with one parent (spec mode):
100% European
96% Northwestern European
1.8% Southern European
2.2% Broadly European

Broken down further to:
37% British & Irish
22% French & German
1% Scandinavian
36% Broadly NW European
1.8% Broadly Southern European

FT-DNA Family Finder My Origins.
100% European

Broken down further to:
36% British Isles
32% Southern Europe
26% Scandinavia
6% Eastern Europe

Now I am comparing the two raw files for the same person, uploaded to, and analysed for ancestry, by DNA.land:

23andMe V4 raw file for myself on DNA.land:

100% West Eurasian.
77% North West European
19% South European (broken into 13% Balkan / 6.1% South/Central European
2.4% Finnish
1.3% Ambiguous

FT-DNA FF raw file for myself on DNA.land:
100% West Eurasian
75% North West European
25% Balkan

Just for more information:

My mother's 23andMe raw file on on DNA.land:
100% West Eurasian
80% North West European
10% South European (broken into 7.7% South/Central Europe / 2.4% Balkan)
6.4% Finnish
2.3% Sardinian
1.5% Ambiguous 

Conclusion

Phasing on 23andme suggested that I inherit (in spec mode) nearly 1% Southern European from each parent. That each of my very East Anglian parents had a Southern European ancestor within the past 300 - 500 years is highly unlikely, considering 1) the paper trail, and 2) local history in this rural area. Therefore I feel that this reflects much older background ancestry for the local SE English population. Ancient DNA calculators also predict that I have higher than average levels of ENF/EEF than other local populations such as the Irish and Scottish, and lower levels of ANE. This appears associated with my Southern European flavour that some tests suggest as a minority percentage. FT-DNA suggested 32% Southern European! Some commentators have suggested that this might indicate significant French admixture to the SE English population, perhaps during the Norman and Medieval periods, carrying a southern signal higher into lowland Britain. Earlier admixture into Lowland Britain from the south, is also possible during late prehistory and the Roman period.

DNA.land has been noted for a bias to predicting both Balkan, and Finnish ancestry for testers, and my results are no exception. I feel that as with all current autosomal DNA test/analysis for ancestry, that DNA.land has a way to go. As with the other predictors, it is very successful at recognising me as 100% European (although ironically my Y-DNA is Western Asian). It is fair at spotting me as NW European, but NOT as successful as 23andMe. Below that level, once again it falls down - but I feel that this is understandable, as most predictors fail down for anciently admixed populations such as the English. They are far more successful at spotting for example, Irish/Scottish. For the English, we tend to be ripped across different European populations. The Southern European element is a particular surprise - but all of the testers so far have been confused by this background signal. Dienekes has himself, suggested Southern European DNA coming into England with the Normans:

http://dienekes.blogspot.co.uk/2016/...-ancestry.html

I'm starting to settle with this hypothesis, although I still have some interest in possible Southern European admixture earlier.

Finally... The two raw files for one person, have produced slightly different results. The FT-DNA raw file has I believe, more tested (but different?) SNPs than the 23andMe file. It would be interesting to know the differences. DNA.land, using the FT-DNA FF file, does not see Finnish, or South/Central European, but enhances the Balkan.

Counting the SNPs - 23andMe V FT-DNA

Comparing 23andMe V4 kit raw file to FT-DNA raw file.

Both tests were taken by myself this year (2016).  I am here comparing the quality of two separate atDNA tests from the same person, by two different DNA for Ancestry companies.  As will be seen, the quality varies considerably, at least in terms of the number of SNPs that are tokenized once forwarded to GEDmatch.com.  This is NOT a test of how well both companies ascertain our DNA ancestry from these files.  Both use their own reference populations and analysis programs.  I've reviewed that elsewhere.  This test simply weighs how many SNPs are registered from the autosomes and X chromosome of one person.

Using the GEDmatch DNA file diagnostic utility, I received the following SNP counts:

Kit M551698 (23andMe V4)

Token File data:
Chr Token SNP Count
1 40974
2 42110
3 34199
4 31020
5 30421
6 36383
7 26352
8 27900
9 23644
10 27888
11 25363
12 25395
13 19880
14 15957
15 15529
16 16551
17 13745
18 16775
19 9006
20 13530
21 7324
22 7386
X 15359

Processed in batch 5355
Number of SNPs utilized by GEDmatch template = 523997
Number of regular SNPs = 517780
Heterozygosity index = 0.302721 (fraction of total SNPs that are heterozygous)
No-calls = 4911 = 0.93956084952678 percent.
Kit M551698 has approximately 19959 total matches with other kits. Of these matches there are 4982 >= 7cM and 14977 < 7cM.


Kit T444495 (FT-DNA file):

Chr Token SNP Count
1 57931
2 59602
3 47094
4 41772
5 39314
6 47546
7 36567
8 36753
9 30643
10 36889
11 35941
12 35850
13 26763
14 22650
15 20899
16 21935
17 18379
18 22586
19 12773
20 19587
21 10001
22 9750
X 19176

Processed in batch 5914
Number of SNPs utilized by GEDmatch template = 709242
Number of regular SNPs = 694324
Heterozygosity index = 0.281384 (fraction of total SNPs that are heterozygous)

No-calls = 16077 = 2.263088030563 percent.

Kit T444495 has approximately 48755 total matches with other kits. Of these matches there are 9351 >= 7cM and 39404 < 7cM.

Conclusion

If the quality of a raw atDNA file is merely down to the number of SNPs that are tested, then FT-DNA clearly wins hands down, when compared with the 23andMe file, following tokenization for GEDmatch use.  The FT-DNA file utilises 709,209 SNPs compared with 23andMe's 523,997 SNPs

I thought that it might be interesting to compare how these files, of the same person, might compare on the same GEDmatch heritage admixture program.

On Eurogenes K13 Oracle, my 23andMe kit gets as top ten closest GD's:

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97
6 French 7.63
7 North_Dutch 7.76
8 Danish 7.95
9 North_German 8.17
10 Irish 8.22

On the same, using my FT-DNA kit (with many more SNPs tested as demonstrated above:

1 Southeast_English 3.75
2 South_Dutch 4.03
3 West_German 5.42
4 Southwest_English 5.68
5 Orcadian 6.33
6 North_Dutch 7.15
7 Danish 7.36
8 Irish 7.59
9 West_Scottish 7.62
10 North_German 7.7

Based on the numbers of SNPs tokenized, I will in future regard the FT-DNA (Family Tree DNA) file as superior in quality, over the 23andMe file, despite my disappointment in the FT-DNA My Origins ancestry analysis.