The Southern European DNA enigma. Option 3. Autosomal DNA Analysis does not work

Here I'm considering the third option to my enigma.  My known ancestry is 100% English.  However, autosomal DNA tests for Ancestry, by commercial companies, and by third party analysis, suggest that I have a mixture of European ancestries, including varying percentages of Southern European.  I'm trying to best explain this phenomena.  In previous posts, I considered 1) that my paper record is incomplete, or biologically incorrect.  2) that something ancient is picked up in analysis of present day English testers - that maybe reflect shared algorithms with ancient admixture, perhaps prehistoric, or Roman.

Now in this post, I consider the third option.  That commercial DNA companies exaggerate their claims to be able to differentiate to any successful degree, between different regions of Europe in my ancestry.  If this is indeed the case, it has significant repercussions for testers for example, in the USA, Canada, Australia, etc.  If they have a poor paper trail, and poorly known ancestry, maybe it's all too easy for them to regard such DNA tests for ancestry, as indisputable and accurate truths.

Commercial DNA companies for Ancestry, are under pressure to supply to market demands.  Their markets have been dominated particularly by USA customers.  Some of them seasoned genealogists with good quality paper trails.  Others, attracted by the easy option to know their ancestry before the, as 23andMe puts it, the Age of Migration before the past few centuries.  Instead of spending a lifetime chasing documents, they can simply send a DNA sample to a company, and know their roots.  People trust the science of DNA testing for ancestry.  That is the demand that commercial companies can cater for.

But what if their abilities to accurately detect ancestry from Autosomal DNA is exaggerated?

Lack of agreement between analysis.

As one evidence.  Test autosomal DNA with three different companies, and you will receive three different results.  That is well known in genetic genealogy circles.  Some apologists excuse it away by pointing to the different companies claims, to be focusing on different periods.  23andMe say that they zoom in on 500 years ago, by rejecting short chains.  Is it really, really possible yet, to be able to zoom in on one particular period?  I'm not convinced.  Is it even possible to securely locate all ancestry from the past 500 years?  I'd expect genetic recombination to wash away an awful lot of ancestral DNA long before that.  The truth is that beyond our great great grandparent's generation, there is less and less chance of us carrying any surviving DNA from any one particular ancestor! Especially from the autosomal DNA passed down on your father's side.  You might have a Balkan g.g.g.g grandfather, but chances are, there will be no evidence of their existence remaining in your autosomes.  His DNA, and all that belonged to his Balkan ancestry, will be lucky to survive the following 250 years, never mind 500 years.  My Y-DNA has strong evidence that I had an Asian ancestor on my paternal line, arrive in Southern England between 1,800 and 500 years ago.  However, nothing remains in my autosomal DNA analysis that suggests Asia.  Washed away.

Getting back to those three companies giving three different ancestries. My South European percentages have varied from 2% (with a hint at Iberia), to 19% (with a hint at Balkans), to FT-DNA's claim of 32%!  Eurogenes K13 hints at Iberia in it's admixture programs on GEDmatch.

Population References

One more thing.  Autosomal DNA tests for ancestry do not use ancient DNA references.  Not yet anyway.  They instead use present-day references, often from their own customer client bases, based on what ancestry they claim.  This is not necessarily the DNA that existed in past populations.  Populations and genes shuffle, genetic drift forms.  I recently read a report that FT-DNA Y data for NW Europe heavily biases to Irish ancestry.  Therefore, references from Americans of Irish and / or British descent, will bias to the West.  The quality of a reference is critical.

Is it all Bunk?

Am I saying that autosomal DNA testing for Ancestry is all a waste of time?  Actually no, not yet.  The tests DO find me to be pretty much 100% European.  That is a success.  Some tests even find me with a degree of confidence, to be NW European.  That is awesome.  However, beyond such regional level, should we be trusting such tests to be providing concrete results, infallible "truths" with a high degree of accuracy?  Shouldn't we be cautious, and regard such speculations as just that - speculations, to be assessed by other forms of evidence?  Some of my ancestors might have lived in Southern Europe.  Maybe Option 1 was correct - one of my Norfolk ancestors brought a Portuguese wife home from the Peninsular Wars.  Perhaps.  Maybe Option 2 was correct - the patterns that DNA companies pick up as Southern European, are ancient, related to Neolithic, Iron Age, or Roman admixture from the South, or sharing ancient ancestry with Southern Europeans.  Maybe.

I'm not at all disenchanted with DNA testing for ancestry though.  I've commissioned five so far this year, including three autosomal DNA tests.  This leads me to my most recent commission.  Perhaps this one will convince me more.  It's a very new test.  I'll post on that next.



Family Tree DNA Family Finder data V 23andMe raw data on GEDMATCH

Background

I'm South-east English in known paper ancestry, ethnicity, and heritage - mainly Norfolk East Anglian, where I still live, close to many known ancestors. I have 207 recorded ancestors on my tree, over the past 380 years. The majority lived in Norfolk, but some were Oxfordshire, Lincolnshire, Suffolk, and Berkshire. All appear to be English, with English surnames, English religions and denominations, overwhelmingly East Anglian:

Generation 1 has 1 individual. (100.00%)

Generation 2 has 2 individuals. (100.00%)

Generation 3 has 4 individuals. (100.00%)

Generation 4 has 8 individuals. (100.00%)

Generation 5 has 16 individuals. (100.00%)

Generation 6 has 29 individuals. (90.62%)

Generation 7 has 51 individuals. (79.69%)

Generation 8 has 47 individuals. (36.72%)

Generation 9 has 36 individuals. (14.84%)

Generation 10 has 10 individuals. (2.34%)

Generation 11 has 4 individuals. (0.39%)

Total ancestors in generations 2 to 11 is 207

I have previously tested 23andMe, FTDNA Y111, and FTDNA Big Y. My Y line is unusual, because it does originate in Western Asia, within the past few thousand years (L1b2c). However, there is no evidence of anything but European in any autosomal tests so far, so other than the Y, it appears to be washed out.

My 23andMe AC in spec mode (after phasing with one parent) is:

100% European

96% NW European

2% South European

2% broadly European


37% British & Irish

22% French & German

1% Scandinavian

36% broadly NW European

2% broadly South European


FTDNA Family Finder - My Origins


36% British Isles

32% Southern European

26% Scandinavian

6% Eastern European

I thought that it would be interesting to compare how a few important GEDMATCH calculators, see my raw data from Family Tree DNA, in comparison to the raw data from 23andMe:

GEDMATCH

23andMe raw data V ftDNA raw data

Eugenes K13 Oracle

23andMe data

Admix Results (sorted):

# Population Percent

1 North_Atlantic 47.58

2 Baltic 22.36

3 West_Med 15.65

4 East_Med 8.03

5 West_Asian 3.05

6 Red_Sea 1.42

7 Amerindian 0.74

8 South_Asian 0.71

9 Oceanian 0.46


Single population Sharing:


# Population (source) Distance

1 South_Dutch 3.89

2 Southeast_English 4.35

3 West_German 5.22

4 Southwest_English 6.24

5 Orcadian 6.97

6 French 7.63

7 North_Dutch 7.76

8 Danish 7.95

9 North_German 8.17

10 Irish 8.22


Family Tree DNA data

Admix Results (sorted):

# Population Percent

1 North_Atlantic 47.89

2 Baltic 22.68

3 West_Med 15.45

4 East_Med 7.41

5 West_Asian 3.11

6 Red_Sea 1.38

7 South_Asian 0.84

8 Amerindian 0.72

9 Oceanian 0.52


Single Population Sharing:


# Population (source) Distance

1 Southeast_English 3.75

2 South_Dutch 4.03

3 West_German 5.42

4 Southwest_English 5.68

5 Orcadian 6.33

6 North_Dutch 7.15

7 Danish 7.36

8 Irish 7.59

9 West_Scottish 7.62

10 North_German 7.7


Euogenes EUtest V2 K15

23andMe data

Admix Results (sorted):


# Population Percent

1 North_Sea 33.42

2 Atlantic 27.98

3 West_Med 12.24

4 Baltic 10.42

5 Eastern_Euro 7.04

6 West_Asian 3.52

7 East_Med 3.14

8 Red_Sea 1.48

9 Amerindian 0.39

10 Oceanian 0.19

11 South_Asian 0.18


Single Population Sharing:


# Population (source) Distance

1 Southwest_English 2.7

2 South_Dutch 3.98

3 Southeast_English 4.33

4 Irish 6.23

5 West_German 6.25

6 North_Dutch 6.79

7 West_Scottish 6.84

8 French 6.85

9 North_German 6.89

10 Danish 7.26


Family Tree DNA data

Admix Results (sorted):


# Population Percent

1 North_Sea 33.81

2 Atlantic 28.23

3 West_Med 12.04

4 Baltic 10.59

5 Eastern_Euro 6.84

6 West_Asian 3.66

7 East_Med 2.47

8 Red_Sea 1.46

9 Amerindian 0.35

10 South_Asian 0.31

11 Oceanian 0.25


Single Population Sharing:


# Population (source) Distance

1 Southwest_English 2.29

2 Southeast_English 4.02

3 South_Dutch 4.48

4 Irish 5.78

5 West_Scottish 6.41

6 North_Dutch 6.43

7 West_German 6.63

8 North_German 6.73

9 Danish 7.01

10 Orcadian 7.19


Gedrosia Eurasia K6 Oracle

23andMe data

Admix Results (sorted):


# Population Percent

1 West_European_Hunter_Gartherer 39.18

2 Natufian 38.8

3 Ancestral_North_Eurasian 20.85

4 Ancestral_South_Eurasian 0.82

5 East_Asian 0.35


Family Tree DNA data

Admix Results (sorted):


# Population Percent

1 West_European_Hunter_Gartherer 39.57

2 Natufian 38.66

3 Ancestral_North_Eurasian 20.75

4 East_Asian 0.86

5 Ancestral_South_Eurasian 0.16

Ancestry and DNA Tests

I'm writing this post in response to a number of comments that I see online with regards to using a commercial DNA test, in order to ascertain ancestry.  Quite often, when someone asks how to find out their family history or ancestry, someone will come back with an answer in the form of "just spit in a vial, send it to Ancestry.com, and they'll tell you".  It's not really that simple, so I'm making this post, to explain how an ancestry DNA test can help, or not help, you discover your ancestry.  Nicely dumbed down I hope, for the beginner.

Traditional Genealogy

Traditional genealogists usually set out to create a genealogy (family history and tree), using interview techniques, artefacts, and oral memories, recorded from older relatives.  Artefacts might for example, include old family medals, or photographs.  They then extend the research, through documentary evidences, such as birth, death, and marriage certificates, church registers, census records, transcripts, electoral rolls, and military records. If they are interested in recording all ancestral information, and not merely a single line such as the surname line, then this research can go on for months, years, even decades.

What you cannot do, is to simply pay a small fee, and your entire family history drops through the letter box in a brown envelope.  It takes years of time to research, collate, and to verify a good family tree.  Most genealogy enthusiasts don't mind this, because they actually enjoy doing the research itself.  It becomes a hobby, even sometimes a passion.

However, a number of commercial DNA companies may give the general public the impression, that you now can simply pay a fee, spit or swab, and your ancestry magically appears for you on a website.  It's big business.  Does it work though?  Exactly what is genetic genealogy?

What is Ancestry and why do we care.

Ancestry can simply be defined as our descent from forbearers.  Why do we care who they were? Which forbearers or ancestors?  How many are there?  How far back?

Of course, not every one does care.  Not everyone cares about history.  But for other's how we define ourselves, our communities, and families, it does matter.  It tells us who we are, where we came from.  It defines us, gives us grounding.  It gives us identity.  Wars have often been inspired by ancestry.  At the same time, a deeper appreciation of the human family, and it's common ancestry, can be used to relate to those elsewhere.  One big family.  Discovering the immense poverty and hardships of our ancestors can help us to appreciate what we have, and to help others in need today.

So what ancestry can we discover?  For those few that merely concentrate on one patriarchal line, it's quite simple to define - the generations of a surname.  However, beyond that one narrow line of descent, few appreciate exactly how much total ancestry that we have.  Lets look at our biological ancestors at each generation:

  • 2 parents
  • 4 grandparents
  • 8 great grandparents
  • 16 great grandparents
  • 32 g.g grandparents
  • 64 g.g.g.grandparents
  • 128 g.g.g.g grandprents
  • 256 g.g.g.g.g grandparents.

These are only your 510 most recent direct ancestors, yet just those generations, will take you back to only around 250 years of family history.  Now add all of the recorded children of these direct ancestors - the great great uncles and aunts to the theoretical family tree.  You're probable going to have a tree of around 1,300 individuals.  That is just for 250 years.  You have a big family  Go back a few more generations, and it will explode before you reach far.  All of those direct ancestors though, are a part of your ancestry.  You'll most likely carry some DNA from most of them.  They are, from a biological perspective, who you are.

By the way, the number of biological ancestors will not continue to increase infinitely.  Because increasingly, you will find couples within your tree that are distant biological cousins of each other.  As this accelerates through thousands of years, that explains how all modern people around the world, all descend from a very small population around 100,000 years ago.

So before considering what DNA can do for genealogy, we need to consider which ancestors matter to us.  Do you just want to know who your biological parents, or grandparents were?  Do you want to know the names, places and social positions of your ancestors over centuries?  Do you want to know which parts of the world that your ancestors lived 500 years ago?  Do you want to know how some of your prehistoric ancestors moved across the globe, thousands of years ago?  Maybe you want to know everything.

Let's now turn to genetics for genealogy, and how DNA tests can answer some of these questions.

There are two main types of DNA tests for ancestry, although they are often incorporated together by commercial companies:

  1. The haplogroups, the Y-DNA and mt-DNA
  2. Autosomal DNA
The Haplogroups

The haplogroups are chains, or markers, that are carried on one of only two strict lines of descent.  They do not apply to your entire ancestry - just two lines.  As we saw above, we have 256 g.g.g.g.g grandparents (unless any of their descendants reproduced together).  Our haplogroups came from only two of them.  Your haplogroup does not define you.  Yet, it's quite odd, because very quickly, many genetic genealogists do relate to them, rather like a hereditary football club.  They do become an identity, only if you enthuse over them.

The Y or paternal haplogroup, follows the strict paternal line.  From father to son.  Women do not have a Y chromosome, so cannot pass it on.  It has to come from the biological father.  However, within this constraint, Y-DNA is particularly useful to genealogists.  It mutates often, both as STRs and less often, as SNPs (snips).  Because of these frequent mutations, it is useful for tracing shared descent with others.  It can also be aligned with surname studies.  The champion commercial DNA company for Y-DNA research, is Family Tree DNA.

The mt or mitochondrial (maternal) haplogroup, follows the strict maternal line.  From mother to children.  Both sons and daughter inherit their mt-DNA haplogroup from their biological mother.  However, only the daughters can pass it down.  Two downfalls to mt-DNA for genealogy.  1) The surname frequently changes, traditionally nearly every generation through marriage. 2) it doesn't mutate as frequently as the STRs of Y-DNA. It is still a useful tool, and can prove descent through the maternal line.  It can also still be used for studies of much deeper, ancient ancestry.

Autosomal DNA

This is the bulk of you DNA.  All of the snips (SNPs), that make up who you are genetically.  You receive approximately 50% from each parent, 25% from each grandparent, 12.5% from each great grandparent.  This subdivision cannot go on forever, and indeed, once you go back much more than six generations, the approximates start to deviate, so that you may have no snips at all from a particular line that joined your family tree over 250 years ago.

The problem with autosomal DNA is that it can be such a mess.  It recombines randomly with every generation.  Therefore, it is much harder to track ancestry in the same way, that we can with the haplogroups.

So how can they be applied for genealogy:

Biological descent

Not everyone knows who their biological parents were, or where they came from.  This is the first use of DNA testing.  It can be used to find, test, or prove recent descent.  The first hurdle of genealogy.  Both haplogroup evidence, and autosomal evidence can be used to prove or determine relationship.

Cousins

Many genetic genealogists, use DNA to find distant, and sometimes not so distant cousins.  The hope is that they can link trees, share knowledge and research, perhaps copies of artefacts.  Therefore an awful lot of genetic genealogy is about tracing genetic relatives, and establishing common ancestry.

There are two main tools:

  • Haplogroup Projects.  The Y haplogroup is favoured for it's frequent STRs, and also for it's link to surnames.  Family Tree DNA projects track the STR and SNP data of it's members, tracking families, relationship, known mutations.  Project administrators at FTDNA can predict relationship to other members in the project.  Your Y cousins.
  • Shared segments.  Autosomal DNA can be used for finding distant cousins.  23andMe for example, have Relative Finder.  Alternatively customers of any commercial DNA company that gives them access to their raw data, can upload that data to GEDMATCH.  At GEDmatch, they can search for other kits, looking for lengths of shared segments (measured in cM - centimorgans) on the autosomes or X chromosomes.  The longer or more segments can be used to indicate shared ancestry.

It is important to understand, that this is not about directly tracing ancestry.  It is only about establishing shared biological ancestry, with other researchers, with which you may be able to share resources.  In the old days of genealogy, we would find distantly related researchers by browsing through annually printed surname interest directories.  Here, the same thing is happening, but we are finding people by comparing DNA.

Ancestry from Autosomes

Most commercial DNA companies providing ancestry information, now use their own propriety calculators to look at the autosomal DNA of their customers for patterns that they can relate to a number of reference populations.  23andMe for example, uses Ancestry Composition to determine what parts of the world, that the ancestors of their customers lived 500 years ago.  They predict from this in percentages of ancestry.

However, it is very much a developing art.  The problem is that genes have been randomly mixing and moving around ever since prehistory.  The customers of these DNA companies want hard facts.  They want their ancestry accurately pin pointed down to modern or ancient nation-states, or to historical populations such as the Vikings or Huns.  Ancestral DNA companies are under pressure to provide this deep ancestry.  However, can they?  Ancestral analysis of DNA can be very enlightening.  It can provide some surprises within a family history.  However, it's accuracy is exaggerated.  It is increasingly successful at predicting ancestry from a particular corner or end of a particular continent.  But it cannot for example, accurately tell French, British, and German ancestry apart to any high accuracy.  It can recognise some populations better than others.  It cannot tell anyone if they had Viking ancestry.

Ancient Ancestry

This is a particular value of the haplogroups.  As we accumulate more and more data on more mutations, as we expand the recorded database, and as we relate that to more ancient DNA extracted from referenced and dated ancient human remains, so we will be able to better explore the population genetics not only in history, but deep into prehistory.

However, it is also becoming increasingly realised, that patterns of ancient admixture can also be detected within the autosomes.  Although Autosomal DNA ancestry calculators claim to reveal relatively recent admixtures over the past 500 years, it is becoming clear that these are being confused by much older patterns of admixture.  These patterns can now be explored and probed on a number of GEDmatch programs.  People can compare their DNA with the kits from ancient DNA, or predict just how much of their ancestry was likely "Western Hunter-Gatherer, or "Early Neolithic Farmer".

In addition, more DNA companies are now measuring for much more ancient admixture with archaic populations such as the Neanderthals.

Conclusion

Genetic Genealogy is fun, great fun.  It is not however, a quick and easy replacement for traditional genealogy.  Unless you get lucky with some comparative Y-DNA in a project, it is not going to directly tell you the names or social status of any ancestors.  It can give you a phylogenetic tree, but not any kind of family tree that you can bore other family members with.

Genetic genealogy can provide some tools to some researchers.  It can test biological relationship.  It can be used to predict some of your ancient history.  For most researchers, particularly those that are able to interview many local family members, search local grave yards, access archives and records - it has no, or little value to the pursuit of collecting ancestors.

I personally love to explore my genetic genealogy. But it is documentary research that provides the names.  Genetic genealogy for myself, is more about the long and ancient journey.

Autosomal DNA Tests for Genealogy

First a disclaimer.  I'm very new to the whole world of genetic genealogy.  I'm not new however, to traditional genealogy, and I do have a pretty good amateur understanding of relative archaeological and anthropological discussions over the past fifty years.  The following is not meant as a critique of genetic genealogy, so much as a review, or my experience, of ancestry composition based on autosomal DNA analysis.

Let's start with my paper trail.

Traditional Genealogy

I am English by ethnicity, British by nationality, and a subject of Queen Elizabeth II (often now referred to as a UK Citizen).

My paper recorded ancestry consists of the genealogical records of:

  • Generation 1 has 1 individual. (100.00%)
  • Generation 2 has 2 individuals. (100.00%)
  • Generation 3 has 4 individuals. (100.00%)
  • Generation 4 has 8 individuals. (100.00%)
  • Generation 5 has 16 individuals. (100.00%)
  • Generation 6 has 29 individuals. (90.62%)
  • Generation 7 has 49 individuals. (76.56%)
  • Generation 8 has 35 individuals. (27.34%)
  • Generation 9 has 24 individuals. (10.16%)
  • Generation 10 has 10 individuals. (2.34%)
  • Generation 11 has 4 individuals. (0.39%)
  • Total ancestors in generations 2 to 11 is 181. (9.04%)

All 181 ancestors, reaching back to the 1690's, appear to be English born, of English ethnicity, with English surnames.  The majority of them (100% on my mother's side, and 81% on my father's side) were East Anglian, with the vast majority of that percentage being born in the county of Norfolk.  Religions recorded or indicated were CofE Anglican or non-conformist Christian.  No sign of any Catholicism, Islam, or Judaism.

Therefore it would look pretty likely, that I can claim English heritage, wouldn't you agree?

Genetic Genealogy and Ancestry Prediction

There are three aspects or avenues of inquiry, available for genetic genealogy.  First of all, the two sex haplogroups; the y-DNA, and the mt-DNA. These two "signals" are referred to as haplogroups.

  1. The y-DNA.  This follows the Y chromosome.  It is only carried by men.  It is passed along the paternal line, and only by that line, from grandfather, down to father, down to son, until the line is broken.  What a lot of people do often misunderstand, is that it does not represent 50% of your ancestry.  It does not represent all of your biological father's ancestry.  For example, his mother's father, and her brothers, although on your father's side, would most likely carry a different y-DNA haplogroup.  It only comes down an uninterrupted strictly paternal line.  Even at Generation 7 (g.g.g.g grandparents) above, it would have been carried by one out of my sixty four biological ancestors at that generation.  The other thirty one g.g.g.g grandfathers for that generation may have carried different Y haplogroups.
  2. The mt-DNA.  Although a very different type of DNA, this one works as the opposite sex haplogroup.  It is a signal that is passed down the strictly maternal line, from grandmother, to mother, to her children.  Yes, we men do inherit our mother's mt_DNA, but we can't pass it down.  Only our sisters can.
  3. The au-DNA, better known as Autosomal DNA.  Whereas the former two sex haplogroups are handy, because we can measure their mutations, and track their formation and movement across thousands of years, au-DNA really is the stuff that we are made of - all of the SNPs on our chromosomes that personalise us within the human genome.  We inherit our au-DNA from all of our recent ancestors.  Roughly 50% from our biological mother, and 50% from our biological father.  Equally, we could say on average, 25% from each grandparent, or 12.5% from each great grandparent.  However, it is messy.  At every reproduction (meiosis), it gets messed up by recombination.  Not only that, but go back much more than six generations, and it becomes more and more likely that you can lose entire lineages.  You can have no surviving trace of any DNA from for example, a particular g.g.g.g.g grandparent.

Autosomal DNA is what makes us individuals, gives us our hereditary traits.  It is passed down from many ancestors, via our parents.  However, the sex haplogroups are of interest because they can be traced across the globe, and the millennia.  As we gain more and more data - both from living populations, and ancient DNA from archaeological finds, so we will be able to track the STR and SNP mutation data more precisely.

However, what about poor old messed up autosomal DNA?  It represents our entire biological heritage over many generations. It is what we are. However, making sense of it is less easy, less precise.  Genetic genealogists are making progress, but it is far less of a precise science than either of the haplogroups.  They use calculators, that measure the segments of DNA cross the chromosomes, looking for patterns that they recognise from a number of known reference populations.  From that, these calculators predict an ancestry.  Exactly what and when that ancestry refers to, does seem to vary from one calculator to another.  There is an argument that the precision can be improved if you also test close known relatives including at least one parent.  The results can then be phased.  I'm actually waiting for the results for my mother, so that I can see my own au-DNA ancestry results phased and corrected.

So lets have a bit of fun, and see what some of the calculators suggest for my autosomal DNA, at least before any phasing with my mother's DNA.  What do they make of my 100% English paper ancestry?

23andMe.com Ancestry Composition Standard Mode

99.9% European.

Broken into:

83% NW European

17% Broadly (unassigned) European

I think that's pretty cool.  As I'm getting to know au-DNA predictions, so as I'm learning to appreciate it when they get the right continent, and the right corner of that continent.  That is more than they could do a decade or two ago.  The prediction is correct, I am a NW European.  I'm not a West African, a South Asian, or a East Siberian.

23andMe.com Ancestry Composition Speculative Mode

100% European

Broken into:

94% NW European

3% S European

3% Broadly (unassigned) European.

Whoa, where did that South European come from?  It could just be a stray incorrectly identified signal, or it could be telling me that one of my ancestors, maybe around Generation 6, were from down south!  Lets break down the prediction further.  First, the NW European:

32% British & Irish

27% French & German

7% Scandinavian

But surely I should be 100% British & Irish?  Not only 32%.  I have my own ideas about this.  I think that although 23andMe claims that Ancestry Composition only represents the ancestry of the past 300 to 500 years (the so-called migration period, as sold to USA customers), that it gets confused by earlier migrations across their reference populations, including those during the early medieval period, and perhaps even some of those during late prehistory.  I've noticed that across Ireland and Britain, the further to the east, the more diluted the 23andMe British & Irish assignment.  People of solid Irish ancestry get between 85% and 98% British & Irish.  My East Anglian results, mixed between British & Irish, French & German, and Scandinavian, are actually rather more like those received by Dutch customers of 23andMe.

As for that Southern European prediction, how does that break down?

0.5% Iberian

2.4% Broadly (unassigned) South European.

Which if taken seriously, might suggest that I have an unknown Spanish or Portuguese ancestor around Generation 6.  If I did take it seriously that is.  I wonder what my mother's test will reveal?

DNA.Land.com Ancestry Composition

This is a third party site, that you can upload your 23andMe V4 raw data to, and see what their calculators predict for your ancestry.  It has recently had it's ancestry composition revised.  What did that make of my 100% English au-DNA?

West Eurasian 100%.

I like that designation, the amateur anthropologist in me prefers that broad designation over "European".  Broken down:

77% North/Central European

19% South European

2.4% Finnish

1.3% unassigned.

What?  Why not 100% North/Central European?  Finnish?  Did some early medieval Scandinavian settlers of East Anglia bring it?  Or is it a false signal?  Misidentified au-DNA?

That darned South European kicked in again.  I'm here looking at a biological cuckoo NPE (non-parental event) at around Generation 5 or even more recent!  Did a great grandmother secretly have a South European lover?  But this South European breaks down further:

13% Balkan

6% Italian.

Oh my goodness, whereas 23andMe speculative mode suggested SW Europe - this one suggests SE Europe!  Do I have a secret Albanian great grandfather?  Or is it all nonsense?

WeGene.com

This is a cracking new third party DNA analyser.  It is based in China, and it's predictors appear to calculate mainly for a Chinese market.  It not only predicts your ancestry composition, but also your two sex haplogroups, and lots of traits and health predictions to compliment those of 23andMe.  It even tries to predict your genetic disposition to sexuality!

It will allow you to send your 23andMe V4 raw data direct to it's own calculators.  However, at the moment the website is almost entirely in Chinese (Mandarin?).  There are two options.  1) At the bottom of the webpages is a hyperlink to English, which gives, in English, a basic ancestry composition, and your haplogroups.  It does not include English versions of the health and trait results.  2) use an online translator, such as the one built into the Google Chrome browser.  It actually serves pretty well.

On sex haplogroups they give my Y-DNA as

L1.  Not bad, but they didn't make it to L1b or L-M317.

My mtDNA?

H6a1a8.  Very good.  Better than 23andMe's H6a1, and the same as the mthap program.

But this is about au-DNA, how did they do, what did they make of my 100% English ancestry?

81% French

19% English/Briton

Now, that sounds pretty awful, but on closer inspection, I'm impressed.  No South European great grandfather.  Okay, so most of my DNA has been placed on the wrong side of the Channel.  However, I know that French and English DNA is actually very close.  Recent surveys even suggest that the English have inherited a lot of common ancestry with the French during unknown migration late in prehistory.  So again - they very much got the right corner of the right Continent.  Well done WeGene.

GEDmatch.com Eurogenes K13

GEDmatch is a website that you can upload raw data not only from 23andMe, but from a range of testers, and from V3 chips as well as V4.  It hosts a number of tools and predictors - some Open Source.  Some of these predictors are for Admixture or ancestry composition.  They measure your ancestry in terms of distance from known reference populations.  The lower the number, the closer you are to their reference.  They use calculators known as oracles to predict ancestry, including mixed ancestry or admixture.

The oracles on the Eurogenes K13 and K15 calculator models have a good reputation at working with West Eurasian ancestry.  So how does K13 first, score my 100% English ancestry?

On Single Population Sharing, it rates my DNA against the closest references.  In order of closest to not so close, the top five are:

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97

I think that's a cracking result.  Okay, it thinks that I'm closer to South Dutch, than I am to SE English, but so close - and my East Anglian ancestry most likely does include a lot of admixture from the Low Countries from the early medieval period.  I really like Eurogenes K13.

Okay, let's now use the Oracle 4 option, to suggest admixture.  First on three populations admixing to create my DNA, what comes closest?

50% Southeast_English +25% Spanish_Valencia +25% Swedish @ 2.087456

Well that's interesting!  The SE English hit the net.  The Swedish?  Could be ancient Scandinavian admixture - but the Iberian prediction has reemerged!

On four populations admixing?

1 Southeast_English + Southeast_English + Spanish_Valencia + Swedish @ 2.087456
2 Southeast_English + Southeast_English + Spanish_Murcia + Swedish @ 2.147237
3 Norwegian + Portuguese + Southeast_English + Southeast_English @ 2.216714
4 Danish + Portuguese + Southeast_English + Southeast_English @ 2.225334
5 Portuguese + Southeast_English + Southeast_English + Swedish @ 2.230991

Oh my goodness.  K13 agrees with 23andMe AC, that I have an Iberian link.  I'm now really starting to wonder.

Let's finish off by trying K15 on my 100% English ancestry:

GEDmatch.com Eurogenes EU test V2 K15


Using Oracle for single population first, the top five closest:

1 Southwest_English 2.7
2 South_Dutch 3.98
3 Southeast_English 4.33
4 Irish 6.23
5 West_German 6.25

Okay, I'm SE English, not SW English, but pretty impressive again.

Using the oracle 4 for three population admixture, what mix comes closest to my auDNA?

50% Southwest_English +25% Spanish_Castilla_Y_Leon +25% West_Norwegian @ 1.080952

That Iberian back again!

Top five mix ups of populations closest to me?

1 Southwest_English + Southwest_English + Spanish_Castilla_Y_Leon + West_Norwegian @ 1.080952
2 Irish + North_Dutch + Southwest_English + Spanish_Galicia @ 1.111268
3 North_Dutch + Southwest_English + Spanish_Galicia + West_Scottish @ 1.282744
4 Southeast_English + Southwest_English + Spanish_Castilla_Y_Leon + West_Norwegian @ 1.295819
5 North_Dutch + North_Dutch + Southwest_English + Spanish_Castilla_Y_Leon @ 1.304939

I can't help preferring the K13 results to the EU test V2 K15 - simply because it recognises me better as SE English, rather than to their SW English reference.

Conclusions

If anyone ever bothers reading this far too lengthy post, I hope that I have imparted the following lessons:

  • Don't expect DNA Ancestry tests to pin point an actual country of ancestry.  They're not no where near that good yet.  The populations of West Eurasia, and elsewhere, are actually all mixed up, or share a lot of recent admixture.  In addition, many European nation-states are quite recent inventions.  I've seen the borders of Europe change in my short lifetime.
  • Don't expect precision.  If for example, you are an American, and a 23andMe AC test suggests only 32% British & Irish, then you could actually have 100% English ancestry over the past 300 years!  We're so mixed up, that these tests are struggling to part and identify us by nationality.
  • If you are willing to share your raw data (there are privacy issues), then have fun trying out all of these third party calculators.  It's a lot of fun as you can see.  They rarely agree.  There are other tools on GEDmatch for example, where you can compare DNA along with .gedcom genealogical files with other users - and look for shared segments on the chromosomes.  You can also compare your DNA to that of ancient populations.
  • Treat au-DNA differently to haplogroup results.  au-DNA is very interesting, and represents so much of our ancestry, if we could just sort some of the mess out.  You can partially do this by phasing your results with those of close relatives.  It is worthwhile phasing with at least one biological parent, if you can.  However, haplogroup results, provide by their mutations incredible stories over much longer periods - thousands of years.  A different kind of genealogy.  As we gather more data, and reference it also to ancient-DNA, so it will tell us more and more about two lines of descent.  Perhaps even into historical times.

The Thackers of Norfolk

The above photograph is of my Great Great Grandmother Sarah Anne Elizabeth Thacker (nee Daynes), of Rackheath, Norfolk.

I'm writing this for someone with shared chromosome segments that I've met on Gedmatch - my first potential relative, found through DNA comparisons.  She's in the USA, and has Thackers - which may be a link.  It's not that a common surname here, and very East English.  It has been suggested that the name is an Anglo-Danish variation of Thatcher.  I'm not so sure that it is of any Norse origin.  However, the strongest concentration of the surname in Britain today, is in the Dane Law county of Lincolnshire.  My Thackers though belong to a secondary cluster - also in the Dane Law, the region of East Anglia.

Thacker Family

I'm going to propose an estimated birth year of circa 1762 for my G.G.G.G.G grandfather John Thacker.  He first appears on my genealogical record on 2nd February 1787, when he married Mary Clarke at Rackheath in Norfolk.  However, tragedy struck.  Mary died a few years later in 1740, and was buried in the neighbouring parish of Salhouse.

John wasn't a widow for long.  He married my ancestor Ann Hewitt on the 26th April 1791 at Salhouse.

I have found records for three of their children.  Thomas Thacker was born in 1793, my G.G.G.G grandfather William Thacker was born 1796 at Salhouse, and his little sister Mary Thacker was born in 1801.

William Thacker married my ancestor Catharine Hagon at Rackheath on 19th May 1819.  Catharine gave birth to at least three children - William Thacker in 1820,my G.G.G grandmother Susan Thacker in 1823, and Thomas Thacker in 1825.  Their children were all later baptised at Salhouse Particular Baptist chapel.  I have very few Baptist ancstors.

William was an agricultural labourer, a farm worker by trade, as were the majority of my 18th and 19th century male ancestors.

Catharine must have died sometime between 1825 and 1833.

On 1st December 1833, William Thacker, now a widower, married Maria Cliborne at Rackheath.

Maria gave him at least three more children - George Thacker, born 1834, John Thacker, born 1837, and Ann Thacker, born 1841.

William Thacker died in 1874.

My G.G.G grandmother Susan Thacker, gave birth to my G.G grandfather, George Thacker in 1847 at Rackheath.  The father was not recorded.

George grew up to be an agricultural labourer and farm worker.  He married my ancestor Sarah Anne Elizabeth Daines (Photo above) in 1870.  Sarah (one of my mtDNA line) gave birth to at least ten children: George Thacker 1871, Ellen Thacker 1878, William Thacker 1876, my great grandmother Caroline Drusilla Thacker 1878, Catherine Thacker 1880, Thomas Thacker 1883, Rubin Thacker 1885, Walter Thacker 1887, Herbert Thacker 1891, and Rose Thacker 1893.

There is a family story about Sarah's strict paternal behaviour.  She was known as "Thacker by name, thacker by Nature" referring to "thacking" - to hit or punch hard.

My mt-DNA great grandmother Caroline married my great grandfather Samuel John Tammas-Tovell.

The Iberian Connection

The above photo at A Capela dos Ossos (the bone chapel) in Évora, Portugal. The entire chapel is covered with human bones.  Every wall and pillar is decorated with skulls and bones.  On another wall hangs the mummified remains of a man and child, said to have been cursed. There is a sign at the entrance of the chapel which states "Nós ossos que aqui estamos, pelos vossos esperamos" (Our bones here, await yours).

Genetic Genealogy

I was a sceptic of genetic genealogy, I'll admit it.  Now I'm hooked.  Not because I feel that it has been a way of hooking up with distant cousins, that can help me extend my family tree.  That's not the way that I've used it so far.  Instead, it has provided very different kind of information, that helps me understand who I am, and how I can link my ancestry to known heritage.

I might not have been so hooked, but I've had so many surprises with my 23andMe results.  If my results had been perhaps, dire and boring, then maybe I would have retreated to traditional genealogy and regarded the technique as predictable and uninteresting.  However, what ancestry related surprises did I have?

  • I have a very rare Y haplogroup for NW Europe.  So far predicted to L1b M317.  It will be shared by my brother, my son, one cousin (and his son, and grandson).  Today I sent away a further FTDNA Y111 swab test.  The L haplogroup is mainly concentrated in Southern and Western Asia, from Afghanistan down to Southern India.  My L1b M317 sub clade is concentrated in Western Asia, including Eastern Turkey, Armenia, Georgia, Azerbaijan, the South Caucasus, and Western Iran. A faint trace of it along the length of the Med in Southern Europe, and across Italy, and a slight cluster in central Europe - which apparently, I don't belong to.
  • Autosome Ancestry composition by 23andMe, gave me a very low percentage of "British & Irish", and high percentages of "French & German" and "Scandinavian".  I've explored the possibility that this could reflect early medieval admixture from across the North Sea.  I've looked at the typical Ancestry Compositions of people with a strong recorded English ancestry, and compared them to the results from people with strong Irish ancestry.  That SE English people typically sit somewhere between the Irish, and typical Dutch in Ancestry Composition reinforces my view that this is the case.
  • My mtDNA was H6a1.  Not the most exciting haplogroup, but not the most boring neither.  It allows me to relate to the latest evidences for Eurasian Steppe admixture into Western Europe during the Early Bronze Age.

A Southern European Enigma

I captured the above photo at Cabo Espichel, Portugal.

There was a fourth, further surprise in my 23andme results.  It lay in the autosome.  23andMe AC (Ancestry Composition) on speculative mode, suggested 2.4% Southern Europe, including a prediction of 0.5% Iberian ancestry.  On speculative mode again, it falls on five pairs of chromosomes - but never on both sides.  On standard mode, 0.1% remains, just on one side of pair 21.  This suggests that all of it comes from just one of my parents.

I might think that this was just "background noise", an error in AC.  However, it keeps popping up.  Indeed when I upload my raw data to the program at DNA.land, they predict only 80% North/Central European, and a whopping 15% South European.  It doesn't stop there.  On GEDMATCH, the Eurogene calculators keep suggesting Iberian or South European admixture on their mixed population oracles.  Eurogenes K9 for example, gives me 61% North European, 29% Mediterranean, and 6% Caucasus.

Let's just refer back to my recorded paper ancestry.  I have 190 recorded ancestors, all in England, with English surnames.  No sign of any Roman Catholicism.  I have all sixteen of Generation 6 (G.G grandparents) named.  All born and named English.  No sign of any South European even in the 1,490 people on the entire family tree for my kids.

However, I think that all of the autosome ancestry calculators could be telling me a truth, that I can't see in my known family tree.  If I have a South European ancestor somewhere, whether Iberian or not, then either a) I have not yet found them, or b) they were the biological ancestor of a NPE (non-parental event), a cuckoo.  I have 3 out of my 32 Generation 7 ancestors unnamed - all absent fathers.  I have 15 missing ancestors in Generation 8.  Above that, the representation really starts to decline, although I have some ancestors named up to Generation 11.  Could a South European be in there?  23andMe in speculative mode suggested 2.4%.  That would seem "average" for an ancestor in Generations 7 or 8 (3 to 4 x G grandparent level)  Of course from around that point, "averages" become pointless, and subject to a randomness that can delete entire lineages further up from any surviving DNA.  None-the-less, I could have a South European from around that period - either one of the 18 "missing" ancestors, or a NPE cuckoo.

I'm commissioning a 23andme test for my mother.  Three reasons.  1) she wont be here for ever.  Recording her genome feels valuable and worthy.  2) I want to see how her very dense 100% recorded Norfolk ancestry projects on Ancestry Composition and on GEDMATCH.  3) I want to phase her results against mine.  It will tell me for example, where my "South European" DNA came from - which parent.  It will help me further understand my own genetic ancestry.

Exploring Gedmatch Eurogenes

The above grave is of my great great grandparents Robert and Ann Smith at Attleborough, Norfolk.

L1b Y-DNA News

First of all, it's looking good on the Y-Front.  My Y111 sample kit has arrived from FTDNA.  I also sent my 23andme V4 raw data to the administrator of the FTDNA Y Haplogroup.  He replied the next day "the raw data confirms that you are positive for M317 and negative for downstream SNPs M349 and M274. A very rare result for a NW European. It will be interesting to see who are your closest matches at 67 and 111 markers.".

So it doesn't look as though my L1b has anything to do with the M349 Rhine-Danube cluster.  I wonder where it comes from, how and when it got into an English ancestry?  It's starting to dawn on me just how rare it is in NW Europe.  European Y-Haplogroup maps and tables simply don't display or list it, because Y-DNA Hg L is not even considered a European Haplogroup, nevermind on British Haplo-maps.  All of those R1b's and I2's.  Not an L in sight.  I can see that having an unusual haplogroup is a mixed blessing.  Sure it's interesting, but no one knows much about it, because there is so little data on it in Europe, and so little research.

I had my first case of disbelief of my L1b Y-DNA on an FTDNA surname project group.  I reported my Y haplogroup as reported by 23andme (using ISOGG 2009) as L2*  The administrator retorted "It is NOT the "L" haplogroup, instead, it is "I".  So I linked her  copy of my 23andMe Paternal line report.  This time she replied  "Goodness gracious Paul. I administer many, many projects and yours is the first "L" You see, it has problems.

Wouldn't it be just great if I found someone else descended from the Berkshire Brookers by their Y line, that had the same haplogroup?

Gedmatch Eurogene admixture results for an Englishman

GEDMATCH offers free tools for analysing the autosome DNA of your raw data, from 23andme or Ancestry.com.  One suite of tools that are useful for analysing population admixture, are the Eurogene.  As an English person, with strong paper English ancestry - including almost certainly early medieval admixture, I thought that I'd get a comparison out of the way.  See which "works" best for my known ancestry and likely heritage.  I'm trying oracles on my 23andMe V4 raw data, for 1. EU Test, 2. K13,and 3. V2 K15.
1. Eurogenes EU Test
Oracle

1 Cornish 4.6
2 English 5.01
3 NL 6.26
4 West_&_Central_German  6.92
5 Orcadian 7.02
6 IE 7.33
7 FR 7.51
8 Scottish 7.95
9 DK 9.39
10 NO 11.57

A bit strange that it sees me as first "Cornish".  I don't know where it got that reference from.  I have no known Cornish ancestry.  However, 2 and 3 are likely.  As a whole it's not a bad prediction, just that the ball landed a bit to the West.
What about mixed populations?  What are it's favourite admixtures between two populations for me?

1   83.7% English  +  16.3% French_Basque  @  3.11
2   79% English  +  21% ES  @  3.17
3   63.7% English  +  36.3% FR  @  3.18
4   80.2% English  +  19.8% PT  @  3.5
5   51.8% FR  +  48.2% Scottish  @  3.54

Okay, not bad - it's given up on the Cornish.  However, it seems to point to France, Spain, and Portugal as a secondary source.  That is eerie, because 23andme threw up a speculative 2.4% South European including 0.5% Iberian.  I do wonder if I actually do have some unrecorded South European ancestry, even Iberian.

2. Eurogenes K13
Oracle

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97
6 French 7.63
7 North_Dutch 7.76
8 Danish 7.95
9 North_German 8.17
10 Irish 8.22

I like K13.  The Dutch may be there in admixture, and I know that they do often share some common patterns with SE English.  So I can excuse it making it to position 1.  Then in second place, the ball scores a goal.  Yes, I am SE English.  Most of the other suggestions could represent ancient admixture.

How about two population proposals?

1   65.6% Southeast_English  +  34.4% French  @  2.03
2   84.9% Southeast_English  +  15.1% North_Italian  @  2.05
3   63.5% Norwegian  +  36.5% Spanish_Valencia  @  2.06
4   69.7% North_Dutch  +  30.3% Spanish_Valencia  @  2.08
5   87.5% Southeast_English  +  12.5% Tuscan  @  2.09

It's got the SE English spot on, but all of these Iberians again!  Is it trying to tell me something?

3. Eurogenes EU Test V2 K15
Oracle

1 Southwest_English 2.7
2 South_Dutch 3.98
3 Southeast_English 4.33
4 Irish 6.23
5 West_German 6.25
6 North_Dutch 6.79
7 West_Scottish 6.84
8 French 6.85
9 North_German 6.89
10 Danish 7.26

Very good, except again, a bit skewed to SW England.  However, to be fair, I do have some slightly westward ancestors in the Oxfordshire area.  The rest is spot on.
What does it offer as a hybrid?

1   73.9% Southwest_English  +  26.1% French  @  1.27
2   71.8% North_Dutch  +  28.2% Spanish_Cantabria  @  1.3
3   89.7% Southwest_English  +  10.3% North_Italian  @  1.35
4   91.6% Southwest_English  +  8.4% Tuscan  @  1.4
5   86.4% Southwest_English  +  13.6% Spanish_Galicia  @  1.43

Those Spanish again!  Goes for SW English over SE English as the primary ancestral population.

Out of these predictions, my gut feeling is that they are all good for single population match.  On two population mix, they all suggest Iberian minorities.  Either I have an undiscovered South European ancestor, or something else is going on.  Do other English get this?  I can't really pick a winner.