Ancient ancestry - K11 Ancients Common and rarer Alleles, and a fresh assessment

Emmanuel Benner - Prehistoric Man Hunting Bears

Above image by Emmanuel Benner the Younger [Public domain], via Wikimedia Commons

The new K11 Ancients Common and Rarer Alleles tests are being run by Dilawer Khan, creator of the Gedrosia stable of admixture calculators available on GEDMatch.com, and of the EurasianDNA.com website.  This new test uses a new set of principles, based on using ADMIXTURE to produce more reliable ancient results.  I commissioned him to run my own 23andMe file through the tests, to produce the following results and PCA's/

PCA for Common Alleles (my position "Norfolk"):

PCA for Rarer Alleles (my position "Norfolk"):

The K11 Ancients common Alleles results should reflect the older ancestry most accurately.  In summary, that gave me:

  1. 48.6% Neolithic Farmer
  2. 26.5% Copper Age Steppe Pastoralist
  3. 24.9% Western Hunter-Gatherer

Thank you Dilawer.

How have other tests seen similar admixture?

I previously commissioned David Wesolowski (Eurogenes stable on GEDMatch and of Eurogenes Blog) to run my raw file through his K7 Basal-rich test.  He produced the following results:

  1. 57.1% Villabruna-related
  2. 28.8% Basal-rich
  3. 14% Ancient North Eurasian.

These are two very different tests, of admixture between different sets of population, of different time periods.  What I do find interesting is the 14% percentage of ANE (Ancient north Eurasian) relates quite favourably to what I understand it's admixture percentage is to Yamna or Steppe pastoralist.  Dilawer gives me 26.5% Steppe.   I have previously heard that the Yamna were circa 50% ANE, and the remainder of mixture of other Western Eurasian Hunter-Gatherer groups, including Caucasus Hunter-Gatherers.

The K11 Ancients test does suggest that I have a surprisingly high amount of ancestry from the Neolithic Farmers, that were in Europe previous to the arrival of the Steppe migrants around 4,900 years ago.  This is actually consistent with my other Ancient admixture test results.  The K7 Basal-rich test for example, had given me 28.8% Basal.  The Basal Eurasians are a hypo-theoretical "ghost" population that was among the founding admixture of the Neolithic Farmers, in a similar way that the ANE were among the founding admixture of the Steppe Pastoralists.  Again then, the two tests do tally reasonably well in determining where my personal percentages of ancient DNA  originate.

Why do I have so high percentages of Neolithic Farmer and Basal Eurasian I do not know.  My DNA flavour is a slight extreme, and atypical even for an English person, and more so for a Briton.  My recorded genealogy is all SE English, mainly East Anglian.  I would love to see the results of other East Anglians, as I suspect to them, that I am not such an extreme.  However, even if this was the case, it doesn't explain why modern East Anglians would have lower Steppe, and more Neolithic than either West British, Scandinavians, or even ancient DNA from Anglo-Saxons.  Higher percentages of Neolithic ancestry today are usually found to the South, peaking in Sardinia, then Iberia.  A favoured explanation is that the SE English could have had a lot of input from the South, via the French during Norman and Medieval periods.  I'm not totally convinced - yet.

A third new ancient admixture test that I might use here in the MDLP Project Modern K11.  On GEDMatch Oracle, it proposes a number of genetic distances to ancient DNA samples:

1 British_Celtic @ 6.948432
2 Bell_Beaker_Germany @ 8.143357
3 Alberstedt_LN @ 8.426399
4 British_IronAge @ 9.027687
5 Halberstadt_LBA @ 10.273615
6 Bell_Beaker_Czech @ 12.190828
7 Hungary_BA @ 12.297826
8 Nordic_MN_B @ 12.959966
9 British_AngloSaxon @ 12.993559
10 Nordic_BA @ 13.170285

Using 4 populations approximation:
1 Bell_Beaker_Germany + Bell_Beaker_Germany + Corded_Ware_Germany + Hungary_CA @ 1.085814
2 BenzigerodeHeimburg_LN + BenzigerodeHeimburg_LN + Corded_Ware_Estonia + Hungary_CA @ 1.089547
3 Alberstedt_LN + Bell_Beaker_Germany + Corded_Ware_Germany + Hungary_CA @ 1.117882
4 Bell_Beaker_Germany + BenzigerodeHeimburg_LN + Hungary_CA + Srubnaya_LBA @ 1.149613
5 Bell_Beaker_Germany + British_IronAge + Hungary_CA + Karsdorf_LN @ 1.185312
6 Alberstedt_LN + BenzigerodeHeimburg_LN + Hungary_CA + Sintashta_MBA @ 1.226794
7 Nordic_BattleAxe + Hungary_BA + Hungary_CA + Karsdorf_LN @ 1.234930
8 Nordic_BattleAxe + BenzigerodeHeimburg_LN + Hungary_CA + Unetice_EBA @ 1.238376
9 Alberstedt_LN + Hungary_BA + Hungary_CA + Yamnaya_Samara_EBA @ 1.247371
10 Bell_Beaker_Germany + Hungary_CA + Nordic_LN + Srubnaya_LBA @ 1.268124

If I look at four population distances, then based on the samples available in the test, I'm looking pretty European Bell Beaker, with Corded Ware and Yamna appearing. My closest single population in the samples is a surprising British Celtic!  More samples from the European Neolithic might turn those results around.

Autosomal DNA Tests for Genealogy

First a disclaimer.  I'm very new to the whole world of genetic genealogy.  I'm not new however, to traditional genealogy, and I do have a pretty good amateur understanding of relative archaeological and anthropological discussions over the past fifty years.  The following is not meant as a critique of genetic genealogy, so much as a review, or my experience, of ancestry composition based on autosomal DNA analysis.

Let's start with my paper trail.

Traditional Genealogy

I am English by ethnicity, British by nationality, and a subject of Queen Elizabeth II (often now referred to as a UK Citizen).

My paper recorded ancestry consists of the genealogical records of:

  • Generation 1 has 1 individual. (100.00%)
  • Generation 2 has 2 individuals. (100.00%)
  • Generation 3 has 4 individuals. (100.00%)
  • Generation 4 has 8 individuals. (100.00%)
  • Generation 5 has 16 individuals. (100.00%)
  • Generation 6 has 29 individuals. (90.62%)
  • Generation 7 has 49 individuals. (76.56%)
  • Generation 8 has 35 individuals. (27.34%)
  • Generation 9 has 24 individuals. (10.16%)
  • Generation 10 has 10 individuals. (2.34%)
  • Generation 11 has 4 individuals. (0.39%)
  • Total ancestors in generations 2 to 11 is 181. (9.04%)

All 181 ancestors, reaching back to the 1690's, appear to be English born, of English ethnicity, with English surnames.  The majority of them (100% on my mother's side, and 81% on my father's side) were East Anglian, with the vast majority of that percentage being born in the county of Norfolk.  Religions recorded or indicated were CofE Anglican or non-conformist Christian.  No sign of any Catholicism, Islam, or Judaism.

Therefore it would look pretty likely, that I can claim English heritage, wouldn't you agree?

Genetic Genealogy and Ancestry Prediction

There are three aspects or avenues of inquiry, available for genetic genealogy.  First of all, the two sex haplogroups; the y-DNA, and the mt-DNA. These two "signals" are referred to as haplogroups.

  1. The y-DNA.  This follows the Y chromosome.  It is only carried by men.  It is passed along the paternal line, and only by that line, from grandfather, down to father, down to son, until the line is broken.  What a lot of people do often misunderstand, is that it does not represent 50% of your ancestry.  It does not represent all of your biological father's ancestry.  For example, his mother's father, and her brothers, although on your father's side, would most likely carry a different y-DNA haplogroup.  It only comes down an uninterrupted strictly paternal line.  Even at Generation 7 (g.g.g.g grandparents) above, it would have been carried by one out of my sixty four biological ancestors at that generation.  The other thirty one g.g.g.g grandfathers for that generation may have carried different Y haplogroups.
  2. The mt-DNA.  Although a very different type of DNA, this one works as the opposite sex haplogroup.  It is a signal that is passed down the strictly maternal line, from grandmother, to mother, to her children.  Yes, we men do inherit our mother's mt_DNA, but we can't pass it down.  Only our sisters can.
  3. The au-DNA, better known as Autosomal DNA.  Whereas the former two sex haplogroups are handy, because we can measure their mutations, and track their formation and movement across thousands of years, au-DNA really is the stuff that we are made of - all of the SNPs on our chromosomes that personalise us within the human genome.  We inherit our au-DNA from all of our recent ancestors.  Roughly 50% from our biological mother, and 50% from our biological father.  Equally, we could say on average, 25% from each grandparent, or 12.5% from each great grandparent.  However, it is messy.  At every reproduction (meiosis), it gets messed up by recombination.  Not only that, but go back much more than six generations, and it becomes more and more likely that you can lose entire lineages.  You can have no surviving trace of any DNA from for example, a particular g.g.g.g.g grandparent.

Autosomal DNA is what makes us individuals, gives us our hereditary traits.  It is passed down from many ancestors, via our parents.  However, the sex haplogroups are of interest because they can be traced across the globe, and the millennia.  As we gain more and more data - both from living populations, and ancient DNA from archaeological finds, so we will be able to track the STR and SNP mutation data more precisely.

However, what about poor old messed up autosomal DNA?  It represents our entire biological heritage over many generations. It is what we are. However, making sense of it is less easy, less precise.  Genetic genealogists are making progress, but it is far less of a precise science than either of the haplogroups.  They use calculators, that measure the segments of DNA cross the chromosomes, looking for patterns that they recognise from a number of known reference populations.  From that, these calculators predict an ancestry.  Exactly what and when that ancestry refers to, does seem to vary from one calculator to another.  There is an argument that the precision can be improved if you also test close known relatives including at least one parent.  The results can then be phased.  I'm actually waiting for the results for my mother, so that I can see my own au-DNA ancestry results phased and corrected.

So lets have a bit of fun, and see what some of the calculators suggest for my autosomal DNA, at least before any phasing with my mother's DNA.  What do they make of my 100% English paper ancestry?

23andMe.com Ancestry Composition Standard Mode

99.9% European.

Broken into:

83% NW European

17% Broadly (unassigned) European

I think that's pretty cool.  As I'm getting to know au-DNA predictions, so as I'm learning to appreciate it when they get the right continent, and the right corner of that continent.  That is more than they could do a decade or two ago.  The prediction is correct, I am a NW European.  I'm not a West African, a South Asian, or a East Siberian.

23andMe.com Ancestry Composition Speculative Mode

100% European

Broken into:

94% NW European

3% S European

3% Broadly (unassigned) European.

Whoa, where did that South European come from?  It could just be a stray incorrectly identified signal, or it could be telling me that one of my ancestors, maybe around Generation 6, were from down south!  Lets break down the prediction further.  First, the NW European:

32% British & Irish

27% French & German

7% Scandinavian

But surely I should be 100% British & Irish?  Not only 32%.  I have my own ideas about this.  I think that although 23andMe claims that Ancestry Composition only represents the ancestry of the past 300 to 500 years (the so-called migration period, as sold to USA customers), that it gets confused by earlier migrations across their reference populations, including those during the early medieval period, and perhaps even some of those during late prehistory.  I've noticed that across Ireland and Britain, the further to the east, the more diluted the 23andMe British & Irish assignment.  People of solid Irish ancestry get between 85% and 98% British & Irish.  My East Anglian results, mixed between British & Irish, French & German, and Scandinavian, are actually rather more like those received by Dutch customers of 23andMe.

As for that Southern European prediction, how does that break down?

0.5% Iberian

2.4% Broadly (unassigned) South European.

Which if taken seriously, might suggest that I have an unknown Spanish or Portuguese ancestor around Generation 6.  If I did take it seriously that is.  I wonder what my mother's test will reveal?

DNA.Land.com Ancestry Composition

This is a third party site, that you can upload your 23andMe V4 raw data to, and see what their calculators predict for your ancestry.  It has recently had it's ancestry composition revised.  What did that make of my 100% English au-DNA?

West Eurasian 100%.

I like that designation, the amateur anthropologist in me prefers that broad designation over "European".  Broken down:

77% North/Central European

19% South European

2.4% Finnish

1.3% unassigned.

What?  Why not 100% North/Central European?  Finnish?  Did some early medieval Scandinavian settlers of East Anglia bring it?  Or is it a false signal?  Misidentified au-DNA?

That darned South European kicked in again.  I'm here looking at a biological cuckoo NPE (non-parental event) at around Generation 5 or even more recent!  Did a great grandmother secretly have a South European lover?  But this South European breaks down further:

13% Balkan

6% Italian.

Oh my goodness, whereas 23andMe speculative mode suggested SW Europe - this one suggests SE Europe!  Do I have a secret Albanian great grandfather?  Or is it all nonsense?

WeGene.com

This is a cracking new third party DNA analyser.  It is based in China, and it's predictors appear to calculate mainly for a Chinese market.  It not only predicts your ancestry composition, but also your two sex haplogroups, and lots of traits and health predictions to compliment those of 23andMe.  It even tries to predict your genetic disposition to sexuality!

It will allow you to send your 23andMe V4 raw data direct to it's own calculators.  However, at the moment the website is almost entirely in Chinese (Mandarin?).  There are two options.  1) At the bottom of the webpages is a hyperlink to English, which gives, in English, a basic ancestry composition, and your haplogroups.  It does not include English versions of the health and trait results.  2) use an online translator, such as the one built into the Google Chrome browser.  It actually serves pretty well.

On sex haplogroups they give my Y-DNA as

L1.  Not bad, but they didn't make it to L1b or L-M317.

My mtDNA?

H6a1a8.  Very good.  Better than 23andMe's H6a1, and the same as the mthap program.

But this is about au-DNA, how did they do, what did they make of my 100% English ancestry?

81% French

19% English/Briton

Now, that sounds pretty awful, but on closer inspection, I'm impressed.  No South European great grandfather.  Okay, so most of my DNA has been placed on the wrong side of the Channel.  However, I know that French and English DNA is actually very close.  Recent surveys even suggest that the English have inherited a lot of common ancestry with the French during unknown migration late in prehistory.  So again - they very much got the right corner of the right Continent.  Well done WeGene.

GEDmatch.com Eurogenes K13

GEDmatch is a website that you can upload raw data not only from 23andMe, but from a range of testers, and from V3 chips as well as V4.  It hosts a number of tools and predictors - some Open Source.  Some of these predictors are for Admixture or ancestry composition.  They measure your ancestry in terms of distance from known reference populations.  The lower the number, the closer you are to their reference.  They use calculators known as oracles to predict ancestry, including mixed ancestry or admixture.

The oracles on the Eurogenes K13 and K15 calculator models have a good reputation at working with West Eurasian ancestry.  So how does K13 first, score my 100% English ancestry?

On Single Population Sharing, it rates my DNA against the closest references.  In order of closest to not so close, the top five are:

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97

I think that's a cracking result.  Okay, it thinks that I'm closer to South Dutch, than I am to SE English, but so close - and my East Anglian ancestry most likely does include a lot of admixture from the Low Countries from the early medieval period.  I really like Eurogenes K13.

Okay, let's now use the Oracle 4 option, to suggest admixture.  First on three populations admixing to create my DNA, what comes closest?

50% Southeast_English +25% Spanish_Valencia +25% Swedish @ 2.087456

Well that's interesting!  The SE English hit the net.  The Swedish?  Could be ancient Scandinavian admixture - but the Iberian prediction has reemerged!

On four populations admixing?

1 Southeast_English + Southeast_English + Spanish_Valencia + Swedish @ 2.087456
2 Southeast_English + Southeast_English + Spanish_Murcia + Swedish @ 2.147237
3 Norwegian + Portuguese + Southeast_English + Southeast_English @ 2.216714
4 Danish + Portuguese + Southeast_English + Southeast_English @ 2.225334
5 Portuguese + Southeast_English + Southeast_English + Swedish @ 2.230991

Oh my goodness.  K13 agrees with 23andMe AC, that I have an Iberian link.  I'm now really starting to wonder.

Let's finish off by trying K15 on my 100% English ancestry:

GEDmatch.com Eurogenes EU test V2 K15


Using Oracle for single population first, the top five closest:

1 Southwest_English 2.7
2 South_Dutch 3.98
3 Southeast_English 4.33
4 Irish 6.23
5 West_German 6.25

Okay, I'm SE English, not SW English, but pretty impressive again.

Using the oracle 4 for three population admixture, what mix comes closest to my auDNA?

50% Southwest_English +25% Spanish_Castilla_Y_Leon +25% West_Norwegian @ 1.080952

That Iberian back again!

Top five mix ups of populations closest to me?

1 Southwest_English + Southwest_English + Spanish_Castilla_Y_Leon + West_Norwegian @ 1.080952
2 Irish + North_Dutch + Southwest_English + Spanish_Galicia @ 1.111268
3 North_Dutch + Southwest_English + Spanish_Galicia + West_Scottish @ 1.282744
4 Southeast_English + Southwest_English + Spanish_Castilla_Y_Leon + West_Norwegian @ 1.295819
5 North_Dutch + North_Dutch + Southwest_English + Spanish_Castilla_Y_Leon @ 1.304939

I can't help preferring the K13 results to the EU test V2 K15 - simply because it recognises me better as SE English, rather than to their SW English reference.

Conclusions

If anyone ever bothers reading this far too lengthy post, I hope that I have imparted the following lessons:

  • Don't expect DNA Ancestry tests to pin point an actual country of ancestry.  They're not no where near that good yet.  The populations of West Eurasia, and elsewhere, are actually all mixed up, or share a lot of recent admixture.  In addition, many European nation-states are quite recent inventions.  I've seen the borders of Europe change in my short lifetime.
  • Don't expect precision.  If for example, you are an American, and a 23andMe AC test suggests only 32% British & Irish, then you could actually have 100% English ancestry over the past 300 years!  We're so mixed up, that these tests are struggling to part and identify us by nationality.
  • If you are willing to share your raw data (there are privacy issues), then have fun trying out all of these third party calculators.  It's a lot of fun as you can see.  They rarely agree.  There are other tools on GEDmatch for example, where you can compare DNA along with .gedcom genealogical files with other users - and look for shared segments on the chromosomes.  You can also compare your DNA to that of ancient populations.
  • Treat au-DNA differently to haplogroup results.  au-DNA is very interesting, and represents so much of our ancestry, if we could just sort some of the mess out.  You can partially do this by phasing your results with those of close relatives.  It is worthwhile phasing with at least one biological parent, if you can.  However, haplogroup results, provide by their mutations incredible stories over much longer periods - thousands of years.  A different kind of genealogy.  As we gather more data, and reference it also to ancient-DNA, so it will tell us more and more about two lines of descent.  Perhaps even into historical times.

Exploring Gedmatch Eurogenes

The above grave is of my great great grandparents Robert and Ann Smith at Attleborough, Norfolk.

L1b Y-DNA News

First of all, it's looking good on the Y-Front.  My Y111 sample kit has arrived from FTDNA.  I also sent my 23andme V4 raw data to the administrator of the FTDNA Y Haplogroup.  He replied the next day "the raw data confirms that you are positive for M317 and negative for downstream SNPs M349 and M274. A very rare result for a NW European. It will be interesting to see who are your closest matches at 67 and 111 markers.".

So it doesn't look as though my L1b has anything to do with the M349 Rhine-Danube cluster.  I wonder where it comes from, how and when it got into an English ancestry?  It's starting to dawn on me just how rare it is in NW Europe.  European Y-Haplogroup maps and tables simply don't display or list it, because Y-DNA Hg L is not even considered a European Haplogroup, nevermind on British Haplo-maps.  All of those R1b's and I2's.  Not an L in sight.  I can see that having an unusual haplogroup is a mixed blessing.  Sure it's interesting, but no one knows much about it, because there is so little data on it in Europe, and so little research.

I had my first case of disbelief of my L1b Y-DNA on an FTDNA surname project group.  I reported my Y haplogroup as reported by 23andme (using ISOGG 2009) as L2*  The administrator retorted "It is NOT the "L" haplogroup, instead, it is "I".  So I linked her  copy of my 23andMe Paternal line report.  This time she replied  "Goodness gracious Paul. I administer many, many projects and yours is the first "L" You see, it has problems.

Wouldn't it be just great if I found someone else descended from the Berkshire Brookers by their Y line, that had the same haplogroup?

Gedmatch Eurogene admixture results for an Englishman

GEDMATCH offers free tools for analysing the autosome DNA of your raw data, from 23andme or Ancestry.com.  One suite of tools that are useful for analysing population admixture, are the Eurogene.  As an English person, with strong paper English ancestry - including almost certainly early medieval admixture, I thought that I'd get a comparison out of the way.  See which "works" best for my known ancestry and likely heritage.  I'm trying oracles on my 23andMe V4 raw data, for 1. EU Test, 2. K13,and 3. V2 K15.
1. Eurogenes EU Test
Oracle

1 Cornish 4.6
2 English 5.01
3 NL 6.26
4 West_&_Central_German  6.92
5 Orcadian 7.02
6 IE 7.33
7 FR 7.51
8 Scottish 7.95
9 DK 9.39
10 NO 11.57

A bit strange that it sees me as first "Cornish".  I don't know where it got that reference from.  I have no known Cornish ancestry.  However, 2 and 3 are likely.  As a whole it's not a bad prediction, just that the ball landed a bit to the West.
What about mixed populations?  What are it's favourite admixtures between two populations for me?

1   83.7% English  +  16.3% French_Basque  @  3.11
2   79% English  +  21% ES  @  3.17
3   63.7% English  +  36.3% FR  @  3.18
4   80.2% English  +  19.8% PT  @  3.5
5   51.8% FR  +  48.2% Scottish  @  3.54

Okay, not bad - it's given up on the Cornish.  However, it seems to point to France, Spain, and Portugal as a secondary source.  That is eerie, because 23andme threw up a speculative 2.4% South European including 0.5% Iberian.  I do wonder if I actually do have some unrecorded South European ancestry, even Iberian.

2. Eurogenes K13
Oracle

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97
6 French 7.63
7 North_Dutch 7.76
8 Danish 7.95
9 North_German 8.17
10 Irish 8.22

I like K13.  The Dutch may be there in admixture, and I know that they do often share some common patterns with SE English.  So I can excuse it making it to position 1.  Then in second place, the ball scores a goal.  Yes, I am SE English.  Most of the other suggestions could represent ancient admixture.

How about two population proposals?

1   65.6% Southeast_English  +  34.4% French  @  2.03
2   84.9% Southeast_English  +  15.1% North_Italian  @  2.05
3   63.5% Norwegian  +  36.5% Spanish_Valencia  @  2.06
4   69.7% North_Dutch  +  30.3% Spanish_Valencia  @  2.08
5   87.5% Southeast_English  +  12.5% Tuscan  @  2.09

It's got the SE English spot on, but all of these Iberians again!  Is it trying to tell me something?

3. Eurogenes EU Test V2 K15
Oracle

1 Southwest_English 2.7
2 South_Dutch 3.98
3 Southeast_English 4.33
4 Irish 6.23
5 West_German 6.25
6 North_Dutch 6.79
7 West_Scottish 6.84
8 French 6.85
9 North_German 6.89
10 Danish 7.26

Very good, except again, a bit skewed to SW England.  However, to be fair, I do have some slightly westward ancestors in the Oxfordshire area.  The rest is spot on.
What does it offer as a hybrid?

1   73.9% Southwest_English  +  26.1% French  @  1.27
2   71.8% North_Dutch  +  28.2% Spanish_Cantabria  @  1.3
3   89.7% Southwest_English  +  10.3% North_Italian  @  1.35
4   91.6% Southwest_English  +  8.4% Tuscan  @  1.4
5   86.4% Southwest_English  +  13.6% Spanish_Galicia  @  1.43

Those Spanish again!  Goes for SW English over SE English as the primary ancestral population.

Out of these predictions, my gut feeling is that they are all good for single population match.  On two population mix, they all suggest Iberian minorities.  Either I have an undiscovered South European ancestor, or something else is going on.  Do other English get this?  I can't really pick a winner.