DNA.land - raw file comparison

Comparing the ancestry results of two raw files from the same tester (myself) uploaded to DNA.land.

Myself.
Paper trail and family history 100% SE English, mainly East Anglian. 249 direct ancestors named in documentary research.

23andMe result before phasing (spec mode):
100% European broken into
94% Northwestern Europe
3% Southern Europe
3% unassigned European

Broken down further to:
32% British & Irish
27% French & German
7% Scandinavian
29% Broadly NW European
0.5% Iberian
2.4% Broadly South European

23andMe result after phasing with one parent (spec mode):
100% European
96% Northwestern European
1.8% Southern European
2.2% Broadly European

Broken down further to:
37% British & Irish
22% French & German
1% Scandinavian
36% Broadly NW European
1.8% Broadly Southern European

FT-DNA Family Finder My Origins.
100% European

Broken down further to:
36% British Isles
32% Southern Europe
26% Scandinavia
6% Eastern Europe

Now I am comparing the two raw files for the same person, uploaded to, and analysed for ancestry, by DNA.land:

23andMe V4 raw file for myself on DNA.land:

100% West Eurasian.
77% North West European
19% South European (broken into 13% Balkan / 6.1% South/Central European
2.4% Finnish
1.3% Ambiguous

FT-DNA FF raw file for myself on DNA.land:
100% West Eurasian
75% North West European
25% Balkan

Just for more information:

My mother's 23andMe raw file on on DNA.land:
100% West Eurasian
80% North West European
10% South European (broken into 7.7% South/Central Europe / 2.4% Balkan)
6.4% Finnish
2.3% Sardinian
1.5% Ambiguous 

Conclusion

Phasing on 23andme suggested that I inherit (in spec mode) nearly 1% Southern European from each parent. That each of my very East Anglian parents had a Southern European ancestor within the past 300 - 500 years is highly unlikely, considering 1) the paper trail, and 2) local history in this rural area. Therefore I feel that this reflects much older background ancestry for the local SE English population. Ancient DNA calculators also predict that I have higher than average levels of ENF/EEF than other local populations such as the Irish and Scottish, and lower levels of ANE. This appears associated with my Southern European flavour that some tests suggest as a minority percentage. FT-DNA suggested 32% Southern European! Some commentators have suggested that this might indicate significant French admixture to the SE English population, perhaps during the Norman and Medieval periods, carrying a southern signal higher into lowland Britain. Earlier admixture into Lowland Britain from the south, is also possible during late prehistory and the Roman period.

DNA.land has been noted for a bias to predicting both Balkan, and Finnish ancestry for testers, and my results are no exception. I feel that as with all current autosomal DNA test/analysis for ancestry, that DNA.land has a way to go. As with the other predictors, it is very successful at recognising me as 100% European (although ironically my Y-DNA is Western Asian). It is fair at spotting me as NW European, but NOT as successful as 23andMe. Below that level, once again it falls down - but I feel that this is understandable, as most predictors fail down for anciently admixed populations such as the English. They are far more successful at spotting for example, Irish/Scottish. For the English, we tend to be ripped across different European populations. The Southern European element is a particular surprise - but all of the testers so far have been confused by this background signal. Dienekes has himself, suggested Southern European DNA coming into England with the Normans:

http://dienekes.blogspot.co.uk/2016/...-ancestry.html

I'm starting to settle with this hypothesis, although I still have some interest in possible Southern European admixture earlier.

Finally... The two raw files for one person, have produced slightly different results. The FT-DNA raw file has I believe, more tested (but different?) SNPs than the 23andMe file. It would be interesting to know the differences. DNA.land, using the FT-DNA FF file, does not see Finnish, or South/Central European, but enhances the Balkan.

FT-DNA My Ancient Origins

Family Tree DNA (FTDNA) have released a new, unexpected feature to their autosomal DNA Family Finder package.  It is clearly aimed at their customers (both new and existing), of mainly European heritage.  It uses ancient DNA references to plot our ancient ancestry.  It breaks European's ancient Eurasian ancestry down into four groups:

  • Hunter-Gatherer (Western Hunter-Gatherer)
  • Farmer (Early Neolithic Farmer)
  • Metal Age Invader (Yamnaya / Bronze Age Steppe immigration)
  • Non European (Other)

First of all, I welcome this new analysis.  Combined with the latest cutting edge research into the origin of the Eurasians, and with other open source calculators of ancient origin available via GedMatch - I feel that it can help us get personal with our ancient Eurasian roots.

However... unfortunately it has faults, as the online community quickly picked up.  In particular, with the Metal Age Invader component.  FT-DNA suggests that it represents the Yamnaya admixture event - where Copper or Early Bronze Age pasturalists, mounted on their horses, expanded from the Pontic and Caspian Steppes of Eurasia, into Europe around 5,000 years ago.  But 1) it doesn't include any ANE (Ancient North Eurasian) component from the Mal'ta-Buret reference, and 2) it of course cannot distinguish it's Western Hunter-Gatherer reference from that inherited directly within Europe or elsewhere.

All that the FT-DNA Metal Age Invader reference appears to represent, is the population known as Caucasus Hunter-Gatherer.  A minority component of Yamnaya DNA as we currently see it.

For the record, as the screendump above shows, my FT-DNA Ancient Origins are:

9% Metal Age Invader

47% Farmer

44% Hunter-Gatherer

0% Non European

Now that I've got that covered, I can move onto my next blog post, which I find more interesting - how I use My Ancient Origins to try to reconstruct my ancestry from 11,000 to 4,000 years ago.


Y Haplotype L1b2c

By Hellerick (Own work) [CC BY-SA 4.0], via Wikimedia Commons.  Modified by Paul Brooker.

I've created this distribution map of known Y haplogroup L, L1b2c or L-SK1414. This is my Y-DNA haplotype.  Not a lot of dots there are there?  This is how rare that this clade is.  L1a and L1b most likely (in my opinion) originated during the last Ice Age circa 18,000 years ago, south of the Caucasus, and west of the Caspian Sea in Western Asia.  In other words, in the area of present day Armenia, Azerbaijan, and North-west Iran.  Again, I emphasise, that is just my opinion, looking at present-time evidence.

Y haplogroup L itself may have diverged between L1 and L2, not so much earlier, or so far away from this region.  Again, just my present opinion.

My sub clade of L1b, is so rare, that it is impossible to say.  As can be seen from the map.  However, this is my blog, so I'm going to push out on this one.  My very best guess would be further to the East than it's parent.  I suspect South East of the Caspian Sea, in what is now Eastern Iran.  I could well be wrong.  We have so few tests from nearby Afghanistan for example.  So far, the SNP SK1414 has only been reported twice.  1) in Makran, SW Pakistan, in a Balochi speaking man.  Balochi is an Iranian language, closely related to North-West Iranian languages.  Researchers suggest that the Balochi people of Makran, largely migrated from south west of the Caspian.

The only other guy in the world so far confirmed is little old me, an Englishman.  I trace my surname (direct paternal) line back to the Thames Valley of Oxfordshire / Berkshire 270 years ago.  If my biological line follows that.  A number of STR testers of English descent appear connected to me by STR analysis.  They all descend from Thomas Chandler, who lived around the same time as my earliest recorded ancestor - only 32 miles away at Basingstoke.

From all of the evidence, I conclude that my Y ancestral line moved, probably in one generation, from Western Asia, perhaps from he edge of Persia, to Southern England conservatively between 2,000 and 400 years ago.  Although I would speculate between 1,600 and 600 years ago - during the Medieval or close by.

A new test - LivingDNA test for Ancestry

You might think that following my recent posts, that I've lost all faith in DNA testing for Ancestry.  Not at all.  I just object when people take the analysis results of autosomal DNA tests for ancestry, as infallible truths.  They are clearly not.

So far this year, I have commissioned two 23andMe tests, and three FT-DNA tests, a FullGenomes analysis, and a YFull analysis.  I have also used free analysis at WeGene and DNA.land, and have run three raw files on GEDmatch calculators.  You'd might also think that I've done enough testing for one year!  I thought that as well.  Then a new service just entered the market.

Living DNA Ancestry attracted my commission on two particular points.  1) it has an incredible British reference, that promises to break ancestry composition into 30 British regions - in addition to global analysis.  If it works, then this is a must for people with significant British ancestry.  2) it uses the latest cutting edge test chip.  The latest Illumina chip based on Global Screen Array (GSA).  In addition, it uses a European based lab (Denmark), it tests Y-DNA, mtDNA, and autosomes.  It tests more SNPs on all three counts, than other current chips used by competitors offering autosomal plus tests.  Raw files for the test results will be available for download.

The British Reference

Living DNA will be using a British reference broken down into an incredible 30 regions, across England, Scotland, Wales, Orkney, and Northern Ireland.  The reference uses the much heralded POBI (Peopling of the British Isles 2015) data set.  This project collected 4,500 blood samples from people that could claim four grandparents in the same area, from across the regions of Britain.

A little about the POBI project below:

The British reference does not include the Republic of Ireland.  However, LivingDNA are confident that they have collected a good global reference, and I understand, that they are seeking a similar quality Irish data-set for the future.  

In comparison, other providers of DNA tests for ancestry, only reference to Britain, or the British Isles & Ireland, as a single reference point.  And as can be seen by my previous posts, with limited success.

They also hope to provide imports for formats of raw file from other test companies in the future.  LivingDNA do not themselves currently offer relative matching, or health information.  Their service is for now, primarily for ancestry.

The Chip

They will be using a custom version of the latest Illumina chip technology, the Global Screen Array (GSA).  It is encoded with:

650,000 autosomal DNA SNPs

20,000 Y-DNA SNPs

4,000 MT-DNA SNPs.

In comparison for example, the 23andMe V4 chip scans for:

577,000 atDNA SNPs

2,329 Y-DNA SNPs

3,100 MT-DNA SNPs.

I hope that LivingDNA will also use up-to-date haplogroup nomenclature and information.  23andMe with their V4 chip still use very dated 2009 nomenclature.

So, let's see if this new service is any improvement to my results, compared with the hit and miss of 23andMe, and Family Tree DNA (FT-DNA).  Will they be able to identify and locate my English roots successfully?  What will the improved chip make of my haplogroups?

The Southern European DNA enigma. Option 1. The DNA Analysis is true

My great grandfather Fred Smith, and my great Uncle Lenny.

Option 1.  The DNA Analysis for Ancestry is true

This option supports the commercial DNA for ancestry companies claim, that I have Southern European ancestry.  For this to be the case, my Southern European ancestors must have either a) been hidden in the gaps, the missing ancestors.  b) be NPE (non parental events - biological ancestors that are contrary to recorded ancestors.  Usually male). c) predate my genealogical record over the past 360 years or d) my recorded genealogy is faulty.  I have badly researched my ancestry and have made mistakes.

What gaps are there?  All of my generations are complete to and including my Generation 5.  I have all of the names of my 16 direct ancestors at that generation (great great grandparents).  All appear totally English, of English religious denominations.  Their surnames and location were: Brooker of London (previously Oxfordshire), Shawers of London, Baxter of Norfolk, Barber of Norfolk, Smith of Norfolk, Peach of Norfolk, Barber (again) of Norfolk, Ellis of Norfolk, Curtis of Norfolk, Rose of Norfolk, Key of Norfolk, Goffen of Norfolk, Tammas-Tovell of Norfolk, Lawn of Norfolk, Thacker of Norfolk, and Daynes of Norfolk.

I have photographs of three of them.

Everything looks utterly English - the majority East Anglian.

The gaps start to appear at Generation 6 (G.G.G Grandparents)  Three missing male ancestors - all missing fathers of illegitimate births.  29 out of 32 direct ancestors recorded though.  All appear English again:  

Brooker of Oxfordshire, Edney of Oxfordshire, Shawers of London, Durran of London (previously Oxfordshire), Baxter of Norfolk, Barber of Norfolk, Smith of Norfolk, Hewitt of Norfolk, Peach of Lincolnshire, Riches of Norfolk, Barber of Norfolk, Ellis of Norfolk, Goodram of Norfolk, Curtis of Norfolk, Larke of Norfolk, Rose of Norfolk, Barker of Norfolk, Key of Norfolk, Waters of Norfolk, Goffen of Norfolk, Nichols of Norfolk, Tovell of Norfolk, Tammas of Norfolk, Lawn of Norfolk, Springall of Norfolk, Thacker of Norfolk, Daynes of Norfolk, Quantrell of Norfolk.  Oh, and a "Mary Ann" of Norfolk.

Again, all English, English religious denominations.  Mainly rural working class East Anglian.  No sign of any foreigners.

The record does start to really fall away at Generation 8.  From then on, it's a minority of lines recorded, stretching back to the 1680's.  However, at no where on my record of 207 direct ancestors, do I see anything that looks remotely non-English, never mind Southern European.  No sign of any Catholicism anywhere.

Let's just consider percentages of DNA though.

Each grandparent gives me 25% on average.

Generation 3 (grandparent) 12.5%

Generation 4 (great grandparent) 6%

Generation 5 (great great grandparent) 3%

Generation 6 (G.G.G grandparent) 1.5%

Beyond then genetic recombination starts to really kick in, and you may have zero DNA from any particular ancestral lineage.  It get's washed out.  Only if it comes down from a number of lines is admixture highly likely to survive further back.

23andMe (V4 AC in spec after phasing with one parent) claims that both of my parents had 2% Southern European DNA.  That takes it back to around MY Generation 6 or 7.  Sure, I'm missing 9% of Generation 6, and 20% of Generation 7.  My Southern European ancestors could have admixed then.  But what are the chances of it happening on both sides?  Possible, yes.  I think unlikely though.  No Southern European names or religions passed down.  When was this? Around 1780 to 1820.  Okay, if I want to piece national history into it, how about The Peninsular Wars (1807-1814)?  The Royal Norfolk Regiment took an active part in that campaign.  Could I have (presumably male) ancestors through both of my parents, that brought back Portuguese wives?  It is a possibility.  I'll acknowledge that.  But am I weaving history in order to make it fit the DNA analysis?

FT-DNA (FF My Origins) claims that I have 32% "Southern European" ancestry.  No sign of it in family history or photography.  Too much likeness of recorded fathers.  Okay, maybe it goes further back, but on multiple lines?  I think that we are pushing this one.  What is the chance of so many Southern Europeans given my above recorded or known English ancestry.  It couldn't have happened.

DNA.land gives me 19% Southern European, including 13% Balkan.  The same problem as the FT-DNA analysis.  It just doesn't wash.  It cannot fit.

Therefore I conclude:

  1. FT-DNA and DNA.land claims of my Southern European percentages cannot realistically be explained by gaps or missing ancestors.
  2. 23andMe claims of 2% Southern European could be explained by the missing gaps - just!  But would need quite a coincidence to be on both sides, just in those gaps.

That pretty much covers it for gaps, NPE's, etc.  If any Southern European on the other hand, predates my genealogical record, then it would need to be on multiple lines, and to have lost all sign of Southern European surnames, religions, and traditions.  I haven't seen any history of a mass Southern European migration to England 600 - 400 years ago.

Family Tree DNA Family Finder data V 23andMe raw data on GEDMATCH

Background

I'm South-east English in known paper ancestry, ethnicity, and heritage - mainly Norfolk East Anglian, where I still live, close to many known ancestors. I have 207 recorded ancestors on my tree, over the past 380 years. The majority lived in Norfolk, but some were Oxfordshire, Lincolnshire, Suffolk, and Berkshire. All appear to be English, with English surnames, English religions and denominations, overwhelmingly East Anglian:

Generation 1 has 1 individual. (100.00%)

Generation 2 has 2 individuals. (100.00%)

Generation 3 has 4 individuals. (100.00%)

Generation 4 has 8 individuals. (100.00%)

Generation 5 has 16 individuals. (100.00%)

Generation 6 has 29 individuals. (90.62%)

Generation 7 has 51 individuals. (79.69%)

Generation 8 has 47 individuals. (36.72%)

Generation 9 has 36 individuals. (14.84%)

Generation 10 has 10 individuals. (2.34%)

Generation 11 has 4 individuals. (0.39%)

Total ancestors in generations 2 to 11 is 207

I have previously tested 23andMe, FTDNA Y111, and FTDNA Big Y. My Y line is unusual, because it does originate in Western Asia, within the past few thousand years (L1b2c). However, there is no evidence of anything but European in any autosomal tests so far, so other than the Y, it appears to be washed out.

My 23andMe AC in spec mode (after phasing with one parent) is:

100% European

96% NW European

2% South European

2% broadly European


37% British & Irish

22% French & German

1% Scandinavian

36% broadly NW European

2% broadly South European


FTDNA Family Finder - My Origins


36% British Isles

32% Southern European

26% Scandinavian

6% Eastern European

I thought that it would be interesting to compare how a few important GEDMATCH calculators, see my raw data from Family Tree DNA, in comparison to the raw data from 23andMe:

GEDMATCH

23andMe raw data V ftDNA raw data

Eugenes K13 Oracle

23andMe data

Admix Results (sorted):

# Population Percent

1 North_Atlantic 47.58

2 Baltic 22.36

3 West_Med 15.65

4 East_Med 8.03

5 West_Asian 3.05

6 Red_Sea 1.42

7 Amerindian 0.74

8 South_Asian 0.71

9 Oceanian 0.46


Single population Sharing:


# Population (source) Distance

1 South_Dutch 3.89

2 Southeast_English 4.35

3 West_German 5.22

4 Southwest_English 6.24

5 Orcadian 6.97

6 French 7.63

7 North_Dutch 7.76

8 Danish 7.95

9 North_German 8.17

10 Irish 8.22


Family Tree DNA data

Admix Results (sorted):

# Population Percent

1 North_Atlantic 47.89

2 Baltic 22.68

3 West_Med 15.45

4 East_Med 7.41

5 West_Asian 3.11

6 Red_Sea 1.38

7 South_Asian 0.84

8 Amerindian 0.72

9 Oceanian 0.52


Single Population Sharing:


# Population (source) Distance

1 Southeast_English 3.75

2 South_Dutch 4.03

3 West_German 5.42

4 Southwest_English 5.68

5 Orcadian 6.33

6 North_Dutch 7.15

7 Danish 7.36

8 Irish 7.59

9 West_Scottish 7.62

10 North_German 7.7


Euogenes EUtest V2 K15

23andMe data

Admix Results (sorted):


# Population Percent

1 North_Sea 33.42

2 Atlantic 27.98

3 West_Med 12.24

4 Baltic 10.42

5 Eastern_Euro 7.04

6 West_Asian 3.52

7 East_Med 3.14

8 Red_Sea 1.48

9 Amerindian 0.39

10 Oceanian 0.19

11 South_Asian 0.18


Single Population Sharing:


# Population (source) Distance

1 Southwest_English 2.7

2 South_Dutch 3.98

3 Southeast_English 4.33

4 Irish 6.23

5 West_German 6.25

6 North_Dutch 6.79

7 West_Scottish 6.84

8 French 6.85

9 North_German 6.89

10 Danish 7.26


Family Tree DNA data

Admix Results (sorted):


# Population Percent

1 North_Sea 33.81

2 Atlantic 28.23

3 West_Med 12.04

4 Baltic 10.59

5 Eastern_Euro 6.84

6 West_Asian 3.66

7 East_Med 2.47

8 Red_Sea 1.46

9 Amerindian 0.35

10 South_Asian 0.31

11 Oceanian 0.25


Single Population Sharing:


# Population (source) Distance

1 Southwest_English 2.29

2 Southeast_English 4.02

3 South_Dutch 4.48

4 Irish 5.78

5 West_Scottish 6.41

6 North_Dutch 6.43

7 West_German 6.63

8 North_German 6.73

9 Danish 7.01

10 Orcadian 7.19


Gedrosia Eurasia K6 Oracle

23andMe data

Admix Results (sorted):


# Population Percent

1 West_European_Hunter_Gartherer 39.18

2 Natufian 38.8

3 Ancestral_North_Eurasian 20.85

4 Ancestral_South_Eurasian 0.82

5 East_Asian 0.35


Family Tree DNA data

Admix Results (sorted):


# Population Percent

1 West_European_Hunter_Gartherer 39.57

2 Natufian 38.66

3 Ancestral_North_Eurasian 20.75

4 East_Asian 0.86

5 Ancestral_South_Eurasian 0.16

Y Haplogroup L Resource page

Distribution Haplogroup L Y-DNA

By Crates (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC BY-SA 4.0-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/4.0-3.0-2.5-2.0-1.0)], via Wikimedia Commons.  Unmodified.

Introduction - Y-DNA, Haplogroups, SNPs, Haplotypes

The Y chromosome, and it's Y-DNA, are copied from father to son, down a strictly paternal lineage.  If I were to trace my entire direct ancestry back, I have two parents, four grandparents, eight great grandparents, sixteen great great grandparents.  Yet out of those sixteen great great grandparents (generation 5), who were born only circa 160 years ago, only one carried the Y-DNA that was passed down to me.  My eight great great grandmothers did not not inherit a Y chromosome from their fathers.  Most likely, my other seven great great grandfathers, carried distinctive differently marked Y-DNA.  Yet all sixteen biological great great grandparents have contributed to my overall atDNA (autosomal DNA).  Only one gave me my Y-DNA.  So you can see that Y-DNA represents only one narrow lineage.

Y-DNA, may on the face of it, appear to offer a limited understanding of total biological ancestry.  All sixteen of my great great grandparents were direct ancestors, not just the Y great great grandfather.  However, this lineage offers us evidence that can be genetically tracked, then mapped into relationship.  It could be done to ascertain parental, or non parental events.  It can be used to check the biological validity of relationship to cousins.  As more people investigate and record their haplogroups, haplotypes, STR markers, and SNPs, so we can for example, start to use them to map biological relationship further back.  Y-DNA is particularly useful, not only because of it's markers, but also because it can be plotted to surname studies.  In Western societies, the surname often follows the Y lineage for several generations.

However, Y-DNA (nor the maternal mtDNA) evidence doesn't just stop there.  As more people investigate, submit, and record their data from around the World - and as anthropologists and archaeologists add ancient DNA data from ancient and provenanced human remains to that record, so we can build and plot a world map of the human family, how it relates, how it was distributed globally throughout prehistory.

Both Y and mt DNA carries mutation markers, that define a HaplogroupA haplogroup is a family of shared descent.  These haplogroups are ancient.  The paternal Y-DNA haplogroup that this resource page is dedicated to has been designated as L.

However, mutations do not stop with the formation of a new haplogroup.  They continue through the generations.  As lineages divide between different sons, across many generations, so these mutations in the Y-DNA for example, continue to accumulate down the diverging lineages that once shared common descent.  We are all unique.  The sub clade of L that this page focuses on is L1b.  All male carriers of L1b will carry a SNP (Single Nucleotide Polymorphism) on their Y-DNA that has been designated as M317. This SNP will be downstream of another SNP that has been designated as M22.  Finally, a Y-DNA can be said to have a terminal SNP.  A terminal, refers not to the Haplogroup (in this case L), but can be used to define right down to the last SNP on the Y-DNA, that is shared with others on a record.  If someone for example, carries Y-DNA that is proven (or predicted by comparison) to be Y Haplogroup L, and to carry M317, then their Y terminal could be designated as L-M317, or alternatively, as L1b.  This is also sometimes referred to as a haplotype.  However, a haplotype can also refer to a particular STR.

Y Haplogroup L M20

The above image illustrates a modern day distribution of Y Haplogroup L (M20) as proposed and created by Anthropogenica user Passa.

Y Haplogroup K formed from Y Haplogroup IJK in the Y-DNA of hunter-gatherer fathers and sons, that share a MRCA (most recent common ancestor) during the Upper Palaeolithic, circa 45,400 years ago.  Where did these Y ancestors live at that time?  We think that they lived in Western or Southern Asia.  Iran is a favourite proposal. Earlier Y ancestors had most likely exited Africa 20,000 years earlier, and were well established in Asia.  They had most likely met and confronted another archaic human species, The Neanderthal. This was however, a time of great expansion by humans.  The first anatomically modern humans had recently entered Europe, while other modern humans were arriving in Australia.  The Ice Age was in a flux, but glaciation was advancing.

The most recent common Y ancestor to carry Y Haplogroup LT lived circa 42,600 years ago.  Then a mutation in the Y-DNA lead to the formation of Y Haplogroup L, with a most recent common ancestor 23,200 years ago, close to the time of the Last Glacial Maximum, when ice sheets were reaching their maximum positions.  K, LT, and early L, most likely all originated in Upper Palaeolithic hunter-gatherer populations living during the last Ice Age, in the area of modern day Syria, Iraq, Iran or Pakistan.  It was a time of increased stress on human populations, that were having to adapt to some severe environmental challenges, and may have at times faced isolation into a number of Ice Age Refuges.  Some of these Upper Palaeolithic, Ice Age hunter-gatherer refuges may have been close to the Black Sea, others close to the Caspian Sea, but they were most likely located somewhere between Eastern Anatolia, and Eastern Iran, south of the Caucasus.

L1 / L2 Divergence - the Odd L2's

The oldest divergence within Y Haplogroup L.  L1, as characterised by the SNP M22, diverged from L2, as characterised by the SNP L595.  L2 was only recently discovered, and forced an ISOGG revision of Y Haplogroup L and it's nomenclature that is still causing problems.  In this article, unless stated otherwise, I am using 2017 Nomenclature.  L2 or L-L595 is very rare, but has so far cropped up sporadically across Western Eurasia, including in Azeribaijan, Turkey, Sardinia, England, and Tartaristan.

That is L2 dealt with.  However, most Y Haplogroup L falls into L1. Let us start to look at the main branches of L1.  Remember, L1 is defined by the SNP M22:

Unofficial proposed tree for L1 (L-M22) 2016.  By Gökhan Zuzigo, modified by Paul Brooker.

Proposed Migration Map of L-M22 (L1) by Phylogeographer at https://phylographer.com/mygrations/?

The Big L1 Split - L1a and L1b

As can be seen above, this split occurred around 18,400 years ago, possibly somewhere between what is now Iran and Pakistan.  The L1a branches inherit the SNP M2481, and the L1b branches inherited M317.

First of all, let's look at L1a, because although it is not my sub clade, in terms of modern day population size, it appears to greatly outnumber any other L sub clade.

Pakistan and India - Present Day Home of L1a1 and L1a2

L1a splits again into two sub clades.  The split occurred around 17,400 years ago.  L1a1 (L-M27) and L1a2 (L-M357)

L1a1 (L-M27)

Defined by SNP M27 (on older nomenclature as still used by 23andMe, this was formerly L1*) is mainly found in India, particularly South West India, and in Sri Lanka. This is perhaps the most populous modern day L sub clade, found in 15% of Indian males.  However, it is not restricted to India, and has also been found in 20% of Balochi in Pakistan, and has also been reported in Kirghiz, Pashtun, Tajik, Uzbek, and Turkmen males across Central Asia.

L1a and L1a1 (L-M27) at Birds Eye Cave, Armenia 6161 years before present.

Ancient Y DNA from the Copper Age has emerged from this location in Armenia, and included L1a, and L1a1.  This might suggest, that although very successful today in India and Pakistan, that it has a Western Asian origin.

L1a2 (L-M357)

Has defined by SNP M357 (on older nomenclature as still used by 23andMe, this was formerly L3*).  This sub clade is mainly found in Pakistan, but also Saudi Arabia, Kuwait, The Chechen Republic, Tajikistan, India, and Afghanistan.  It has been found at 15% in Burusho populations, and at 25% in Kalash populations.  It is much more common in Pakistan than in India.

So, the L1a sub clades - spreading down into Southern Asia, and accounting for potentially millions of Y Men there.  Far more than any other branches of Y Haplogroup L.  However, Southern Asia is unlikely to be the origin of L.  That origin is more likely, as stated earlier, to be the place with the most diversity in branches.  That points more towards again towards Western Asia.  It's just that ancient carriers of L, appear to have been particularly successful in Southern Asia, and to have fathered more sons there.

L-M317 or L1b of Western Asia

We now move onto the branches of particular interest to myself, because I carry a Y Haplotype that belongs here.  L1b is defined by the SNP M317, that formed circa 18,400 years ago, most likely in the area of modern day Iran, or elsewhere in Western Asia.

Phylogenetic tree of L1b by Anthrogenica user Caspian (with permission):

Click on above hyperlink for full sized image

L1b is mainly distributed across Western Asia, from modern day Turkey, across to Pakistan.  However, as we will see, it also spreads in low densities across parts of Europe.  it is very much, the "Western L".

The Next split - L1b1 or L-M349.  The Levant, and Europe!

Around 14,000 years ago, another split occurred in the L1b (M317) branch. A new SNP, M349, defined L1b1.  Today, L1b1, or L-M349, is found in Western Asia, in Lebanon, Syria, Turkey, Armenia, etc.  However, it is also found scattered in low densities through parts of Europe.  It crops up in South Europe, often close the the Mediterranean Sea, including particularly in parts of Italy.  It also forms a light cluster in Central Europe.

A working map of Y haplogroup L sub clades by Edward Chernoff.  This map is incomplete, but is published here with permission of Edward Chernoff.  Copyrights applied.

Branching away from a common Y ancestor with L1b1 (M349), is another 14,000 year old line defined by SNP SK1412, L1b2.

L1b2 (L-SK1412) splits - Pontic Greeks, and the others...

13,000 years ago,, during a cold stage towards the end of the last Ice Age, the L1b2 (SK1412) Y branch divides again.  Very recent research suggests that it split into three lines: L-SK1415 (L1b2a), L-PH8 (L1b2b), and L-SK1414 (L1b2c).

L1b2a (L-SK1415), has as far as I know, only been detected in a Makrani Balochi survey in SW Pakistan.

L1b2b (PH8), is found in Turkey, Greece, Armenia, Chechen Republic, Iraq, etc.  It is associated particularly with the Pontic Greek ethnicity from Eastern Anatolia, and around the Black Sea.  A further division within PH8 has been detected at around 3,000 years ago.

Finally ... mine:

L1b2c (L-SK1414, FGC51074), has so far been SNP detected only in Makrani Balochi, in SW Pakistan, Gujarat, India, Turkey, Cyprus, Saudi Arabia, Lebanon (Druze), and in England. STR predictions for L-SK1414 have also been found in Goa, Syria, Iraq, Kuwait, UAE, Tartaristan, France, Italy, Iran, and Azores.  In addition to SK1414, I have with the assistance of Gareth Henson, a FT-DNA Big Y test, accompanied by further analysis of their raw data, by Yfull, and FullGenomes, ascertained 117 novel SNPs looking for first time matches.  As can be imagined, I'm very keen that further L Y-Men should test.

Distribution may be connected to the dispersal of the recently identified group known as the Iranian Neolithic Farmers after 10,000 BCE:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113750/#:~:text=We%20sequenced%20Early%20Neolithic%20genomes,significantly%20to%20the%20ancestry%20of

In addition there would appear to be a potential link between this group, and the inhabitants of the Harappan or Indus Valley civilization. Below is my proposed distribution of L SK1414 / FGC51074:

Those tentative European Y haplogroup L links

We have seen above, that again, and again, Y haplogroup L (M20), and several of it's sub clades appear to have Western Asian origins, despite success of some of those sub clades today in India and Pakistan.  Y haplogroup L has not been linked to the Yamna hypothesis, that has taken credit for the origin of many haplogroups that are successful today in Europe.  Y-DNA L was located to the southern side of the Caucasus, between present day Turkey and Pakistan.  However, two particular Y-DNA L sub clades do make mysterious appearances across Europe.

1) L-L595 (L2) has only recently been discovered, so far, exclusively across Europe, in very low numbers.

2) L-M349 (L1b1), downstream of M317, also spreads across South Europe, and clusters at the Rhine-Danube.  I have on 23andMe forums, seen a number of testers that unfortunately have not tested their Y elsewhere, claim Ashkenazi paternal ancestry, but this is far from common to all European L-M349 samples. Although rarely forming much more than 1% of all Y along the Mediterranean coast of Southern Europe, this percentage does occasionally rise higher, for example, in parts of Italy.

When did L2 or even L1b1 enter Europe?  L2 has only so far been found in Europe.  There are some suggestions that some European L could be survivors from the Eurasian Neolithic.  However, ancient DNA has not yet been found to support this hypothesis. 

Prime resources

L Yfull Tree

https://www.yfull.com/tree/L/

Wikimedia entry for Y Haplogroup L-M20

https://en.wikipedia.org/wiki/Haplogroup_L-M20

FTDNA L The Y Haplogroup L Project

https://www.familytreedna.com/public/Y-Haplogroup-L/

Marco Cagetti's Y Haplogroup L

http://www.cagetti.com/Genetics/L-haplogroup.html

Anthropogenica Y Haplogroup L Forum Board

http://www.anthrogenica.com/forumdisplay.php?37-L

ISOGG 2009 Y Haplogroup L (Useful for understanding 23andMe Y haplogroup result of L2*)

http://isogg.org/tree/2009/ISOGG_HapgrpL09.html

ISOGG 2017 Y Haplogroup L

http://isogg.org/tree/ISOGG_HapgrpL.html

Other resources

Europedia Y-DNA Haplogroup L

http://www.eupedia.com/europe/origins_haplogroups_europe.shtml#L

23andMe users should note that the company in 2016, still used a very outdated ISOGG nomenclature system.  My 23andMe reported haplotype was L2*.  However, using ISOGG 2016, this is now L1b (L-M317).  NOT to be confused with modern day L2 (L-L595).

Facebook Y Haplogroup L Group

https://www.facebook.com/groups/773887796013634/

L-M317 STR Alpine cluster article

https://figshare.com/articles/L_M317_STR_marker_likelihood_tree_focuing_Alpine_cluster/105684

Familypedia Wiki for Haplogroup L

http://familypedia.wikia.com/wiki/Haplogroup_L_(Y-DNA)

For personal note  as of 2018-08-28.

My Y Haplogroup L Designation:

L +M20 +M22 +M317 +SK1412 +SK1414 (or FGC51074) +FGC51041 (or Y31947) +FGC51036

L-SK1414 = L1b2c

SK1414/FGC51074 age estimate current 9,300 years bp.

L-FGC51041 is a verified terminal

FGC51041/Y31947 age estimate current 6,000 years ago but only 2 samples on Ytree.

L-FGC51036 on terminal on FT-DNA

L-Y31947 is terminal on yFull

115 novel SNPs