A new test - LivingDNA test for Ancestry

You might think that following my recent posts, that I've lost all faith in DNA testing for Ancestry.  Not at all.  I just object when people take the analysis results of autosomal DNA tests for ancestry, as infallible truths.  They are clearly not.

So far this year, I have commissioned two 23andMe tests, and three FT-DNA tests, a FullGenomes analysis, and a YFull analysis.  I have also used free analysis at WeGene and DNA.land, and have run three raw files on GEDmatch calculators.  You'd might also think that I've done enough testing for one year!  I thought that as well.  Then a new service just entered the market.

Living DNA Ancestry attracted my commission on two particular points.  1) it has an incredible British reference, that promises to break ancestry composition into 30 British regions - in addition to global analysis.  If it works, then this is a must for people with significant British ancestry.  2) it uses the latest cutting edge test chip.  The latest Illumina chip based on Global Screen Array (GSA).  In addition, it uses a European based lab (Denmark), it tests Y-DNA, mtDNA, and autosomes.  It tests more SNPs on all three counts, than other current chips used by competitors offering autosomal plus tests.  Raw files for the test results will be available for download.

The British Reference

Living DNA will be using a British reference broken down into an incredible 30 regions, across England, Scotland, Wales, Orkney, and Northern Ireland.  The reference uses the much heralded POBI (Peopling of the British Isles 2015) data set.  This project collected 4,500 blood samples from people that could claim four grandparents in the same area, from across the regions of Britain.

A little about the POBI project below:

The British reference does not include the Republic of Ireland.  However, LivingDNA are confident that they have collected a good global reference, and I understand, that they are seeking a similar quality Irish data-set for the future.  

In comparison, other providers of DNA tests for ancestry, only reference to Britain, or the British Isles & Ireland, as a single reference point.  And as can be seen by my previous posts, with limited success.

They also hope to provide imports for formats of raw file from other test companies in the future.  LivingDNA do not themselves currently offer relative matching, or health information.  Their service is for now, primarily for ancestry.

The Chip

They will be using a custom version of the latest Illumina chip technology, the Global Screen Array (GSA).  It is encoded with:

650,000 autosomal DNA SNPs

20,000 Y-DNA SNPs

4,000 MT-DNA SNPs.

In comparison for example, the 23andMe V4 chip scans for:

577,000 atDNA SNPs

2,329 Y-DNA SNPs

3,100 MT-DNA SNPs.

I hope that LivingDNA will also use up-to-date haplogroup nomenclature and information.  23andMe with their V4 chip still use very dated 2009 nomenclature.

So, let's see if this new service is any improvement to my results, compared with the hit and miss of 23andMe, and Family Tree DNA (FT-DNA).  Will they be able to identify and locate my English roots successfully?  What will the improved chip make of my haplogroups?

The Southern European DNA enigma. Option 3. Autosomal DNA Analysis does not work

Here I'm considering the third option to my enigma.  My known ancestry is 100% English.  However, autosomal DNA tests for Ancestry, by commercial companies, and by third party analysis, suggest that I have a mixture of European ancestries, including varying percentages of Southern European.  I'm trying to best explain this phenomena.  In previous posts, I considered 1) that my paper record is incomplete, or biologically incorrect.  2) that something ancient is picked up in analysis of present day English testers - that maybe reflect shared algorithms with ancient admixture, perhaps prehistoric, or Roman.

Now in this post, I consider the third option.  That commercial DNA companies exaggerate their claims to be able to differentiate to any successful degree, between different regions of Europe in my ancestry.  If this is indeed the case, it has significant repercussions for testers for example, in the USA, Canada, Australia, etc.  If they have a poor paper trail, and poorly known ancestry, maybe it's all too easy for them to regard such DNA tests for ancestry, as indisputable and accurate truths.

Commercial DNA companies for Ancestry, are under pressure to supply to market demands.  Their markets have been dominated particularly by USA customers.  Some of them seasoned genealogists with good quality paper trails.  Others, attracted by the easy option to know their ancestry before the, as 23andMe puts it, the Age of Migration before the past few centuries.  Instead of spending a lifetime chasing documents, they can simply send a DNA sample to a company, and know their roots.  People trust the science of DNA testing for ancestry.  That is the demand that commercial companies can cater for.

But what if their abilities to accurately detect ancestry from Autosomal DNA is exaggerated?

Lack of agreement between analysis.

As one evidence.  Test autosomal DNA with three different companies, and you will receive three different results.  That is well known in genetic genealogy circles.  Some apologists excuse it away by pointing to the different companies claims, to be focusing on different periods.  23andMe say that they zoom in on 500 years ago, by rejecting short chains.  Is it really, really possible yet, to be able to zoom in on one particular period?  I'm not convinced.  Is it even possible to securely locate all ancestry from the past 500 years?  I'd expect genetic recombination to wash away an awful lot of ancestral DNA long before that.  The truth is that beyond our great great grandparent's generation, there is less and less chance of us carrying any surviving DNA from any one particular ancestor! Especially from the autosomal DNA passed down on your father's side.  You might have a Balkan g.g.g.g grandfather, but chances are, there will be no evidence of their existence remaining in your autosomes.  His DNA, and all that belonged to his Balkan ancestry, will be lucky to survive the following 250 years, never mind 500 years.  My Y-DNA has strong evidence that I had an Asian ancestor on my paternal line, arrive in Southern England between 1,800 and 500 years ago.  However, nothing remains in my autosomal DNA analysis that suggests Asia.  Washed away.

Getting back to those three companies giving three different ancestries. My South European percentages have varied from 2% (with a hint at Iberia), to 19% (with a hint at Balkans), to FT-DNA's claim of 32%!  Eurogenes K13 hints at Iberia in it's admixture programs on GEDmatch.

Population References

One more thing.  Autosomal DNA tests for ancestry do not use ancient DNA references.  Not yet anyway.  They instead use present-day references, often from their own customer client bases, based on what ancestry they claim.  This is not necessarily the DNA that existed in past populations.  Populations and genes shuffle, genetic drift forms.  I recently read a report that FT-DNA Y data for NW Europe heavily biases to Irish ancestry.  Therefore, references from Americans of Irish and / or British descent, will bias to the West.  The quality of a reference is critical.

Is it all Bunk?

Am I saying that autosomal DNA testing for Ancestry is all a waste of time?  Actually no, not yet.  The tests DO find me to be pretty much 100% European.  That is a success.  Some tests even find me with a degree of confidence, to be NW European.  That is awesome.  However, beyond such regional level, should we be trusting such tests to be providing concrete results, infallible "truths" with a high degree of accuracy?  Shouldn't we be cautious, and regard such speculations as just that - speculations, to be assessed by other forms of evidence?  Some of my ancestors might have lived in Southern Europe.  Maybe Option 1 was correct - one of my Norfolk ancestors brought a Portuguese wife home from the Peninsular Wars.  Perhaps.  Maybe Option 2 was correct - the patterns that DNA companies pick up as Southern European, are ancient, related to Neolithic, Iron Age, or Roman admixture from the South, or sharing ancient ancestry with Southern Europeans.  Maybe.

I'm not at all disenchanted with DNA testing for ancestry though.  I've commissioned five so far this year, including three autosomal DNA tests.  This leads me to my most recent commission.  Perhaps this one will convince me more.  It's a very new test.  I'll post on that next.



The Southern European DNA enigma. Option 2. The DNA is Ancient.

The above photograph taken by me, of Neolithic skulls from the Tomb of the Sea Eagles, Orkney.

I'm not the only English person reporting "Southern European" on their Autosomal DNA for Ancestry test results.  I've noticed that on 23andMe, for example, that English often report these strange low percentages of "Southern European" in their ancestry composition results.

There may be something odd about the ancient ancestry of the English, that we do not yet know.  Others have also pointed out that in ancient admixture calculators, that the English receive lower percentages of ANE (Ancient North Eurasian) than do the Irish, Scottish, or other nearby neighbours.  POBI (People of the British Isles 2015) suggested a unknown immigration into Southern Britain during Late Prehistory perhaps from the area that is now France.

Some point to perhaps, more Neolithic survival in lowland Britain, relating perhaps to Sardinian patterns.  Others suggest immigration from Southern Europe and elsewhere during 360 years of Roman occupation.

Option 2 is a possibility - perhaps these is something about English ancestry that we do not yet know about, that confuses the algorithms of commercial DNA companies, when trying to identify our more recent ancestry.

Revisiting Southern European for Ancestry

This photo of A Capela dos Ossos (the bone chapel) in Évora, Portugal.  Taken by myself.

First, a recap

I'm English by ethnicity, birth, upbringing, known family history, and by record.  That record, I've researched on and off for more than 25 years, primarily in record offices, but in more recent years also online.  On my personal database I presently have 207 direct ancestors recorded.  All lived in Southern England, with the majority in East Anglia.  All appear to have English surnames.  All recorded religious denominations, English.  The majority were rural working class.  I have a typical English ethnicity and phenotype.  My recorded genealogy stretches back at the furthest to the 1680's.

I'd expect some admixture in there.  I know from my Y-DNA that I have Asian admixture from between 500 and 1,800 years ago on my paternal lineage.  Surely some Hugeonauts, Strangers, Romany, or others at some point.  However, a rare and single event on one line of ancestry doesn't hang around very long in autosomal DNA.  It can be washed out very quickly by genetic recombination - as my Asian, as detected by my Y-DNA, has been.  You should only really see significant traces of admixture, when it is either recent (within the past few hundred years at most), or entered on multiple lines of ancestry.

Therefore, I'd have expected a commercial Autosomal DNA test for ancestry to come fairly close to 100% for British, or even English.  But instead, so far, I've received:

From 23andMe Ancestry Composition in Speculative mode, before any phasing with mother alone:

32% British & Irish
27% French & German
7% Scandinavian
29% Broadly NW European
2% Broadly Southern European (including 0.5% Iberian)

and after phasing with one parent:

37% British & Irish  (23% from father, 14% from mother)
22% French & German  (12% from father, 10% from mother)
1% Scandinavian  (from mother alone)
36% Broadly NW European  (23% from father, 13% from mother)
2% Broadly Southern European (1% from father, 1% from mother)

From FTDNA Family Finder My Origins, I recently received:

36% British Isles
32% Southern European
26% Scandinavia
6% Eastern Europe

Wegene using my 23andMe raw data gives me:

81% French
19% British

DNA.land using my 23andMe raw data gives me:

77% Northwest European
19% South European broken into 13% Balkan and 6% Central/South European
2% Finnish
1% ambiguous West Eurasian.

GEDmatch Eurogenes K13 on Oracle using my FT-DNA raw data gives me as my nearest Genetic Distance:

Southeast English 3.75 GD

On Oracle 4 I get as my nearest single population Genetic Distance:

Southeast English 4.28 GD

Best three way on K13 Oracle 4 mix is:

50% Southeast_English +25% Spanish_Valencia +25% Swedish @ 1.86 GD

Eurogenes K13 does often suggest Iberian references for admixtures on my results further down the proposal list.  Still, thumbs up for Eurogenes K13!  It gets me as Southeast English correctly!

So... 23andMe claims that I have 2% Southern European and that it comes from both parents, although before phasing, it hinted at Iberian.  FT-DNA claims that I have a whopping 32% Southern European!  DNA.land claims that I'm 19% South European, but Balkan with some Italian, rather than Iberian!  Eurogenes K13 Oracle 4 suggests that if I do have admixture, that it most likely includes Iberian.  My family tree has no evidence of any Southern European people, names, or any Catholicism, etc.  Confusing or what?


FTDNA (Family Tree DNA) My Origins Autosome Test for Ancestry

I know I should have smiled!  Me, myself sitting outside of the archaeology museum earlier this year, at Sofia, Bulgaria.South-West Europe.

FT-DNA Family Finder My Origins

I haven't posted much coherent lately, because, well, my Life changed, and consequently I've been pretty busy, in a very good way.  However, my exploration into genetic genealogy hasn't ceased at all.  Indeed, I took advantage of the Family Tree DNA (FTDNA) Summer sale, and bought the USD $79 Family Finder test.

No need to send a fresh sample, this was the third test from the sample that I sent to FT-DNA's US lab earlier this year.  FTDNA Family Finder is an autosomal DNA test only, without haplogroup results - but I've tested my Y-DNA to death already, and I know my mtDNA haplotype.  The services supplied include relationship matching, raw file download, and an Ancestry analysis named My Origins.  Hey, for that price, in GBP £, that is el cheapo good value.  And it's a good test, with about 690,000 SNPs tested, against 23andMe's current 577,382.  Smoking.

My prime interests was in 1) comparing the raw data with 23andMe on GEDmatch, and 2) seeing what FTDNA My Origins has to say about my autosomal DNA for ancestry.

So what did I find.

The former, comparing raw data files, I've done.  But briefly, the calculators DO vary for the two files, but not by very much - except maybe, that on Eurogenes K13, the nearest GD on my FT-DNA file is closer to correct - putting SE English closer this time than South Dutch.

The latter?

Family Tree DNA reported My Origins as:

100% European.

Broken into:

36% British Isles
32% Southern Europe
26% Scandinavia
6% Eastern Europe

This is a pretty bizarre result.  36% almost hits dead on my 23andMe Ancestry Composition (spec) result for British Isles (32% before phasing, 37% after phasing with one parent).  Perhaps they are using a similarly biased reference?  I'll blog on that soon as well.

26% Scandinavian is massive.  23andMe AC spec reported 7% before phasing, and 2% after phasing with one parent.

That's pretty much my first report for Eastern European, except for DNA.land's claim of some Balkans (hence my excuse for the above photograph).

but .... 32% Southern Europe, really?  Let's go there next, off to Southern Europe now:



Family Tree DNA Family Finder data V 23andMe raw data on GEDMATCH

Background

I'm South-east English in known paper ancestry, ethnicity, and heritage - mainly Norfolk East Anglian, where I still live, close to many known ancestors. I have 207 recorded ancestors on my tree, over the past 380 years. The majority lived in Norfolk, but some were Oxfordshire, Lincolnshire, Suffolk, and Berkshire. All appear to be English, with English surnames, English religions and denominations, overwhelmingly East Anglian:

Generation 1 has 1 individual. (100.00%)

Generation 2 has 2 individuals. (100.00%)

Generation 3 has 4 individuals. (100.00%)

Generation 4 has 8 individuals. (100.00%)

Generation 5 has 16 individuals. (100.00%)

Generation 6 has 29 individuals. (90.62%)

Generation 7 has 51 individuals. (79.69%)

Generation 8 has 47 individuals. (36.72%)

Generation 9 has 36 individuals. (14.84%)

Generation 10 has 10 individuals. (2.34%)

Generation 11 has 4 individuals. (0.39%)

Total ancestors in generations 2 to 11 is 207

I have previously tested 23andMe, FTDNA Y111, and FTDNA Big Y. My Y line is unusual, because it does originate in Western Asia, within the past few thousand years (L1b2c). However, there is no evidence of anything but European in any autosomal tests so far, so other than the Y, it appears to be washed out.

My 23andMe AC in spec mode (after phasing with one parent) is:

100% European

96% NW European

2% South European

2% broadly European


37% British & Irish

22% French & German

1% Scandinavian

36% broadly NW European

2% broadly South European


FTDNA Family Finder - My Origins


36% British Isles

32% Southern European

26% Scandinavian

6% Eastern European

I thought that it would be interesting to compare how a few important GEDMATCH calculators, see my raw data from Family Tree DNA, in comparison to the raw data from 23andMe:

GEDMATCH

23andMe raw data V ftDNA raw data

Eugenes K13 Oracle

23andMe data

Admix Results (sorted):

# Population Percent

1 North_Atlantic 47.58

2 Baltic 22.36

3 West_Med 15.65

4 East_Med 8.03

5 West_Asian 3.05

6 Red_Sea 1.42

7 Amerindian 0.74

8 South_Asian 0.71

9 Oceanian 0.46


Single population Sharing:


# Population (source) Distance

1 South_Dutch 3.89

2 Southeast_English 4.35

3 West_German 5.22

4 Southwest_English 6.24

5 Orcadian 6.97

6 French 7.63

7 North_Dutch 7.76

8 Danish 7.95

9 North_German 8.17

10 Irish 8.22


Family Tree DNA data

Admix Results (sorted):

# Population Percent

1 North_Atlantic 47.89

2 Baltic 22.68

3 West_Med 15.45

4 East_Med 7.41

5 West_Asian 3.11

6 Red_Sea 1.38

7 South_Asian 0.84

8 Amerindian 0.72

9 Oceanian 0.52


Single Population Sharing:


# Population (source) Distance

1 Southeast_English 3.75

2 South_Dutch 4.03

3 West_German 5.42

4 Southwest_English 5.68

5 Orcadian 6.33

6 North_Dutch 7.15

7 Danish 7.36

8 Irish 7.59

9 West_Scottish 7.62

10 North_German 7.7


Euogenes EUtest V2 K15

23andMe data

Admix Results (sorted):


# Population Percent

1 North_Sea 33.42

2 Atlantic 27.98

3 West_Med 12.24

4 Baltic 10.42

5 Eastern_Euro 7.04

6 West_Asian 3.52

7 East_Med 3.14

8 Red_Sea 1.48

9 Amerindian 0.39

10 Oceanian 0.19

11 South_Asian 0.18


Single Population Sharing:


# Population (source) Distance

1 Southwest_English 2.7

2 South_Dutch 3.98

3 Southeast_English 4.33

4 Irish 6.23

5 West_German 6.25

6 North_Dutch 6.79

7 West_Scottish 6.84

8 French 6.85

9 North_German 6.89

10 Danish 7.26


Family Tree DNA data

Admix Results (sorted):


# Population Percent

1 North_Sea 33.81

2 Atlantic 28.23

3 West_Med 12.04

4 Baltic 10.59

5 Eastern_Euro 6.84

6 West_Asian 3.66

7 East_Med 2.47

8 Red_Sea 1.46

9 Amerindian 0.35

10 South_Asian 0.31

11 Oceanian 0.25


Single Population Sharing:


# Population (source) Distance

1 Southwest_English 2.29

2 Southeast_English 4.02

3 South_Dutch 4.48

4 Irish 5.78

5 West_Scottish 6.41

6 North_Dutch 6.43

7 West_German 6.63

8 North_German 6.73

9 Danish 7.01

10 Orcadian 7.19


Gedrosia Eurasia K6 Oracle

23andMe data

Admix Results (sorted):


# Population Percent

1 West_European_Hunter_Gartherer 39.18

2 Natufian 38.8

3 Ancestral_North_Eurasian 20.85

4 Ancestral_South_Eurasian 0.82

5 East_Asian 0.35


Family Tree DNA data

Admix Results (sorted):


# Population Percent

1 West_European_Hunter_Gartherer 39.57

2 Natufian 38.66

3 Ancestral_North_Eurasian 20.75

4 East_Asian 0.86

5 Ancestral_South_Eurasian 0.16

Y Haplogroup L Resource page

Distribution Haplogroup L Y-DNA

By Crates (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC BY-SA 4.0-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/4.0-3.0-2.5-2.0-1.0)], via Wikimedia Commons.  Unmodified.

Introduction - Y-DNA, Haplogroups, SNPs, Haplotypes

The Y chromosome, and it's Y-DNA, are copied from father to son, down a strictly paternal lineage.  If I were to trace my entire direct ancestry back, I have two parents, four grandparents, eight great grandparents, sixteen great great grandparents.  Yet out of those sixteen great great grandparents (generation 5), who were born only circa 160 years ago, only one carried the Y-DNA that was passed down to me.  My eight great great grandmothers did not not inherit a Y chromosome from their fathers.  Most likely, my other seven great great grandfathers, carried distinctive differently marked Y-DNA.  Yet all sixteen biological great great grandparents have contributed to my overall atDNA (autosomal DNA).  Only one gave me my Y-DNA.  So you can see that Y-DNA represents only one narrow lineage.

Y-DNA, may on the face of it, appear to offer a limited understanding of total biological ancestry.  All sixteen of my great great grandparents were direct ancestors, not just the Y great great grandfather.  However, this lineage offers us evidence that can be genetically tracked, then mapped into relationship.  It could be done to ascertain parental, or non parental events.  It can be used to check the biological validity of relationship to cousins.  As more people investigate and record their haplogroups, haplotypes, STR markers, and SNPs, so we can for example, start to use them to map biological relationship further back.  Y-DNA is particularly useful, not only because of it's markers, but also because it can be plotted to surname studies.  In Western societies, the surname often follows the Y lineage for several generations.

However, Y-DNA (nor the maternal mtDNA) evidence doesn't just stop there.  As more people investigate, submit, and record their data from around the World - and as anthropologists and archaeologists add ancient DNA data from ancient and provenanced human remains to that record, so we can build and plot a world map of the human family, how it relates, how it was distributed globally throughout prehistory.

Both Y and mt DNA carries mutation markers, that define a HaplogroupA haplogroup is a family of shared descent.  These haplogroups are ancient.  The paternal Y-DNA haplogroup that this resource page is dedicated to has been designated as L.

However, mutations do not stop with the formation of a new haplogroup.  They continue through the generations.  As lineages divide between different sons, across many generations, so these mutations in the Y-DNA for example, continue to accumulate down the diverging lineages that once shared common descent.  We are all unique.  The sub clade of L that this page focuses on is L1b.  All male carriers of L1b will carry a SNP (Single Nucleotide Polymorphism) on their Y-DNA that has been designated as M317. This SNP will be downstream of another SNP that has been designated as M22.  Finally, a Y-DNA can be said to have a terminal SNP.  A terminal, refers not to the Haplogroup (in this case L), but can be used to define right down to the last SNP on the Y-DNA, that is shared with others on a record.  If someone for example, carries Y-DNA that is proven (or predicted by comparison) to be Y Haplogroup L, and to carry M317, then their Y terminal could be designated as L-M317, or alternatively, as L1b.  This is also sometimes referred to as a haplotype.  However, a haplotype can also refer to a particular STR.

Y Haplogroup L M20

The above image illustrates a modern day distribution of Y Haplogroup L (M20) as proposed and created by Anthropogenica user Passa.

Y Haplogroup K formed from Y Haplogroup IJK in the Y-DNA of hunter-gatherer fathers and sons, that share a MRCA (most recent common ancestor) during the Upper Palaeolithic, circa 45,400 years ago.  Where did these Y ancestors live at that time?  We think that they lived in Western or Southern Asia.  Iran is a favourite proposal. Earlier Y ancestors had most likely exited Africa 20,000 years earlier, and were well established in Asia.  They had most likely met and confronted another archaic human species, The Neanderthal. This was however, a time of great expansion by humans.  The first anatomically modern humans had recently entered Europe, while other modern humans were arriving in Australia.  The Ice Age was in a flux, but glaciation was advancing.

The most recent common Y ancestor to carry Y Haplogroup LT lived circa 42,600 years ago.  Then a mutation in the Y-DNA lead to the formation of Y Haplogroup L, with a most recent common ancestor 23,200 years ago, close to the time of the Last Glacial Maximum, when ice sheets were reaching their maximum positions.  K, LT, and early L, most likely all originated in Upper Palaeolithic hunter-gatherer populations living during the last Ice Age, in the area of modern day Syria, Iraq, Iran or Pakistan.  It was a time of increased stress on human populations, that were having to adapt to some severe environmental challenges, and may have at times faced isolation into a number of Ice Age Refuges.  Some of these Upper Palaeolithic, Ice Age hunter-gatherer refuges may have been close to the Black Sea, others close to the Caspian Sea, but they were most likely located somewhere between Eastern Anatolia, and Eastern Iran, south of the Caucasus.

L1 / L2 Divergence - the Odd L2's

The oldest divergence within Y Haplogroup L.  L1, as characterised by the SNP M22, diverged from L2, as characterised by the SNP L595.  L2 was only recently discovered, and forced an ISOGG revision of Y Haplogroup L and it's nomenclature that is still causing problems.  In this article, unless stated otherwise, I am using 2017 Nomenclature.  L2 or L-L595 is very rare, but has so far cropped up sporadically across Western Eurasia, including in Azeribaijan, Turkey, Sardinia, England, and Tartaristan.

That is L2 dealt with.  However, most Y Haplogroup L falls into L1. Let us start to look at the main branches of L1.  Remember, L1 is defined by the SNP M22:

Unofficial proposed tree for L1 (L-M22) 2016.  By Gökhan Zuzigo, modified by Paul Brooker.

Proposed Migration Map of L-M22 (L1) by Phylogeographer at https://phylographer.com/mygrations/?

The Big L1 Split - L1a and L1b

As can be seen above, this split occurred around 18,400 years ago, possibly somewhere between what is now Iran and Pakistan.  The L1a branches inherit the SNP M2481, and the L1b branches inherited M317.

First of all, let's look at L1a, because although it is not my sub clade, in terms of modern day population size, it appears to greatly outnumber any other L sub clade.

Pakistan and India - Present Day Home of L1a1 and L1a2

L1a splits again into two sub clades.  The split occurred around 17,400 years ago.  L1a1 (L-M27) and L1a2 (L-M357)

L1a1 (L-M27)

Defined by SNP M27 (on older nomenclature as still used by 23andMe, this was formerly L1*) is mainly found in India, particularly South West India, and in Sri Lanka. This is perhaps the most populous modern day L sub clade, found in 15% of Indian males.  However, it is not restricted to India, and has also been found in 20% of Balochi in Pakistan, and has also been reported in Kirghiz, Pashtun, Tajik, Uzbek, and Turkmen males across Central Asia.

L1a and L1a1 (L-M27) at Birds Eye Cave, Armenia 6161 years before present.

Ancient Y DNA from the Copper Age has emerged from this location in Armenia, and included L1a, and L1a1.  This might suggest, that although very successful today in India and Pakistan, that it has a Western Asian origin.

L1a2 (L-M357)

Has defined by SNP M357 (on older nomenclature as still used by 23andMe, this was formerly L3*).  This sub clade is mainly found in Pakistan, but also Saudi Arabia, Kuwait, The Chechen Republic, Tajikistan, India, and Afghanistan.  It has been found at 15% in Burusho populations, and at 25% in Kalash populations.  It is much more common in Pakistan than in India.

So, the L1a sub clades - spreading down into Southern Asia, and accounting for potentially millions of Y Men there.  Far more than any other branches of Y Haplogroup L.  However, Southern Asia is unlikely to be the origin of L.  That origin is more likely, as stated earlier, to be the place with the most diversity in branches.  That points more towards again towards Western Asia.  It's just that ancient carriers of L, appear to have been particularly successful in Southern Asia, and to have fathered more sons there.

L-M317 or L1b of Western Asia

We now move onto the branches of particular interest to myself, because I carry a Y Haplotype that belongs here.  L1b is defined by the SNP M317, that formed circa 18,400 years ago, most likely in the area of modern day Iran, or elsewhere in Western Asia.

Phylogenetic tree of L1b by Anthrogenica user Caspian (with permission):

Click on above hyperlink for full sized image

L1b is mainly distributed across Western Asia, from modern day Turkey, across to Pakistan.  However, as we will see, it also spreads in low densities across parts of Europe.  it is very much, the "Western L".

The Next split - L1b1 or L-M349.  The Levant, and Europe!

Around 14,000 years ago, another split occurred in the L1b (M317) branch. A new SNP, M349, defined L1b1.  Today, L1b1, or L-M349, is found in Western Asia, in Lebanon, Syria, Turkey, Armenia, etc.  However, it is also found scattered in low densities through parts of Europe.  It crops up in South Europe, often close the the Mediterranean Sea, including particularly in parts of Italy.  It also forms a light cluster in Central Europe.

A working map of Y haplogroup L sub clades by Edward Chernoff.  This map is incomplete, but is published here with permission of Edward Chernoff.  Copyrights applied.

Branching away from a common Y ancestor with L1b1 (M349), is another 14,000 year old line defined by SNP SK1412, L1b2.

L1b2 (L-SK1412) splits - Pontic Greeks, and the others...

13,000 years ago,, during a cold stage towards the end of the last Ice Age, the L1b2 (SK1412) Y branch divides again.  Very recent research suggests that it split into three lines: L-SK1415 (L1b2a), L-PH8 (L1b2b), and L-SK1414 (L1b2c).

L1b2a (L-SK1415), has as far as I know, only been detected in a Makrani Balochi survey in SW Pakistan.

L1b2b (PH8), is found in Turkey, Greece, Armenia, Chechen Republic, Iraq, etc.  It is associated particularly with the Pontic Greek ethnicity from Eastern Anatolia, and around the Black Sea.  A further division within PH8 has been detected at around 3,000 years ago.

Finally ... mine:

L1b2c (L-SK1414, FGC51074), has so far been SNP detected only in Makrani Balochi, in SW Pakistan, Gujarat, India, Turkey, Cyprus, Saudi Arabia, Lebanon (Druze), and in England. STR predictions for L-SK1414 have also been found in Goa, Syria, Iraq, Kuwait, UAE, Tartaristan, France, Italy, Iran, and Azores.  In addition to SK1414, I have with the assistance of Gareth Henson, a FT-DNA Big Y test, accompanied by further analysis of their raw data, by Yfull, and FullGenomes, ascertained 117 novel SNPs looking for first time matches.  As can be imagined, I'm very keen that further L Y-Men should test.

Distribution may be connected to the dispersal of the recently identified group known as the Iranian Neolithic Farmers after 10,000 BCE:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113750/#:~:text=We%20sequenced%20Early%20Neolithic%20genomes,significantly%20to%20the%20ancestry%20of

In addition there would appear to be a potential link between this group, and the inhabitants of the Harappan or Indus Valley civilization. Below is my proposed distribution of L SK1414 / FGC51074:

Those tentative European Y haplogroup L links

We have seen above, that again, and again, Y haplogroup L (M20), and several of it's sub clades appear to have Western Asian origins, despite success of some of those sub clades today in India and Pakistan.  Y haplogroup L has not been linked to the Yamna hypothesis, that has taken credit for the origin of many haplogroups that are successful today in Europe.  Y-DNA L was located to the southern side of the Caucasus, between present day Turkey and Pakistan.  However, two particular Y-DNA L sub clades do make mysterious appearances across Europe.

1) L-L595 (L2) has only recently been discovered, so far, exclusively across Europe, in very low numbers.

2) L-M349 (L1b1), downstream of M317, also spreads across South Europe, and clusters at the Rhine-Danube.  I have on 23andMe forums, seen a number of testers that unfortunately have not tested their Y elsewhere, claim Ashkenazi paternal ancestry, but this is far from common to all European L-M349 samples. Although rarely forming much more than 1% of all Y along the Mediterranean coast of Southern Europe, this percentage does occasionally rise higher, for example, in parts of Italy.

When did L2 or even L1b1 enter Europe?  L2 has only so far been found in Europe.  There are some suggestions that some European L could be survivors from the Eurasian Neolithic.  However, ancient DNA has not yet been found to support this hypothesis. 

Prime resources

L Yfull Tree

https://www.yfull.com/tree/L/

Wikimedia entry for Y Haplogroup L-M20

https://en.wikipedia.org/wiki/Haplogroup_L-M20

FTDNA L The Y Haplogroup L Project

https://www.familytreedna.com/public/Y-Haplogroup-L/

Marco Cagetti's Y Haplogroup L

http://www.cagetti.com/Genetics/L-haplogroup.html

Anthropogenica Y Haplogroup L Forum Board

http://www.anthrogenica.com/forumdisplay.php?37-L

ISOGG 2009 Y Haplogroup L (Useful for understanding 23andMe Y haplogroup result of L2*)

http://isogg.org/tree/2009/ISOGG_HapgrpL09.html

ISOGG 2017 Y Haplogroup L

http://isogg.org/tree/ISOGG_HapgrpL.html

Other resources

Europedia Y-DNA Haplogroup L

http://www.eupedia.com/europe/origins_haplogroups_europe.shtml#L

23andMe users should note that the company in 2016, still used a very outdated ISOGG nomenclature system.  My 23andMe reported haplotype was L2*.  However, using ISOGG 2016, this is now L1b (L-M317).  NOT to be confused with modern day L2 (L-L595).

Facebook Y Haplogroup L Group

https://www.facebook.com/groups/773887796013634/

L-M317 STR Alpine cluster article

https://figshare.com/articles/L_M317_STR_marker_likelihood_tree_focuing_Alpine_cluster/105684

Familypedia Wiki for Haplogroup L

http://familypedia.wikia.com/wiki/Haplogroup_L_(Y-DNA)

For personal note  as of 2018-08-28.

My Y Haplogroup L Designation:

L +M20 +M22 +M317 +SK1412 +SK1414 (or FGC51074) +FGC51041 (or Y31947) +FGC51036

L-SK1414 = L1b2c

SK1414/FGC51074 age estimate current 9,300 years bp.

L-FGC51041 is a verified terminal

FGC51041/Y31947 age estimate current 6,000 years ago but only 2 samples on Ytree.

L-FGC51036 on terminal on FT-DNA

L-Y31947 is terminal on yFull

115 novel SNPs