Counting the SNPs - 23andMe V FT-DNA

Comparing 23andMe V4 kit raw file to FT-DNA raw file.

Both tests were taken by myself this year (2016).  I am here comparing the quality of two separate atDNA tests from the same person, by two different DNA for Ancestry companies.  As will be seen, the quality varies considerably, at least in terms of the number of SNPs that are tokenized once forwarded to GEDmatch.com.  This is NOT a test of how well both companies ascertain our DNA ancestry from these files.  Both use their own reference populations and analysis programs.  I've reviewed that elsewhere.  This test simply weighs how many SNPs are registered from the autosomes and X chromosome of one person.

Using the GEDmatch DNA file diagnostic utility, I received the following SNP counts:

Kit M551698 (23andMe V4)

Token File data:
Chr Token SNP Count
1 40974
2 42110
3 34199
4 31020
5 30421
6 36383
7 26352
8 27900
9 23644
10 27888
11 25363
12 25395
13 19880
14 15957
15 15529
16 16551
17 13745
18 16775
19 9006
20 13530
21 7324
22 7386
X 15359

Processed in batch 5355
Number of SNPs utilized by GEDmatch template = 523997
Number of regular SNPs = 517780
Heterozygosity index = 0.302721 (fraction of total SNPs that are heterozygous)
No-calls = 4911 = 0.93956084952678 percent.
Kit M551698 has approximately 19959 total matches with other kits. Of these matches there are 4982 >= 7cM and 14977 < 7cM.


Kit T444495 (FT-DNA file):

Chr Token SNP Count
1 57931
2 59602
3 47094
4 41772
5 39314
6 47546
7 36567
8 36753
9 30643
10 36889
11 35941
12 35850
13 26763
14 22650
15 20899
16 21935
17 18379
18 22586
19 12773
20 19587
21 10001
22 9750
X 19176

Processed in batch 5914
Number of SNPs utilized by GEDmatch template = 709242
Number of regular SNPs = 694324
Heterozygosity index = 0.281384 (fraction of total SNPs that are heterozygous)

No-calls = 16077 = 2.263088030563 percent.

Kit T444495 has approximately 48755 total matches with other kits. Of these matches there are 9351 >= 7cM and 39404 < 7cM.

Conclusion

If the quality of a raw atDNA file is merely down to the number of SNPs that are tested, then FT-DNA clearly wins hands down, when compared with the 23andMe file, following tokenization for GEDmatch use.  The FT-DNA file utilises 709,209 SNPs compared with 23andMe's 523,997 SNPs

I thought that it might be interesting to compare how these files, of the same person, might compare on the same GEDmatch heritage admixture program.

On Eurogenes K13 Oracle, my 23andMe kit gets as top ten closest GD's:

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97
6 French 7.63
7 North_Dutch 7.76
8 Danish 7.95
9 North_German 8.17
10 Irish 8.22

On the same, using my FT-DNA kit (with many more SNPs tested as demonstrated above:

1 Southeast_English 3.75
2 South_Dutch 4.03
3 West_German 5.42
4 Southwest_English 5.68
5 Orcadian 6.33
6 North_Dutch 7.15
7 Danish 7.36
8 Irish 7.59
9 West_Scottish 7.62
10 North_German 7.7

Based on the numbers of SNPs tokenized, I will in future regard the FT-DNA (Family Tree DNA) file as superior in quality, over the 23andMe file, despite my disappointment in the FT-DNA My Origins ancestry analysis.

A new test - LivingDNA test for Ancestry

You might think that following my recent posts, that I've lost all faith in DNA testing for Ancestry.  Not at all.  I just object when people take the analysis results of autosomal DNA tests for ancestry, as infallible truths.  They are clearly not.

So far this year, I have commissioned two 23andMe tests, and three FT-DNA tests, a FullGenomes analysis, and a YFull analysis.  I have also used free analysis at WeGene and DNA.land, and have run three raw files on GEDmatch calculators.  You'd might also think that I've done enough testing for one year!  I thought that as well.  Then a new service just entered the market.

Living DNA Ancestry attracted my commission on two particular points.  1) it has an incredible British reference, that promises to break ancestry composition into 30 British regions - in addition to global analysis.  If it works, then this is a must for people with significant British ancestry.  2) it uses the latest cutting edge test chip.  The latest Illumina chip based on Global Screen Array (GSA).  In addition, it uses a European based lab (Denmark), it tests Y-DNA, mtDNA, and autosomes.  It tests more SNPs on all three counts, than other current chips used by competitors offering autosomal plus tests.  Raw files for the test results will be available for download.

The British Reference

Living DNA will be using a British reference broken down into an incredible 30 regions, across England, Scotland, Wales, Orkney, and Northern Ireland.  The reference uses the much heralded POBI (Peopling of the British Isles 2015) data set.  This project collected 4,500 blood samples from people that could claim four grandparents in the same area, from across the regions of Britain.

A little about the POBI project below:

The British reference does not include the Republic of Ireland.  However, LivingDNA are confident that they have collected a good global reference, and I understand, that they are seeking a similar quality Irish data-set for the future.  

In comparison, other providers of DNA tests for ancestry, only reference to Britain, or the British Isles & Ireland, as a single reference point.  And as can be seen by my previous posts, with limited success.

They also hope to provide imports for formats of raw file from other test companies in the future.  LivingDNA do not themselves currently offer relative matching, or health information.  Their service is for now, primarily for ancestry.

The Chip

They will be using a custom version of the latest Illumina chip technology, the Global Screen Array (GSA).  It is encoded with:

650,000 autosomal DNA SNPs

20,000 Y-DNA SNPs

4,000 MT-DNA SNPs.

In comparison for example, the 23andMe V4 chip scans for:

577,000 atDNA SNPs

2,329 Y-DNA SNPs

3,100 MT-DNA SNPs.

I hope that LivingDNA will also use up-to-date haplogroup nomenclature and information.  23andMe with their V4 chip still use very dated 2009 nomenclature.

So, let's see if this new service is any improvement to my results, compared with the hit and miss of 23andMe, and Family Tree DNA (FT-DNA).  Will they be able to identify and locate my English roots successfully?  What will the improved chip make of my haplogroups?

Y Haplogroup L Resource page

Distribution Haplogroup L Y-DNA

By Crates (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC BY-SA 4.0-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/4.0-3.0-2.5-2.0-1.0)], via Wikimedia Commons.  Unmodified.

Introduction - Y-DNA, Haplogroups, SNPs, Haplotypes

The Y chromosome, and it's Y-DNA, are copied from father to son, down a strictly paternal lineage.  If I were to trace my entire direct ancestry back, I have two parents, four grandparents, eight great grandparents, sixteen great great grandparents.  Yet out of those sixteen great great grandparents (generation 5), who were born only circa 160 years ago, only one carried the Y-DNA that was passed down to me.  My eight great great grandmothers did not not inherit a Y chromosome from their fathers.  Most likely, my other seven great great grandfathers, carried distinctive differently marked Y-DNA.  Yet all sixteen biological great great grandparents have contributed to my overall atDNA (autosomal DNA).  Only one gave me my Y-DNA.  So you can see that Y-DNA represents only one narrow lineage.

Y-DNA, may on the face of it, appear to offer a limited understanding of total biological ancestry.  All sixteen of my great great grandparents were direct ancestors, not just the Y great great grandfather.  However, this lineage offers us evidence that can be genetically tracked, then mapped into relationship.  It could be done to ascertain parental, or non parental events.  It can be used to check the biological validity of relationship to cousins.  As more people investigate and record their haplogroups, haplotypes, STR markers, and SNPs, so we can for example, start to use them to map biological relationship further back.  Y-DNA is particularly useful, not only because of it's markers, but also because it can be plotted to surname studies.  In Western societies, the surname often follows the Y lineage for several generations.

However, Y-DNA (nor the maternal mtDNA) evidence doesn't just stop there.  As more people investigate, submit, and record their data from around the World - and as anthropologists and archaeologists add ancient DNA data from ancient and provenanced human remains to that record, so we can build and plot a world map of the human family, how it relates, how it was distributed globally throughout prehistory.

Both Y and mt DNA carries mutation markers, that define a HaplogroupA haplogroup is a family of shared descent.  These haplogroups are ancient.  The paternal Y-DNA haplogroup that this resource page is dedicated to has been designated as L.

However, mutations do not stop with the formation of a new haplogroup.  They continue through the generations.  As lineages divide between different sons, across many generations, so these mutations in the Y-DNA for example, continue to accumulate down the diverging lineages that once shared common descent.  We are all unique.  The sub clade of L that this page focuses on is L1b.  All male carriers of L1b will carry a SNP (Single Nucleotide Polymorphism) on their Y-DNA that has been designated as M317. This SNP will be downstream of another SNP that has been designated as M22.  Finally, a Y-DNA can be said to have a Haplotype.  A Haplotype, refers not only to the Haplogroup (in this case L), but can be used to define right down to the last SNP on the Y-DNA, that is shared with others on a record.  If someone for example, carries Y-DNA that is proven (or predicted by comparison) to be Y Haplogroup L, and to carry M317, then their Y Haplotype could be designated as L-M317, or alternatively, as L1b.

Y Haplogroup L M20

The above image illustrates a modern day distribution of Y Haplogroup L (M20) as proposed and created by Anthropogenica user Passa.

Y Haplogroup K formed from Y Haplogroup IJK in the Y-DNA of hunter-gatherer fathers and sons, that share a MRCA (most recent common ancestor) during the Upper Palaeolithic, circa 45,400 years ago.  Where did these Y ancestors live at that time?  We think that they lived in Western or Southern Asia.  Iran is a favourite proposal. Earlier Y ancestors had most likely exited Africa 20,000 years earlier, and were well established in Asia.  They had most likely met and confronted another archaic human species, The Neanderthal. This was however, a time of great expansion by humans.  The first anatomically modern humans had recently entered Europe, while other modern humans were arriving in Australia.  The Ice Age was in a flux, but glaciation was advancing.

The most recent common Y ancestor to carry Y Haplogroup LT lived circa 42,600 years ago.  Then a mutation in the Y-DNA lead to the formation of Y Haplogroup L, with a most recent common ancestor 23,200 years ago, close to the time of the Last Glacial Maximum, when ice sheets were reaching their maximum positions.  K, LT, and early L, most likely all originated in Upper Palaeolithic hunter-gatherer populations living during the last Ice Age, in the area of modern day Iran and Iraq.  It was a time of increased stress on human populations, that were having to adapt to some severe environmental challenges, and may have at times faced isolation into a number of Ice Age Refuges.  Some of these Upper Palaeolithic, Ice Age hunter-gatherer refuges may have been close to the Black Sea, others close to the Caspian Sea, but they were most likely located somewhere between Eastern Anatolia, and Eastern Iran, south of the Caucasus.

L1 / L2 Divergence - the Odd L2's

The oldest divergence within Y Haplogroup L.  L1, as characterised by the SNP M22, diverged from L2, as characterised by the SNP L595.  L2 was only recently discovered, and forced an ISOGG revision of Y Haplogroup L and it's nomenclature that is still causing problems.  In this article, unless stated otherwise, I am using 2016 Nomenclature.  L2 or L-L595 is very rare, but has so far cropped up sporadically across Europe, including in Sardinia.

That is L2 dealt with.  However, most Y Haplogroup L falls into L1. Let us start to look at the main branches of L1.  Remember, L1 is defined by the SNP M22:

Unofficial proposed tree for L1 (L-M22) 2016.  By Gökhan Zuzigo, modified by Paul Brooker.

The Big L1 Split - L1a and L1b

As can be seen above, this split occurred around 18,400 years ago, possibly somewhere between what is now Iran and Pakistan.  The L1a branches inherit the SNP M2481, and the L1b branches inherited M317.

First of all, let's look at L1a, because although it is not my sub clade, in terms of modern day population size, it appears to greatly outnumber any other L sub clade.

Pakistan and India - Present Day Home of L1a1 and L1a2

L1a splits again into two sub clades.  The split occurred around 17,400 years ago.  L1a1, as defined by SNP M27 (on older nomenclature as still used by 23andMe, this was formerly L1*) is mainly found in India, particularly South West India, and in Sri Lanka, where it has been projected onto 15% of men.  It is however, also found outside of India, in Saudi Arabia, Iraq, Pakistan, and Iran.  This is perhaps the most populous modern day L sub clade, found in 14.5% of Indian males.

L1a and L1a1 (L-M27) at Birds Eye Cave, Armenia 6161 years before present.

Ancient Y DNA from the Copper Age has recently emerged from this location, and included L1a, and L1a1.  This might suggest, that although very successful today in India and Pakistan, that it has a Western Asian origin.

L1a2 as defined by SNP M357 (on older nomenclature as still used by 23andMe, this was formerly L3*).  This sub clade is mainly found in Pakistan, but also Saudi Arabia, Kuwait, The Chechen Republic, Tajikistan, India, and Afghanistan.

So, the L1a sub clades - spreading down into Southern Asia, and accounting for potentially millions of Y Men there.  Far more than any other branches of Y Haplogroup L.  However, Southern Asia is unlikely to be the origin of L.  That origin is more likely, as stated earlier, to be the place with the most diversity in branches.  That points more towards again towards Western Asia.  It's just that ancient carriers of L, appear to have been particularly successful in Southern Asia, and to have fathered more sons there.

L-M317 or L1b of Western Asia

We now move onto the branches of particular interest to myself, because I carry a Y Haplotype that belongs here.  L1b is defined by the SNP M317, that formed circa 18,400 years ago, most likely in the area of modern day Iran, or elsewhere in Western Asia.

Phylogenetic tree of L1b by Anthrogenica user Caspian (with permission):

Click on above hyperlink for full sized image

L1b is mainly distributed across Western Asia, from modern day Turkey, across to Pakistan.  However, as we will see, it also spreads in low densities across parts of Europe.  it is very much, the "Western L".

The Next split - L1b1 or L-M349.  The Levant, and Europe!

Around 14,000 years ago, another split occurred in the L1b (M317) branch. A new SNP, M349, defined L1b1.  Today, L1b1, or L-M349, is found in Western Asia, in Lebanon, Syria, Turkey, Armenia, etc.  However, it is also found scattered in low densities through parts of Europe.  It crops up in South Europe, often close the the Mediterranean Sea, including particularly in parts of Italy.  It also forms a light cluster in Central Europe.

A working map of Y haplogroup L sub clades by Edward Chernoff.  This map is incomplete, but is published here with permission of Edward Chernoff.  Copyrights applied.

Branching away from a common Y ancestor with L1b1 (M349), is another 14,000 year old line defined by SNP SK1412, L1b2.

L1b2 (L-SK1412) splits - Pontic Greeks, and the others...

13,000 years ago,, during a cold stage towards the end of the last Ice Age, the L1b2 (SK1412) Y branch divides again.  Very recent research suggests that it split into three lines: L-SK1415 (L1b2a), L-PH8 (L1b2b), and L-SK1414 (L1b2c).

L1b2a (L-SK1415), has as far as I know, only been detected in a Makrani Balochi survey in SW Pakistan.

L1b2b (PH8), is found in Turkey, Greece, Armenia, Chechen Republic, Iraq, etc.  It is associated particularly with the Pontic Greek ethnicity from Eastern Anatolia, and around the Black Sea.  A further division within PH8 has been detected at around 3,000 years ago.

Finally ... mine:

L1b2c (L-SK1414), has so far been detected only in Makrani Balochi, in SW Pakistan, and in England!  It has also been predicted for STR testers from Eastern Iran, and Saudi Arabia.  In addition to SK1414, I have with the assistance of Gareth Henson, a FT-DNA Big Y test, accompanied by further analysis of their raw data, by Yfull, and FullGenomes, ascertained 117 novel SNPs looking for first time matches.  As can be imagined, I'm very keen that further L Y-Men should test.

Those tentative European Y haplogroup L links

We have seen above, that again, and again, Y haplogroup L (M20), and several of it's sub clades appear to have Western Asian origins, despite success of some of those sub clades today in India and Pakistan.  Y haplogroup L has not been linked to the Yamna hypothesis, that has taken credit for the origin of many haplogroups that are successful today in Europe.  Y-DNA L was located to the southern side of the Caucasus, between present day Turkey and Pakistan.  However, two particular Y-DNA L sub clades do make mysterious appearances across Europe.

1) L-L595 (L2) has only recently been discovered, so far, exclusively across Europe, in very low numbers.

2) L-M349 (L1b1), downstream of M317, also spreads across South Europe, and clusters at the Rhine-Danube.  I have on 23andMe forums, seen a number of testers that unfortunately have not tested their Y elsewhere, claim Ashkenazi paternal ancestry, but this is far from common to all European L-M349 samples. Although rarely forming much more than 1% of all Y along the Mediterranean coast of Southern Europe, this percentage does occasionally rise higher, for example, in parts of Italy.

When did L2 or even L1b1 enter Europe?  L2 has only so far been found in Europe.  There are some suggestions that some European L could be survivors from the Eurasian Neolithic.  However, ancient DNA has not yet been found to support this hypothesis. 

Prime resources

L Yfull Tree

https://www.yfull.com/tree/L/

Wikimedia entry for Y Haplogroup L-M20

https://en.wikipedia.org/wiki/Haplogroup_L-M20

FTDNA L The Y Haplogroup L Project

https://www.familytreedna.com/public/Y-Haplogroup-L/

Marco Cagetti's Y Haplogroup L

http://www.cagetti.com/Genetics/L-haplogroup.html

Anthropogenica Y Haplogroup L Forum Board

http://www.anthrogenica.com/forumdisplay.php?37-L

ISOGG 2009 Y Haplogroup L (Useful for understanding 23andMe Y haplogroup result of L2*)

http://isogg.org/tree/2009/ISOGG_HapgrpL09.html

ISOGG 2016 Y Haplogroup L

http://isogg.org/tree/ISOGG_HapgrpL.html

Other resources

Europedia Y-DNA Haplogroup L

http://www.eupedia.com/europe/origins_haplogroups_europe.shtml#L

Cropped image of the top of my 23andme Haplogroup Mutation Tree Mapper:

23andMe users should note that the company in 2016, still uses a very outdated ISOGG nomenclature system.  My 23andMe reported haplotype was L2*.  However, using ISOGG 2016, this is now L1b (L-M317).  NOT to be confused with modern day L2 (L-L595).

Facebook Y Haplogroup L Group

https://www.facebook.com/groups/773887796013634/

L-M317 STR Alpine cluster article

https://figshare.com/articles/L_M317_STR_marker_likelihood_tree_focuing_Alpine_cluster/105684

Familypedia Wiki for Haplogroup L

http://familypedia.wikia.com/wiki/Haplogroup_L_(Y-DNA)


L1b2b (L-PH8) homelands with Roman political boundaries circa 50AD

This map illustrates political zones at that time, across the area which today includes some of the highest percentages of Y-DNA L1b2b.

By Cplakidas - Based on Image:Arshakuni Armenia 150-en.svg. Province & client state outlines based on: Atlas of Classical History, Routledge 1985, pp. 160-162; History Map of Europe, Year 1 from Euratlas, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=6431799

The Other SK1414. My Cousin in Baluchistan

By Baluchistan on Flickr under a Creative Commons Licence. No, this young man is not the SK1414 tester, but the mandolinist in me found this photo kind of cool.  A young man from Makran.  The other SK1414 tester was also a male Makrani Baloch.

I'm hot on the trail of my Y or paternal line, following my FTDNA Y111 STR, then Big Y tests.  These tests analysed the DNA on my Y chromosome.  It is passed down strictly from father to biological sons.  the mutations (SNP and STR) that can be identified in the Y-DNA, can be used to assess relationship, and in some cases, to date the time of most recent common ancestry.  So, with the assistance of Gareth Henson, administrator of the FT-DNA Y haplogroup L Project, and with help from my new distant cousins, what have I learned over the past few weeks?

The Smoking Gun of Y-DNA

Between 45,000 and 13,000 years ago, my paternal ancestors most likely were hunter-gatherers, that lived in the region of what is now Iran and Iraq, during the last Ice Age.  Some sharp changes in glaciation, and cold extremes towards the end of that period, may have generated a number of adaptations, and subsequently, split new sub clades of my Y haplogroup L.

13,000 years ago (based on the Big Y test), I share a common paternal great x grandfather with a number of distant cousins, that descend from Pontic Greek families from the Trabzon region in Turkey.

Between 3,000 and 1,000 years ago (based on the less accurate STR evidence at 111 marker), I share common paternal great x grandfather with another cousin, who's paternal line Habibi, can be traced back to the 1850's in the town of Birjand, Southern Khorasan, Eastern Iran, close to the modern Afghanistan border.  This closer cousin now lives in Australia.

Human male karyotpe high resolution - Y chromosome

My Big Y test produced no less than 90 previously unrecorded or known SNP (pronounced "snip") mutations.  That might be because my Y-DNA is rare, or / and, that it is mainly found in parts of the World where very few people test at this level.  The last SNP on the roll that had been seen before, has been called SK1414.  Because now two of us have tested for this SNP, it is my terminal SNP, so at the moment (although it still has to be submitted to the YFull Tree), I can declare my Y haplogroup sub clade designation to be L-SK1414.  Only one of two so far recorded in the World.

So, who is this Y cousin that shares my SK1414 mutation?

My Baluchistan Cousin

By Baluchistan on Flickr under a Creative Commons Licence.  Another photo from Makran, Balochistan.

The other SK1414 turned up during an early survey, back in the early 2000s by the Human Genome Diversity Project.  It turned up in a sample of the Baluchi in Makran, South-west Pakistan.  Could this cousin be closer than the Habibi tester?  Could my Habibi cousin, from an eastern Iranian family also carry SK1414?

The Baluch, are an Iranic people, that speak Baluchi, an Iranian language that belongs, as do most European languages, to the Indo-European linguistic family.  According to the Iran Chamber Society website, they moved to Makran during the 12th Century AD.  Traditionally the Baluch claim that they originated in Syria, but a linguistic study has instead suggested that they actully originated from the south east of the Caspian region, and that they moved westwards between the 6th and 12th centuries AD in a series of waves.  No other Y sub clade L1b (L-M317) have been found in Southern Asia outside of two samples of this survey, so perhaps the tester did have ancestry from Western Asia.

Iran regions map fr

It would seem likely that I do have a number of Y cousins, most likely in the region of Eastern Iran and South-Western Pakistan.  That doesn't necessarily follow though, that our most recent common Y ancestors lived there.  As I said above, the Baluch of Makrani, Pakistan are said to have migrated from further north-west, from the Caspian Sea region.

There is a tentative suggestion of a link to the Parsi. A Portuguese STR tester with a genetic distance (based on 67 markers) of 22, has (thanks again Gareth) "a distinctive value of 10 at DYS393. In the Qamar paper this value is found in the Parsi population".  So there is just the possibility also, of the Parsi ethnicity carrying L1b from Western Asia into Southern Asia.  Perhaps this marker was picked up by a Portuguese seafarer link to Southern Asia.  It could even be the link to my English line, via the Anglo-Portuguese Alliance.  A lot of speculation.  I don't think that M317 has been found yet in India.

Into England

I have found STR links with four people that carry the surname Chandler.  They live in England, Australia, and the USA.  These cousins appear to descend from a Thomas Chandler, that lived in Basingstoke during the 1740s.  That is 32 miles away from my own contemporary surname ancestor, John Brooker, who lived at the same time at the village of Long Wittenham in the Thames Valley.

Unfortunately three of the Chandlers have only 12 markers tested, and the fourth at 37 markers.  Therefore time of most recent ancestor is not accurate, but it looks as the Chandler and Brooker Y hg L testers of Southern England, most likely shared a common paternal great x grandfather sometime between 800 and 350 years ago.

That only these two lines have turned up, and that they are geographically and genetically so close, might suggest that our Y-DNA lineage arrived in Southern England around the late medieval, perhaps from between the 13th and 17th centuries AD.  It could just be through a Portuguese navigator link, or it could be through thousands of other routes.  More L-M20 testers could turn up in England in the future, that could push the arrival to an earlier date.

Today

I could have any number of cousins from south England.  The Brookers and Chandlers may well have other paternal line descendants living in the Thames Valley, Hampshire, London, or elsewhere.  I'd love to prove a Brooker from the Berkshire / Oxfordshire area, as sharing ancestry.  I believe for example, that the journalist Charlie Brooker descends from one of the Thames Valley families, although not necessarily from mine.  Do they carry the Y hg L?

My great great grandfather Henry Brooker, did not appear to have any more sons, other than my great grandfather John Henry Brooker - who in turn, only had one son, my grandfather Reginald John Brooker.

I have one Y haplogroup first cousin.  He has I believe, a son, and a grandson.

Story of L. My Big Y Test Results

The above Photograph of the Sumela Monastery, Trabzon Mountains, former Pontus, by reibai of Flickr under Creative Commons Licence.  Close to the home of my nearest recorded Big Y cousins today.

The Big Y Test

The FTDNA results came back.  As with the Y111 test results, they were three weeks earlier than scheduled.  So what has this test told me, about the story of my Y-DNA, and it's exotic L-M20 genetic marker? It was not a disappointment.

Warning

Remember, I am only telling the story of one single line of descent.  Y-DNA merely provides a convenient genetic marker of mutation, that can be compared and traced with others.  It does not define anyone.  From an anthropological perspective, haplogroups are of value in a collective sense - to a population.  I no doubt share the story of my Y with many more people alive today.  I may be a carrier of it, but it is also your story, just as the haplogroups that you carry, are also my story - through our mothers and shared descent.  Y-DNA passes strictly on only one line of descent - from father to son.  It is not inherited nor passed down by women.  Only on that one strict paternal line of descent. The Y haplogroup is only a convenient marker of one line.

The Y Haplogroup L

Y Haplogroup K formed in a paternal lineage of hunter-gatherer fathers and sons, that share a MRCA (most recent common ancestor) during the Upper Palaeolithic, circa 45,400 years ago.  Where did my Y ancestors live at that time?  We think that they lived in Western or Southern Asia.  Iran is a favourite proposal. My earlier Y ancestors had most likely exited Africa 20,000 years earlier, and were well established in Asia.  They had most likely met and confronted another archaic human species, The Neanderthal. This was however, a time of great expansion by humans.  The first anatomically modern humans had recently entered Europe, while other moderns u were arriving in Australia.  The Ice Age was in a flux, but glaciation was advancing.

Our most recent common Y ancestor to carry Y Haplogroup LT lived circa 42,600 years ago.  Then a mutation in the Y-DNA lead to the formation of Y Haplogroup L, with a most recent common ancestor 23,200 years ago, close to the time of the Last Glacial Maximum, when ice sheets were reaching their maximum positions.  K, LT, and early L, most likely all originated in Upper Palaeolithic hunter-gatherer populations living during the last Ice Age, in the area of modern day Iran and Iraq.  It was a time of increased stress on human populations, that were having to adapt to some severe environmental challenges, and may have at times faced isolation into a number of Ice Age Refuges.

Around 18,400 years ago, M317 appeared on their Y-DNA, then circa 14,000 years ago, my line (L-SK1214), diverged away from L-M349.  L1b today, occurs mainly in Western Asia, from Anatolia to Afghanistan.  L1a occurs mainly in India, Sri Lanka, and in Pakistan.  Where did all of this occur?  We don't know yet.  There is so little data.  Some other divergences popped up in Southern and Central Asia.  Some of these sub clades in India and Pakistan, are the most numerous of L today.  However, the finger keeps pointing at Western Asia, as the source of much of L divergence, particularly in L1b sub clades such as M317, and M349.  But we don't yet know what part Europe played if any.  Both M317 and M349 crops at low frequencies across Europe, particularly along the south coast, and in Italy.  L2 (L595) crops at at low frequency almost exclusively in Europe.  Altogether, L forms only around 0.3% across Europe as a whole, yet, this diversity sits at low frequencies scattered across the continent.

Iran may equally be a key.  We believe that it could have been home to L for a very long time, but we have very little data from that part of the world.  L is also missing from ancient DNA.  A hypothesis has been proposed that some early Neolithic farmers from Anatolia, may have carried L, and may have carried it into Europe for example.  All speculation, but it could explain some of these old divisions of L that we are starting to see across Europe and Western Asia.  Some of the earliest Eurasian L Y-DNA extracted so far has only very recently been reported - in populations of Iron Age Huns, that had migrated westwards into Europe.

My Big Y Results

So what did the test tell me about my line?  Was I descended from a recent immigrant from India or Pakistan?  An Iron Age Hun?  An Italian?  How about a Pontic Greek, or a Persian?  Where do I fit in?

The answers provided by the Big Y were a bit of a shock.  I had 90 novel SNPs in my Y-DNA, that have not been seen before in any other Big Y Test, not even in any of the other 23 Big Y test results within the FTDNA Y Haplogroup L project.  The last SNP to terminate, that has already been reported, was SK1414.  The administrator has not yet found it's non-FTDNA origin, but believes that it came from a test in Iran.  Therefore, my sub clade can now be declared as L-SK1214.

My nearest FTDNA Big Y matches were two from Pontic Greek ancestry.  However, here is the crunch.  The project administrator calculates that even these testers, my closest known Y cousins that have so far tested to Big Y level, last shared a common Y ancestor with me 13,000 years ago.

When I have my BAM file, and submit it to the Yfull tree, it should make a significant alteration to the branches, as my lineage of SK1414, appears to branch off from L1b, perhaps only 1000 years after L1b appeared, and before the PH8 lineage associated with my Pontic Greek cousins formed.

L-SK1414 (L1b2c)

So my new terminal SNP SK1414 separated from the Pontic Greek PH8 lineage around 13,000 years ago.  What was happening in Western Asia then?  This was towards the end of the last Cold Stage.  There were some cold fluctuations in the Ice Age climate, with some advances in glaciation, before they finally started to melt back for the present interglacial period.  Perhaps some of these climatic stresses were involved?  a severe freeze took place around 12,700 years ago. 

My most recent common ancestors to any other Big Y testers - the Pontic Greek samples, lived somewhere in Western Asia around 13,000 years ago.  They most likely were Western Asian ibex hunter-gatherers.  The earliest sign of agriculture in the region, the Pre Pottery Neolithic A doesn't take off until around 10,300 years ago.

Where have my Y ancestors been over the past 13,000 years?  That is the big question that I am probably unlikely to answer within my lifetime.  More testing, by more L testers in the future may reveal more, as would the results of more ancient DNA from excavations.  If I had to bank money on it, I'd say that my Y ancestors were most likely to provenance to the Fertile Crescent of the Neolithic Revolution.  Perhaps in the river valleys of Iraq / Iran.  They may have gone on to take part in the Pre Pottery Neolithic A Culture there.  That might account for their existence over the next few thousands of years.  However, when did my lineage enter Europe?  Did it arrive with Anatolian Early Neolithic farmers?  Or did it arrive later?  Perhaps even, much later?  I just cannot answer that.  Suggestions are most welcome.


The above photograph taken of the excavation of Jarmo, an Early Neolithic village in Iraqi Kurdistan, dated to 9,100 years before present.  From Wikimedia Commons by user Emrad284.

The STR testing, and the matching with the Chandler family might suggest that my Y line arrived in Southern England quite recently, perhaps during the Medieval.  However, I am acutely aware of how very few English have yet tested - that more L could turn up, that rewrite that arrival date.
Unofficial proposed tree by Gökhan Zuzigo

Conclusion

It seems that I have 12,700 years of unwritten or detected family history to research on my paternal line.  The Big Y test told me that I have a hunter-gatherer ancestor, somewhere in Western Asia, most likely Iraq / Iran, perhaps 13,000 years ago.  Then a rather long gap, until the Brooker surname appears on parish registers in the Thames Valley of Southern England, leading down to myself, and onto my son.

The Chandler family, judging by the comparative STR evidence, are Y cousins, with a shared Y ancestry until circa 330 - 700 years ago.

That's it.  We were missing for a long time.  I'm looking forward to trying to work out where my missing ancestors were for thousands of years.  I'm looking forward to seeing more L1b tests appear on Yfull and on the Y haplogroup L Project.  Please test.

The above photograph on Rock Art in Iran, taken by dynamosquito on Flickr linked here under a Creative Commons Licence.  The Ibex seems to feature frequently in prehistoric rock art in the region, and perhaps was a primary prey of our ancestors.

Preserving our genetic heritage

The above portrait is of my great uncle Leonard Smith, with my grandmother, Doris Smith of Norwich.  Taken circa 1904.

Preserving our genetic heritage

I've ordered a genetic profiling kit to test my mother.  I want the results 1) for phasing with my own results, in order to better understand where different segments on my chromosomes originate from - from which parent.  2) because I feel that my mother has a particularly rich, documented, and very localised Norfolk ancestry.  Finally 3) because I feel almost duty bound to do so, while I can.  I've lost my father.  My mother will not always be here, as neither will I.  I wont always have the chance to do this.  By examining Mum's SNPs, I'll be able to find out exactly what SNPs my late father gave me.  I think that I've seen programs that try to rebuild the DNA of a missing parent, by combining the results of their children or / and other relatives.

This has lead me to ponder over the future.  Will we want to preserve the genetic scans of our parents and grandparents?  Will the desire to capture photographic images of our elders, then to preserve them long after they've gone, transform itself into a desire to preserve genetic profiles?  Will we value the raw data of their SNPs?  Will great granny's genome be handed down in the form of binary data from chip to chip?  Will families pride themselves on the ownership of a SNP scan data from a great great grandparent?