Gedrosia and our DNA

Attribution: Fielding Lucas, Jr. [Public domain], via Wikimedia Commons

This post is partly an excuse to upload and store the above creative commons image.  My Y-DNA terminal SNP (L-SK1414) twin was a Balochi speaker in Makran, SW Pakistan.  In classical times, Makran was located in the Kingdom of Gedrosia.  It's almost ironic that an open source range of autosomal DNA testers on GEDmatch have been named after Gedrosia.

Gedrosia was a dry, mountainous country along the northwestern shores of the Indian Ocean.  The indigenous name for Gedrosia is thought to have been Gwadar.  It was conquered by the Persian king Cyrus the Great (559-530 BC). The capital of Gedrosia was Pura, which may survive today as modern Bampûr.  In 326 BC, The Macedonian king, Alexander the Great disastrously crossed the Gedrosian Desert, on the return from his campaign in India, and lost 12,000 of his men to the savage conditions.

Image of Gwadar Bay by wetlandsofpakistan (Gwadar - West Bay) [CC BY-SA 2.0 (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons.

So, although the GedrosiaDNA GEDmatch heritage calculators may have little to do with the legendary land that may have been host to my Y ancestors, out of interest, how do our atDNA test results tally with the GedrosiaDNA calculators?  These calculators are designed to measure Ancient Eurasian Admixture.

My results (using an FT-DNA raw file) against those of my mother (23andMe raw file).

My Eurasia K9 ASI Oracle:

  • 39% Western Hunter-Gatherer
  • 27% Early Neolithic Farmer
  • 15% Eastern Hunter-Gatherer
  • 12% Caucasus Hunter-Gatherer
  • 7% SW Asian
  • 1% Siberian East Asian

Mother's Eurasia K9 ASI Oracle:

  • 40% Western Hunter-Gatherer
  • 26% Early Neolithic Farmer
  • 14% Eastern Hunter-Gatherer
  • 12% Caucasus Hunter-Gatherer
  • 6% SW Asian
  • 1% Siberian East Asian

My Gedrosia K3 Oracle:

  • 97.5% West Eurasian
  • 2.5% East Eurasian

Mother's Gedrosia K3 Oracle:

  • 96% West Eurasian
  • 4% East Eurasian

My Gedrosia K15 Oracle:

  • 40% Western Hunter-Gatherer
  • 25% Early European Farmer
  • 21% Caucasus
  • 5% Burusho
  • 5% SW Asian
  • 3% Balochi
  • 1% Siberian

Mother's Gedrosia K15 Oracle:

  • 40% Western Hunter-Gatherer
  • 24% Early European Farmer
  • 18% Caucasus
  • 4% Burusho
  • 3% Kalash
  • 2% Siberian
  • 1% Balochi

My Ancient Eurasia K6 Oracle:

  • 40% West European Hunter-Gatherer
  • 39% Natufian
  • 21% Ancient North Eurasian
  • 1% East Asian

Mother's Ancient Eurasia K6 Oracle:

  • 41% West European Hunter-Gatherer
  • 38% Natufian
  • 19% Ancient North Eurasian
  • 2% Ancient South Eurasian
  • 1% East Asian

Conclusions

We both appear to have inherited around 40% of our DNA from ancient Western (European) Hunter-Gatherer populations, nothing unexpected there.  Western Hunter-Gatherers not only lived in Europe, but appear to have contributed to some later Eurasian populations such as the Yamnaya and Early European Farmers.

We both have low counts of Ancient North Eurasian - particularly my mother, who scores only 19% ANE (Upper-Paleolithic genomes from the Lake Baikal region of Siberia, identified as Malta, Afontogora 2, and Afontogora 3, dated to 17 to 24 kya).  It has been noted during online discussions, that the English appear to have slightly lower percentages of ANE than do their close neighbours.  ANE is sometimes used to indicate Yamnaya ancestry (ANE was a component), that spread from the Eurasian Steppes into Western Europe during the Early Bronze Age.

I have 3% Balochi compared to 1% Balochi for my mother.  It may mean nothing, but it could just perhaps indicate something in the autosomes that associates on my paternal side with my Y-DNA story.  The indicators (particularly K15) suggest that I have more SW Asian ancestry, presumably from my paternal side - again, it just could associate with what we know about my Y haplogroup L-SK1414.

We have around 24-25% Early European Farmer ancestry, representing early Neolithic descendants of an admix of WHG and "Basal Eurasians".  This signal apparently peaks around 80% in modern Sardinians.  However, the "Natufian" reference are higher - 38-39%.  According to GEDmatch: "Natufian was an Epipaleolithic culture that existed from 12,500 to 9,500 BC in the area of Israel. They were derived about 50% from an original Out-of-Africa population, referred to as Basal Eurasians. If you are a European and show Natufian admixture, this does not imply that Natufians interacted with your ancestors. All it means is that Natufian like admixture was mediated to you via intermediaries, such as the early European Farmers from the Near East".  I'm not sure what to make of that.



Counting the SNPs - 23andMe V FT-DNA

Comparing 23andMe V4 kit raw file to FT-DNA raw file.

Both tests were taken by myself this year (2016).  I am here comparing the quality of two separate atDNA tests from the same person, by two different DNA for Ancestry companies.  As will be seen, the quality varies considerably, at least in terms of the number of SNPs that are tokenized once forwarded to GEDmatch.com.  This is NOT a test of how well both companies ascertain our DNA ancestry from these files.  Both use their own reference populations and analysis programs.  I've reviewed that elsewhere.  This test simply weighs how many SNPs are registered from the autosomes and X chromosome of one person.

Using the GEDmatch DNA file diagnostic utility, I received the following SNP counts:

Kit M551698 (23andMe V4)

Token File data:
Chr Token SNP Count
1 40974
2 42110
3 34199
4 31020
5 30421
6 36383
7 26352
8 27900
9 23644
10 27888
11 25363
12 25395
13 19880
14 15957
15 15529
16 16551
17 13745
18 16775
19 9006
20 13530
21 7324
22 7386
X 15359

Processed in batch 5355
Number of SNPs utilized by GEDmatch template = 523997
Number of regular SNPs = 517780
Heterozygosity index = 0.302721 (fraction of total SNPs that are heterozygous)
No-calls = 4911 = 0.93956084952678 percent.
Kit M551698 has approximately 19959 total matches with other kits. Of these matches there are 4982 >= 7cM and 14977 < 7cM.


Kit T444495 (FT-DNA file):

Chr Token SNP Count
1 57931
2 59602
3 47094
4 41772
5 39314
6 47546
7 36567
8 36753
9 30643
10 36889
11 35941
12 35850
13 26763
14 22650
15 20899
16 21935
17 18379
18 22586
19 12773
20 19587
21 10001
22 9750
X 19176

Processed in batch 5914
Number of SNPs utilized by GEDmatch template = 709242
Number of regular SNPs = 694324
Heterozygosity index = 0.281384 (fraction of total SNPs that are heterozygous)

No-calls = 16077 = 2.263088030563 percent.

Kit T444495 has approximately 48755 total matches with other kits. Of these matches there are 9351 >= 7cM and 39404 < 7cM.

Conclusion

If the quality of a raw atDNA file is merely down to the number of SNPs that are tested, then FT-DNA clearly wins hands down, when compared with the 23andMe file, following tokenization for GEDmatch use.  The FT-DNA file utilises 709,209 SNPs compared with 23andMe's 523,997 SNPs

I thought that it might be interesting to compare how these files, of the same person, might compare on the same GEDmatch heritage admixture program.

On Eurogenes K13 Oracle, my 23andMe kit gets as top ten closest GD's:

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97
6 French 7.63
7 North_Dutch 7.76
8 Danish 7.95
9 North_German 8.17
10 Irish 8.22

On the same, using my FT-DNA kit (with many more SNPs tested as demonstrated above:

1 Southeast_English 3.75
2 South_Dutch 4.03
3 West_German 5.42
4 Southwest_English 5.68
5 Orcadian 6.33
6 North_Dutch 7.15
7 Danish 7.36
8 Irish 7.59
9 West_Scottish 7.62
10 North_German 7.7

Based on the numbers of SNPs tokenized, I will in future regard the FT-DNA (Family Tree DNA) file as superior in quality, over the 23andMe file, despite my disappointment in the FT-DNA My Origins ancestry analysis.

Y Haplotype L1b2c

By Hellerick (Own work) [CC BY-SA 4.0], via Wikimedia Commons.  Modified by Paul Brooker.

I've created this distribution map of known Y haplogroup L, L1b2c or L-SK1414. This is my Y-DNA haplotype.  Not a lot of dots there are there?  This is how rare that this clade is.  L1a and L1b most likely (in my opinion) originated during the last Ice Age circa 18,000 years ago, south of the Caucasus, and west of the Caspian Sea in Western Asia.  In other words, in the area of present day Armenia, Azerbaijan, and North-west Iran.  Again, I emphasise, that is just my opinion, looking at present-time evidence.

Y haplogroup L itself may have diverged between L1 and L2, not so much earlier, or so far away from this region.  Again, just my present opinion.

My sub clade of L1b, is so rare, that it is impossible to say.  As can be seen from the map.  However, this is my blog, so I'm going to push out on this one.  My very best guess would be further to the East than it's parent.  I suspect South East of the Caspian Sea, in what is now Eastern Iran.  I could well be wrong.  We have so few tests from nearby Afghanistan for example.  So far, the SNP SK1414 has only been reported twice.  1) in Makran, SW Pakistan, in a Balochi speaking man.  Balochi is an Iranian language, closely related to North-West Iranian languages.  Researchers suggest that the Balochi people of Makran, largely migrated from south west of the Caspian.

The only other guy in the world so far confirmed is little old me, an Englishman.  I trace my surname (direct paternal) line back to the Thames Valley of Oxfordshire / Berkshire 270 years ago.  If my biological line follows that.  A number of STR testers of English descent appear connected to me by STR analysis.  They all descend from Thomas Chandler, who lived around the same time as my earliest recorded ancestor - only 32 miles away at Basingstoke.

From all of the evidence, I conclude that my Y ancestral line moved, probably in one generation, from Western Asia, perhaps from he edge of Persia, to Southern England conservatively between 2,000 and 400 years ago.  Although I would speculate between 1,600 and 600 years ago - during the Medieval or close by.

Bunwell, Norfolk - ancestral parish

I took a little local bicycle ride today, along the extremes of my recorded maternal line - that which should carry my mt-DNA.  23andMe tested it as H6a1.  WeGene and mthap analyser both suggest H6a1a8.  I'm looking forward to see what Living DNA make of it.  The haplogroup, based on current evidence, most likely originated on the Pontic and Caspian Steppes, before spreading into Western Europe during the Early Bronze Age.  However, on documentary record, I've traced it back to Generation 9 - my maternal G.G.G.G.G.G Grandmother, Susannah Briting (Brighten, Brighton) who married my ancestor John Hardyman (Hardiment, Hardimend, Hardiman), in the Norfolk parish of Bunwell in 1747.  According to the Bunwell Parish Registers, between 1748 and 1754, John and Susannah had four children baptised at the local church of Bunwell St Michael & Angels: John, Martha, Elizabeth, and Thomas Hardyman.  My G.G.G.G.G Grandmother Elizabeth Hardyman, went on to marry my ancestor Robert Page at nearby Wymondham in 1779, and continued my maternal line down to my Mother.

Bunwell, Norfolk, East Anglia

Bunwell is a parish of scattered settlement and hamlets, located above the Tas Valley, on the high boulder-clay soils of South Norfolk.  These heavy soils encouraged a pattern of dispersed settlement during the Late Medieval, with occupation often taking place along the edges of common land.  This could suggest limited manorial control.

I took this photo of the local landscape.  Large medieval open fields were divided into smaller enclosed fields during the 17th to 19th centuries.  These small parceled and enclosed fields were then opened up again into larger fields, with the removal of many hedgerows, during the 20th century.  Main land uses today are arable agriculture - modern crops include sugar beet, wheat, oil seed rape, etc.

Counting the number of plant species within a designated length of hedgerow has been used as a dating process.  The number of species increasing across the centuries.

Vernacular tradition includes many classic South Norfolk farmhouses, of which the following example is a striking example:

The owner has been renovating it from many years, having inherited it from his father.  Aside from the chimney (which was replaced following a lightening strike), the newest sections of the house date to the 1740's.  The eldest have not been dated, but perhaps extend to the Late Medieval.

The Church building is of the Perpendicular tradition, and dates to circa the 1450's, although it was most likely built on the site of an earlier church.  It is dedicated to St Michael and All Angels.

It is very much still an active church.  A knitting club were busy at work on my visit.  The church warden also called in.  The locals told me that there were, or still are Hardiment and Britings living in the parish.  I had a look around the surrounding graves.

This headstone is a good example of 18th century headstones in East Anglia.  Sadly, not one of my known ancestors.  Note the extra information Late of Starston.

I didn't spot any Britings, but I did find a cluster of Hardyman graves, including this example:

Not one of my direct ancestors, but most likely, a cousin.

My mtDNA ancestor Elizabeth, moved away from Bunwell, marrying nearby at Wymondham. From there, my mtDNA line moved through other nearby villages, including Bestthorpe.  Another generation on, it made an unusual leap (my Norfolk ancestors rarely moved far) to the opposite side of Norwich, to the parish of Rackheath in Broadland.  It then moved further East, to the parishes of Tunstall and Reedham.  On to Hassingham and my mother carried it back west to the Norwich area. We've ironically both  carried it back to the Wymondham area.


Living DNA Kit

Well, the kit arrived in good time!  Living DNA only just announced yesterday afternoon, on their Facebook Page, that the first batch of kits were ready for despatch - and it was here in Norfolk this morning!

Unfortunately, I can't activate it until the service goes live on Friday midday.  So I've got to wait a few days.  Ugh.  I also notice that the return address for the completed sample, is back to their base in Frome, Somerset.  So I'm guessing that they'll pile them up until they ship them to the lab in Denmark.  Must be patient!

A new test - LivingDNA test for Ancestry

You might think that following my recent posts, that I've lost all faith in DNA testing for Ancestry.  Not at all.  I just object when people take the analysis results of autosomal DNA tests for ancestry, as infallible truths.  They are clearly not.

So far this year, I have commissioned two 23andMe tests, and three FT-DNA tests, a FullGenomes analysis, and a YFull analysis.  I have also used free analysis at WeGene and DNA.land, and have run three raw files on GEDmatch calculators.  You'd might also think that I've done enough testing for one year!  I thought that as well.  Then a new service just entered the market.

Living DNA Ancestry attracted my commission on two particular points.  1) it has an incredible British reference, that promises to break ancestry composition into 30 British regions - in addition to global analysis.  If it works, then this is a must for people with significant British ancestry.  2) it uses the latest cutting edge test chip.  The latest Illumina chip based on Global Screen Array (GSA).  In addition, it uses a European based lab (Denmark), it tests Y-DNA, mtDNA, and autosomes.  It tests more SNPs on all three counts, than other current chips used by competitors offering autosomal plus tests.  Raw files for the test results will be available for download.

The British Reference

Living DNA will be using a British reference broken down into an incredible 30 regions, across England, Scotland, Wales, Orkney, and Northern Ireland.  The reference uses the much heralded POBI (Peopling of the British Isles 2015) data set.  This project collected 4,500 blood samples from people that could claim four grandparents in the same area, from across the regions of Britain.

A little about the POBI project below:

The British reference does not include the Republic of Ireland.  However, LivingDNA are confident that they have collected a good global reference, and I understand, that they are seeking a similar quality Irish data-set for the future.  

In comparison, other providers of DNA tests for ancestry, only reference to Britain, or the British Isles & Ireland, as a single reference point.  And as can be seen by my previous posts, with limited success.

They also hope to provide imports for formats of raw file from other test companies in the future.  LivingDNA do not themselves currently offer relative matching, or health information.  Their service is for now, primarily for ancestry.

The Chip

They will be using a custom version of the latest Illumina chip technology, the Global Screen Array (GSA).  It is encoded with:

650,000 autosomal DNA SNPs

20,000 Y-DNA SNPs

4,000 MT-DNA SNPs.

In comparison for example, the 23andMe V4 chip scans for:

577,000 atDNA SNPs

2,329 Y-DNA SNPs

3,100 MT-DNA SNPs.

I hope that LivingDNA will also use up-to-date haplogroup nomenclature and information.  23andMe with their V4 chip still use very dated 2009 nomenclature.

So, let's see if this new service is any improvement to my results, compared with the hit and miss of 23andMe, and Family Tree DNA (FT-DNA).  Will they be able to identify and locate my English roots successfully?  What will the improved chip make of my haplogroups?

The Southern European DNA enigma. Option 3. Autosomal DNA Analysis does not work

Here I'm considering the third option to my enigma.  My known ancestry is 100% English.  However, autosomal DNA tests for Ancestry, by commercial companies, and by third party analysis, suggest that I have a mixture of European ancestries, including varying percentages of Southern European.  I'm trying to best explain this phenomena.  In previous posts, I considered 1) that my paper record is incomplete, or biologically incorrect.  2) that something ancient is picked up in analysis of present day English testers - that maybe reflect shared algorithms with ancient admixture, perhaps prehistoric, or Roman.

Now in this post, I consider the third option.  That commercial DNA companies exaggerate their claims to be able to differentiate to any successful degree, between different regions of Europe in my ancestry.  If this is indeed the case, it has significant repercussions for testers for example, in the USA, Canada, Australia, etc.  If they have a poor paper trail, and poorly known ancestry, maybe it's all too easy for them to regard such DNA tests for ancestry, as indisputable and accurate truths.

Commercial DNA companies for Ancestry, are under pressure to supply to market demands.  Their markets have been dominated particularly by USA customers.  Some of them seasoned genealogists with good quality paper trails.  Others, attracted by the easy option to know their ancestry before the, as 23andMe puts it, the Age of Migration before the past few centuries.  Instead of spending a lifetime chasing documents, they can simply send a DNA sample to a company, and know their roots.  People trust the science of DNA testing for ancestry.  That is the demand that commercial companies can cater for.

But what if their abilities to accurately detect ancestry from Autosomal DNA is exaggerated?

Lack of agreement between analysis.

As one evidence.  Test autosomal DNA with three different companies, and you will receive three different results.  That is well known in genetic genealogy circles.  Some apologists excuse it away by pointing to the different companies claims, to be focusing on different periods.  23andMe say that they zoom in on 500 years ago, by rejecting short chains.  Is it really, really possible yet, to be able to zoom in on one particular period?  I'm not convinced.  Is it even possible to securely locate all ancestry from the past 500 years?  I'd expect genetic recombination to wash away an awful lot of ancestral DNA long before that.  The truth is that beyond our great great grandparent's generation, there is less and less chance of us carrying any surviving DNA from any one particular ancestor! Especially from the autosomal DNA passed down on your father's side.  You might have a Balkan g.g.g.g grandfather, but chances are, there will be no evidence of their existence remaining in your autosomes.  His DNA, and all that belonged to his Balkan ancestry, will be lucky to survive the following 250 years, never mind 500 years.  My Y-DNA has strong evidence that I had an Asian ancestor on my paternal line, arrive in Southern England between 1,800 and 500 years ago.  However, nothing remains in my autosomal DNA analysis that suggests Asia.  Washed away.

Getting back to those three companies giving three different ancestries. My South European percentages have varied from 2% (with a hint at Iberia), to 19% (with a hint at Balkans), to FT-DNA's claim of 32%!  Eurogenes K13 hints at Iberia in it's admixture programs on GEDmatch.

Population References

One more thing.  Autosomal DNA tests for ancestry do not use ancient DNA references.  Not yet anyway.  They instead use present-day references, often from their own customer client bases, based on what ancestry they claim.  This is not necessarily the DNA that existed in past populations.  Populations and genes shuffle, genetic drift forms.  I recently read a report that FT-DNA Y data for NW Europe heavily biases to Irish ancestry.  Therefore, references from Americans of Irish and / or British descent, will bias to the West.  The quality of a reference is critical.

Is it all Bunk?

Am I saying that autosomal DNA testing for Ancestry is all a waste of time?  Actually no, not yet.  The tests DO find me to be pretty much 100% European.  That is a success.  Some tests even find me with a degree of confidence, to be NW European.  That is awesome.  However, beyond such regional level, should we be trusting such tests to be providing concrete results, infallible "truths" with a high degree of accuracy?  Shouldn't we be cautious, and regard such speculations as just that - speculations, to be assessed by other forms of evidence?  Some of my ancestors might have lived in Southern Europe.  Maybe Option 1 was correct - one of my Norfolk ancestors brought a Portuguese wife home from the Peninsular Wars.  Perhaps.  Maybe Option 2 was correct - the patterns that DNA companies pick up as Southern European, are ancient, related to Neolithic, Iron Age, or Roman admixture from the South, or sharing ancient ancestry with Southern Europeans.  Maybe.

I'm not at all disenchanted with DNA testing for ancestry though.  I've commissioned five so far this year, including three autosomal DNA tests.  This leads me to my most recent commission.  Perhaps this one will convince me more.  It's a very new test.  I'll post on that next.