Counting the SNPs - 23andMe V FT-DNA

Comparing 23andMe V4 kit raw file to FT-DNA raw file.

Both tests were taken by myself this year (2016).  I am here comparing the quality of two separate atDNA tests from the same person, by two different DNA for Ancestry companies.  As will be seen, the quality varies considerably, at least in terms of the number of SNPs that are tokenized once forwarded to GEDmatch.com.  This is NOT a test of how well both companies ascertain our DNA ancestry from these files.  Both use their own reference populations and analysis programs.  I've reviewed that elsewhere.  This test simply weighs how many SNPs are registered from the autosomes and X chromosome of one person.

Using the GEDmatch DNA file diagnostic utility, I received the following SNP counts:

Kit M551698 (23andMe V4)

Token File data:
Chr Token SNP Count
1 40974
2 42110
3 34199
4 31020
5 30421
6 36383
7 26352
8 27900
9 23644
10 27888
11 25363
12 25395
13 19880
14 15957
15 15529
16 16551
17 13745
18 16775
19 9006
20 13530
21 7324
22 7386
X 15359

Processed in batch 5355
Number of SNPs utilized by GEDmatch template = 523997
Number of regular SNPs = 517780
Heterozygosity index = 0.302721 (fraction of total SNPs that are heterozygous)
No-calls = 4911 = 0.93956084952678 percent.
Kit M551698 has approximately 19959 total matches with other kits. Of these matches there are 4982 >= 7cM and 14977 < 7cM.


Kit T444495 (FT-DNA file):

Chr Token SNP Count
1 57931
2 59602
3 47094
4 41772
5 39314
6 47546
7 36567
8 36753
9 30643
10 36889
11 35941
12 35850
13 26763
14 22650
15 20899
16 21935
17 18379
18 22586
19 12773
20 19587
21 10001
22 9750
X 19176

Processed in batch 5914
Number of SNPs utilized by GEDmatch template = 709242
Number of regular SNPs = 694324
Heterozygosity index = 0.281384 (fraction of total SNPs that are heterozygous)

No-calls = 16077 = 2.263088030563 percent.

Kit T444495 has approximately 48755 total matches with other kits. Of these matches there are 9351 >= 7cM and 39404 < 7cM.

Conclusion

If the quality of a raw atDNA file is merely down to the number of SNPs that are tested, then FT-DNA clearly wins hands down, when compared with the 23andMe file, following tokenization for GEDmatch use.  The FT-DNA file utilises 709,209 SNPs compared with 23andMe's 523,997 SNPs

I thought that it might be interesting to compare how these files, of the same person, might compare on the same GEDmatch heritage admixture program.

On Eurogenes K13 Oracle, my 23andMe kit gets as top ten closest GD's:

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97
6 French 7.63
7 North_Dutch 7.76
8 Danish 7.95
9 North_German 8.17
10 Irish 8.22

On the same, using my FT-DNA kit (with many more SNPs tested as demonstrated above:

1 Southeast_English 3.75
2 South_Dutch 4.03
3 West_German 5.42
4 Southwest_English 5.68
5 Orcadian 6.33
6 North_Dutch 7.15
7 Danish 7.36
8 Irish 7.59
9 West_Scottish 7.62
10 North_German 7.7

Based on the numbers of SNPs tokenized, I will in future regard the FT-DNA (Family Tree DNA) file as superior in quality, over the 23andMe file, despite my disappointment in the FT-DNA My Origins ancestry analysis.

Autosomal DNA Tests for Genealogy

First a disclaimer.  I'm very new to the whole world of genetic genealogy.  I'm not new however, to traditional genealogy, and I do have a pretty good amateur understanding of relative archaeological and anthropological discussions over the past fifty years.  The following is not meant as a critique of genetic genealogy, so much as a review, or my experience, of ancestry composition based on autosomal DNA analysis.

Let's start with my paper trail.

Traditional Genealogy

I am English by ethnicity, British by nationality, and a subject of Queen Elizabeth II (often now referred to as a UK Citizen).

My paper recorded ancestry consists of the genealogical records of:

  • Generation 1 has 1 individual. (100.00%)
  • Generation 2 has 2 individuals. (100.00%)
  • Generation 3 has 4 individuals. (100.00%)
  • Generation 4 has 8 individuals. (100.00%)
  • Generation 5 has 16 individuals. (100.00%)
  • Generation 6 has 29 individuals. (90.62%)
  • Generation 7 has 49 individuals. (76.56%)
  • Generation 8 has 35 individuals. (27.34%)
  • Generation 9 has 24 individuals. (10.16%)
  • Generation 10 has 10 individuals. (2.34%)
  • Generation 11 has 4 individuals. (0.39%)
  • Total ancestors in generations 2 to 11 is 181. (9.04%)

All 181 ancestors, reaching back to the 1690's, appear to be English born, of English ethnicity, with English surnames.  The majority of them (100% on my mother's side, and 81% on my father's side) were East Anglian, with the vast majority of that percentage being born in the county of Norfolk.  Religions recorded or indicated were CofE Anglican or non-conformist Christian.  No sign of any Catholicism, Islam, or Judaism.

Therefore it would look pretty likely, that I can claim English heritage, wouldn't you agree?

Genetic Genealogy and Ancestry Prediction

There are three aspects or avenues of inquiry, available for genetic genealogy.  First of all, the two sex haplogroups; the y-DNA, and the mt-DNA. These two "signals" are referred to as haplogroups.

  1. The y-DNA.  This follows the Y chromosome.  It is only carried by men.  It is passed along the paternal line, and only by that line, from grandfather, down to father, down to son, until the line is broken.  What a lot of people do often misunderstand, is that it does not represent 50% of your ancestry.  It does not represent all of your biological father's ancestry.  For example, his mother's father, and her brothers, although on your father's side, would most likely carry a different y-DNA haplogroup.  It only comes down an uninterrupted strictly paternal line.  Even at Generation 7 (g.g.g.g grandparents) above, it would have been carried by one out of my sixty four biological ancestors at that generation.  The other thirty one g.g.g.g grandfathers for that generation may have carried different Y haplogroups.
  2. The mt-DNA.  Although a very different type of DNA, this one works as the opposite sex haplogroup.  It is a signal that is passed down the strictly maternal line, from grandmother, to mother, to her children.  Yes, we men do inherit our mother's mt_DNA, but we can't pass it down.  Only our sisters can.
  3. The au-DNA, better known as Autosomal DNA.  Whereas the former two sex haplogroups are handy, because we can measure their mutations, and track their formation and movement across thousands of years, au-DNA really is the stuff that we are made of - all of the SNPs on our chromosomes that personalise us within the human genome.  We inherit our au-DNA from all of our recent ancestors.  Roughly 50% from our biological mother, and 50% from our biological father.  Equally, we could say on average, 25% from each grandparent, or 12.5% from each great grandparent.  However, it is messy.  At every reproduction (meiosis), it gets messed up by recombination.  Not only that, but go back much more than six generations, and it becomes more and more likely that you can lose entire lineages.  You can have no surviving trace of any DNA from for example, a particular g.g.g.g.g grandparent.

Autosomal DNA is what makes us individuals, gives us our hereditary traits.  It is passed down from many ancestors, via our parents.  However, the sex haplogroups are of interest because they can be traced across the globe, and the millennia.  As we gain more and more data - both from living populations, and ancient DNA from archaeological finds, so we will be able to track the STR and SNP mutation data more precisely.

However, what about poor old messed up autosomal DNA?  It represents our entire biological heritage over many generations. It is what we are. However, making sense of it is less easy, less precise.  Genetic genealogists are making progress, but it is far less of a precise science than either of the haplogroups.  They use calculators, that measure the segments of DNA cross the chromosomes, looking for patterns that they recognise from a number of known reference populations.  From that, these calculators predict an ancestry.  Exactly what and when that ancestry refers to, does seem to vary from one calculator to another.  There is an argument that the precision can be improved if you also test close known relatives including at least one parent.  The results can then be phased.  I'm actually waiting for the results for my mother, so that I can see my own au-DNA ancestry results phased and corrected.

So lets have a bit of fun, and see what some of the calculators suggest for my autosomal DNA, at least before any phasing with my mother's DNA.  What do they make of my 100% English paper ancestry?

23andMe.com Ancestry Composition Standard Mode

99.9% European.

Broken into:

83% NW European

17% Broadly (unassigned) European

I think that's pretty cool.  As I'm getting to know au-DNA predictions, so as I'm learning to appreciate it when they get the right continent, and the right corner of that continent.  That is more than they could do a decade or two ago.  The prediction is correct, I am a NW European.  I'm not a West African, a South Asian, or a East Siberian.

23andMe.com Ancestry Composition Speculative Mode

100% European

Broken into:

94% NW European

3% S European

3% Broadly (unassigned) European.

Whoa, where did that South European come from?  It could just be a stray incorrectly identified signal, or it could be telling me that one of my ancestors, maybe around Generation 6, were from down south!  Lets break down the prediction further.  First, the NW European:

32% British & Irish

27% French & German

7% Scandinavian

But surely I should be 100% British & Irish?  Not only 32%.  I have my own ideas about this.  I think that although 23andMe claims that Ancestry Composition only represents the ancestry of the past 300 to 500 years (the so-called migration period, as sold to USA customers), that it gets confused by earlier migrations across their reference populations, including those during the early medieval period, and perhaps even some of those during late prehistory.  I've noticed that across Ireland and Britain, the further to the east, the more diluted the 23andMe British & Irish assignment.  People of solid Irish ancestry get between 85% and 98% British & Irish.  My East Anglian results, mixed between British & Irish, French & German, and Scandinavian, are actually rather more like those received by Dutch customers of 23andMe.

As for that Southern European prediction, how does that break down?

0.5% Iberian

2.4% Broadly (unassigned) South European.

Which if taken seriously, might suggest that I have an unknown Spanish or Portuguese ancestor around Generation 6.  If I did take it seriously that is.  I wonder what my mother's test will reveal?

DNA.Land.com Ancestry Composition

This is a third party site, that you can upload your 23andMe V4 raw data to, and see what their calculators predict for your ancestry.  It has recently had it's ancestry composition revised.  What did that make of my 100% English au-DNA?

West Eurasian 100%.

I like that designation, the amateur anthropologist in me prefers that broad designation over "European".  Broken down:

77% North/Central European

19% South European

2.4% Finnish

1.3% unassigned.

What?  Why not 100% North/Central European?  Finnish?  Did some early medieval Scandinavian settlers of East Anglia bring it?  Or is it a false signal?  Misidentified au-DNA?

That darned South European kicked in again.  I'm here looking at a biological cuckoo NPE (non-parental event) at around Generation 5 or even more recent!  Did a great grandmother secretly have a South European lover?  But this South European breaks down further:

13% Balkan

6% Italian.

Oh my goodness, whereas 23andMe speculative mode suggested SW Europe - this one suggests SE Europe!  Do I have a secret Albanian great grandfather?  Or is it all nonsense?

WeGene.com

This is a cracking new third party DNA analyser.  It is based in China, and it's predictors appear to calculate mainly for a Chinese market.  It not only predicts your ancestry composition, but also your two sex haplogroups, and lots of traits and health predictions to compliment those of 23andMe.  It even tries to predict your genetic disposition to sexuality!

It will allow you to send your 23andMe V4 raw data direct to it's own calculators.  However, at the moment the website is almost entirely in Chinese (Mandarin?).  There are two options.  1) At the bottom of the webpages is a hyperlink to English, which gives, in English, a basic ancestry composition, and your haplogroups.  It does not include English versions of the health and trait results.  2) use an online translator, such as the one built into the Google Chrome browser.  It actually serves pretty well.

On sex haplogroups they give my Y-DNA as

L1.  Not bad, but they didn't make it to L1b or L-M317.

My mtDNA?

H6a1a8.  Very good.  Better than 23andMe's H6a1, and the same as the mthap program.

But this is about au-DNA, how did they do, what did they make of my 100% English ancestry?

81% French

19% English/Briton

Now, that sounds pretty awful, but on closer inspection, I'm impressed.  No South European great grandfather.  Okay, so most of my DNA has been placed on the wrong side of the Channel.  However, I know that French and English DNA is actually very close.  Recent surveys even suggest that the English have inherited a lot of common ancestry with the French during unknown migration late in prehistory.  So again - they very much got the right corner of the right Continent.  Well done WeGene.

GEDmatch.com Eurogenes K13

GEDmatch is a website that you can upload raw data not only from 23andMe, but from a range of testers, and from V3 chips as well as V4.  It hosts a number of tools and predictors - some Open Source.  Some of these predictors are for Admixture or ancestry composition.  They measure your ancestry in terms of distance from known reference populations.  The lower the number, the closer you are to their reference.  They use calculators known as oracles to predict ancestry, including mixed ancestry or admixture.

The oracles on the Eurogenes K13 and K15 calculator models have a good reputation at working with West Eurasian ancestry.  So how does K13 first, score my 100% English ancestry?

On Single Population Sharing, it rates my DNA against the closest references.  In order of closest to not so close, the top five are:

1 South_Dutch 3.89
2 Southeast_English 4.35
3 West_German 5.22
4 Southwest_English 6.24
5 Orcadian 6.97

I think that's a cracking result.  Okay, it thinks that I'm closer to South Dutch, than I am to SE English, but so close - and my East Anglian ancestry most likely does include a lot of admixture from the Low Countries from the early medieval period.  I really like Eurogenes K13.

Okay, let's now use the Oracle 4 option, to suggest admixture.  First on three populations admixing to create my DNA, what comes closest?

50% Southeast_English +25% Spanish_Valencia +25% Swedish @ 2.087456

Well that's interesting!  The SE English hit the net.  The Swedish?  Could be ancient Scandinavian admixture - but the Iberian prediction has reemerged!

On four populations admixing?

1 Southeast_English + Southeast_English + Spanish_Valencia + Swedish @ 2.087456
2 Southeast_English + Southeast_English + Spanish_Murcia + Swedish @ 2.147237
3 Norwegian + Portuguese + Southeast_English + Southeast_English @ 2.216714
4 Danish + Portuguese + Southeast_English + Southeast_English @ 2.225334
5 Portuguese + Southeast_English + Southeast_English + Swedish @ 2.230991

Oh my goodness.  K13 agrees with 23andMe AC, that I have an Iberian link.  I'm now really starting to wonder.

Let's finish off by trying K15 on my 100% English ancestry:

GEDmatch.com Eurogenes EU test V2 K15


Using Oracle for single population first, the top five closest:

1 Southwest_English 2.7
2 South_Dutch 3.98
3 Southeast_English 4.33
4 Irish 6.23
5 West_German 6.25

Okay, I'm SE English, not SW English, but pretty impressive again.

Using the oracle 4 for three population admixture, what mix comes closest to my auDNA?

50% Southwest_English +25% Spanish_Castilla_Y_Leon +25% West_Norwegian @ 1.080952

That Iberian back again!

Top five mix ups of populations closest to me?

1 Southwest_English + Southwest_English + Spanish_Castilla_Y_Leon + West_Norwegian @ 1.080952
2 Irish + North_Dutch + Southwest_English + Spanish_Galicia @ 1.111268
3 North_Dutch + Southwest_English + Spanish_Galicia + West_Scottish @ 1.282744
4 Southeast_English + Southwest_English + Spanish_Castilla_Y_Leon + West_Norwegian @ 1.295819
5 North_Dutch + North_Dutch + Southwest_English + Spanish_Castilla_Y_Leon @ 1.304939

I can't help preferring the K13 results to the EU test V2 K15 - simply because it recognises me better as SE English, rather than to their SW English reference.

Conclusions

If anyone ever bothers reading this far too lengthy post, I hope that I have imparted the following lessons:

  • Don't expect DNA Ancestry tests to pin point an actual country of ancestry.  They're not no where near that good yet.  The populations of West Eurasia, and elsewhere, are actually all mixed up, or share a lot of recent admixture.  In addition, many European nation-states are quite recent inventions.  I've seen the borders of Europe change in my short lifetime.
  • Don't expect precision.  If for example, you are an American, and a 23andMe AC test suggests only 32% British & Irish, then you could actually have 100% English ancestry over the past 300 years!  We're so mixed up, that these tests are struggling to part and identify us by nationality.
  • If you are willing to share your raw data (there are privacy issues), then have fun trying out all of these third party calculators.  It's a lot of fun as you can see.  They rarely agree.  There are other tools on GEDmatch for example, where you can compare DNA along with .gedcom genealogical files with other users - and look for shared segments on the chromosomes.  You can also compare your DNA to that of ancient populations.
  • Treat au-DNA differently to haplogroup results.  au-DNA is very interesting, and represents so much of our ancestry, if we could just sort some of the mess out.  You can partially do this by phasing your results with those of close relatives.  It is worthwhile phasing with at least one biological parent, if you can.  However, haplogroup results, provide by their mutations incredible stories over much longer periods - thousands of years.  A different kind of genealogy.  As we gather more data, and reference it also to ancient-DNA, so it will tell us more and more about two lines of descent.  Perhaps even into historical times.

The 23andMe DNA results are in!

The results were uploaded to my 23andMe profile today.  I posted/registered the sample from the UK, nine weeks ago.  The sample traveled to the USA lab via a NL holding depot.  It took six weeks to process the sample and results, from the time of being marked as arriving at the USA lab.  I feel very fortunate, as many 23andMe customers are reporting a seasonal log-jam that is delaying the process.  My results though were comfortably within the proposed time frame.

There were a number of pleasant surprises.  The results were far from boring.  

Genetic Risk Factors

On the health side that we UK customers can presently still enjoy - there was only good news.  Although I have a family history of Alzheimer's that is strong on my father's side, there was no identification of any genes in my DNA, that have so far been associated with increased risk of the illness.  If my father did have these genes, I didn't receive them.  It does not mean that I will never be at risk to the illness, but it gives me some comfort.  Indeed, all of my 23andMe genetic risk factors were good.  There was no bad news.

Traits

An amusing little trait, that IS identified by the DNA analysis, is on Asparagus Metabolite Detection.  When I eat asparagus, my urine smells strongly.  It confirms for me - that the system works!  It also correctly identifies that I have a sweet tooth, that I have blue eyes, etc.

Now to the genetic genealogy goodies.

Ancestry

Y-DNA

The genetic marker that I inherit from my strictly paternal lineage - father's father, father, and so on, going back.  On paper, I've traced this back to a John Brooker, that lived in Oxfordshire, but was born outside of that county, perhaps in nearby Berkshire, circa 1785.  Of course, that is if no-one ever lied in forms over who the father was.

This one was a shocker.  A little background first.  Although my paper ancestry over the past 350 years is overwhelmingly localised in parts of the county of Norfolk, in East Anglia, my paternal-line surname carrier, that should be the donor of my Y chromosome marker, or Y-DNA, can be traced to Oxfordshire, in Wessex.  Out of my eight paper great grandparents, seven were Norfolk born and bred.  However, the exception was my paternal great grandfather.  Therefore I would not expect my Y-DNA to belong to any local Norfolk gene-pool.  It is the least representative lineage for my heritage.  This is why I feel that people can sometimes place too much value on their haplogroups.  I did however, expect it to belong to a common English or British haplogroup such as the Y-DNA R1b group.

I was in for a surprise.  It is exotic L2*.

From initial research including an Internet search, this haplogroup forms only a rare back scatter across Europe.  It appears more commonly across Western Asia and the Sub-Continent, from Turkey to Southern India.  It is most common in Pakistan, where it may originate, circa 30,000 years ago.  It is not a common European Y-DNA haplogroup.  I need to more carefully research this in the near future, but I'm in awe to find that I have an exotic Y-DNA.  It does conjure up images of one of my paternal ancestors being a Syrian archer, or Persian mercenary in the Roman Army, fathering a child, while stationed in Britannia, or perhaps elsewhere in Roman Europe.  But that might be too fanciful.  Anyway, I'm having pheasant curry for dinner tonight.

This genetic marker should be shared with my son, and my brother.  A few of my first cousins will also have it.

mt-DNA

The genetic marker that I inherit from my strictly maternal lineage - mother's mother, mother and so on back.  On paper, I've traced it back to a Mary Page, who was born in 1802, in Norfolk.  I like the maternal line, as it is actually the most biologically secure.  Few forms lie about who the mother is.  I'd expect my mt-DNA to be a haplogroup firmly established in East Anglia.

A nice one to have.  It is H6a1.

This haplogroup belongs to the Helena group.  However, it is not ancient European.  H6 is believed to have mutated from H around 30,000 years ago in Central Asia.

H6a1 has recently been associated with the Yamnaya migration into Western Europe, from the Eurasian Steppes to the north of the Black Sea, some 4,000 to 5,500 years ago.  In Europe itself, it could be associated with a number of Early Bronze Age cultures, the Corded Ware culture.  It has been linked with the R1b Y haplogroup, that dominates Western European countries such as Ireland, France, and the British Isles.  Recent studies have indeed suggested a significant displacement of people in Western Europe, that occurred in late prehistory, with the arrival of pastoralists from Eurasia.  This migration is also associated with the rise of the dominant Indo-European linguistic group of Europe.  If H6a1 does indeed prove to be linked to the Indo-European explosion of the early Bronze Age, I'd be very happy.  I like to imagine one of my maternal ancestors 5,500 years ago, accompanying a band of prehistoric pastoralists, that are heading westwards into Europe with their horses.

This genetic marker will be shared with my mother, my brother, my sisters, and their children.  A few cousins will also share it.

Ancestral Composition

This is an area that I've been trying to understand recently.  It uses computer analysis, to compare my autosome DNA to a number of others in reference populations from around the World, which then composes suggested ancestry in percentages.  This magic attempts to look not at a few genetic markers or haplogroups, but at all of the patterns in my autosomal DNA, to predict likely ancestry on any lineages that survive in my DNA.

Previous to receiving my results, I recently revised and bolstered up my paper genealogy based family tree,  I now have 172 direct ancestors listed, going back to Generation 14 during the 17th Century.  I noted that all, and everyone of my paper recorded ancestors were English.  All of them.  That includes all of my eight grandparents, all of my sixteen great great grandparents, and thirty of my thirty two great great great grandparents.  That is 100% English.

Now, I'm sure that you'd agree, I should be expecting my 23andMe ancestry composition to give 100% English, right? Well no.  They can't presently identify an ethnic group like the English.  Instead, I should expect my results to fall 100% into the British & Irish category.

100% British & Irish?  No, I'll give this one early.  it was 32% British & Irish on speculative mode.  More on this further down.

My paper research before I received my results also revealed just how concentrated, most of my ancestry has been over the past 350 years.  I compiled the below map of East Anglia.   The BLUE marking the places of ancestral events from my family tree on my father's side; and the RED marking the places of ancestral events on my mother's side.  The larger the marker, the more events recorded.

I also made a map based on East Norfolk during the 4th Century AD, before sea levels fell, and drainage changed the coastline.  I then marked out the area of my mother's ancestry on that.

 The point that I was trying to make was that I believe that my ancestry may have been more exposed to the North Sea Immigration waves of the 4th to 11th centuries AD.  More exposed than your average person of British & Irish heritage.  I also suggested that East Anglia, very much a part of the North Sea World, was particularly attractive to Early Medieval migrants from Frisia, Schleswig-Holsten / Angeln, North Saxony, and from Denmark.

On reviewing the 23andMe DNA Ancestry Composition of an admittedly small sample of other users with strong English heritage, I concluded that the average ethnic English person receives the results:

100% European

60% British & Irish

10% French & German

2% Scandinavian

25% unidentified broadly NW European

People of Irish heritage, or even Americans with either Irish or British ancestry, tend to score a higher percentage of British & Irish than do the present day ethnic English.  23andMe has a generous and growing reference population in it's British & Irish database.  However I hypothesised that 1) the 23andMe B&I reference is skewed to the Irish, and away from English.  It is also possible that it is distorted by a case of genetic drift by testing Americans of British origin.  2) that the British & Irish designation may actually be inadvertently looking at DNA that arrived in the British Isles largely previous to the early medieval North Sea migrations.  To the British and Irish genes that have been here since late prehistory.  On the other hand, the French & German, the Scandinavian, and perhaps some of the undesignated Broadly NW European percentages that are usually assigned to the ethnic English, may actually reflect early medieval migration from across the North Sea.  The computer analysis is simply unable to distinguish some of the DNA from that of present day French, Germans, or Scandinavians, because of ancient admixture.

I'm told that this would not be the case, that 23andMe ancestral composition could not detect such deep, ancient admixture.  However, what if I am correct about my own heritage - that I likely have enhanced levels of Anglo-Saxon and perhaps Norse heritage, because of the geographical location of so many of my ancestors?  Should I not expect even lower percentage of the 23andMe British & Irish category, and even higher percentages of other NW Europeans from across the North Sea?  So what was my 23andMe ancestry composition percentages (speculative mode)?

100% European.  Broken down into:

94% NW European.

3% South European.

I'll get to the South European later, but what about this North west European?  Let's break it down into 23andMe's sub categories:

32% British & Irish

27% French & German

7% Scandinavian

29% undistinguished broadly NW European

Oh my goodness.  It correctly fits my prediction.  I have more than double the average percentage of F&G and Scand for English people.  Despite having a paper researched genealogy that is 100% English, 23andMe's ancestry composition based on a generous reference sample size of 1251 sets, gives me 32% British & Irish.

So a predicted, but still incredibly exciting result.  I'm chuffed to bits.  It does in my eyes, blow 23andMe's British & Irish designation out of the water though.  Their reference samples do not appear to match the East English.  Instead, their software misreads some of the English DNA for French & German, or Scandinavian.  I'm suggesting that this is because of ancient admixture, during the 4th to 11th centuries AD, with North Sea immigration.  I invite others to knock my suggestion down.

One more surprise from my Ancestry Composition:  A South European 2.7%.  Broken down into 23andMe's sub categories:

0.5% Iberian

2.4% undistinguished broadly South European

This looks real.  It appears that I have a small percentage of South European heritage.  Most likely from Spain, Portugal, or Basque.  I probably have Iberian ancestry that I have not yet detected using paper genealogy.  Either that, or it's an anomaly, a incorrect interpretation.

Neanderthal Ancestry

Finally, how much Neanderthal DNA do I have?  How much of my DNA was shared by the archaic humans that lived across parts of Eurasia, between 350,000 to 30,000 years ago?  Evidence of early admixture events between Neanderthal and anatomically modern human populations?

An estimated 2.9%.

That's just slightly above the average of 2.7% for modern Europeans.  So I am not more Neanderthal than most others.  Sorry to disappoint.

All in all, very happy that I spent the money.

Software for Genealogy

My experience

I first took an interest in genealogy around 1989.  My interest grew, as did my very badly kept scraps of notes.  During the following decade, a number of computer databases designed for genealogy came onto the market, but I could only afford a used 8 bit Amstrad  computer, and merely had to do with typing it all up on Locoscript word processor.  In 1998, I finally moved onto a Windows 3.1 PC.  I soon came across an early version of Family Tree Maker - perhaps version 3 or 4.  I had to install it using a pile of floppy disks!

I then used FTM to compile my family tree, complete with scanned photos.  I could use it to print trees, albeit only on A4 sheets.  I do wish that I had been more careful to conserve and record all of my sources back then.  I wasn't a very methodical researcher.  The program also allowed me to produce a GEDCOM file, which I could use on other genealogical databases, or, it promised, to share online, as soon as I had the ability to connect to this new Internet thingy.

I indeed upgraded to a Windows 98SE PC, and even, with a 56k modem, connected to the Internet.  I think that I bought a newer CD-ROM version of FTM.  One of my first actions on the Internet was to upload my .ged (GEDCOM) file.  I'm not too sure to who, what server, or what they did with my data.  I wasn't too wise about the commercialisation of the information super highway at that time.  It may have ended up with Ancestry.com enterprise, who no doubt have sold and resold it on.

Eventually my interest in genealogy drifted away.  Quake 3 Arena was much more fun than typing in ancestors to a database.  I did at some point around 2006, upload my gedcom to a web server, so that it could be downloaded from my own ancestry pages.  I eventually gave up that subscription.  Life moved on.  I went through family break up.  Lost all of my old poorly kept notes, my hard drives, everything.

Then a few months ago, I randomly decided to take a DNA test with 23andme.  My interest in my roots rekindled by the prospect of genetic ancestral profiling, I dug up my old ancestry pages from a web archive service, and was surprised to find that they had also archived my gedcom file.  I downloaded it.  Now I needed some software in order to open it.  I always look first for Open Source software.  I found Gramps 4.2.

Gramps 4.2

Gramps is available free, and Open Source on both Windows and on Linux.  It may be available for other platforms as well, I don't know.  I've installed Gramps 4.2 onto my Windows 7 64 bit PC.  I've also installed Gramps 4.0 onto my Lubuntu 14.04 (Linux) netbook.  The screenshots that I've uploaded to this post, were taken on the Windows PC.  There are differences between my two versions on two platforms, and I won't go into all of those.  I'm mainly interested in updating and generating .ged GEDCOM files, bearing in mind, my past experiences in losing data over the years.  As long as I don't faff around too much on attributes carried by Gramps native features, that are not carried over by the GEDCOM format, the programs both run great.  I haven't yet played too much with images.  My understanding is, that you need to host the binary files in folders on your hard drive - then GRAMPS  merely points / looks at them.  That isn't something (as I presently understand) that is supported under GEDCOM.

Gramps is a database.  Like any database, it revolves around objects, attributes, and tags.  Some glossy family history software might dumb all of this down a bit.  They want to catch the mass market.  Gramps does not shy away though.  It's real magic is that it offers so many different ways of entering data, in ways that can be tracked to resources, citations, notes, places, even coordinates.  It's a piece of free software that the geek genealogist should love.  Typical of Open Source, it is more functional than pretty.  It's a piece of software that can be daunting at first.  However, if you are methodical, and reasonably pooter-literate, give it a little perseverance, and you soon start to love it.  It's features will encourage tidiness and well documented research.  Why spend out on EULA licensed software?

This week, I've been investigating the Places objects.   I've discovered that you can geo-tag your places - that can be referenced to events, such as baptisms, deaths, census records, etc.  I must have hundreds of places for my family database of 1,435 ancestors / relations.  None-the-less, I've been spending too much time on the pooter, tidying up my place data, and by referring to OpenStreetMap.org, copy and pasting longitudes and latitudes into all of them, along with place-type, alternative names, etc.  It's all about making a better GEDCOM, a better family history database.

So why bother?  Well for one  it's going to be a vastly improved record of my family.  A healthy database.  Not only that, but GRAMPS allows me to plot my ancestors locations - or rather, the locations of "events":

I can see at least one error there - in the sea off the Kent coast.  Some more tidying to do.  By the way - the mapped events include the paper ancestors on my kid's family tree, including those of my Ex.  Alternatively, I can browse the places, hit an option, and in a browser, up pops the location on OpenStreetMaps!

There are many more features to explore on Gramps.  I'll get to them in time.  I've uploaded several of the fancharts that it can generate already on this blog.  There are a range of other reports that it can produce, and web pages.  The generated website is incredibly functional.  It took a couple of minutes to generate pages for every one of my 1,435 family tree individuals.  All with trees galore.  All that I would need is a web host.

As for stability.  I've seen someone complain that it slows down.  Nonsense.  It's fine even with my extensive database.  My Windows version is very slow to launch though - not so the Linux version.  However, when it's up, even on Windows,, it is perfectly functional and very fast.

Some people are also confused on how to load a GEDCOM file at start.  I was.  It's simple.  You need to first create a blank family tree file from the manager.  Then you can import your GEDCOM into it.  You don't see the Import/Export functions until you have first loaded a family tree - just make it a new blank file.   Once you have created a family tree, and imported a GEDCOM - be careful to use that file next time, and not do as I did - import the GEDCOM again.  You'll end up with two of each individuals.  Always back up before and after making any edits.  I like to mail a backup to myself on webmail, so that the GEDCOM is also backed up on two webmail servers.

GEDexplorer

That geo data that I produced on Gramps, is carried over on GEDCOM to other databases and platforms.  I use the GEDexplorer v1.24 ap on my Android smartphone.  This app allows me to view my GEDCOM files on my phone!  It cost me a couple of quid from the Play Store, but it was money well spent. 

The above screenshot shows a view of one part of my tree - the ancestors of my great grandfather Sam.  It's a really nice feature of GEDCOM files and this software, that you can open up your database, look through trees, fan charts, or just the data itself, browsing through ancestors.  Handy if you just get a spare hour here or there to research with - but no lap top!  Easy quick reference of your entire database from a phone.

You see, it's all there.  The beauty of GEDCOM format has reached from my Windows 3.1 machine, to my Sony Z3 phone.  That's a rugged file format.

Now to the point of this entire post really.  Those hours that I've wasted away on giving geo-tag provenances to all of my ancestral places?  The GEDCOM picks up the latitude and longitudes, and GEDexplorer displays them.  Just click on any hyperlinked ancestral event place, such as my great grandmother's place of birth above, and ....

and I can hit the link and look at it in more detail using Google Maps.  Hell, I could even navigate to the actual place.

The Paleo Diet. A critique

The above photograph seems to illustrate how many modern people eat.  I took it in a Wisbech back street, using the 50p camera, Olympus XA2, loaded with Ilford HP5+ film, which I developed in Kodak D76.

Eat like a Caveman?

What prompted this post?  I was shopping in a local discount store today, and I spotted their range of Paleo and Atkins diet aids - ketosis pills, high protein this, high protein that, and a ... Paleo Protein Bar.  I just cannot imagine a palaeolithic hunter-forager unwrapping and then biting into factory produced "protein bar".  The ketosis pills were ridiculous enough.  Still, it reminded me of why I turned my back onto the Paleo diet crowd years ago.

The Paleo-Diet is based on the assumption that humans have not had time to adapt towards a modern diet.  This they might argue, is why we grow overweight, unfit, and suffer many illnesses.  They suggest that humans evolved to a hunter-gatherer diet over many thousands of years.  In order to replicate some of our "natural" food groups, Paleo-dieters do not eat: junk food, fast food, bread, cereals, any wheat or flour products, refined sugar, beans, legumes (including peanuts), potatoes, processed vegetable oils, or any dairy produce.  Some Paleo-dieters with European heritage, also avoid certain foods, that originated in the New World, including for example, tomatoes, avocados, and peppers.  Although this proscription does not appear in the mainstream, there are some Paleo-dieters that believe that people without a long New World heritage, have a genetic based conflict with such food groups.  On my recent excursion into genetic profiling, I've seen posts from some individuals actively looking for New World food intolerant genes. I know where they are coming from.

Before I launch my critique, I should just start with what I do agree with the Paleo-Diet.

  • I fully agree that we have not had time to adapt to the modern diet.  However, I refer to the massive changes to our diet over the past 100 years, not over the past 10,000 years.  I agree that we should avoid junk food, fast food, processed meat, sugary foods, and processed oils.  I feel that we should also reduce our consumption of refined white wheat flour products.
  • I do like how the more sensible paleo-dieters seek out and eat a variety of vegetables, fruits, and nuts.  By avoiding starchy foods (rice, pasta, and potatoes), they sometimes consume more portions of vegetables.
  • It might encourage people to think like a hunter forager. Foraging at the local farmshops, markets and stores for more variety of natural wholesome foods. To consume mindfully. Avoid cheap meats.

Now what I disagree with:

  • It is based on bad science, bad anthropology.  10,000 years represents lots of generations for us to adapt to an agricultural diet.  Lactose tolerance is evidence of that evolution.  The book The 10,000 year Explosion: How civilization accelerated human evolution. 2009.  Cochran and Harpending, explores these issues, and revealed that human evolution, in terms of deviations within our population alleles, has actually accelerated.  One of the pressures behind this acceleration has been identified as the agricultural diet.  Anthropologists can also point to farming populations, for example, some dairy farmers in Africa, that are particularly tall and strong.  The past 100 years, yes, I can agree, we have not had time to biologically adapt to the profusion of refined sugars and processed fats that surround us daily.  However, you cannot tell me that agricultural foods are all bad.  What are Paleo-dieters actually eating?  Those fruits, vegetables, grass-fed beef, and even nuts, are all agriculturally produced.  I don't see many Paleo-dieters living on whatever they forage or hunt from the wild alone.
  • The truth is that hunter-foragers were highly adaptable to different diets.  As they spread across the planet, so they encountered different resources.  They adapted partly by human culture, by lifestyle, but humans are also great omnivores  and opportunists.  I once saw a British TV documentary about a a woman with an eating disorder, that restricted her diet to one flavour and brand of corn snacks.  Okay, her skin looked a bit pasty, but she was not noticeably overweight, and her disorder had not prevented her from surviving to adulthood, and from raising her own children.  What I hate about the Paleo-diet, particularly about some of it's more extreme schools, is that it is restrictive.  It prohibits the consumption at least of legumes, potatoes, even whole grains cereals, and beans.  Some of it's followers also avoid tomatoes!  Come on people.
  • Again, I'm not criticising the mainstream Paleo.  But some of it's followers  really go for the meat, even processed meats!  They treat it as a high protein diet.  In their eyes, hunter-gatherers ate largely what they hunted.  Although some hunter-gatherer communities did eat a lot of fish and whale meat for example, most of them in reality more likely resourced most of their calorific requirements with foraged foods.  For example, humans have an enzyme that converts starches from plant roots and tubers, into useful sugars.  Other species of apes lack this enzyme.  This would suggest that at some point of our ancestry, the ability to eat and digest roots was pretty essential for survival.  The prized tool of many bush women in SW Africa until recently (it's now most likely a smartphone), was the digging stick.  Expertly used to dig up edible roots for the pot.  Hunted meat was culturally more valued - but foraged foods provided most of the calories.  I've seen Paleo-dieters praising American bacon and tinned ham as a good food source.  I've seen the same people wince at the idea of eating fresh liver or sprats.  I'm pretty sure that fresh tomatoes, rolled oats, and even local potatoes (in the right proportion), are healthier than fried processed bacon.  But that's just my opinion.
  • Any focus on diet, is only half the equation when it comes to living well.  The other half is activity.  Do we exercise?  How often do we get out of breath?  Are we really happy, Do we push our muscles to the limit?  Do we take time out, to stroll through green, clean air areas, do we relax properly?

That's today's sermon.  Eat more vegetables and fruit, and you don't need to avoid oats or tomatoes.  Consume mindfully. Get moving.  Enjoy life.