How not to use online genealogy

I recently decided to invest in an annual subscription to Ancestry.co.uk.  I therefore intend to use it extensively over the next year in order to bolster my tree and to add leafs through their very fat database of resources.

A little background.  I've researched my family tree since at least 1988, but not continuously.  Back in the day, there were no online resources.  the most modern thing were census on microfilm and the Mormon IGI (International Genealogical Index - the ancestor of FamilySearch.org) available in the Local Studies Library.  My tree started, as it should, through interviewing elderly relatives, looking through their photos, the few birth and marriage certificates, and any other artifacts.  Those elderly relatives have all passed on now.  if you are just starting with genealogy - do it now.  I then moved on to the English & Welsh County record offices.  White gloves and pencils, in order to peruse through the original parish registers and other documents - no digitalisation, or even microfilming of them then.  Very little indexing as well.

Then I was ordering GRO certificates from London, paying professional researchers to collect them for me, as it worked out cheaper than having them mailed to me by the GRO!  Then rather than looking for DNA matches, it was searching through surname interests or through the annually published GRD (Genealogical Research Directory) for shared ancestry.  The good old days.

I said it wasn't continuously.  Interests changed, I lived out life recklessly, and moved on a few times, leaving all behind.  I lost pretty much all of my genealogy.  Meanwhile, digitalisation was coming in fast, indexing increasing, and the Internet was giving birth to online genealogy.  During this birth, I had used an early version of Broderbund Family Tree Maker (it installed on several floppy disks) on a personal computer, and even managed to upload data and a GEDCOM file to a few places.

Then maybe 16 months ago, after ordering a 23andMe test, I picked it up again.  I found my old GEDCOM file on a web archive.  Downloaded it, opened it with open source Gramps software.  It worked!  Since then, I've gathered surviving notes (so many lost), photos, and certificates.  I then discovered a remarkable resource.  Online Genealogy.

Online Genealogy

There are many online resources.  The big providers include Ancestry.com (Ancestry.co.uk), FindMyPast.co.uk, MyHeritage.com, and FamilySearch.org.  All but the latter website are subscription fee based.  Asides from these providers, there are many other services for genealogy online.  Of the above, I have heavily used FindMyPast, FamilySearch, and Ancestry.

Online Genealogy using Ancestry.com

The big advantage of Online Genealogy is indexing and the database.  Over the past 25 years or so, armies of volunteers and paid researchers, have been reading through microfilmed, microfisches, or digitalised images of masses of parish registers, parish records, wills, criminal registers, state records, military records, Bishop's transcripts, Headstone surveys, and more - from not only England & Wales but from all over the World, where they are available.  They read the names of those recorded, and add them to computer files with references.  Businesses such as Ancestry.com, buy access to these indexes, and often to the original digitalised images if they exist.  These are all added to their own database.  Their customers search, and find ancestors.

A Few Problems

  1. I can report this for English records, for which I have a lot of experience. The record is still very incomplete.  You might see a Joe Bloggs, but is it your ancestor Joe Bloggs?  Many of the parish records were missing, or damaged.  Parish chests in cold churches can be damp places, the registers pulled out for every baptism, marriage, or burial, thumbed through by all.  Paper was valuable in older records, and the priests and clerks cram their little scribbled lines in them.  There were stories of vicar's wife's using old registers to kindle the fire in the vicarage.  In addition, not ALL parish registers are online at any one depository.  I've noticed that Ancestry.com is very good for Norfolk registers, but abysmal for Suffolk.  FindMyPast is good for Berkshire records.  They are far from complete records.  In addition, some ancestors were not in any parish records.  They were rogues on the run, vagabonds, or even more often ... non-conformists.  Some priests were lazy.  All of this on top of those many missing or damaged records.
  2. The indexers were human beings.  Sometimes volunteers, sometimes more recently I suspect, poorly paid human beings outside of Europe (is this the case?)  They vary in skill at reading 18th century, 17th, even 16th century hand writing that has been scribbled down in often damaged records.  The database searches for names that sound similar (to a computer program), but they miss so many that are incorrectly transcribed.  Try to read through the original images if you can.

So the record is far from complete.  The online record less so.  A brilliant tool, but it's not going to hand you your family tree all perfect and true.  If you understand this problem, and you are more concerned about truth and quality, than about quickly producing a family tree back to Queen Boadicea (I have seen people claim such things!), then you are already aware of this.  The problem is, that you know that an ancestor was called Joe Bloggs.  Online, you find a Joe Bloggs, living 100 miles away, born about the right time.  With a click, you "add" him to the tree, then resume climbing up from him.  What you may not realise, is that there were maybe 20 Joe Bloggs born at about the right time within a 100 mile radius of the next generation.  You just picked the one that your online ancestry service flashed up to you.  He is quite probably not close family, never mind your ancestor.  All above him are not your ancestors.

Truth and quality in a family tree

Do you care?  Is it possible to trace back more than several generations, and to preserve that quality? The 20th and 19th centuries in England & Wales are great.  We have records from a national census every 10 years between 1841 and 1911.  They can be searched with your online service.  We have them as correlations for parish records.  We also have state records to correlate with from 1837!  Before that though, it gets a bit scratchy.  Particularly if your ancestors were not titled - as most of them were not!  Then we are down to scribbles in parish registers, a few tax books, tithes, military rolls.  Great stuff, but increasingly - we lose correlations.  We lose certainty.

When we lose certainty, we have to start to make judgments.  Do we add an ancestor based on little record?  We have to make that judgement ourselves.  We should add the resource, name it, perhaps publish our uncertainty.  We should be ready to remove if doubt grows rather than certainty.

I've not mentioned biological certainty here.  Haplogroup DNA can challenge some very old trees.  Things happen in biology.  We call them NPE (Non Parental Event).  Spouses cheat, lie, prostitute, are raped, commit bigamy, incest, confused.  People secretly adopt, particularly during a crisis.  I have seen a claim of the average NPE happening once in every ten generations on average.  I don't think that we can truly measure this.  Anyway, I'm of the school that although DNA genealogy is interesting in the pursuit of the past, that family is not always just about biology.  Who reared them?  Who gave them their name?  If that is family, it's also ancestry.


But the ultimate mistake with using online genealogy

This one is easy.  It is that companies such as Ancestry.com and MyHeritage.com, allow, sometimes encourage the resourcing of other members family trees.  It has nothing to do with rights or property.  It has to do with the reproduction of mistakes, and bad quality research.  It indeed gives genealogy at online sites like these, a pretty bad name.

Many users of these sites are casual.  They have only used the online resources available through the quick click and collect ancestry of these services.  They are only trying to pursue as far back, as possible, within as short time as possible.  Truth and quality is of very much secondary value.  It's the consume society.  They leave their disjointed trees of fiction all over these web services.  Then Ancestry / MyHeritage, invites you to add them to your own.  Very much internet viral in form - the errors replicate like mutations in a strand of DNA, only with lightening speed.  It's so easy to add new layers of ancestry.  But they are fiction.  I've seen people marrying before they are born, dying before they give birth.  I've seen people marry their parents or uncles.   I myself, recently tried it en mass as an experiment to a tree.  It was incredible.  The discrepancies and errors.  Ugly.

So, if you have to, look at other trees. I strongly recommend that you avoid that temptation to simply click and collect ancestry.  Most of the genuine ancestry on these trees is available to be quickly found with your own use of the services on that site.  Do that, but make your own judgments.  Don't add to the virus trees.  Genealogy is for the long haul.

Ancestry and DNA Tests

I'm writing this post in response to a number of comments that I see online with regards to using a commercial DNA test, in order to ascertain ancestry.  Quite often, when someone asks how to find out their family history or ancestry, someone will come back with an answer in the form of "just spit in a vial, send it to Ancestry.com, and they'll tell you".  It's not really that simple, so I'm making this post, to explain how an ancestry DNA test can help, or not help, you discover your ancestry.  Nicely dumbed down I hope, for the beginner.

Traditional Genealogy

Traditional genealogists usually set out to create a genealogy (family history and tree), using interview techniques, artefacts, and oral memories, recorded from older relatives.  Artefacts might for example, include old family medals, or photographs.  They then extend the research, through documentary evidences, such as birth, death, and marriage certificates, church registers, census records, transcripts, electoral rolls, and military records. If they are interested in recording all ancestral information, and not merely a single line such as the surname line, then this research can go on for months, years, even decades.

What you cannot do, is to simply pay a small fee, and your entire family history drops through the letter box in a brown envelope.  It takes years of time to research, collate, and to verify a good family tree.  Most genealogy enthusiasts don't mind this, because they actually enjoy doing the research itself.  It becomes a hobby, even sometimes a passion.

However, a number of commercial DNA companies may give the general public the impression, that you now can simply pay a fee, spit or swab, and your ancestry magically appears for you on a website.  It's big business.  Does it work though?  Exactly what is genetic genealogy?

What is Ancestry and why do we care.

Ancestry can simply be defined as our descent from forbearers.  Why do we care who they were? Which forbearers or ancestors?  How many are there?  How far back?

Of course, not every one does care.  Not everyone cares about history.  But for other's how we define ourselves, our communities, and families, it does matter.  It tells us who we are, where we came from.  It defines us, gives us grounding.  It gives us identity.  Wars have often been inspired by ancestry.  At the same time, a deeper appreciation of the human family, and it's common ancestry, can be used to relate to those elsewhere.  One big family.  Discovering the immense poverty and hardships of our ancestors can help us to appreciate what we have, and to help others in need today.

So what ancestry can we discover?  For those few that merely concentrate on one patriarchal line, it's quite simple to define - the generations of a surname.  However, beyond that one narrow line of descent, few appreciate exactly how much total ancestry that we have.  Lets look at our biological ancestors at each generation:

  • 2 parents
  • 4 grandparents
  • 8 great grandparents
  • 16 great grandparents
  • 32 g.g grandparents
  • 64 g.g.g.grandparents
  • 128 g.g.g.g grandprents
  • 256 g.g.g.g.g grandparents.

These are only your 510 most recent direct ancestors, yet just those generations, will take you back to only around 250 years of family history.  Now add all of the recorded children of these direct ancestors - the great great uncles and aunts to the theoretical family tree.  You're probable going to have a tree of around 1,300 individuals.  That is just for 250 years.  You have a big family  Go back a few more generations, and it will explode before you reach far.  All of those direct ancestors though, are a part of your ancestry.  You'll most likely carry some DNA from most of them.  They are, from a biological perspective, who you are.

By the way, the number of biological ancestors will not continue to increase infinitely.  Because increasingly, you will find couples within your tree that are distant biological cousins of each other.  As this accelerates through thousands of years, that explains how all modern people around the world, all descend from a very small population around 100,000 years ago.

So before considering what DNA can do for genealogy, we need to consider which ancestors matter to us.  Do you just want to know who your biological parents, or grandparents were?  Do you want to know the names, places and social positions of your ancestors over centuries?  Do you want to know which parts of the world that your ancestors lived 500 years ago?  Do you want to know how some of your prehistoric ancestors moved across the globe, thousands of years ago?  Maybe you want to know everything.

Let's now turn to genetics for genealogy, and how DNA tests can answer some of these questions.

There are two main types of DNA tests for ancestry, although they are often incorporated together by commercial companies:

  1. The haplogroups, the Y-DNA and mt-DNA
  2. Autosomal DNA
The Haplogroups

The haplogroups are chains, or markers, that are carried on one of only two strict lines of descent.  They do not apply to your entire ancestry - just two lines.  As we saw above, we have 256 g.g.g.g.g grandparents (unless any of their descendants reproduced together).  Our haplogroups came from only two of them.  Your haplogroup does not define you.  Yet, it's quite odd, because very quickly, many genetic genealogists do relate to them, rather like a hereditary football club.  They do become an identity, only if you enthuse over them.

The Y or paternal haplogroup, follows the strict paternal line.  From father to son.  Women do not have a Y chromosome, so cannot pass it on.  It has to come from the biological father.  However, within this constraint, Y-DNA is particularly useful to genealogists.  It mutates often, both as STRs and less often, as SNPs (snips).  Because of these frequent mutations, it is useful for tracing shared descent with others.  It can also be aligned with surname studies.  The champion commercial DNA company for Y-DNA research, is Family Tree DNA.

The mt or mitochondrial (maternal) haplogroup, follows the strict maternal line.  From mother to children.  Both sons and daughter inherit their mt-DNA haplogroup from their biological mother.  However, only the daughters can pass it down.  Two downfalls to mt-DNA for genealogy.  1) The surname frequently changes, traditionally nearly every generation through marriage. 2) it doesn't mutate as frequently as the STRs of Y-DNA. It is still a useful tool, and can prove descent through the maternal line.  It can also still be used for studies of much deeper, ancient ancestry.

Autosomal DNA

This is the bulk of you DNA.  All of the snips (SNPs), that make up who you are genetically.  You receive approximately 50% from each parent, 25% from each grandparent, 12.5% from each great grandparent.  This subdivision cannot go on forever, and indeed, once you go back much more than six generations, the approximates start to deviate, so that you may have no snips at all from a particular line that joined your family tree over 250 years ago.

The problem with autosomal DNA is that it can be such a mess.  It recombines randomly with every generation.  Therefore, it is much harder to track ancestry in the same way, that we can with the haplogroups.

So how can they be applied for genealogy:

Biological descent

Not everyone knows who their biological parents were, or where they came from.  This is the first use of DNA testing.  It can be used to find, test, or prove recent descent.  The first hurdle of genealogy.  Both haplogroup evidence, and autosomal evidence can be used to prove or determine relationship.

Cousins

Many genetic genealogists, use DNA to find distant, and sometimes not so distant cousins.  The hope is that they can link trees, share knowledge and research, perhaps copies of artefacts.  Therefore an awful lot of genetic genealogy is about tracing genetic relatives, and establishing common ancestry.

There are two main tools:

  • Haplogroup Projects.  The Y haplogroup is favoured for it's frequent STRs, and also for it's link to surnames.  Family Tree DNA projects track the STR and SNP data of it's members, tracking families, relationship, known mutations.  Project administrators at FTDNA can predict relationship to other members in the project.  Your Y cousins.
  • Shared segments.  Autosomal DNA can be used for finding distant cousins.  23andMe for example, have Relative Finder.  Alternatively customers of any commercial DNA company that gives them access to their raw data, can upload that data to GEDMATCH.  At GEDmatch, they can search for other kits, looking for lengths of shared segments (measured in cM - centimorgans) on the autosomes or X chromosomes.  The longer or more segments can be used to indicate shared ancestry.

It is important to understand, that this is not about directly tracing ancestry.  It is only about establishing shared biological ancestry, with other researchers, with which you may be able to share resources.  In the old days of genealogy, we would find distantly related researchers by browsing through annually printed surname interest directories.  Here, the same thing is happening, but we are finding people by comparing DNA.

Ancestry from Autosomes

Most commercial DNA companies providing ancestry information, now use their own propriety calculators to look at the autosomal DNA of their customers for patterns that they can relate to a number of reference populations.  23andMe for example, uses Ancestry Composition to determine what parts of the world, that the ancestors of their customers lived 500 years ago.  They predict from this in percentages of ancestry.

However, it is very much a developing art.  The problem is that genes have been randomly mixing and moving around ever since prehistory.  The customers of these DNA companies want hard facts.  They want their ancestry accurately pin pointed down to modern or ancient nation-states, or to historical populations such as the Vikings or Huns.  Ancestral DNA companies are under pressure to provide this deep ancestry.  However, can they?  Ancestral analysis of DNA can be very enlightening.  It can provide some surprises within a family history.  However, it's accuracy is exaggerated.  It is increasingly successful at predicting ancestry from a particular corner or end of a particular continent.  But it cannot for example, accurately tell French, British, and German ancestry apart to any high accuracy.  It can recognise some populations better than others.  It cannot tell anyone if they had Viking ancestry.

Ancient Ancestry

This is a particular value of the haplogroups.  As we accumulate more and more data on more mutations, as we expand the recorded database, and as we relate that to more ancient DNA extracted from referenced and dated ancient human remains, so we will be able to better explore the population genetics not only in history, but deep into prehistory.

However, it is also becoming increasingly realised, that patterns of ancient admixture can also be detected within the autosomes.  Although Autosomal DNA ancestry calculators claim to reveal relatively recent admixtures over the past 500 years, it is becoming clear that these are being confused by much older patterns of admixture.  These patterns can now be explored and probed on a number of GEDmatch programs.  People can compare their DNA with the kits from ancient DNA, or predict just how much of their ancestry was likely "Western Hunter-Gatherer, or "Early Neolithic Farmer".

In addition, more DNA companies are now measuring for much more ancient admixture with archaic populations such as the Neanderthals.

Conclusion

Genetic Genealogy is fun, great fun.  It is not however, a quick and easy replacement for traditional genealogy.  Unless you get lucky with some comparative Y-DNA in a project, it is not going to directly tell you the names or social status of any ancestors.  It can give you a phylogenetic tree, but not any kind of family tree that you can bore other family members with.

Genetic genealogy can provide some tools to some researchers.  It can test biological relationship.  It can be used to predict some of your ancient history.  For most researchers, particularly those that are able to interview many local family members, search local grave yards, access archives and records - it has no, or little value to the pursuit of collecting ancestors.

I personally love to explore my genetic genealogy. But it is documentary research that provides the names.  Genetic genealogy for myself, is more about the long and ancient journey.