How to make the most of your AncestryDNA matches: Part 1 – Getting started

How to make the most of your AncestryDNA matches: Part 1 – Getting started

As we approach Christmas 2018, and given the massive push to have cheap DNA tests given out as gifts this season, it seems natural to finally write a series on how to make genealogical use of a DNA test you, or your loved one, may have just taken.

We’re going to start with the very basics on how DNA testing works, and walk through both how to leverage AncestryDNA to track down ancestors as well as using GEDmatch and other advanced tools to go even deeper.

Assuming you have a few weeks before the test results are in, here are a couple of things to learn and prepare before you dive into the matches.

  • First, understand that while the commercials like to highlight the joys of learning your ethnicity, DNA testing raises serious issues that will likely come up as your journey progresses. You may uncover family relationships, both inside and outside of your family, that could have serious negative impacts on people. We’ve uncovered children born outside of marriages that were never known to the family, and we know of adopted children who were outed by tests where their parents had never told them. We wrote about an example of this last year (Dangers of DNA Testing).
  • Second, they key to effectively making matches will be a good, solid family tree through the test subject’s 4x Great Grandparents. Most of your matches made will be through 3x or 4x GGP, and in a perfect world the match will also have a good tree so the link will be obvious. We can’t over state this, or stress it enough: your success/failure of matching DNA tests from unknown relatives will rely on the quality and depth of your tree. We’ve walked through how to build a good “quick and dirty” Public tree on Ancestry (Building a good Public Ancestry.com tree – Part One: sources, citations, facts, and proof), and the process would be about the same on other sites, many of which are free.
  • It’s also important you have the tree available publicly…many of your interactions are going to be about exchanging trees to build a match. It’s ok if you have just a skeleton tree with basic information(names, date of birth/death, locations, children, etc.), but it will be key that you have something available publicly. 

Basics of DNA

The main new term/concept you’ll need for effective Genealogical DNA research is a measure of distance: centimorgan (cM). Now, it’s not technically distance…but for all intents and purposes, it’s used as a measure of distance.

What does cM measure?

Centimorgan measures length of DNA strands. More specifically, it will be used to measure the length of matching DNA segments between your test and a test that is a genetic match. For example, you have roughly 6800 cM if you take all 22 chromosomes and strung them out end-to-end, and your matches will have varying levels of matching DNA, measured by centimorgans.

How do we use centimorgans to identify matches?

Since you get about 50% of your DNA from each parent, your DNA tests will match a test from your parents with about 3400 cM. You will match a Grandparent with about 1700 cM (50% of your parent’s 50%). The more cM you match someone, the closer a relative they are, and the more likely that you will confirm a match with them. 

We’ll use both charts from ISOSG (The Shared cM Project table) and an interactive version of that chart from the DNA Painter site (Shared cM Interactive Tool), which both break down the average cM to expect with various relatives, and helps us identify where to look to establish a match. For example, if a match is 311cM then we can guess they match the person with the DNA test at around a 1st or 2nd cousin…which means our common ancestor is likely a Grandparent or Great Grandparent, which narrows down our search!

What’s next?

So, there’s the first part of this DNA journey. There’s a little homework while you wait for the test results, a basic understanding about how we’ll actually leverage the DNA to make matches, and why your basicGenealogy and a solid family tree will be key to this process. Next week, we’ll go over what to do when you first get your DNA results!

Next installment: How to make the most of your AncestryDNA matches: Part 2 – Leveraging your strongest matches to make quicker work of your more challenging matches!

It’s time to stop giving attention to “Ethnicity” and genetic admixture

It’s time to stop giving attention to “Ethnicity” and genetic admixture

[One quick note: As always, we receive no financial benefit or consideration for any product or service we review/recommend/discuss here. Everything we discuss is our opinion alone, and we talk about it because we use it.]

Ancestry has made a lot of noise recently when they updated their Ethnicity estimates, and the now intensified debate about the “accuracy of DNA tests” and the confusion among the general public makes it clear: as a community of serious researchers, we need to be the voice of reason when it comes genetic admixture and call it out for dubiously valuable, largely inaccurate parlor trick that it is. Here’s why:

Ethnicity cannot be tested for. Ever.

Ethnicity is a social construct. Period. If we look at any test, any genealogical tree or other determination it will not build a social link to ones ancestral background. Michael hasn’t been to Ireland, but I have, and despite being able to trace 12.5% of my 3x great grandparents to Ireland, and Ancestry’s admixture pointing to an Irish background, I am not Irish. I visited Ireland as an American…a very obvious American. As will Michael when he visits. Nor will he be mistaken for Beninian when we visit Benin. We are Americans, some with European ancestors some with African ancestors as well, but even with a perfect admixture that could pinpoint our ethnic ancestors exactly…we’re still not German, or Cameroonian, or English/Irish, etc. You can’t test for it, and DNA gives you no indication of how someone identifies ethnically. And that’s important, because Ethnicity is only about how someone identifies themselves and/or how others identify them…it’s not based on a gene. Neither is race, but that’s another rant for another day.

We need to voice a supportable, honest, accurate narrative to drive continued testing…one that will continue after the “Ethnicity” emperor is shown to have no clothes.

It’s not honest

All DNA testing companies, especially 23andMe and Ancestry, are for-profit enterprises that have a strong incentive to grow their number of DNA tests. The larger the test database, the more money the companies charge to sell access to your data. This isn’t to say they are selling personally identifiable data, the data is largely de-identified and aggregated, but it’s YOUR data…and it’s very, very valuable. 23andMe survives almost entirely on the revenue generated from your data, and it’s likely Ancestry is generating a large amount of their revenue from your DNA data as well. And no one’s advertising “come test with us, we are selling to great causes like Michael J. Fox Foundation” [23andMe], they are basing their sales pitch on the shiny bauble that gets the tests in the door: Ethnicity and pretty graphs. The more we play into the Ethnicity debate

It’s not our tool

Ethnicity (as determined by genetic admixture), has almost no genealogical or family history value, and the results will never break a brick wall or significantly add to your family’s stories. First, all of the major providers target who your genetic ancestors were 800-1000 years ago. Even those of us with great trees rarely go back to 1000-1200 AD…and we doubt there would be much value in anyone researching our 28th great grandparents. We have over 1 million 18th GGP’s. Admixture doesn’t rank even among the top 20 tools we use to build our trees, and it doesn’t deliver us any value.

It’s not accurate, and it’s not scientific

16kEthnicityThe biggest red flag from Ancestry’s last update was this: they increased the reference samples from 3,000 tests to 16,000. They have literally spent the last 4 years selling “Ethnicity” to the general public as a great reason to build Ancestry’s test database, even though the entire house of cards was built on 3,000 reference samples. There is no statistically valid data that be gleaned from 3,000 total samples as they relate to our genetic ancestors 1000 years ago. Again, we each had MILLIONS of ancestors 30 generations ago…and to use 3kEthnicity3,000 for all genetic admixture just demonstrates the shoddy science that underpins this process. Even 16,000 is a ridiculously small sample…even if they were each perfectly tied to a region 1000 years ago. “Ethnicity” is just enough science to seem valid enough to be scientific…and just scientific enough to justify the pretty graphs that facilitate the selling of more tests.

It’s hurting genealogy, and it will ultimately turn the public off of genetic DNA testing

Youtube is rife with videos of the general public discussing their “inaccurate” DNA tests, with the testee going into great detail about how they know their Ethnicity and they see something they don’t expect, the test is wrong. There are now new discussions everywhere with people questioning the entire testing process when the “results” can be changed so dramatically by a change by Ancestry. Ancestry is aware of the strain this update is having on the general public, and we can see the efforts they’re making to try and calm people as they go through the update. There are explanations, surveys, etc. to try and make sure the public doesn’t freak out about this change. It’s all just adding more weight to the idea that these tests aren’t accurate/reliable. Since the entire business case for the public taking these tests has been “Ethnicity”, once that’s being exposed as the subjective “art” that it is, the only reason for people to test is being questioned. We will hit a tipping point where our relatives are going to think of DNA testing as a “scam” that’s of no value/dangerous, and it’s going to make the process of getting tests that much harder.

So, what can we do? What impact can we have? Honestly, not much…at least not immediately. But, as the people serious about genealogy we can start being the voice of reason and begin to lay out a better justification for why the public should test, even if the focus of the commercial testing companies is only on adding more samples to their databases. If the thought-leaders and respected voices in the communities turn their back on genetic admixture, that will eventually drive the discussion.

To that end, here’s our suggestions:

  • Stop discussing “Ethnicity” as a testable value – Push back on this basic premise and start to educate the public on why DNA tests have no value as it relates to how they identify ethnically.
  • Don’t give genetic admixture a place at the table – We should no more engage in admixture as a point of genealogical value as we phrenology. They both sound scientific, and their proponents would like them to be seen as science, but neither are science. Even making an anti-admixture discussion elevates it to a “con” in a pro vs. con debate. We need to stop engaging in a debate of equal positions with admixture.
  • Develop other reasons the general public, and our relatives, should submit tests – The tens of millions of tests in various databases have a HUGE value to the genealogical community, and we all benefit as more tests are added. We need to voice a supportable, honest, accurate narrative to drive continued testing…one that will continue after the “Ethnicity” emperor is shown to have no clothes.
  • Be honest with our relatives as they test and help them, and the general public, understand how these tests play into the for-profit world – Those who take tests aren’t purchasing a product, they are the product. 23andMe and Ancestry needs those tests to make a profit, and it’s the only reason why they offer these tests. Let’s discuss that, and what we get in return, to level set everyone’s expectations. If we don’t set these expectations, some scandal will do it for us, and when negative public opinion sets in, we likely will lose the value of having non-experts testing.

Bottom line is that we can see how the reality of DNA testing doesn’t match the perception of the testing public, and all eggs are in the “Ethnicity” basket. As that basket starts to fray, we can either be a part of the rational message that keeps this testing world moving forward, or we can be reactive and wish we could go back to the “good old days” when people were testing without us having to fight for each one.

Ancestry.com takes another step away from its genealogical roots…

Ancestry.com takes another step away from its genealogical roots…

We could see it coming…back in March of 2017, one of our first blog posts was about Ancestry.com’s new tool “We’re Related” (We’re Related app is a lot less frivolous than it first appears). It was a bit of a “hot take” about how it was less silly than it seemed and how it could be very powerful if it’s expanded to a tool that is predictive of your matches.

We’re Related is making suppositions based (apparently) on an algorithm that can draw the line between what you know, and what it guesses is true, to build a potential line for you. If this technology is ever leveraged against some of my brick walls instead a gimmick like linking me to Blake Shelton, Ancestry might really be on to something.

Before we take any victory laps…we have to admit, we were incredibly naive. We never guessed that Ancestry would take this powerful technology and use it to take it’s worst, most frustrating feature, and make it much more dangerous.

The new feature is the “Potential Father/Mother” suggestion, and I’m going to let Carolynn ni Lochlainn detail all the challenges of this new tool, and the risks, in her SPOT ON “From Paper to People” Podcast #27 (From Paper to People: What I Hate About New). Please listen, but her upshot is that this feature is an easy way for those new to genealogy to quickly build out their trees, and the tool forces you to create the ancestor without any sources attached.

One of the biggest drawbacks of Ancestry is the Public Trees that are so often inaccurate, and are often built solely on other people’s unsourced trees. Now, it’s a certainty that these trees are going to start to mushroom, and by design have NO citations attached to the new ancestor.

Ever wonder why Ancestry has delivered even more accurate admixture and even prettier graphs, but none of the tools needed to do serious genealogical research? It’s because there’s no additional revenue from genealogical tools.

The good news is that we as serious users can avoid the downfalls, and use the predictive part of this feature to do the research for us, but we must immediately attach the citations to any newly added ancestor. We, as a community, can also make sure we NEVER use a Member Tree to support a fact. You can link the Member Tree ancestor to yours, but make sure all facts are unselected before you link them. They will see your additional work, and you them, but you will not perpetuate their unsourced facts.

This slideshow requires JavaScript.

But, Ancestry.com isn’t packed full of serious hobbyists/professionals and “Potential Parent” is going to take the problem of Member Trees and make it explode it beyond what we could have imagined. At some point, the tree feature in Ancestry is going to be unusable. Ancestry.com will continue to be a great source of primary research, but it will be nothing more than a data repository for those of us who are serious about this work.

And, back to our naivety…the most frustrating thing is that we should have known better. Again, going back to our vaults, we saw right away that AncestryDNA is here to support genealogy ONLY because it’s a good way to gather DNA tests (Dancing with the Devil: The Tradeoffs of Modern Genealogical Research). Once Ancestry realized that pushing pretty graphs and “ethnicity” was the best way to sell more tests, they pivoted and met their true goal with these tests: the largest DNA database that will generate a tremendous amount of revenue from drug companies, etc. who can leverage your tests to understand how their drugs might work. Ancestry now (or soon will) make more money from monetizing your DNA than it does from supporting our genealogical work.

Screen Shot 2018-09-16 at 9.28.32 AM
How did the public records “Reclaim the Records” paid to get show up here, for paid members only?

Ever wonder why Ancestry has delivered even more accurate admixture and even prettier graphs, but none of the tools needed to do serious genealogical research? It’s because there’s no additional revenue from genealogical tools, but putting more effort into the graphs will drive more people to test, which will grow the database, and grow the revenue stream.

As a community we have to get ready to accept that Ancestry is not a partner in our work, and is not in business to support us or our needs. They exist to generate revenue, and as long as that interest and ours intersect, we’re good, but as they make more money from other streams they are going to sacrifice our needs to focus on revenue. You’re already seeing that with things like “Potential Parents”, more admixture, and their new collections consisting of public records gathered at great expense by groups like Reclaim The Records and putting them behind the paywall.

The genealogy features of Ancestry are still there, for now, but the bad Member Trees we suffer through today are likely going to be remembered as the golden age of online genealogy research.

 

Matching unmatched DNA matches by Casting a Wide Net, Part 6 – Our crazy attempt to leverage 288 DNA matches to expand our tree comes to it’s conclusion

Matching unmatched DNA matches by Casting a Wide Net, Part 6 – Our crazy attempt to leverage 288 DNA matches to expand our tree comes to it’s conclusion

In the five previous parts of this series: We identified a plan to tackle what looked like a large group of DNA matches (Part 1), we went through and tagged all 288 of our Ancestry DNA results that were related to a group of matches that had Woodley/Woodson surnames in their attached trees (Part 2), we then built out a common tree for as many of the matches as we could, to nail down common ancestors, and to gain clues on where these matches link up with our tree (Part 3), we used GEDmatch and DNApainter to target the most likely line of “Mary’s” that leads from her to the group of 12 DNA matches (Part 4), and last week we broke through a brick wall with some old fashioned genealogy (Part 5). In this installment, we wrap up the story of this journey and the lessons we’ve learned. 

This journey also highlights the paradox of genealogical DNA: Your matches will come easiest on lines where you have a complete and accurate tree, but you’ll struggle to match those that are on the lines where you really need the help of DNA…because you don’t have a complete and accurate tree.

As we ended our last installment, we’d identified Sam Caswell’s wife as Annie (Moore) Caswell, daughter of Robert Moore and Henrietta (Bradford) Moore. We were able to quickly identify Henrietta’s mother, Sallie Bradford and five of Henrietta’s siblings. It was amazing, the links came easy, and the tree fell in-place just how you’d hope. The only problem was…we weren’t getting any closer to linking Roman and Mary Jones to “Mary”.  

Screen Shot 2018-08-10 at 3.18.43 PMGoing back to our work with the “What Are the Odds?” tool (Part 3), it’s 48 times more likely that “Mary” and Roman/Mary’s Most Recent Common Ancestor was our “Mary’s” 3x Great Grandparents, than it was her 2xGGP, and 77 times more likely that it was 3xGGP v. 4xGGP. That means Annie (Moore) Caswell’s parent all but needed to be the MRCA. One thing became increasingly clear as we shrubbed out our tree with the new information: Sam and Annie weren’t a link to Roman and Mary Jones 

Roman Jones was born around 1840, and his wife Mary was born around 1838. Annie (Moore) Caswell parents were both born around 1880, and for them to share parents would be…incredible. We looked back a generation (hoping to defy the 48 times odds!), and the lines still didn’t match.  We had good info on “Mary’s” 4xGGM Henrietta Bradford and her siblings…and while we couldn’t rule it completely out, it was very likely she wasn’t a link to the Jones either.

We went back to review everything we had on Annie Caswell, and in the 1910 U.S. Census it jumped out at us: Sam and Annie listed themselves as having no children, despite the fact that Mattie would have been 7 years old. She also indicated that she never had children. 

SamAnnie1910Census

When we looked at our notes, and research we realized we fell in the most basic trap in genealogy research: we had accepted family lore as fact, and built around that “fact”. We had an uncle that had done some basic Ancestry-based research, and when we first built out a skeleton tree, we’d used his info as the bones of the Caswell line. We had all the right facts on Mattie Caswell, we had all the right facts on Sam Caswell and Annie (Moore) Caswell…but we’d never proven their link. We went back and reviewed the transcripts of other family interviews we’d done with Mattie’s granddaughter (and others) about 4 years ago and there it was. They described that Mattie’s mother had died soon after Mattie’s birth, and her father died soon after. Mattie had been raised by others, her parents weren’t Sam and Annie, and the brick wall we’d broken through wasn’t ours…in-fact it wasn’t anyone’s, since they never had children who would be researching their ancestors.  

So what did we learn in all of this?

  • The crazy strategy of casting a wide net across 288 DNA matches worked..even though it was a LOT of work.
  • We identified a key ancestor, and we know where we can expect the MCRA to fall in our line once we know more about our line.
  • In the end, no matter how high-tech genealogy research becomes with DNA, it still comes back to the basics of a solid tree, with strong evidence, supported by old fashioned family history research. Without a solid tree, we can’t take full advantage of DNA links. 

This journey also highlights the paradox of genealogical DNA: Your matches will come easiest on lines where you have a complete and accurate tree, but you’ll struggle to match those that are on the lines where you really need the help of DNA…because you don’t have a complete and accurate tree.

For us, it’s back to the drawing board. We’re spinning off the branch of the Caswell tree for Sam and Annie that we’ve documented so well, and making it Public so others can benefit from our work. We’re attempting to identify more information from family on where/when George Barnes and Mattie (Caswell) Barnes died, so we can get their Death Certificates and begin working backwards again!

Matching unmatched DNA matches by Casting a Wide Net, Part 4 – Proving the matches, and establishing a theory of connection

Matching unmatched DNA matches by Casting a Wide Net, Part 4 – Proving the matches, and establishing a theory of connection

In the first three parts of this series (Part 1, Part 2, Part 3), we went through and tagged all 288 of our Ancestry DNA results that were related to a group of matches which had Woodley/Woodson surnames in their attached trees. We then built out a common tree for as many of the matches as we could, to nail down common ancestors, and to gain clues on where these matches link up with our tree. In this installment, we leverage GEDmatch, and deductive reasoning, to identify where we think our tree will link up with their trees.

The largest DNA matches (by centimorgans) we identified in Ancestry had also uploaded their DNA results to GEDmatch, so we were able to do tests to confirm they all truly matched. The “One-to-One” matches for each of them confirmed they were all related to “Mary”. It’s not scientific to say that all 288 were actual DNA matches, but we know the core group of matches are and that a good number of the matches-of-matches are likely also legit.

X DNA is tricky, but the important use of it identifying people who you CAN’T be a match if you share X DNA.

We were also able to use GEDmatch to identify the “true” cM match amounts between various matches, and from there we leveraged the International Society of Genetic Genealogists’ table showing cM ranges and averages between various family relationships (Shared cM Project – V3). The closest match for Mary was “W.W.”, and we settled on 133cM as their match level. The most likely relationship for that level of match was with a shared Great Grandparent, with W.W. likely being a 2nd Cousin, or a 2nd Cousin Once Removed. When we fleshed out the other 12 matches on paper, they all roughly fit this notion that they matched either Mary’s Great Grandparent or Great-Great Grandparent.

female-x-chart
Inheritance pattern for females (X DNA)

The other thing that jumped out at us, unexpectedly, from GEDmatch was that some of the 5 matches there had X DNA matches as well. X DNA is tricky, but the important use of it identifying people who you CAN’T be a match if you share X DNA. For example, a person will only inherit X DNA from their mother, so if you have an X DNA match that you’re theorizing is related to someone, but there are two male relatives in a row between the two matches, that isn’t possible.

Once we built out the theoretical map between all the matches and Mary it all fit that her GGP’s could be Roman and Mary Jones, and with the DNA levels and inheritance pattern of X DNA it’s likely that Marie’s relative was a daughter of Roman and Mary. It also pointed strongly to the matching being on her mother’s side.

 

Screen Shot 2018-08-10 at 3.18.43 PM
“What are the Odds?” gives you the chances of various hypothesis’

The day after we did our work on paper with the ISOGG chart as our guidance, DNAPainter introduced a new tool called “What are the Odds?”  that does the same work we just did on paper! It’s easy, it’s awesome, and we’ll cover it in more detail in a future post. But, most importantly, it showed us that it was 77 times MORE likely that Roman or Mary Jones’s parents are our Most Recent Common Ancestor, than anyone else. It’s technically possible that our Mary is directly descended from Roman and Mary Jones, or that they are connected by 4xGGP’s…but it’s much, much more likely we’re looking for Mary’s 3xGGP’s, the parents of Roman and Mary.

Looking at Mary’s tree, and she of course has 2 maternal GGM’s. One, we have some documentation (mostly Census info with Ancestry Member Trees), but the other we had almost no information. We’re guessing the one we have little information on, Annie Caswell, might be the best lead, so we’re going to dig into her.

Samuel and Annie Caswell were born, married, and died in and around Crowder, MS. Family lore has Sam and Annie as Mary’s grandparents, but we only have Annie’s Census birth date, and no maiden name for her. About the only piece of hard information we had was that Samuel might have died in July 1974 (based on the SSDI).

Time for some old-school genealogy, to hopefully prove out the high-tech theory that points to Annie Caswell being on the Jones line.

Next in the Series: Matching unmatched DNA matches by Casting a Wide Net, Part 5 – Rolling up our sleeves and doing some genealogy