Leeds Method

DNA Color Clustering: Does It Work with 4th Cousins?

When I first developed DNA Color Clustering, I thought Ancestry.com’s “4th cousin” matches would not share enough DNA to be helpful. I was afraid the resulting chart would be too messy.

I was wrong!

The key? Create the clusters based on 2nd and 3rd cousins (who share between 90 and 400 cM), and then add high “4th cousins” into these already created clusters. 

Step 1: Create a DNA Color Cluster chart using the Leeds Method.

REMINDER: Use AncestryDNA’s 2nd & 3rd cousins matches who share less than 400 cM with the test taker to create a color cluster chart using Excel. (See “The Leeds Method” for more details.)

Example of Color Clustering where test taker’s DNA sorted into four clusters plus one unclustered, purple match, Drew

If a test taker’s 4 grandparents are not closely related and descendants from all 4 sets of great grandparents have tested at AncestryDNA, DNA Color Clustering should result in 4 columns which are related to the 4 sets of great grandparents.

Note: In this example using real data, Mona (in red print) sorted into TWO columns. She is likely related to the test taker through BOTH the yellow and orange families.

Step 2: Identify the Color Clusters, if possible.

The “C” stands for “cluster”

In this case, we were able to determine the relationship of the test taker to the 4 clusters (C1 through C4). We did not identify the unclustered purple match, Drew.

Note: If you cannot identify some (or any) of these groups, you can skip this step for now.

Step 3: Add some 4th cousin matches and sort those 4th cousins into the already created Color Clusters.

The 10 names in gray are listed as 4th cousins by AncestryDNA. I sorted these into the already existing Color Clusters.

Below the original Color Clustering, I wrote the names of the test taker’s first ten “4th cousin” matches (in gray boxes). For each person, I opened the Shared Matches and looked to see which 2nd and 3rd cousin names they matched with and assigned them into that Color Cluster.

Teresa, in red, did not have any 2nd or 3rd cousins in her shared matches. She has not been assigned into a Color Cluster at this point.

Note: This is not proof that they are related to that branch of your family, but it is a strong clue

Step 4: Sort 4th cousins who do not match a 2nd or 3rd cousin by looking at the 4th cousin’s Shared Matches for possible 2nd or 3rd cousin matches.

One 4th cousin match, Teresa (in red print), did not have a cousin within the 2nd and 3rd cousin matches. But, when I opened the shared matches of her closest match, she matched Mona. Since Mona is in the Orange & Yellow Clusters, Teresa was assigned to both clusters.

NOTE: One of the best things about this method? Your matches do NOT have to have FAMILY TREES, yet this method STILL WORKS!

41 thoughts on “DNA Color Clustering: Does It Work with 4th Cousins?

  • Barbara

    I’m working on color coding to solve the unknown parents of my ancestor born in 1864, and abandoned to the Newark foster home by age of 6. As well as the dna dad of an adopted person born in 1959, who is of course related to.him (and not his wife) on her unknown father’s side. So far my chart resembles a falling waterfall, with many steps but not alot of overlap. I mostly elimated my paternal matches from the chart as this is a maternal issue from my side. So far no aha moments but I will keep working on it.

    Reply
    • Hi, Barbara. I have two presentations this weekend, but early next week I’d be happy to look at your chart and see if I can tell what’s going on!

      Reply
      • Jamie Crawley

        When you say you are concerned about adding 4th cousins with only small amounts of DNA, what would you class as a “small” amount?

        Reply
        • Hi, Jamie. That post was written shortly after I introduced the method. I just updated it and removed that sentence. However, it is a good genealogy practice not to rely on matches who share 7 cM or less. Hope this helps!

          Reply
          • Jamie Crawley

            Thank you very much.

            I’ve asked you this because I have been using AncestryDNA’s coloured dot system. I have now created 8 groups for my mother using this system; there are 23 at my disposal.

            Is it worth using them all?

          • I have never used all 24 colors in any of the kits I administer, but I would use them if I needed to! It’s a great way to organize your matches. So, I would use as many as you need!

    • I’m just beginning the Leeds layout. Your tree resembles mine.

      Reply
  • Ashleigh Watson

    when I did this I had a 3rd cousin that did not match any of my first 10 fourth cousin matches. when I do my 2nd/3rd cousins I get 8 different colors.

    Reply
    • Do you have any overlap with the 8 different colors? (See my most recent post.) If your 2nd/3rd cousin matches are actually all 3rd cousins, you might have sorted yours into 8 sets of great, great grandparents… which is great!

      Reply
      • Ashleigh Watson

        so I actually made a mistake I have 7 different colors. 4 yellows. 4 red, 4 blue, 4 green, 2 blue/green, 2 dark blue, 1 purple,1 peach. I only have 3 2nd cousin matches the rest are 3rd cousins. I do not think I will get a 2nd or 3rd cousin match for my grandfather on my mom’s side since he immigrated to the USA in the 1960s. When I do the first 10 fourth cousin matches of mine I get 1 yellow, 2 red, 5 green, 1 red/yellow and 1 purple.

        Reply
  • Pamela Willis

    I’m so happy to have found your method, but I have a few questions and I’m in need of advice. I’m using my paternal aunt’s DNA, as my dad has passed away, so that it will be one generation earlier. We have a brick wall for her maternal great grandfather that we have been working on for several years. He is a Sexton, but to complicate matters even more, my paternal uncle’s YDNA came back Harris. My aunt paid for several of their male cousins to test, descending from her “Sexton” grandfather and his brother. They all came back with Harris DNA, however, we have many, many 4th cousin Sexton matches to a large line that we have traced back to the 1700s. We have not found where Polly’s g grandfather fits in. We are assuming that perhaps her gg grandmother was a Sexton & her gg grandfather was a Sexton descendent, because there are so many DNA matches, all leading back to the same Sexton couple.
    The first spreadsheet I did came back with 10 columns of colors. Here are my questions:
    1. Polly’s maternal grandfather was first married to her great aunt. They had a lot of kids, then he married and had many kids with the great aunt’s niece (Polly’s grandmother). I’m thinking that the below 400 cM won’t work for our chart, as Polly shares a bloodline, with all of the children from his first marriage, too, from both parents. Many of these show less than 400 cM. Should I eliminate any cousins, even if they are well below 400 cM, if I know they descend from her grandfather and his first wife?
    2. Polly is in her 80s, so some of her cousin matches are from younger generations that descend from her maternal grandfather and grandmother. They are less than 400 cM, as well. Should I eliminate them?
    3. Help! Any suggestions on sorting this mess out?
    Thanking you in advance for your advice.

    Reply
    • Hi! When you say they all came back as “Harris DNA”, what do you mean? And at what level did you test?

      If you know some matches descend from the grandfather and his first wife, yes, I would leave those off of your chart.

      Dana

      Reply
      • Pamela Willis

        Thank you, I will do that and try it again.
        I’m sorry, I should have been more clear. I was trying to get it in without writing a book. My uncle is a Willis, but Sexton is his maternal line. After Harris matches showed up as his maternal line, Polly had every “Sexton” male descendent, from her maternal grandfather and his brother, that she could find and agree to take YDNA tests. They all are part of “Harris Group 8”. I’m pretty new to DNA research, but not genealogy research. I’m not sure what level they tested at. I can ask my aunt if you need more specific answers.
        https://www.familytreedna.com/groups/harris-ydna/about/results

        Reply
        • I was just wanting to make sure you didn’t see Harris as the closest match and decided it must be a Harris Y chromosome instead of Sexton. It sounds like you’ve been a lot more thorough!

          In this case, I would work with those you know and see how others “cluster” around them. And, then work with surnames. If you aren’t finding in common surnames at the 4th generation level (great grandparents, 4th gen), then try 5th generation (2x great grandparents, 5th gen).

          Reply
          • Pamela Willis

            Thank you so very much. I will work on it & let you know how it goes.
            Truly appreciate it.

  • Susan Mullikin

    What happens if you try to extend beyond the 4th cousin matches? I am going to work on a couple of brick walls using the 2nd-3rd plus 4th cousins. I will let you know the results!
    Thanks

    Reply
    • Susan, When working with 4th cousins, I think the key is to sort them into the Color Clusters you’ve already created. Please let us know how your experience goes!

      Reply
  • I found creating a spreadsheet with 2nd and 3rd cousins using your method so helpful that, before I read this, I expanded it to 4th cousins. My cutoff is an arbitrary 40 cM. I am working on a project to identify the 3rd great grandfather of an ancestor of my husband, and 4th cousin “leads” are important to me. I have 10 clusters. Two columns are distinct from the others. Three columns appear connected and are shaded in different shades of blue. Two appear related and are different shades of green. The last 3 appear connected and are different shades of purple. Adding the 4th cousins is giving me multiple interesting matches to a line that I already knew matched on Y-DNA (as high as 109 of 111 markers). What is interesting about this is that, as I am examining matches’ trees and creating “quick and dirty” trees for those that have none; the atDNA is pointing me to a match with descendants of a particular ancestral couple. Thanks so much for creating this tool!

    Reply
    • Cheryl, That’s wonderful!!! Please let me know if you are able to positively identify this 3rd great grandfather. 🙂
      Dana

      Reply
  • Pingback: Find Unknown Biological Parents Using Ancestry DNA - Shannon in the Country

  • Wendy W

    I am working on adding in my 4th cousins but have a question. When adding a 4th cousin, what if one of the shared matches is in more than one color group? For example, one of my 4th cousin matches is to a 2nd cousin who is in three color groups. Do I add that 4th cousin to all three color groups? I wasn’t sure if that was a safe assumption to make…

    Reply
    • Hi, Wendy. First of all, if a 4th cousin matches more than one color group then you should add that 4th cousin to all of the groups. However, if you have a lot of groups it is likely that you need to consolidate your groups. If you want me to take a quick look at your clusters, I can let you know how I would consolidate them. Just email me: drleeds@sbcglobal.net

      Reply
  • I no my father he past away when I was 9 I never met none of my Ken from my father I think he came from Kentucky Ray Shipley he had brother named Raymond

    Reply
    • Best wishes on your research!

      Reply
  • Kate Zumpano

    Hi Dana. FIRST of all, you’re brilliant. As a visual learner, this method using Excel (as an accountant LOL) is A. MAZING.

    2nd, I’m wondering what you would call 2nd-3rd cousins, then 4th. What CM ranges are you using? I know below 7 is not recommended. Using this “Second Leeds” Sort, am I going from say, 150-399, and then say, 20-149? Not taking half-relatives and endogamy into consideration (I’ve got the former, like probably all of us, but not the latter thank the LORD), would this be good breakdown? Or what would you recommend based on your experience?

    Thank you!

    Reply
    • Kate, Thank you so much! As you could probably guess, I’m a visual learner, too. And my degree was in biology education, but I worked as an accounting assistant. 🙂

      Although the range is greater, I have found for *most people* that the range of 400 down to 90 cM fairly safely includes 2nd and 3rd cousins. Ancestry used to cut off their 3rd cousin predictions at 90 cM which is how I chose that number. The 400 cM was from trial and error.

      My suggestion would be to start with the 2nd & 3rd cousins (or 400 to 90 cM if you’re unsure). Then I’d add in all higher matches. And then I’d start slowly adding in anything less than 90 cM. OR I’d work with a specific cluster – for example, one you’ve identified as your mom’s mom’s or one you’ve identified as “unknown” – and add in the Shared Matches for the people in that cluster. Then start fitting the pieces together and see how everyone connects.

      Hope this helps!
      Dana

      Reply
  • William G Gessner, Jr

    What level of cM’s would you consider the least number as a cutoff for the Ancestry “Distant Family” fourth cousins?

    Reply
    • Hi, William. Ancestry “4th cousins or closer,” which are also could “close family,” is cut off at 20 cM. And they now cut off their “distant family” at 8 cM, though those who saved their smaller matches go down to 6 cM.

      Reply
  • Hi,
    I’m just starting this process and I have a 4th cousin who matches with 40CM, but I don’t have any 2nd-3rd cousins in common with them. I’ve also checked the shared matches of their closest matches, and again no 2nd-3rd cousin matches. Any advice on how to figure out which line this match may be from? Thank you.

    Reply
    • Hi, Amy. I would suggest you just work with that 40 cM match and whatever Shared Matches they have. Work on identifying common places, surnames, or people on their trees. You will likely have to build trees out also.

      Hope this helps!
      Dana

      Reply
  • Keri Janssen

    Hi Dana! I have already separated all of my Ancestry DNA matches into groups. My 2nd great-grandfather (b.1823) was an orphan and there are a bunch of 4-6th cousin matches that don’t descend from him that match all of us who do (I have separated those that match his wife’s line). I have been building trees for these “unplaced” matches and have placed them on a basic family chart. Most connect to (and overlap) one of the 3 surnames that have appeared. Would doing a Leeds Method chart help me at this range to sort these matches further? Where would I start? Thank you so much!

    Reply
  • Clinton Mann

    Hello Dana. I have created a Leeds Chart using my Ancestry.co.uk test data but only have six matches with 2nd or 3rd cousin status with cms between 150 and 90 cms. But I do have other matches in this range with people who feature on other sites where my data is inputted (eg MyHeritage, LivingDNA, Familytreedna etc) Can I input their data onto my Leeds chart? And if so, how? Thanks. Clinton Mann

    Reply
    • Hi, Clinton. Hi, Judith. First of all, my sincere apologies! I haven’t blogged (before today) in quite a while and somehow stopped receiving comment notifications and didn’t notice. I’m catching up tonight!

      Unfortunately, there’s not a great way to combine the matches from multiple sites into one spreadsheet. One thing I sometimes recommend, though, is figuring out which grandparent your matches are related to. Then you could make one combined spreadsheet showing your DNA matches at the various sites and which grandparent they are related to.

      Best wishes! I know it’s often harder to work with DNA matches outside the U.S.
      Dana

      Reply
  • Debby McKenna

    Hi Dana,

    You recommend high “4th cousins”, what number of cMs would work? My husband has nearly 21,000 DNA matches! I am trying to figure out who his 3x Great Grandfather is. Thank-you.

    Cheers, Debby.

    Reply
    • HI, Debby. Somehow I stopped getting notifications of comments so will have to check that out. There isn’t a magic number. Go as low as necessary! You will just create more and more clusters.

      To identify a 3x great grandfather, you want to find people who are descended from his child that is the 2x great grandparent. Then look at the Shared Matches of those identified matches.

      Best wishes!
      Dana

      Reply
  • Ellie

    Hi Dana, I tried your method on lower cM matches and it had some interesting results. I had used it previously to identify a lot of my tree, but still had a brick wall with a recent known ancestor – felt like I was chasing a ghost!

    So, I thought I would try to sort my clusters more widely.

    It took me a few tries, but sweet spot is sub-100cM matches to 40cM. When I did this, I ended up with MANY clusters and it became a big mess. I thought it was a mistake but I sorted it and realised… when I added my nearer matches in to the already existing clusters, I was able to begin to predict where my further matches came into my tree because I could see who else was sharing a common ancestor aside from the common ancestor of my nearer matches (hope that wording makes sense). In the end I had 28 clusters and I used them to figure out how a lot of my matches were related to me. Some of them didn’t even have trees.

    I think having a large tree (one that you know reasonably well) to start with is key for this to work, it would’ve simply been too overwhelming otherwise. Likewise, working with sub-40cM matches was hit and miss. I had a few in the 30cMs who I know for certain are closer to me, and some in the 200cM match range who I can’t pinpoint to a family. DNA distribution is pretty uneven sometimes. I recommend checking lower matches trees for recurring names, even some 20cM matches. It is a bit like a needle in the haystack, but if you have a particularly elusive brick wall, it might help. It might not.

    But I had fun testing the limits of it!

    Reply
    • Ellie, Thanks for sharing! That’s a great idea! And I think 40 cM is likely a great point to work towards.

      Do you have the enhanced shared matches (Shared Matches Pro tool on Ancestry)? It has helped me identify a lot of my matches I hadn’t been able to before. Here’s a link to a video I did: https://youtu.be/IP6X1Fg7GcE?si=Y6NoR0ROilmRD5v5

      Dana

      Reply
  • Marcus

    Dana, your method is incredibly helpful. As a visual learner, have you considered demonstrating this enhanced process in a YouTube presentation? I, like most of us, are breaking brickwalls. I’m using my paternal grandmother’s DNA to discover her 2nd great grandparents.
    Thanks!

    Reply

Leave a Reply to Pamela WillisCancel reply