Published by the Students of Johns Hopkins since 1896
July 4, 2020

Genetic ancestry sites may help solve crimes

By LAURA WADSTEN | October 25, 2018

Your DNA report could help put your delinquent brother behind bars. The ability to utilize data in genetic ancestry databases to determine the identities of criminals is no longer something of science fiction. Investigators recently used DNA from a free online ancestry database to track down the infamous Golden State Killer, the man who killed 12 people and raped 45 women across California between 1976 and 1986.

To solve the case, police compared DNA samples from crime scenes to the profiles on GEDmatch, one of the ancestral genealogy sites that has become popular. 

The suspected Golden State Killer was not in this database, but DNA that partially matched the evidence was; this similar DNA likely belonged to a relative of the serial killer. This finding helped police narrow the scope of their search down to the members of one family, allowing detectives to use conventional techniques to identify a suspect: Joseph James DeAngelo, a 72-year-old man who lived relatively close to many of the attacks.

If DeAngelo is convicted of being one of the most famous serial killers in American history, his arrest will be a testament to the power of these forensic familial DNA searches. Yet this is only one instance in which this technique was helpful. A study published in October 2018 in Cell sought to determine the extent to which this type of comparative forensic investigation of DNA would work. 

The new paper published a computational method for linking individuals in ancestry databases to those in law enforcement databases. The two databases use different systems of genetic markers — genes or DNA sequences with a known location on a chromosome — to identify individuals or species. 

Investigators reported that in a test of feasibility, over 30 percent of people can be accurately matched with their close relative (either sibling, parent or child) from the different databases.

Senior author of the paper Noah Rosenberg, a Biology professor at Stanford University, spoke about the study in a press release. 

“In this study, we were trying to pose the question of whether a newer, more modern system of genetic markers could be tested against the old system and still get matches and find relatives,” Rosenberg said.

The database used by the FBI and other law enforcement agencies is known as the Combined DNA Index System (CODIS). It relies on short tandem repeat (STR) markers, which are segments where a nucleotide pattern is repeated in the DNA, while ancestry databases look for differences in single-nucleotide polymorphisms (SNPs), which occur when one nucleotide is substituted for another. The unique combinations and locations in which STRs and SNPs appear in the genome allow for individuals to be identified. 

The linking of individuals between the two databases is based on the idea that each STR marker is surrounded by SNPs, and the two are usually inherited together. This means a person’s SNPs can somewhat predict the neighboring STR and vice versa. When many of these small similarities are observed over the analysis of the genome, it becomes possible to match an SNP profile with an STR profile.

The study conducted by Rosenberg and his colleagues attempted to add onto this concept by determining if the same method could connect close family members. 

They found that when one individual had been analyzed for STR markers and the other for SNP markers, about 30 to 32 percent of parent-child pairs and 35 to 36 percent of sibling pairs could be linked. This research was funded by the National Institutes of Health and the National Institute of Justice. 

The authors explained that the study was intended to provide data that will allow deliberation and discussion on forensic genetics and genomic privacy. 

“We wanted to examine to what extent these different types of databases can communicate with each other,” Rosenberg said. “It’s important for the public to be aware that information between these two types of genetic data can be connected.”

The researchers highlighted the potential negative implications of this expanded capability. 

For example, populations overrepresented in law enforcement databases due to disproportionate representation in the criminal justice system are likely to produce more false identifications, potentially contributing to further overrepresentation in the system. Additionally, false-positive identifications of relatives might affect members of populations with lower genetic diversity. 

Determining the relationship between DNA samples from different databases could influence fields outside of law enforcement. Scientists could use the model developed by Rosenberg and his colleagues to compare old DNA samples with new samples of a different type. For example, ecologists studying organisms could use this approach to determine whether animals living in an area descended from animals whose DNA had been collected in the past, even if only STR data is available from the older samples. 

Comments powered by Disqus

Please note All comments are eligible for publication in The News-Letter.

News-Letter Special Editions