UW researchers warn online DNA database is vulnerable to genetic data breaches – KING5.com

Software

SEATTLE — DNA testing service websites are becoming an increasingly popular way for people to learn more about their health history and ethnic heritage. When they take those results to third-party sites, things can get tricky.

University of Washington researchers found that sites like GEDmatch, a site that attempts to connect potential relatives from DNA sequences that are uploaded, are vulnerable to multiple types of security risks. 

A malicious user could use a small number of DNA comparisons to extract someone’s sensitive genetic markers to create fake genetic profiles to impersonate someone’s relative, according to UW researchers.

“People think of genetic data as being personal — and it is. It’s literally part of their physical identity,” said lead author Peter Ney, a postdoctoral researcher in the UW Paul G. Allen School of Computer Science & Engineering. “This makes the privacy of genetic data particularly important. You can change your credit card number but you can’t change your DNA.”

RELATED: ‘I think I’m your daughter’: Wash. dad connects with daughter through DNA testing

To look for security issues, UW researchers created an account on GEDMatch. 

The researchers uploaded experimental genetic profiles that they created by mixing and matching genetic data from multiple databases of anonymous profiles. 

GEDmatch assigned these profiles an ID that people can use to do one-to-one comparisons with their own profiles.

For the one-to-one comparisons, GEDmatch produces graphics with information about how much of the two profiles match. One graphic is a bar for each of the 22 non-sex chromosomes. 

Each bar changes length depending on how similar the two profiles are for that chromosome. A longer bar shows that there are more matching regions, while a series of shorter bars means that there are short regions of similarity interspersed with areas that are different.

DNA

University of Washington

Then the team created 20 extraction profiles that they used for one-to-one comparisons on a target profile that they created. Based on how the pixel colors changed, they were able to pull out information about the target sequence. 

For five test profiles, the researchers extracted about 92% of a test’s unique sequences with about 98% accuracy.

“So basically, all the adversary needs to do is upload these 20 profiles and then make 20 one-to-one comparisons to the target,” Ney said. “They could write a program that automatically makes these comparisons, downloads the data and returns the result. That would take 10 seconds.”

Once someone’s profile is exposed, the hacker can use that information to create a profile, claiming that they are a relative. 

The team tested this by creating a fake child for one of their experimental profiles. Because children receive half their DNA from each parent, the fake child’s profile had their DNA sequences half matching the parent profile. When the researchers did a one-to-one comparison of the two profiles, GEDmatch estimated a parent-child relationship.

GEDMatch users do have the option to delete their DNA data from the site so their information isn’t stored.

Before UW researchers published their results, they shared their findings with GEDMatch. GEDMatch told UW researchers that they have been working to resolve these issues. 

The UW researchers’ work has been accepted at the Network and Distributed System Security Symposium. The findings will be presented in San Diego in February. 

Read more on their findings here.

RELATED: Genealogy test identifies ‘Fly Creek Jane Doe’ in Clark County cold case