Disease Prediction Scores Based on DNA Have a Racial Bias, Study Finds
Scores for predicting disease risk based on genetics are on average almost five times less accurate for people of African ancestry, two times less accurate for East Asians, and 1.6 times less accurate for Hispanic or Latino Americans, compared to people of European descent, a paper published Friday in Nature Genetics has found.
The paper highlights the pitfalls of a potentially powerful genetic tool, polygenic scores, for predicting a person's risk of diseases such as coronary artery disease or type 2 diabetes.
"This isn't a few percent here and there; this is severalfold difference in prediction accuracy so it's a pretty big issue," the paper's lead author Alicia Martin told me over the phone.
Accuracy in disease prediction is important, Martin said, because these scores may be used to guide clinician decisions about which patient will ultimately receive a particular treatment, such as cholesterol-absorbing medications for heart disease.
Previous studies have identified different genetic variants that are linked to diseases. Polygenic scores combine this information into a single score that identifies a patient's risk of inheriting a genetic disease, or a particular trait such as weight. The higher the polygenic score, the more at-risk the person is.
The study used genetic data from around 500,000 individuals stored in the United Kingdom Biobank to calculate prediction scores for 17 traits including height, body mass index, and type 2 diabetes. Across all traits the scores were more accurate for Europeans because the Biobank database is almost exclusively made up of genetic data from people of European ancestry, Martin told me.
Fewer than 10 percent of people represented in the database are of African, South Asian, East Asian, and Hispanic or Latino ancestry, according to the study.
Martin said the findings reinforce the need for more diversity in genetic studies. She noted 79 percent of participants in genetic studies since 2008 were European. "This is far out of line with the global population where European descent individuals make up about 16 percent," she said.
The authors say diversity is lacking across most genetic studies worldwide and the percentage of non-European study participants overall has declined since 2014.
Underrepresentation at the study stage could widen the gap in healthcare between white and Black populations, the study warns.
When the authors re-ran their calculations using BioBanks from Japan, the accuracy of the scores jumped by 50 percent.
Several projects are currently underway to diversify genetic databases. The National Institute of Health's All of Us project, for example, seeks to gather data from one million people in the United States from different lifestyle, environmental and biological backgrounds.
However, the project has been criticized by Indigenous leaders for bypassing tribal consultation. There have also been concerns that genetic information could be made public and that individuals may lose their privacy protection under the Health Insurance Portability and Accountability Act (HIPAA).
Creating diverse genetic datasets will require a large systemic change, said Martin. "There's going to need to be some cultural reckoning with some of the earned mistrust that's happened in the biomedical establishment and in law enforcement," she said.
"These are not easy issues to change, but in general we need much more diverse representation to have equitable use of this type of genetic technology."
Get six of our favorite Motherboard stories every day by signing up for our newsletter.