U.S. precision medicine research program releases genomic data
by Paul Govern
Earlier this month the All of Us Research Program released an initial large batch of genomic data on its cloud-based research platform, the Researcher Workbench, including whole genome sequences of 98,600 research participants and genotype data from 165,200 participants.
“Thanks to its many research participants from all 50 states, All of Us has reached a significant program milestone with this initial infusion of genomic data,” said Paul Harris, professor of Biomedical Informatics and Biostatistics at Vanderbilt University Medical Center, professor of Biomedical Engineering at the Vanderbilt School of Engineering, and principal investigator for the All of Us Data and Research Center (DRC), where the Researcher Workbench was created. “We look forward to supporting and learning from the many researchers around the country who will draw on this unique resource to conduct biomedical research and advance the understanding of human health.”
All of Us is a National Institutes of Health (NIH) precision medicine research initiative announced in 2015 by President Barack Obama. Congress in 2016 authorized $1.5 billion for the program over 10 years. Having gathered data from 329,000 participants to date, the program aims to gather data from 1 million or more people living in the U.S.
The DRC is led by VUMC, working with the Broad Institute of MIT and Harvard and Verily Life Sciences (a subsidiary of Alphabet Inc.). So far, of the initial set of 98,600 whole genome sequences, 77,000 have been paired with electronic health record (EHR) data, physical measurements and survey responses.
“We’ve developed a cloud-based platform that is bringing researchers to the data and analysis tools, while ensuring security and appropriate data use,” said another DRC principal investigator, Dan Roden, MD, professor of Medicine, Pharmacology and Biomedical Informatics at VUMC. “With this unique access model, this varied and growing data set promises to become a major resource for biomedical research in this country. Thanks are due to All of Us participants, to program leadership at the national level, and to our team here at VUMC and at our partner institutions. These genomes represent the first step to fulfilling the program’s commitment to deliver genomic data to the participants.”
In all, along with the genomic data, the Workbench currently contains survey responses of 329,000 participants, physical measurements of 267,600 participants, EHR information of 214,200 participants and data from electronic physical activity trackers of 11,600 participants. The platform also links to data from the Census Bureau’s American Community Survey, providing details about the communities where participants live.
According to a March 17 press release from the NIH, about half of the whole genome sequences are from individuals who identify with racial or ethnic groups that have historically been underrepresented in research.
“Until now, over 90% of participants from large genomics studies have been of European descent. The lack of diversity in research has hindered scientific discovery,” said Joshua Denny, MD, MS, chief executive officer of All of Us and adjunct professor of Biomedical Informatics at VUMC. “All of Us participants are leading the way toward more equitable representation in medical research through their involvement. And this is just the beginning. Over time, as we expand our data and add new tools, this dataset will become an indispensable resource for health research.”
The DRC is supported by the NIH (5U2COD023196).