The All of Us research project was launched by the National Institutes of Health in 2015 with a lofty goal: to build a database of the fully sequenced genomes of at least 1 million Americans of diverse backgrounds that can then be used by scientists to improve diagnostic and drug development, clinical trial recruitment and our overall understanding of human disease.
To date, nearly 660,000 people have registered to participate, and about half as many have completed the initial steps of giving researchers access to their electronic health records, completing health surveys and donating at least one blood, saliva or urine sample to the project’s biobank at the Mayo Clinic.
Half a decade after recruitment began, researchers can begin reaping the fruits of that work, as the program this week unveiled a database comprising about 10% of its ultimate goal.
Scientists across the U.S. are now able to access that data, which amounts to nearly 100,000 whole genome sequences, about half of which come from people representing racial and ethnic groups that have been historically underrepresented in medical research.
Researchers interested in using the All of Us data for their own work can register online. From there, once they’ve completed mandatory training on ethical use of the de-identified information, they’ll have access to the program’s Researcher Workbench.
Within the cloud-based platform, they can analyze participants’ whole-genome sequencing data, as well as information from their electronic health records, Fitbit devices and survey responses, plus socioeconomic factors collected from census data integrated into the system. The database also includes the genotyping arrays of more than 165,000 individuals.
The project’s researchers are hoping that this combination of demographic, socioeconomic and health data will help scientists better understand the impact of both genetics and the environment on disease, and therefore improve the development of more precise treatments for those diseases.
“There is a unique depth and dimensionality to the All of Us platform that sets it apart from other resources in the field. It’s also designed with team science in mind, allowing researchers to explore topics in an open and collaborative way,” said Gail Jarvik, M.D., Ph.D., a principal investigator at one of the program’s sequencing centers at the University of Washington.
“As the Researcher Workbench matures, it will create nearly endless possibilities for discovery to understand the role of genes and variants, as well as many other factors that combine to affect health and disease,” Jarvik said.
In addition to the University of Washington location, the project’s other sequencing centers are located at Baylor College of Medicine, Johns Hopkins University and the Broad Institute of MIT and Harvard. Altogether, they process about 5,000 genomic samples per week.
In return for submitting their genomic data and vast swaths of health information, participants in the project receive a free report about their ancestry and genetic traits, courtesy of project partner Color. Additionally, by the end of this year, they’ll also have access to findings about their hereditary disease risk and medication-gene interactions.