Researchers at University of California, San Diego School of Medicine have been awarded a $9.2 million grant to help modernize and transform how researchers share, use, find and cite biomedical datasets.
The 3-year project, in collaboration with The University of Texas Health Science Center at Houston, is part of a federal initiative to increase the utility of biomedical research data, launched this week by the National Institutes of Health (NIH) through its Big Data to Knowledge (BD2K) program.
The agency plans to invest nearly $656 million through 2020 to encourage biomedical data sharing and re-use, accelerating the pace of new medical discoveries at lower cost to U.S. citizens who fund basic research.
“Data creation in today’s research is exponentially more rapid than anything we anticipated even a decade ago,” said NIH Director Francis S. Collins, MD, PhD. “Mammoth data sets are emerging at an accelerated pace in today’s biomedical research and these funds will help us overcome the obstacles to maximizing their utility. The potential of these data, when used effectively, is quite astounding.”
Data sharing and the ability to include multiple “big data” datasets in research studies could help scientists find patterns among diseases, genes and lifestyle that might easily go unnoticed in smaller datasets. These patterns could have virtually endless applications in advancing health, for example, by helping to identify those at higher risk for breast cancer, heart attack or other diseases and conditions. Researchers might also more rapidly identify rare side effects of certain medications or the benefits of new drugs to small subsets of individuals.
As part of this vision, the UC San Diego-led team will develop a strategy for cataloging and indexing biomedical datasets, coined “big data” because of the volume, variety and speed at which information – anything from whole genome sequencing to social media tweets – are being collected in the digital era.
Dataset indexing is considered a vital step toward being able to build a searchable online digital library, much like the highly successful online PubMed directory, but for health-related datasets.
“You can’t go online right now and search for datasets on, say, a particular type of brain tumor,” said Lucila Ohno-Machado, MD, PhD, professor of medicine and lead investigator on the Biomedical and healthCAre Data Discovery and Indexing Ecosystem (BioCADDIE). “These specialized search engines don’t exist. We are starting almost from scratch. I think this might surprise people.”...