UPDATE: NIH has released the Strategic Plan for Data Science. The plan was informed by community and public input, and NIH will continue to seek input as it implements the plan, many elements of which are well underway. By maximizing the value of data generated through NIH-funded efforts, the pace of biomedical discoveries and medical breakthroughs for better health outcomes can be substantially accelerated.
To capitalize on the opportunities presented by advances in data science, the National Institutes of Health (NIH) is developing a Strategic Plan for Data Science. This plan describes NIH’s overarching goals, strategic objectives, and implementation tactics for promoting the modernization of the NIH-funded biomedical data science ecosystem. As part of the planning process, NIH has published a draft of the strategic plan [PDF 490KB], along with a Request for Information to seek input from stakeholders, including members of the scientific community, academic institutions, the private sector, health professionals, professional societies, advocacy groups, patient communities, as well as other interested members of the public.
As co-chair of the NIH Scientific Data Council, which is overseeing development of the Strategic Plan for Data Science, I encourage your comments and suggestions. Responses should be submitted via an online form by April 2, 2018.
Our experiences with XCMS Online have given us unique insight into the value of cloud computing in metabolomics and NIGMS willing to support such efforts. The most significant value of the universal access options provided by cloud computing is the fast and simple sharing of resources. In the case of XCMS Online, uploaded liquid chromatography-mass spectrometry (LC-MS) raw data files, as well as the results of any specific data processing job including the experimental parameters and settings, can be shared either privately with any collaborator or publicly with the entire community. Both options require only a few mouse clicks. Thus, this type of cloud-based platform also increases the field’s ability to compare results with those of publicly shared jobs. Alongside XCMS Online, similar initiatives underpin the expanding role of cloud-based workflows in metabolomics and beyond. For example, MetaboAnalyst is a valuable collection of online tools for data analysis and interpretation, and Galaxy, a platform with roots in genomics and transcriptomics, is currently diversifying into metabolomics.
Common alternatives to cloud computing are downloadable, vendor-specific software tools to process and evaluate metabolomic data sets. While these software solutions offer advantages, they are typically very costly and sometimes not compatible with data formats acquired on competitor instruments. With an XCMS Online user population of over 22,000 scientists in 120+ countries, we estimate the cost savings of the scientific community to be over $70 million USD since its launch in 2011, while the cost for development and maintenance of XCMS Online has been below $2 million USD, resulting in a community return on investment of approximately 97%. At the current rate of growth these savings are expected to increase to over $250 million USD by 2021 by extrapolating across the current user base. These calculations are based on the average cost from four different vendors over the course of the last five years and the assumption that there would be one license for every three XCMS Online users. Importantly, these estimates are conservative, as additional costs for software updates or the need for the acquisition of more than one program were not taken into account.
Overall, the value of cloud-based bioinformatic platforms goes beyond convenience and addresses the fundamental demands of putting analytical raw data into a biological context and creating a more open scientific environment for rapidly translating science. The future success of cloud computing in the biomedical and life sciences will depend on the ability of investigators to properly interpret large-scale, multi-dimensional data sets that are generated by high-throughput technologies. From this perspective, freely accessible, cloud-based platforms are invaluable to facilitate appropriate data processing and handling. Making more metabolic data accessible on the cloud may trigger the integration with other high content datasets such as those existing in genomics and proteomics. One could image the creation of software to promote the integration and analysis of these datasets to generate more in depth insights. Having a central “hub” for these large data-sets and a way to integrate the information contained would be an invaluable asset to the scientific community.
The success of metabolomic platforms will likely evolve and expand into other fields, such as environmental and exposure sciences, microbial research, clinical diagnostics, pharmacokinetics, molecular modeling, education, and the social sciences, to name only a few. However, challenges in data protection will certainly remain to some extent. Because cloud-based tools are inherently easier to access, newly launched platforms should also stimulate synergies through smartly connecting resources.
Government by experts leads to government for experts, according to Harold Laski. The NIH strategic data plan is expertcentric and represents an escalation of commitment to experts in the iron triangle- government, academia and industry and since it fails to align itself with a goal of lower the prevalence of illness, the cost of illness, treatment burden, adjustment burden or return people to work- it appears to me to be improved means to unimproved ends. I did notice a goal of “improve health”, but you failed to operationalize that and I think its a safe guess that with or without your strategic plan that will be happening anyway in the iron triangle as Deans of Medical school often talk of “improving health”.
I find your example of citizen science to be not just demeaning, but reflects the fact that NIH engages in moral isolationism as evidenced by protests in the last few years by groups like MeAction and recent reports -https://www.lupusresearch.org/lupus-patient-voices-report-released-today/ not to mention the fact advisory committees at NIH are overrepresented by academic interests and under represent the largest stakeholders- the American public.
Shall I also mention a very troubled Office of Research Integrity and ongoing issues with research reproducibility. I’m sure the names Chalmers, Goldacre, and Ioannidis ring a bell..
How bout NIH’s resistance to monies spent on translational research? So its clear that like much research it is a closed system that seeks the cobra effect and escalation of commitment and more taxpayer dollars that is immune to public consultation as researchers believe that the public isn’t qualified to have an opinion about research. But frankly as much as NIH likes to engage in the grade their own homework and engage in the narcissism of small differences no organization can be a judge in its own defense. As a taxpayer I dont see a vision and real value to NIH’s effort. My healthcare costs are not being lowered, treatments are not more accessible,and wheres the cures for diabetes, migraines, etc? In fact NIH, as evidenced by Dr Robert Dworkins statement during an FPRS meeting last year indicated NIH has failed to consider cures for pain. And pain is the major reason why people seek medical attention.
And no price tag was provided to your data strategy- so American taxpayers are expected to buy a pig in a poke and write a blank check to the order of expercentrism. And that is why I suspect the data strategy is a “bad check” for America.
I wonder how many members of advisory committees can tell me the difference between neural networks and genetic algorithms, how bout aco and simulated annealing or q learning. I can only wonder why NIH advisory committee members haven’t kept pace with developments in soft computing- I’m no scientist and have taken the time to learn of such things. And so despite claims to the contrary scientists and government officials have been slow to make much needed changes and I fear like the poor progress of ehr, NIH data strategy will be very costly and of little direct value to the public.
I understand there is no course given in having a vision in medical school and only recently do schools like Johns Hopkins call for critical thinking skills and so I understand why there really is no vision that the NIH data strategy is aligned to and that NIH is highly acritical of its own efforts- as we see with regard to the failed first decade of the brain where then NIMH Director Lewis Judd said that it would conquer mental illness by the year 2000 or by the failure of NIH to take pain care seriously that has now lead to disaster capitalism in pain care and a mad rush to develop alternatives to opioids. But, of course, as Dr Collins indicated they may not be more robust then opioids and certianly are not part of an effort to cure pain but reflect pain managerialism at NIH aka paradigm paralysis.
NIH continues to be autopoeitic closed system that fails to develop clearly auditiable goals that will make health care much different or better for individuals. I will forward my collection of citations on the great many problems in medical research/pain research and care that will not be improved by design by NIH strategic data plan. No doubt the iron triangle will embrace the data plan as it serves their interests but as with most activities at NIH it is too far removed from the public good.