Dr. Bourne gave an excellent talk, laying out a vision for how data science will improve health. He also reiterated his view of "Big Data," published recently in Journal of the American Medical Informatics Association (JAMIA), which focused less on the quantity of data and more on clinical, research, and other health-related organizations making maximal use of all of their data assets [1]. This is in distinction, as noted by another commentary, about a certain vagueness when definitions of Big Data focus on the word "big" [2]. Dr. Bourne's utilitarian view makes more sense to me, since there are many "small" data issues around clinical data, such as quality, completeness, and provenance, that must be solved before we can trust and apply the output of Big Data systems [3].
Nonetheless, what I believe was under-appreciated in Dr. Bourne's talk, which is common among those coming from the bioinformatics world where data is more regular and complete, was the scientific issues underlying the challenges of clinical data. Yes, we are (finally!) entering an era when patient data is increasingly captured in electronic form. But just because clinical data is plentiful does not mean it is good data, and there is no evidence, as is sometimes asserted, that more plentiful quantities of data will overcome some of its quality problems. I certainly agree that clinical trials as we now perform them are small, expensive, and may not have generalizability. But that does not prove that multiple orders of magnitude larger quantities of observational data will be better.
I certainly have enthusiasm for using data in our clinical systems. I believe there will be tremendous opportunities for leveraging the value of data, especially when it is of high quality. We will, for example, be able to validate the results of experimental studies on a much larger scale. We will also be able to find many uses for predictive analytics, such as identifying patients where we can intervene to ward off poor outcomes or find ways to deliver healthcare services more efficiently. There is no end to the possible value of Big Data in healthcare and biomedicine.
But the fruits of more data will not be realized just by accumulating more of it in digital systems. One of the big challenges was eloquently stated by another attendee of the talk, Dr. Justin Starren of Northwestern University, who noted that while data science deals with important problems, it takes place outside of the workflows addressed by clinical informatics. On the front end, data science says very little about data entry, workflow, usability of EHRs, and other factors that have, according to a recent survey by Medical Economics magazine, made EHRs the bane of many clinicians [4]. On the back end, there are challenges too, such as whether the output of data analytical algorithms can be applied in ways that measurably benefit clinical outcomes [5].
These are important as growing criticism emerges from clinicians regarding currently used EHRs. We also know that while a good deal of research shows benefits of IT [6], other research raises concerns about its safety [7]. Clearly we have a ways to go before we solve the end-to-end goal of electronic record-keeping leading to improved health or healthcare delivery.
To this end, we need a research agenda for clinical informatics. The problem is that we do not have a well-funded federal agency devoted to research in this area. The National Library of Medicine (NLM) is an obvious home for such research, especially as many of us have careers that have been propelled by NLM funding. However, many people don't immediately think of a "library" for this kind of work. In addition, the NLM's research budget is small; for example, only 13 research grants were awarded last year. Another government agency that funds this kind of work might be the Agency for Healthcare Research & Quality (AHRQ), which has a rich health information technology (HIT) portfolio. However, as important as AHRQ studies are, they mostly focus on applications of HIT and do not get down to the core scientific issues addressed above. Some of the other institutes of the NIH fund informatics research, but is usually applied in disease-specifc ways (e.g., the National Cancer Institute and the National Institute of Diabetes and Digestive and Kidney Diseases). There are other government agencies that funded some general types of informatics research, such as the National Science Foundation (NSF), although NSF eschews disease-specific research.
I recognize we are in an era of tight federal research funding, with few dollars for investing in new programs. I am hopeful that the investments being made in data science will take a broad focus and include investigation into better ways to produce high-quality clinical data as well as optimally use it to improve health, clinical outcomes, and healthcare delivery. In the long run, however, our healthcare system really needs a research agenda and program for clinical informatics.
References
1. Bourne PE, What Big Data means to me. Journal of the American Medical Informatics Association, 2014. 21: 194-195.
2. Ward JS and Barker A, Undefined by data: a survey of big data definitions. Databases (cs.DB), 2014. http://arxiv.org/abs/1309.5821.
3. Hersh WR, Weiner MG, Embi PJ, Logan JR, Payne PR, Bernstam EV, et al., Caveats for the use of operational electronic health record data in comparative effectiveness research. Medical Care, 2013. 51(Suppl 3): S30-S37.
4. Verdon DR, Physician outcry on EHR functionality, cost will shake the health information technology sector, Medical Economics. February 10, 2014. http://medicaleconomics.modernmedicine.com/medical-economics/news/physician-outcry-ehr-functionality-cost-will-shake-health-information-technol.
5. Amarasingham R, Patel PC, Toto K, Nelson LL, Swanson TS, Moore BJ, et al., Allocating scarce resources in real-time to reduce heart failure readmissions: a prospective, controlled study. BMJ Quality & Safety, 2013. 22: 998-1005.
6. Jones SS, Rudin RS, Perry T, and Shekelle PG, Health information technology: an updated systematic review with a focus on meaningful use. Annals of Internal Medicine, 2014. 160: 48-54.
7. Anonymous, Health IT and Patient Safety: Building Safer Systems for Better Care. 2012, Washington, DC: National Academies Press.