Monday, November 20, 2017

From Predictive to Prescriptive Analytics: Response to NLM RFI

The National Library of Medicine (NLM) recently posted a Request for Information (RFI) asking for comment on promising directions and opportunities for next-generation data science challenges in health and biomedicine. This blog posting lists the questions posed and my responses to them. A main focus of my input centers on the need for transition from predictive to prescriptive analytics, i.e., going beyond being able to predict with data and moving toward applying it to improve patient diagnoses and outcomes.

1. Promising directions for new data science research in the context of health and biomedicine.  Input might address such topics as Data Driven Discovery and Data Driven Health Improvement.

The scientific literature is increasingly filled with papers describing novel and exciting applications of data science, such as improving clinical diagnosis and determining safer and more efficient healthcare. But there is more to impactful data science than the data and tools. We need studies that demonstrate real impact in improve patient and system outcomes. We need to assess the impact of efforts improving data standards and data quality.

One way to look at this is to consider the growing area of data analytics, which may be thought of as applied data science. Data analytics classifies three levels of analytics [1]:
  • Descriptive - describing what the data say about what has happened
  • Predictive - using the data to predict what might happen going forward
  • Prescriptive - deciding on actions based on the data to improve outcomes
Of course, there is science behind each level. We are seeing a steady stream of scientific papers on the application of predictive analytics. One of the earliest foci was the use of clinical data to predict hospital readmission, especially as a result of the Centers for Medical and Medicaid Services (CMS) penalizing US hospitals for excessive rates of readmission. This has led to dozens of papers being published over the last decade assessing various models and approaches for prediction of hospital readmission, e.g., [2,3]. Another focus that has recently attracted attention has been the use of deep learning for medical diagnoses through processing of radiology [4,5], pathology [6], and photographic images [7]. Even patient monitoring and health behaviors have shown the potential benefit for improvement via Big Data [8]. Likewise, as the database for precision medicine emerges, we will understand increasingly data-driven ways to treat different diseases, sometimes by therapies we never hypothesized for a given condition [9].

These predictive analytics applications are important, but equally important is research into how they will be best applied. Attention to hospital readmissions has somewhat lowered its rate, but the problem is far from solved. We not only need to predict who these patients will be, but device programs that will enable action on that data.

Likewise, as we learn to improve diagnosis and treatment of disease through predictive analytics, we will need to determine ways to make actions on those predictions possible, both for clinical researchers who discover new possible diagnostic tests and treatments for disease as well as clinicians who apply the new complex information in patient care. This will require both clinical decision support from machines and new organizational structures to conduct research and apply its results optimally in clinical care.

As such, a new thread of research in prescriptive analytics, i.e., applying the outcomes of data science research, is critical for realizing the value of biomedical science. The NLM should be at the forefront of thought leadership and funding of that research. Such research can build on its unique strong portfolio of existing research in biomedical informatics (which some of us consider data science to be a part of).

2. Promising directions for new initiatives relating to open science and research reproducibility. Input might address such topics as Advanced Data Management and Intelligent and Learning Systems for Health.

Open science and reproducibility of research are critical for the transition of data science from predictive to prescriptive analytics. Since the value of data science comes from large understanding of populations of patients, it is only fair to all who contribute their data to benefit from research using it. Therefore, we must devise methods to allowing appropriate access to that data while still protecting the privacy of individuals who have contributed their data. We also need to devise approaches to give appropriate scientific credit to those who collect the data, and a short time-limited window for them to achieve the first publication of results from it.

Open science should not, however, just be thought of as open data. The models and algorithms that process such data are also increasingly complex. We need more research into understanding how such systems work, how different methods compare with each other, and where biases and other problems may be introduced. As such, the algorithms used must be open so they can be understood and improved.

3. Promising directions for workforce development and new partnerships. Input might address such topics as Workforce Development and Diversity and New Stakeholder Partnerships.

New directions in data science must take into account the human workforce needed to lead discovery as well as apply it to achieve value. The best known data analytics workforce analyses from McKinsey [10] and IDC [11] are a few years old now, but both make a consistent point that we not only need a focused cadre of quantitative experts, but also 5-10 fold more professionals who can contribute to the design of analyses and apply their results in ways that improve patient and system outcomes. In other words, we need individuals who not only know the optimal methods for predictive uses, but also domain experts and applications specialists who can collaborate with the quantitative experts to achieve the best outcomes of data science.

In conclusion, there are many opportunities to put data science and data analytics to work for advancing health and healthcare. This work must not only build on past work done in biomedical informatics and other disciplines but also look to the future to best apply prediction in ways that improves maintanence of health and treatment of disease.


1. Davenport, TH (2015). Big Data at Work: Dispelling the Myths, Uncovering the Opportunities. Cambridge, MA, Harvard Business Review.
2. Amarasingham, R, Moore, BJ, et al. (2010). An automated model to identify heart failure patients at risk for 30-day readmission or death using electronic medical record data. Medical Care. 48: 981-988.
3. Futomaa, J, Morris, J, et al. (2015). A comparison of models for predicting early hospital readmissions. Journal of Biomedical Informatics. 56: 229-238.
4. Oakden-Rayner, L, Carneiro, G, et al. (2017). Precision radiology: predicting longevity using feature engineering and deep learning methods in a radiomics framework. Scientific Reports. 7: 1648.
5. Rajpurkar, P, Irvin, J, et al. (2017). CheXNet: radiologist-level pneumonia detection on chest x-rays with deep learning.
6. Liu, Y, Gadepalli, K, et al. (2017). Detecting cancer metastases on gigapixel pathology images.
7. Esteva, A, Kuprel, B, et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature. 542: 115-118.
8. Price, ND, Magis, AT, et al. (2017). A wellness study of 108 individuals using personal, dense, dynamic data clouds. Nature Biotechnology. 35: 747-756.
9. Collins, FS and Varmus, H (2015). A new initiative on precision medicine. New England Journal of Medicine. 372: 793-795.
10. Manyika, J, Chui, M, et al. (2011). Big data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute.
11. Anonymous (2014). IDC Reveals Worldwide Big Data and Analytics Predictions for 2015. Framingham, MA, International Data Corporation.

Wednesday, November 8, 2017

End of an Era For Academic Informatics: Demise of the Home-Grown EHR

Pick your cliche to describe a major event this past week: Another domino is falling. The dawn of a new era. The news that Vanderbilt University Medical Center, home to one of the most esteemed academic informatics programs in the country, is replacing its collection of home-grown and commercial electronic health record (EHR) systems with Epic shows that the era of the home-grown academic EHR is coming to a close.

Whatever cliche we wish to use, the change is real for academic informatics. One by one, many of the major academic informatics programs have sunset their home-grown EHRs in favor of commercial systems, including Partners Healthcare, Mayo Clinic, Intermountain Healthcare, and the Veteran’s Administration.

The enterprise EHR has become too complex for a single academic program to maintain. Academic informatics programs are great at fostering innovation in areas such as clinical decision support and re-use of clinical data. But they are less adept at managing the more mundane yet increasingly complex operations of hospitals and healthcare systems, such as the transmission of orders from the hospital ward to departmental (e.g., radiology or pathology) systems, the delivery of results back to clinicians, and the generation of bills for services. When compliance and security issues are added on top, it becomes untenable for academic programs to maintain.

Some in academic informatics lament this closing of an era. But ever the glass-half-full optimist, I do not necessarily view it as a bad thing. Now that EHR systems are mission-critical to healthcare delivery organizations and must be integrated with their myriad of other information systems, it is probably inappropriate for academic groups to develop and maintain them.

Fortunately, there are emerging tools for innovation on top of the mundane “plumbing” of the EHR. Probably the leading candidate to serve as such a platform is SMART on FHIR. A growing number of academic programs are using SMART on FHIR to innovate on top of commercial EHRs. Granted, some of the commercial EHR systems (e.g., Epic) currently support the Fast Health Interoperability Resources (FHIR) standard incompletely, but we can remember another cliche, which is the famous Wayne Gretzky quote of skating not to where the puck is, but where it will be going. As SMART on FHIR matures, I can envision it as a great platform for apps that read and write data from the EHR.

In some ways I liken the situation to the relationship between computer operating systems and academic computer science departments. Very few academic computer scientists do research on operating systems these days. Most academic computer scientists, just like the rest of us, use Windows, MacOS, Linux, iOS, and/or Android. Today’s modern operation systems are complex and require large companies to maintain. Most academic computer science research now occurs on top of those operating systems. There, academics can carry out their innovation knowing that the operating systems (to the best of their capabilities) can manage the data in files, connect to networks, and keep information secure.

This new environment should lead to new types of innovations in informatics, which take place on top of commercial EHRs, which may now be better viewed as the “operating system” that provides the foundational functionality upon which academic informatics innovators can build. This could be a boon to places like my institution, which never even had a home-grown EHR. We are certainly pursuing SMART on FHIR development with rigor going forward.

Friday, November 3, 2017

Why Pursue a Career in Biomedical and Health Informatics?

There are an ever-growing number of career opportunities for those who enjoy working with data, information, and knowledge to improve the health of individuals and the population in the field of biomedical and health informatics. This field develops solutions to improve the health of individuals, the delivery of healthcare, and advancing of research in health-related areas. Jobs in informatics are highly diverse, running the spectrum of the highly technical to those that are very interpersonal. All are driven, however, by the goal of using data, information, and knowledge to improve all aspects of human health [1, 2].

Within biomedical and health informatics are a myriad of sub-disciplines, all of which apply the same fundamental science and methods but are focused on particular (and increasingly overlapping) subject domains. Informatics can be viewed as proceeding along a continuum from the cellular level (bioinformatics) to the person (medical or clinical informatics) to the population (public health informatics). Within clinical informatics may be a focus on specific healthcare disciplines, such as nursing (nursing informatics), pharmacy (pharmacy informatics), and radiology (radiology informatics) as well as on consumers and patients (consumer health informatics). There are also disciplines in informatics that apply across the cell-person-population spectrum:
  • Imaging informatics – informatics with a focus on the storage, retrieval, and processing of images
  • Research informatics – the use of informatics to facilitate biomedical and health research, including a focus on clinical and translational research that aims to accelerate research findings into healthcare practice
Another emerging new discipline that has substantial overlap with informatics is data science (or data analytics in its more applied form). The growth in use of electronic health records, gene sequencing, and new modalities of imaging, combined with advances in machine learning, natural language understanding, and other areas of artificial intelligence provide a wealth of data and tools for use to improve health. But informatics is not just about processing the data; the range of activity includes insuring the usability of systems for entering and working with high-quality data to applying the results of data analysis to improve the health of individuals and the population as well as the safety and quality of healthcare delivery.

The variety of jobs in biomedical and health informatics means that there is a diversity in the education of those holding the jobs. Informatics has a body of knowledge and a way of thinking that advance the field. It is also an interdisciplinary field, existing at the interface of a number of other disciplines. For this reason, education has historically been at the graduate level, where individuals combine their initial education in one of the core disciplines (e.g., health or life sciences, computing or information sciences, etc.) with others as well as the core of informatics. An example of such a program is ours at Oregon Health & Science University (OHSU).

A variety of data show that professionals from this discipline are in high demand. Job sites such as show a wide variety of well-paying jobs. A previous analysis of online job postings found 226,356 positions advertised [3]. More recently, a survey of healthcare IT leaders shows continued demand for professionals in this area [4]. For physicians working in the field, there is now a new medical subspecialty [5]. The nursing profession has had a specialization in nursing informatics for over a decade, and we are likely to see more certifications, for example the American Medical Informatics Association (AMIA) developing an Advanced Health Informatics Certification that will apply to all informatics professionals, not just those who are physicians and nurses.

Does one need to be a clinician to be trained and effective in a job in clinical informatics? Must one know computer programming to work in any area of informatics? The answers are no and no. Informatics is a very heterogeneous field, and there are opportunities for individuals from all types of backgrounds. One thing that is clear, however, is that the type of informatics job you assume will be somewhat dependent on your background. Those with healthcare backgrounds, particularly medicine or nursing, are likely to draw on that expertise for their informatics work in roles such as a Chief Medical or Nursing Informatics Officer. Those with other backgrounds still have plenty of opportunities in the field, with a wide variety of jobs and careers that are available.

Informatics is a career for the 21st century. There are a wide variety of jobs for people with diverse backgrounds, interests, and talents, all of whom can serve the health of society through effective use of information and associated technologies.


1. Hersh, W (2009). A stimulus to define informatics and health information technology. BMC Medical Informatics & Decision Making. 9: 24.
2. Hersh, W and Ehrenfeld, J (2017). Clinical Informatics. In Health Systems Science. S. Skochelak, R. Hawkins, L. Lawson et al. New York, NY, Elsevier: 105-116.
3. Schwartz, A, Magoulas, R, et al. (2013). Tracking labor demand with online job postings: the case of health IT workers and the HITECH Act. Industrial Relations: A Journal of Economy and Society. 52: 941–968.
4. Anonymous (2017). 2017 HIMSS Leadership and Workforce Survey. Chicago, IL, Healthcare Information Management Systems Society.
5. Detmer, DE and Shortliffe, EH (2014). Clinical informatics: prospects for a new medical subspecialty. Journal of the American Medical Association. 311: 2067-2068.

Wednesday, November 1, 2017

From Vendor-Centric to Patient-Centric Data Stores

There is growing consensus that patients should be owners and stewards of their personal health and healthcare data. They should also have the right to control access to chosen healthcare professionals, institutions, and researchers. Current information systems in the healthcare system do not facilitate this point of view, as data is for the most part stored in the siloed systems of the places where patients obtain care.

If we accept the view that patients own their data and can control access to it, how do we facilitate the transition from provider-centric to patient-centric data storage? Such an ecosystem will require new models for data storage and its access. Existing business models for clinical systems will need to adapt to this new approach, although new business opportunities will emerge for companies and others that can succeed in this new environment.

My own view is that every patient should have a cloud-based data store to which they (or a designated surrogate for minors or those unable to give consent for access) allow access to designated healthcare providers or others. A new business model will emerge for companies that facilitate connection of authorized systems to a patient’s data. Even existing electronic health record (EHR) vendors could participate, especially as many of them are building large data centers and cloud-based solutions (although will require changes in their current business models away from their keeping the data in their silos).

The market for this approach will necessarily have some regulation, most likely from the government. Those participating will need to adhere to a common set of standards. Systems will also need to maintain the integrity of data deposited by clinicians. Patients should be allowed to annotate data, and even challenge it, but not modify it (unless the clinician amends it).

This has implications for EHR systems of the future. The current large monolithic systems will need to give way to those that access data in a standardized way. The new EHR “system” may not look much different from current systems (although hopefully will), but instead of accessing data from within its own stores, it will instead pull and push back data from the patient’s designated store.

A recent Perspective in JAMA lays out three necessary components for this vision to succeed [1]. The first is standard data elements. Among the approaches likely to achieve this are initiative such as SMART on FHIR [2] and the Clinical Information Modeling Initiative (CIMI) [3, 4].

The JAMA piece posits a second required component, a standard data receipt for each clinical encounter, with push of the encounter into the patient’s data store. Methods such as blockchain may facilitate the integrity needed to maintain the sanctity of the clinician's input.

Finally, the third is a contract (I may have preferred calling it a compact) that sets the rules for access and control for such a system.

One question the JAMA piece did not was address was, who pays? This is never an easy question in healthcare, since patients do not pay directly for many things. Instead, their insurance pays. So a conversation will be necessary to determine how such a system is financed.

As with many aspects of informatics, the technology to implement all of this currently exists, and the real challenges are how to create the market and the regulations for this major transition in how patient data is stored and accessed. As with all developments in informatics, there will be “unintended consequences” along the way that will need thoughtful discussion among all stakeholders in this endeavor.


1. Mikk, KA, Sleeper, HA, et al. (2017). The pathway to patient data ownership and better health. Journal of the American Medical Association. 318: 1433-1434.
2. Mandel, JC, Kreda, DA, et al. (2016). SMART on FHIR: a standards-based, interoperable apps platform for electronic health records. Journal of the American Medical Informatics Association. 23: 899-908.
3. Oniki, TA, Zhuo, N, et al. (2016). Clinical element models in the SHARPn consortium. Journal of the American Medical Informatics Association. 23: 248-256.
4. Moreno-Conde, A, Moner, D, et al. (2015). Clinical information modeling processes for semantic interoperability of electronic health records: systematic review and inductive analysis. Journal of the American Medical Informatics Association. 22: 925-934.