Tuesday, July 28, 2015

The ONC Health IT Curriculum Returns to Life

Long-time readers of this blog know that a substantial part of my work life around 2010-2013 involved developing the health information technology curriculum for the Office of the National Coordinator for Health Information Technology (ONC). I posted in this blog about the project when it was funded as well as it came to an end. As the project was winding down, one of my laments was that there was no further funding to maintain the curriculum. This did not mean it was still not a valuable resource, as many educators were continuing to use it and enhance it locally. We were fortunately able to find a home for the materials in the American Medical Informatics Association (AMIA) Knowledge Center.

I was pleased earlier this year when ONC announced a funding opportunity to update the materials and add four new areas of content relevant for improved healthcare delivery: population health, care coordination, new care delivery and payments models, and value-based care. I am even more thrilled to report that OHSU was one of seven institutions awarded nearly a million dollars in funding to carry out this update and enhancement.

The funding is for more than just updating the curriculum and adding the new topic areas. After the curriculum revision is complete, ONC will work with the awardees to establish a program to train incumbent healthcare employees whose roles, duties, or functions involve health IT. The training will be completed in five days or less to accommodate professionals with restricted schedules and will be offered in various settings, such as online, in-person, or train-the-trainer programs. In total, awardees will collectively train about 6000 incumbent healthcare workers (about 1000 per grantee) in team-based care environments, such as long-term care facilities, patient-centered medical homes, accountable care organizations, hospitals, safety net clinics, rural health, and other settings.

I am certain that I will have more to say periodically about the project and its progress. I am also confident that it will help expand capacity of health IT across the country.

Friday, July 10, 2015

What is the Difference (If Any) Between Informatics and Data Science?

I am increasingly asked to describe the difference between data science and biomedical informatics. Distinguishing these disciplines takes on added importance with the recent publication of the NIH Advisory Committee to the Director, National Library of Medicine (NLM) Working Group, report on the future of the NLM, which calls for NLM to become a leader in data science at NIH. NLM has of course historically been a leader in research and training in biomedical informatics.

What is, if any, the difference between informatics and data science? Let me start with definitions. I have written my own definitions of biomedical informatics [1] but for the sake of the community, let me quote the latest consensus definition from our professional association, the American Medical Informatics Association (AMIA) [2]: "Biomedical informatics (BMI) is the interdisciplinary field that studies and pursues the effective uses of biomedical data, information, and knowledge for scientific inquiry, problem solving and decision making, motivated by efforts to improve human health."

How is data science defined? It is not as easy to find an "official" definition of data science, but a good starting point might be the definition from Wikipedia, which is the "extraction of knowledge from large volumes of data that are structured or unstructured." The Wikipedia article references that definition from a paper by Vasant Dhar [3] and a blog posting by Jeff Leek. A Google search also points out some highly-cited sources from O’Reilly & Associates Media and Forbes Magazine. The Forbes article quotes the famous information scientist Hal Varian, who has noted, "The ability to take data - to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it - that’s going to be a hugely important skill in the next decades." I myself have written that the core competencies of data science are statistics, especially machine learning; data-oriented computer programming, especially of querying databases; domain understanding where the analysis and interpretation will be applied; business processes; and communications.

I believe that whether data science is distinct from, partially overlapping with, or a subset of informatics is defined by how broadly one defines informatics. I tend to take a very broad definition of informatics, because I understand that its "sociotechnical" nature [4] covers many facets of data, information, and knowledge, including their technological as well as social context. Informatics recognizes that many aspects of data influence its use, aggregation, and interpretation. I have expressed concern in the past that data scientists need to understand research methodology, in particular how we distinguish cause-and-effect from correlation.

One possible way to answer the question of distinction between data science and informatics is to think about areas of informatics that are not ordinarily considered part of data science. I can think of (at least) several. One is usability. We know through the recent massive adoption of the electronic health record EHR) that there are significant usability challenges of current EHRs [5]. These not only adversely impact workflow, another important informatics topic, but may compromise safety. Another area of informatics we have come to recognize as critical is adherence to standards so that we may achieve better system interoperability. Finally, we also know that informatics is riddled with challenging "people and organizational issues" as information systems profoundly impact healthcare and individual health in many ways [6].

There is no question that what we can do with data is important for informatics, larger healthcare, and society as a whole. Informatics has recognized this for decades, but it also knows that there is much context beyond the data itself, and to this end, we are best served by viewing data science as a proper subset of informatics, certainly in the biomedical and health domain.


1. Hersh, W (2009). A stimulus to define informatics and health information technology. BMC Medical Informatics & Decision Making. 9: 24. http://www.biomedcentral.com/1472-6947/9/24/.
2. Kulikowski, CA, Shortliffe, EH, et al. (2012). AMIA Board white paper: definition of biomedical informatics and specification of core competencies for graduate education in the discipline. Journal of the American Medical Informatics Association. 19: 931-938.
3. Dhar, V (2013). Data science and prediction. Communications of the ACM. 56(12): 64-73.
4. Coiera, E (2007). Putting the technical back into socio-technical systems research. International Journal of Medical Informatics. 76(Supp 1): 98-103.
5. Zhang, J and Walji, M, Eds. (2014). Better EHR - Usability, workflow & cognitive support in electronic health records. Houston, TX, National Center for Cognitive Informatics & Decision Making in Healthcare.
6. Ash, JS, Berg, M, et al. (2004). Some unintended consequences of information technology in health care: the nature of patient care information system related errors. Journal of the American Medical Informatics Association. 11: 104-112.