Wednesday, July 17, 2013

Data Scientists Must Also Be Research Methodology Scientists

I had the chance last week to attend a conference in Singapore, Big Data and Analytics in Health Care. It was an interesting blend of academics, operational health information technology professionals, and data scientists from companies in the emerging analytics market. I was also in Singapore for the end in-person session of the 10x10 ("ten by ten") introductory informatics course we offer there.

The talks were all interesting, but I was struck by the difference in the content and tone of the academic and clinical operations speakers compared to those from analytics companies and who called themselves "data scientists." Whereas the academic and clinical operational types were cautious in their methods and results, the data scientists implied their techniques would revolutionize healthcare and threw around terms like "big data" and "analytics" at every turn. One of the latter types showed a "model" of the pathways leading to good (conservative) and bad (surgery) outcomes in back pain, with the intermediate nodes representing actions along the path, such as medication use, physical therapy, and chiropractic care. It was not clear to me how this model could be used to improve care, and I am not sure the speaker really understood that correlations do not prove causality. A second such speaker showed some interesting correlations between words and phrases that occur in clinical narratives of patients with diabetes and aspects of their care. I understand machine learning and how it might be used to "learn" things about patients with diabetes, but I did not see any evidence that this work would lead to any kind of improved patient outcomes.

Another concern I have about proponents of clinical data analytics is their presumption that their algorithms can somehow take all of the growing amount of operational electronic health record (EHR) data and automatically turn it into medical knowledge, as if they could turn a crank with data going in and knowledge emerging. I do have great enthusiasm for some of what can be done with this data, but I also have concerns about the quality and completeness of this data as well as the causality issues that arise without controlling observations in experimental ways.

I had the opportunity to speak at the conference as well, and gave a talk pulling together my cautious enthusiasm for using operational clinical data for research and other analytical purposes. This was the first public talk I have given on this topic since publication of a paper with ten other colleagues on caveats for the use of operational electronic health record data in comparative effectiveness research in the journal Medical Care [1]. The paper was commissioned by AcademyHealth and is part of a special supplement of the journal devoted to electronic data methods.

Our paper notes that while there are many opportunities for using clinical data for research and analytics, we also must remember the limitations of such data. In particular, EHR and other clinical data may be:
  • Inaccurate - data entry is not always a top priority for clinicians, and they may take shortcuts, such as copy-and-paste
  • Incomplete - patients do not get all of their care in one setting
  • Transformed in ways that undermine meaning - coding for billing is the best known example of this
  • Unrecoverable for research - data may be in clinical narratives or other less accessible places
  • Of unknown provenance - we need to know where data comes from and how likely it is to be accurate
  • Of inappropriate granularity - data too coarse for research purposes
  • Incompatible with research protocols - patients are not always diagnosed and treated consistently with best practices
Despite these caveats, I am optimistic that there will be uses for this data, especially if we can generate it in a standards-based way and otherwise improve its quality. Hopefully clinicians, researchers, patients, public health authorities, quality improvement leaders, and other who might benefit from the data will have incentive to improve it by more meticulous entry as well as use of standards-based, such as those proscribed by Stage 2 of the meaningful use program [2]. For many clinicians especially these days, the EHR can be a data sink hole into which they enter data, spending a great deal of time but getting little in return.

The bottom line is that while data scientists may be able to generate interesting and important results with their methods, they must also understand basic principles of research science, such as inferential statistics, clinical significance, and cause and effect. In addition, they must demonstrate their methods lead to improvements in health and/or healthcare, and are not just generating interesting associations. In other words, they must show evidence that their methods add value, just as medical care and informatics are required to do.


1. Hersh, WR, Weiner, MG, et al. (2013). Caveats for the use of operational electronic health record data in comparative effectiveness research. Medical Care. 51(Suppl 3): S30-S37,
2. Metzger, J and Rhoads, J (2012). Summary of Key Provisions in Final Rule for Stage 2 HITECH Meaningful Use. Falls Church, VA, Computer Sciences Corp.

Thursday, July 4, 2013

OHSU Informatics Graduation 2013: UBT Winding Down, Other Growth Steady, and Future Opportunities Abound

One of my favorite activities each year at Oregon Health & Science University (OHSU) is graduation. It is always a joy to march in the faculty procession and see our graduates who attend receive their degrees and certificates. Since we first started having graduates in 1998, I have only missed one commencement ceremony. Below is a picture of the happy group that attended this year's ceremony.
With this year's graduates, our biomedical informatics graduate program has now awarded 537 degrees and certificates to 509 people. We have 181 people who are still enrolled in our various programs. A total of 1558 people have been enrolled in our program at one time or another dating back to its launch in 1996. The figure below shows the distribution in the program tracks - bioinformatics and computational biology (BCB), clinical informatics (CI), and health information management (HIM) - for each degree or certificate.
This graduation also marks the end of studies for almost all students funded by the University-Based Training (UBT) program, the initiative of the Office of the National Coordinator for Health IT (ONC) to vastly expand the health IT workforce. Funded under the Health Information Technology for Economic and Clinical Health (HITECH) Act, this program began in April, 2010 and in essence served as a scholarship program to pay tuition for students selected for funding.

The figure below shows the flowchart for applicants through graduates of the OHSU UBT program. While we will fall slightly short of our goal to graduate 148 students (135 Graduate Certificate and 13 Master's), we have launched many new informatics careers. While we accepted more people than our goal (176), the attrition rate turned out to be higher than we anticipated (24.4%, although about one-third continued as non-UBT-funded students). In addition, a number of those to which we were not able to award funding still enrolled in the program (65). In the meantime, a substantial number of people who were not candidates for UBT funding or otherwise did not apply still enrolled and/or graduated during the time period of the UBT grant (272 enrolled, 136 graduated).
The UBT funding will formally end on September 30, 2013. After that time, we will have three additional months to allow the remaining UBT-funded students to complete the program. Given how much of professional effort has revolved around two ONC initiatives, the UBT program and the Health IT Curriculum Development Project, it is amazing to think of my life devoid of them.

The good news is that there is not only plenty of continued demand for informatics education, but new opportunities are emerging. Interest and enrollment in our program has not abated. As for the opportunities, one will be training programs for the new clinical informatics subspecialty for physicians. In addition, there will likely be additional opportunities when new certifications emerge from the Advanced Interprofessional Informatics Certification Task Force of the American Medical Informatics Association (AMIA) as well as others from groups such as the Commission on the Accreditation of Health Informatics and Information Management (CAHIIM). There are in addition opportunities for developing informatics education for medical students, enabled by a recent grant awarded to OHSU by the American Medical Association (AMA).

Tuesday, July 2, 2013

Changes in the Meaning of Birthdays As We Age

Today is my 55th birthday. I am halfway through my fifties. Overall, I cannot complain about my life. I have a wonderful family, great friends and colleagues, and a rewarding career. I also have my health, which I work at to maintain via healthy living, although I acknowledge that some health problems occur that are unrelated to diet and exercise, i.e., over which we have very little control.

Earlier in my life, my birthday represented a next step in my life, such as to achieve a new age or take a new job (one day after the usual July 1st starting date of the new year in academic medicine). Now, however, birthdays are more introspective, giving me pause to think and reflect about where my life has gone and what lies ahead.

In my rise in my career, I have mostly been an overachiever, accomplishing more than many people relative to my age. I reached a significant leadership role in launching the forerunner of my academic department, and then becoming Chair when we became a true department. My colleagues used to kid me at one point when I was the youngest member of the Board of Directors elected by the American Medical Informatics Association. My overachievement in this part of my life made up for my being a relative underachiever earlier in my life, especially in elementary and early high school.

As I have reflected on my birthdays in recent years, one thing is clear to me: While I still have a number of pathways open to me in life, my finite time left closes some options. I certainly have plenty of time to make adjustments in my career, but there are few careers that I could undertake completely de novo at this time.

Not that I have any desire to drastically change my career. I have been fortunate to be a part of something special, from the days of informatics being an obscure academic discipline to the modern days of HITECH, smartphones, and the ubiquitous Internet. My students laugh at me when I sometimes awe at the new technology and reminisce about the limits of what we had to deal with in the old days. However, I will keep looking forward in the informatics field and in life, and wonder about what amazements I will come upon at future birthdays.

Monday, July 1, 2013

Would You Get a $485 MRI Exam?

Last month, I was traveling to my daughter's college commencement in Corvallis, OR, and saw an interesting billboard along the way, advertising a $485 magnetic resonance imaging (MRI) exam. We are so unaccustomed to seeing healthcare businesses try to compete on price that this billboard took me by surprise.

And then I thought to myself, would I use this service if I needed an MRI? Fortunately, I have health insurance through my employer, so am able to get exams like an MRI without having to shop based on price. Of course, I am not sure I would want shop for something that might be serious for my health on the basis of price alone, even though I have to admit that I have no idea (nor does anyone, really) whether the quality of this $485 MRI or the one costing fourfold more that most other local healthcare institutions offer is better or worse. Still, I feel some comfort in knowing I do not have to choose based on cost, especially on cost alone.

This dilemma goes to the heart of our current conundrum of healthcare. We know that healthcare costs too much, especially in the United States. Yet I suspect that few of us, especially when a serious diagnosis is being contemplated, would want any care less than the best possible care. This is especially the case when one needs care acutely; the last thing we do when suffering from severe abdominal or chest pain is to go shopping for the best price.

As I have written before and even before that, while there may be things we can do to reduce the cost of healthcare without sacrificing quality, unleashing the free market, which works so well in some industries like computers and agriculture, is unlikely to be the solution to our healthcare cost problems. While our healthcare system can do a much better job providing information about quality and safety, I just do not see most people choosing bargain-basement prices for healthcare when confronted with serious illness.