Tuesday, August 26, 2014

Beyond Prediction: Data Analytics/Data Science/Big Data Must Demonstrate Value

One of my ongoing concerns for data analytics/data science/Big Data in biomedicine and health is that despite the growth of articles and other writing, the accomplishments of using these tools, especially as would be documented in peer-review journals, continues to be small. I am as enthusiastic as anyone about the prospects for harnessing the growing quantity of data in our operational electronic health record (EHR) and other systems for improving health, healthcare, and research. Yet I also believe that we need to be careful that our enthusiasm does not lead to overselling or outright hype, and that we must demonstrate the value for using data just as we would any other clinical process or tool.

There have been some news reports of the value of using Big Data. However, it would be better to see peer-review publication of such results. From the news, it has been reported that two states, Wyoming and Washington, have shown reduced emergency department visits using data-based methods, while Beth Israel Deaconess Hospital has used data as part of an effort that has helped reduce hospital readmissions by 25%. Another earlier news article reported that IBM Watson has learned from data how to diagnose cancer more accurately than physicians, although when I emailed the physician to whom that quote of its success was attributed, he replied that he had never said it (Samuel Nussbaum, email communication, July 28, 2014).

There also continues to be a spate of well-done research demonstrating the predictive value of data. Just this past week, as I was preparing this post, two interesting and informative studies of prediction came across the wire, one looking at risk for metabolic syndrome in a database of 36,944 individuals maintained by a large health insurer [1] and another looking at prediction of hospital readmission [2]. These studies are important, but all of this must be followed with implementation of approaches that make use of data to show real benefit, such as improved patient outcomes, improved health, or even cost efficiency. There is only one study that showed benefit for use of data analytic techniques, using a heart failure prediction algorithm [3]. Maybe I am wrong that other studies demonstrating the application of Big Data techniques have shown benefit (or have been done at all), and I will certainly stand corrected if there are.

Despite the lack of studies demonstrating benefit, there have been plenty of interesting writings about Big Data. Some publications that have even devoted issues or volumes to the topic. One of these was the July issue of the health policy journal, Health Affairs. There were a number of interesting articles in the issue, although none reported any research results demonstrating the value of Big Data. Among the interesting papers were:
  • Bates et al. detailing what they consider the six most important use cases for Big Data: high-cost patients, readmissions, triage, patient decompensation, adverse events, and treatment optimization for diseases affecting multiple organ systems [4]
  • Krumholz describing the need for new thinking and training (including informatics) in the application of Big Data [5]
  • Curtis et al. discussing four large national multi-purpose data networks that could have substantial impact [6]
  • Longhurst et al. presented the concept of the "Green Button," a tool in the EHR that would aggregate data in an attempt to answer clinical questions for which no prior evidence existed [7]
Also appearing recently was the 2014 Yearbook of Medical Informatics, which is now available via open-access publishing and was devoted this year to the topic of Big Data. Similar to the Health Affairs issue, there were several interesting papers (including one of which I was a co-author that focused on how informatics education must adapt to Big Data [8]) but none reporting patient or organizational benefits of Big Data.

There also continues to be a steady stream of other papers related to re-use of clinical data that provide insights or demonstrate the challenges to working it. Two of these papers come from a recent special issue of Journal of the American Medical Informatics Association (JAMIA) devoted to "high-throughput phenotyping." A paper by Richesson et al. documents the challenges in something so seemingly simple as definitively determining patients diagnosed with diabetes mellitus [9]. Another paper by Pathak et al. documents the detailed work required to standardize and normalize data in the EHR for a single quality measure assessing a serum cholesterol levels below 100 mg/dL for patients with diabetes mellitus [10]. Other recent papers in JAMIA have documented the challenges with the quality of diabetes-related data used for quality indicators in primary care [11] and the significant quantity of non-conformance with the details of the Consolidated Clinical Document Architecture (C-CDA) that undermine interoperability [12].

Despite the slow progress, I am still confident that we will see scientific advances around data analytics/data science/Big Data in biomedicine and health. I agree with Cathy O'Neil, who writes that we should be "skeptics, not cynics" about Big Data [13]. In other words, we should approach data, and the results obtained from it, with informed skepticism. I reiterate what I have written in the past, that we must put data to use in ways that demonstrate benefit, apply a research mentality, and take into account the "provocations" of Dana Boyd, the most important of which is that we must not let the data define our questions of it, and instead seek data that will best answer our questions [14].


1. Steinberg, GB, Church, BW, et al. (2014). Novel predictive models for metabolic syndrome risk: a “big data” analytic approach. American Journal of Managed Care. 20: e221-e228.
2. Hebert, C, Shivade, C, et al. (2014). Diagnosis-specific readmission risk prediction using electronic health data: a retrospective cohort study. BMC Medical Informatics & Decision Making. 14: 65. http://www.biomedcentral.com/1472-6947/14/65.
3. Amarasingham, R, Patel, PC, et al. (2013). Allocating scarce resources in real-time to reduce heart failure readmissions: a prospective, controlled study. BMJ Quality & Safety. 22: 998-1005.
4. Bates, DW, Saria, S, et al. (2014). Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Affairs. 33: 1123-1131.
5. Curtis, LH, Brown, J, et al. (2014). Four health data networks illustrate the potential for a shared national multipurpose big-data network. Health Affairs. 33: 1178-1186.
6. Krumholz, HM (2014). Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system. Health Affairs. 33: 1163-1170.
7. Longhurst, CA, Harrington, RA, et al. (2014). A 'green button' for using aggregate patient data at the point of care. Health Affairs. 33: 1229-1235.
8. Otero, P, Hersh, W, et al. (2014). Big Data: Are Biomedical and Health Informatics Training Programs Ready? Yearbook of Medical Informatics 2014. C. Lehmann, B. Séroussi and M. Jaulent: 177-181.
9. Richesson, RL, Rusincovitch, SA, et al. (2013). A comparison of phenotype definitions for diabetes mellitus. Journal of the American Medical Informatics Association. 20: e319-e326.
10. Pathak, J, Bailey, KR, et al. (2013). Normalization and standardization of electronic health records for high-throughput phenotyping: the SHARPn consortium. Journal of the American Medical Informatics Association. 20: e341-e348.
11. Barkhuysen, P, deGrauw, W, et al. (2014). Is the quality of data in an electronic medical record sufficient for assessing the quality of primary care? Journal of the American Medical Informatics Association. 21: 692-698.
12. D'Amore, JD, Mandel, JC, et al. (2014). Are Meaningful Use Stage 2 certified EHRs ready for interoperability? Findings from the SMART C-CDA Collaborative. Journal of the American Medical Informatics Association. Epub ahead of print.
13. O'Neil, C (2013). On Being a Data Skeptic. Sebastopol, CA, O'Reilly. http://www.oreilly.com/data/free/being-a-data-skeptic.csp.
14. Boyd, D and Crawford, K (2011). Six Provocations for Big Data. Cambridge, MA, Microsoft Research. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1926431.

Monday, August 25, 2014

Healthy Living is Not Alternative Medicine, and Vice Versa

Part of my original interest in a medical career emanated from my interest in personal health. Starting with being a distance runner in high school, developing an interest in nutrition in college, and taking charge of my middle-age weight gain a decade ago, I have always been interested in healthy living.

My early interest in health also led me to develop an interest in complementary and alternative medicine (CAM). In fact, CAM was part of what led me to a medical career, as my initial interest in computers starting in high school waned while the anti-establishment appeal of CAM attracted me as a college student in the late 1970s. Even my attraction to evidence-based medicine (EBM) comes from an adage that appealed to me from the CAM world, which was, "Let truth be your authority, not authority be your truth."

I had a resurgence of interest in the 1990s when the National Institutes of Health (NIH) ramped up its National Center for Complementary and Alternative Medicine (NCCAM) in an effort to bring some scientific rigor and objectivity to the study of CAM. I became involved in some CAM research and education activities at Oregon Health & Science University (OHSU).

Alas, I think it is fair to say that the evidence for CAM interventions still seems to be wanting. I recognize there are limits to EBM and its main tool, the randomized controlled trial (RCT), in assessing CAM therapies. But there is no reason why some CAM therapies should not show some success in RCTs. However, when put to objective evidence-based testing, most of the major CAM therapies do not hold up, including homeopathy [1], acupuncture [2], and antioxidant supplements [3]. While some may argue that taking vitamin supplements is not CAM, they too show no benefit in primary prevention of disease [4]. A number of science-based books have also reviewed the evidence base for CAM and explained research findings in lay terms [5, 6]. One of the most prolific science-based reviewers of CAM studies is Edzard Ernst MD, PhD, a physician formerly employing homeopathy whose Web site is a running commentary on CAM studies and their interpretation.

There is also increasing criticism of the research funding allocated to NCCAM. There is concern not only that NCCAM studies are not justified by the underlying science, but also that few of the studies, especially RCTs, actually have their results published [7, 8]. This has led some prominent researchers to argue that we need science-based evidence more than evidence-based medicine, i.e., RCTs are other evaluative studies that are based on sound scientific underpinnings [9]. (Biological plausibility is one of the tenets of evidence-based medicine, but seems to get lost in the desire to satisfy the clamoring for studies of CAM.)

To their credit, advocates of CAM have always been among the loudest proponents of healthy living, although I have seen my share of exceptions in CAM practitioners who eat poor diets, smoke, or otherwise live unhealthfully. I see even more of the disconnection between use of CAM and healthy living in individuals, who somehow view CAM as insurance against disease from poor health habits.

Unlike most CAM, however there is evidence to support the health benefits of true healthy living. This is very distinct from health benefits of CAM. I do not view healthy living as some form of "alternative medicine." I also recognize that not all health problems are due to poor lifestyle choices. While I am confident that my healthy lifestyle will likely contribute to my better health and longevity, I know there are plenty of medical conditions that have little to do with lifestyle.

So in contrast to CAM, there is very good evidence on a number of fronts that healthy living is associated with better health and longevity. Last year the American College of Cardiology/American Heart Association published a comprehensive guideline to lifestyle interventions for reducing risk of cardiovascular disease [10]. The underlying systematic review exhaustively identifies the evidence supporting diets emphasizing fruits, vegetables, whole grains, low-fat dairy products, lean meats, nontropical vegetable oils and nuts, and legumes, while limiting sweets, sugary beverages, and red meat [11]. The systematic review also finds evidence for limiting saturated and trans fats as well as sodium. It also finds benefit for moderate-to-vigorous intensity exercise 3-4 times per week lasting 40 minutes per session.

Although the evidence for healthy living in enhancing longevity and reducing disease is strong, it is not ironclad, as it is very difficult to perform controlled trials of healthy living. Therefore a good deal of the evidence comes from large-scale observational studies. But this evidence is solid, including an aptly titled study from Europe, Healthy Living is the Best Revenge [12], and a recent analysis that running, an activity I enjoy, reduces all-cause and cardiovascular risk mortality [13]. The latter reaffirms the U-shaped curve showing the most benefit for moderate amounts of running akin to the level I do, i.e., between 9-19 miles per week.

Another adage that resonates with me is that while a simple healthy lifestyle is beneficial, there is little evidence-based (i.e., coming from RCTs and other strong evidence) data for many foods and supplements that are dubbed "miracles." I agree with Katz, who has written a book extolling the virtues of simple healthy living via the diet and exercise regimens supported by the evidence, along with avoidance of smoking [14]. (His advice makes me think, without evidence to support it, that there are diminishing returns from more and more devotion to minutiae of good diet, and that the simple basics probably get you most of the way toward the best health returns.) In addition, like many, I got a kick out of the recent grilling of Dr. Mehmet Oz in the US Senate [15].

The scientific evidence clearly supports the benefits of healthy living, while for the most part lacking it for alternative medicine. It is important to distinguish these two, and also to remember that even the healthiest lifestyle will not prevent all disease. For these reasons, there is still a role for conventional medicine and the research underlying it. I will personally continue to live healthfully, even if I know this will not provide immunity from all illness.


1. Ernst, E (2010). Homeopathy: what does the “best” evidence tell us? Medical Journal of Australia. 192: 458-460.
2. Madsen, MV, Gøtzsche, PC, et al. (2009). Acupuncture treatment for pain: systematic review of randomised clinical trials with acupuncture, placebo acupuncture, and no acupuncture groups. British Medical Journal. 338: a3115. http://www.bmj.com/content/338/bmj.a3115.
3. Bjelakovic, G, Nikolova, D, et al. (2013). Antioxidant supplements to prevent mortality. Journal of the American Medical Association. 310: 1178-1179.
4. Fortmann, SP, Burda, BU, et al. (2013). Vitamin and mineral supplements in the primary prevention of cardiovascular disease and cancer: An updated systematic evidence review for the U.S. Preventive Services Task Force. Annals of Internal Medicine. 159: 824-834.
5. Offit, PA (2013). Do You Believe in Magic?: The Sense and Nonsense of Alternative Medicine. New York, NY, Harper.
6. Singh, S and Ernst, E (2009). Trick or Treatment?: Alternative Medicine on Trial. London, England, Corgi.
7. Atwood, KC (2013). The Ongoing Problem with the National Center for Complementary and Alternative Medicine. Skeptical Inquirer, September / October 2013. http://www.csicop.org/si/show/ongoing_problem_with_the_national_center.
8. Mielczarek, EV and Engler, BD (2014). Selling Pseudoscience: A Rent in the Fabric of American Medicine: A Study of Federal Funding Advancing Naturopathy, Acupuncture, Chiropractic, and Energy Healing as Acceptable Medical Protocols Finds Troubling Misuse of Taxpayer Dollars. Skeptical Inquirer, May/June, 2014.
9. Gorski, DH and Novella, SP (2014). Clinical trials of integrative medicine: testing whether magic works? Trends in Molecular Medicine. Epub ahead of print.
10. Eckel, RH, Jakicic, JM, et al. (2014). 2013 AHA/ACC Guideline on Lifestyle Management to Reduce Cardiovascular Risk. Journal of the American College of Cardiology. 129: S76-S99.
11. Eckel, RH, Jakicic, JM, et al. (2013). 2013 Report on Lifestyle Management to Reduce Cardiovascular Risk: Full Work Group Report Supplement. Journal of the American College of Cardiology. 129: Supplement. http://circ.ahajournals.org/content/suppl/2013/11/07/01.cir.0000437740.48606.d1.DC1/Lifestyle_Full_Work_Group_Report.docx.
12. Ford, ES, Bergmann, MM, et al. (2009). Healthy living is the best revenge: findings from the European Prospective Investigation Into Cancer and Nutrition-Potsdam study. Archives of Internal Medicine. 169: 1355-1362.
13. Lee, DC, Pate, RR, et al. (2014). Leisure-time running reduces all-cause and cardiovascular mortality risk. Journal of the American College of Cardiology. 64: 472-481.
14. Katz, DL (2013). Disease-Proof: The Remarkable Truth About What Makes Us Well. New York, NY, Hudson Street Press.
15. Haiken, M (2014). Dr. Oz's 10 Most Controversial Weight Loss Supplements. Forbes Magazine, June 18, 2014. http://www.forbes.com/sites/melaniehaiken/2014/06/18/dr-oz-senate-scolding-his-10-most-controversial-weight-loss-supplements/.