Thursday, January 12, 2017

What is the Right Approach to Sharing Clinical Research Data?

While many people and organizations have long called for data from randomized clinical trials (RCTs) and other clinical research to be shared with other researchers for re-analysis and other re-use, the impetus for it accelerated about a year ago with two publications. One was a call by the International Committee of Medical Journal Editors (ICMJE) for de-identified data from RCTs to be shared as condition of publication [1]. The other was the publication of an editorial in the New England Journal of Medicine wondering whether those who do secondary analysis of such data were “research parasites” [2]. The latter set off a fury of debate across the spectrum, e.g. [3], from those who argued that primary researchers labored hard to devise experiments and collect their data, thus having claim to control over it, to those who argued that since most research is government-funded, the taxpayers deserve to have access to that data. (Some of those in the latter group proudly adopted the “research parasite” tag.)

Many groups and initiatives have advocated for the potential value of wider re-use of data from clinical research. The cancer genomics community has long seen the value of a data commons to facilitate sharing among researchers [4]. Recent US federal research initiatives, such as the Precision Medicine Initiative [5] and the 21st Century Cures program [6] envision an important role for large repositories of data to accompany patients in cutting-edge research. There are a number of large-scale efforts in clinical data collection that are beginning to accumulate substantial amounts of data, such as the National Patient-Centered Clinical Research Network (PCORNet) and the Observational Health Data Sciences and Informatics (OHDSI) initiative.

As with many contentious debates, there are valid points on both sides. The case for requiring publication of data is strong. As most research is taxpayer-funded, it only seems fair that those who paid are entitled to all the data for which they paid. Likewise, all of the subjects were real people who potentially took risks to participate in the research, and their data should be used for discovery of knowledge to the fullest extent possible. And finally, new discoveries may emerge from re-analysis of data. This was actually the case that prompted the Longo “ esearch parasites” editorial, which was praising the “right way” to do secondary analysis, including working with the original researchers. The paper that the editorial described had discovered that the lack of expression of a gene (CDX2) was associated with benefit from adjuvant chemotherapy [7].

Some researchers, however, are pushing back. They argue that those who carry out the work of designing, implementing, and evaluating experiments certainly have some exclusive rights to the data generated by their work. Some also question whether the cost is a good expenditure of limited research dollars, especially since the demand for such data sets may be modest and the benefit is not clear. One group of 282 researchers in 33 countries, the International Consortium of Investigators for Fairness in Trial Data Sharing, notes that there are risks, such as misleading or inaccurate analyses as well as efforts aimed at discrediting or undermining the original research [8]. They also express concern about the costs, given that there are over 27,000 RCTs performed each year. As such, this group calls for an embargo on reuse of data for two years plus another half-year for each year of the length of the RCT. Even those who support data sharing point out the requirement for proper curation, wide availability to all researchers, and appropriate credit to and involvement of those who originally obtained the data [9].

There are a number of challenges to more widespread dissemination of RCT data for re-use. A number of pharmaceutical companies have begun making such data available over the last few years. Their experience has shown that the costs are not insignificant (estimated to be about $30,000-$50,000 per RCT) and a scientific review process is essential [10]. Another analysis found that the time to re-analyze data sets can be long, and so far the number of publications have been few [11]. An additional study found that identifiable data sets were only explicitly visible from 12% of all clinical research funded by the National Institutes of Health in 2011 [12]. This means that from 2011 alone, there are possibly more than 200,000 data sets that could be made publicly available, indicating some type of prioritization might be required.

There are also a number of informatics-related issues to be addressed. These not only include adherence to standards and interoperability [13], but also attention to workflows, integration with other data, such as that from electronic health records (EHRs), and consumer/patient engagement [14]. Clearly the trialists who generate the data must be given incentives for their data to be re-used [15]. My own work assessing the caveats of re-using EHR data is somewhat applicable here too, in that even RCT data may not have the breadth of data or cover sufficient periods of time for additional analyses [16].

There is definitely great potential for re-use of RCT and other clinical research data to advanced research and ultimately health and clinical care for the population. However, it must be done in ways that represent an appropriate use of resources and result in data that truly advances research, clinical care, and ultimately individual health.

References
1. Taichman, DB, Backus, J, et al. (2016). Sharing clinical trial data: a proposal from the International Committee of Medical Journal Editors. New England Journal of Medicine. 374: 384-386.
2. Longo, DL and Drazen, JM (2016). Data sharing. New England Journal of Medicine. 374: 276-277.
3. Berger, B, Gaasterland, T, et al. (2016). ISCB’s initial reaction to The New England Journal of Medicine Editorial on data sharing. PLoS Computational Biology. 12(3): e1004816. http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004816.
4. Grossman, RL, Heath, AP, et al. (2016). Toward a shared vision for cancer genomic data. New England Journal of Medicine. 379: 1109-1112.
5. Collins, FS and Varmus, H (2015). A new initiative on precision medicine. New England Journal of Medicine. 372: 793-795.
6. Kesselheim, AS and Avorn, J (2017). New "21st Century Cures" legislation: speed and ease vs science. Journal of the American Medical Association. Epub ahead of print.
7. Dalerba, P, Sahoo, D, et al. (2016). CDX2 as a prognostic biomarker in stage II and stage III colon cancer. New England Journal of Medicine. 374: 211-222.
8. Anonymous (2016). Toward fairness in data sharing. New England Journal of Medicine. 375: 405-407.
9. Merson, L, Gaye, O, et al. (2016). Avoiding data dumpsters — toward equitable and useful data sharing. New England Journal of Medicine. 374: 2414-2415.
10. Rockhold, F, Nisen, P, et al. (2016). Data sharing at a crossroads. New England Journal of Medicine. 375: 1115-1117.
11. Strom, BL, Buyse, ME, et al. (2016). Data sharing — is the juice worth the squeeze? New England Journal of Medicine. 375: 1608-1609.
12. Read, KB, Sheehan, JR, et al. (2015). Sizing the problem of improving discovery and access to NIH-funded data: a preliminary study. PLoS ONE. 10(7): e0132735. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0132735.
13. Kush, R and Goldman, M (2016). Fostering responsible data sharing through standards. New England Journal of Medicine. 370: 2163-2165.
14. Tenenbaum, JD, Avillach, P, et al. (2016). An informatics research agenda to support precision medicine: seven key areas. Journal of the American Medical Informatics Association. 23: 791-795.
15. Lo, B and DeMets, DL (2016). Incentives for clinical trialists to share data. New England Journal of Medicine. 375: 1112-1115.
16. Hersh, WR, Weiner, MG, et al. (2013). Caveats for the use of operational electronic health record data in comparative effectiveness research. Medical Care. 51(Suppl 3): S30-S37.

Friday, December 30, 2016

A Different Annual Reflection For This Past Year

Every year since the inception of this blog, my last posting of the year has been a reflection looking back over the year that is ending. This year’s reflection marks the completion of eight years of this blog, and writing this year’s posting feels different. This is no doubt because this blog has been very much tied into the key events of informatics over the last decade, in particular the Health Information Technology for Economic and Clinical Health (HITECH) Act and other actions emanating from the Presidency of Barack Obama. This has been a period of activist government with respect to our field, and now the US electorate (at least according to the rules of the Electoral College) has chosen a different path going forward.

Fortunately, the need for informatics is not going away. Even if the Affordable Care Act is repealed, the underlying problems in healthcare that led to its passage are still a challenge. Healthcare in the US is still the most fragmented, expensive, and inefficient of any country in the world. This does not mean I would want to get seriously ill anywhere else in the world, but I still believe there is also an ethical imperative to provide basic healthcare to all citizens in the least costly manner. Medicine is supposed to be a calling for physicians, and not just a job. Although I no longer care for patients directly, I view my work as a physician-informatician to support the delivery of more universal and efficient care by supporting the data, information, and knowledge needs of healthcare delivery and patients.

Informatics also supports other aspects of health that will also continue to be important even if reform of the US healthcare delivery system takes different directions. Informatics should support the health of the population through public health. It can support expansion of our knowledge and best practices by enhancing basic, clinical, and translational research. It can extend the reach of healthcare through telehealth and telemedicine. And because the US is still a prosperous nation to whom many still look for leadership, we can share our knowledge and tools for better health and healthcare with our fellow planetary citizens around the world, especially clinical and informatics professionals.

As for the blog itself, it continues to thrive. I am always gratified when people tell me they find it a valuable source of information, especially for key topics in the application of informatics as well as for issues for people seeking to start or advance careers in the field. The number of page views continues to increase, and in this last month, the total barreled through the 400,000 mark for the (including this) 267 posts I have made over the eight years. I have no plans to change anything with my approach to the blog any time soon.

There is no question that for people who work in academia, in research, and in health IT that there is uncertainty as to the future. Nonetheless, I am grateful that I have a loving family, wonderful colleagues, and a great many other friends who bring happiness and stability to my life.

Wednesday, December 28, 2016

Benchmarks to Assess the New President

According to the rules of US elections, Donald Trump won the Presidency and Republicans control the Senate and House of Representatives. I respect that.

This does not mean, however, that Trump and his party have any sort of mandate. Not only did Trump lose the popular vote by about 2.8 million votes (48.2%-46.2%), but he won three large states (Pennsylvania, Michigan, and Wisconsin) that would have swung the election to Hillary Clinton with a combined total of only 77,000 votes. And while this year there was a narrow overall majority of votes for House Republican candidates this year, in recent years there has been a majority of votes for Democratic candidates despite a hefty majority of seats filled by Republicans, owing to gerrymandering. There were also more popular votes for Democratic Senate candidates this year, although it was somewhat anomalous due to the California Senate race being a run-off between two Democratic candidates.

There is no question that this election was tilted to Mr. Trump by the growing number of mostly white working class people who have been left behind by both economic and social changes in our society. His election was also aided by an unknown amount from Russian hacking, fake news, and the questionable decision by the FBI to raise its investigation of the email issue in the last weeks of the campaign. This was a candidate who set new records for fact-checkers disputing his statements and had a large following of people who believed those falsehoods.

As such, the outcome of this election is anything but a mandate for Donald Trump. Yes, he did obtain more electoral votes than Hillary Clinton, but his victory was extremely narrow, and he and other Republicans need to be careful of overreach. This is especially true since it is not clear that Mr. Trump really stands for the kind of people and their views that he is installing in his political leadership. (It is often not clear what he stands for at all, since his governing philosophy is not very detailed or consistent.)

But the new Republican majority may find it harder to improve upon the economic situation than what they have been handed. The US economy certainly still has a number of problems, especially income inequality and technology that is changing the nature of work, especially manual work. By most measures, however, the US economy is actually doing well. We finish the year, and President Obama’s second term, with strong economic growth (Gross Domestic Product [GDP] up at a 3.5% annual rate last quarter and being positive most of his second term), low unemployment (currently 4.6%, nearly full employment), low inflation, and a booming stock market (Dow Jones Industrial Average closing in on 20,000). Gas prices are low and the proportion of people lacking health insurance is lower than it has been in decades.

I believe an important task is to hold President Trump accountable. We will want to see how he adheres to his conflicting campaign pledges and the results of those policies when they are implemented. This includes promises to massively slash taxes, increase defense and infrastructure spending, make no cuts to Medicare or Social Security, build a border wall and deport 11 million people, renegotiate trade deals and implement tariffs if necessary, and come up with "something better" as the Affordable Care Act is repealed. While I disagree with many of these actions, it will be important to see whether Mr. Trump carries them out, and if he does, what is their impact.

Even though a good deal of what Mr. Trump says bothers many of us, I believe it will be more important to look at his actions. I hope he will especially be held accountable by those who are not conservative ideologues, such as workers who have been displaced from coal-mining and manufacturing jobs and those who don’t believe that their new health insurance they have received through the Affordable Care Act will be taken away. I also hope the impact of his policies on the environment, including climate change, will be objectively measured. And, of course, an objective assessment of a foreign policy administered via Twitter.

While I believe Mr. Trump should be judged more for his actions and their outcomes, I don't think he should be let off the hook for his words either. This includes all the vitriol he spread through the years of President Obama, from stoking the fires of the birther movement to making false statements on the economy. Despite attempts to "unify" the electorate after a divisive election, we cannot forget Mr. Trump's insults and lies about individual people and of groups, from women to Muslims to Mexicans. I still shake my head in amazement when people are asked to not take everything Trump said during the campaign literally, that it is legitimate to enter some sort of "post-truth" era., or that a good proportion of his

In the end, a President is not responsible for everything that happens on his or her watch. But for a narcissistic individual who takes credit for things that go right, even when that credit is not deserved, we should also hold him or her to objective measures of performance as well. While Mr. Trump has mastered the neutering of the press through social media and other means, I hope that responsible journalism will rise to the task and objectively report the impact of the words and policies that emanate from his Presidency and his political party.

Thursday, December 8, 2016

Coping With Adversarial Information Retrieval in Modern Times

When I first chose my area of research focus in my postdoctoral fellowship in biomedical informatics in the late 1980s, I was intrigued by information retrieval (IR; also known as search). While most in informatics were still focused on artificial intelligence and expert systems, I was fascinated by the notion that computers could provide information in response to users entering text. At that time, of course, there were only modest amounts of information to retrieve. The main source was bibliographic databases such as MEDLINE. While the full text of journals and even some textbooks was starting to become available, it was mostly text and not figures or images.

The world of search started to change with the advent of the World Wide Web in the early 1990s. I had actually been skeptical that the Web could even deliver more than text in real-time, given how slow the Internet was at that time. This was also a time when my colleagues at Oregon Health & Science University (OHSU) started putting on continuing medical education (CME) courses for physicians about the growing amount of information available (including via CD-ROM drives). But when we taught about searching the Web, we presented many caveats, especially because there was no control over the quality of information [1].

A related happening about this same time was the growth of spam email [2]. In the 1980s and even into the early 1990s, the only real users of Internet email were academics and techies. But as the Web and underlying Internet spread to broader populations, so did spam email, especially because it was so easy to reach massive numbers of people.

These developments all gave rise to the notion of “adversarial” IR, something that was initially difficult to fathom when we were trying to develop the most effective methods to provide access to the highest quality information available [3]. But as content emerged that we hoped users would not retrieve, there started an additional focus in IR that considered ways to avoid providing users the worst information.

One advance that improved the ability of Web searching to retrieve high-quality material was Google and its PageRank algorithm. A major change pioneered by Google was to rank results based not on measures of similarity between words in the query and page, at the time considered to be our best approach, but instead by how many other pages pointed to them. While not perfect, the number of links to a page is indeed associated with its quality, e.g.,, more pages will point to those from the National Library of Medicine or Mayo Clinic than a less credible site.

Of course, this situation resulted in a number of other consequences, not the least of which was the emergence of search engine optimization (SEO), enabling people to fight against PageRank and related algorithms [5]. It also set off a tit-for-tat battle of search engine sites hiring armies of engineers to figure out how people were trying to game their systems [6]. In more recent years, the emergence of new information streams, most notably the Facebook newsfeed, has provided new opportunities and led to the proliferation of “fake news” attributed to impacting the recent US president election [7].

While technology will play some role in solving the adversarial IR problem, it will not succeed by itself. Clever programmers and others will likely always find ways to exploit approaches to limiting the spread of false or incorrect information. The sheer volume of such information makes human intervention an unlikely solution, and of course one person’s high quality information is another person’s trash heap.

The main way to solve the problem, however, is through education. It is all part of basic modern information literacy everyone must have in the 21st century. Just as I have argued that statistics should be a topic taught in high school if not earlier, so should modern information literacy, including related to health. While there will always be shades of gray in terms of information quality, people can and should be taught how to recognize that which is flagrantly false.

I hope we will learn from fake news, newer variants of spam email such as phishing, and other risks of the Internet era that we must train society to better understand our new information ecosystem, and how we can benefit from its value while minimizing its risk.

References

1. Hersh, WR, Gorman, PN, et al. (1998). Applicability and quality of information for answering clinical questions on the Web. Journal of the American Medical Association. 280: 1307-1308.
2. Goodman, J, Cormack, GV, et al. (2007). Spam and the ongoing battle for the inbox. Communications of the ACM. 50(2): 25-33.
3. Castillo, C and Davison, BD (2011). Adversarial Web Search. Delft, Netherlands, now Publishers.
4. Brin, S and Page, L (1998). The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems. 30: 107-117. http://infolab.stanford.edu/pub/papers/google.pdf.
5. Anonymous (2015). The Beginner's Guide to SEO. Seattle, WA, SEOmoz. http://moz.com/beginners-guide-to-seo.
6. Singhal, A (2004). Challenges in Running a Commercial Web Search Engine. Mountain View, CA, Google. http://www.research.ibm.com/haifa/Workshops/searchandcollaboration2004/papers/haifa.pdf.
7. Davis, W (2016). Fake Or Real? How To Self-Check The News And Get The Facts. Washington, DC, National Public Radio. http://www.npr.org/sections/alltechconsidered/2016/12/05/503581220/fake-or-real-how-to-self-check-the-news-and-get-the-facts.

Tuesday, November 29, 2016

Kudos for the Informatics Professor - Fall 2016 Edition

The year 2016 has been a busy but fun year of personal achievements. Many of the notable accomplishments involved giving talks, both in person and online, and around the country and the world. However, I also had a number of other achievements.

A few months ago I posted about talks during the summer of 2016. The fall of 2016 was equally busy. As I noted at the end of the summer posting, I was slated to give two talks in September. The first was the opening talk at the National Library of Medicine (NLM) Georgia Biomedical Informatics Course entitled, What is Biomedical Informatics? This talk was an updated version from the previous offering in this course delivered in April, 2016. In September, I also provided an online lecture in the National Institutes of Health BD2K Guide to the Fundamentals of Data Science Series entitled, Data Indexing and Retrieval.

In October, I had the opportunity to visit the world-renowned Geisinger Health System, where I met with a number of individuals who have taken courses of mine, both my 10x10 ("ten by ten") course as well as physicians in the new Clinical Informatics Fellowship who are taking online courses in the OHSU Biomedical Informatics Graduate Program. I also presented Grand Rounds on the topic of competencies in clinical informatics required of 21st-century clinicians and informaticians.

Also in October was the 25th Anniversary Celebration of the Biomedical Information Communication Center (BICC) at OHSU. The speakers at the event  included the current and long-time former Directors of the NLM. I provided an overview talk about the OHSU Department of Medical Informatics & Clinical Epidemiology (DMICE) and presented a poster on all of the collaboration that DMICE does at OHSU.

I started November with a talk at the OHSU Informatics Research Conference on Challenge Evaluations in Biomedical Information Retrieval, which was a preparation talk for another 25th anniversary talk to be mentioned in a moment.

In mid-November I was busy at the AMIA Annual Symposium, first leading a workshop on Evidence-Based Informatics at the Clinical Informatics Fellows’ Retreat that took place at my alma mater, the University of Illinois College of Medicine. Next I provided a talk at the AMIA Annual Symposium Learning Showcase entitled, The Full Spectrum Biomedical and Health Informatics Education at Oregon Health & Science University.

My final talk of the fall was at the Celebrating 25 Years of TREC Conference at the National Institute for Standards and Technology (NIST) in Gaithersurg, MD. My talk, The TREC Bio/Medical Tracks, described the various tracks in the biomedical domain at TREC over the years. A video of the talk is in Part 3 (starting around the 50-minute mark) of the Webcast archive page for the meeting.

What other accomplishments did I have this past fall? One was teaching my introductory biomedical and health informatics course to a group of clinical and IT leaders from Bangkok Duisuit Medical Services (BDMS), a network of hospitals in Thailand and a few in nearby countries. OHSU has an ongoing collaboration with BDMS in many areas, including informatics. This offering of the course had the usual recorded lectures and discussion forums, but added other activities, including interactive videoconferences and in-person sessions in both Bangkok and Portland. One of the participants in the course, Dr. Somsak Wankijcharoen, created a video of the experience.

Saturday, November 12, 2016

ABPM Extends “Grandfathering” Period for Clinical Informatics Physician Subspecialty Through 2022

The single most-viewed entry in the history of this blog is a posting from 2013 describing eligibility for the clinical informatics subspecialty for physicians. This was partly due to my wanting to have a standard reply for the frequent emails I received at the time from individuals asking if they would be eligible to sit for the board exam during the "grandfathering" period. I would also mention to them my singular most important piece of advice, which was to try, if possible, to get certified before 2018, after which they would need to complete an Accreditation Council for Graduate Medical Education (ACGME)-accredited fellowship.

Earlier this month, however, the American Board of Preventive Medicine (ABPM) extended the period that allows physicians to be eligible for board certification in the clinical informatics subspecialty by five years, through 2022. This means that the grandfathering (and "grandmothering" for my female colleagues!) period can be used to achieve board eligibility through 2022.

One new issue is how this will impact the growing number of ACGME-accredited fellowships, such as the one we offer at Oregon Health & Science University (OHSU). I still believe those fellowships will be the gold standard for early-career physicians to receive the best training in clinical informatics. But other physicians wanting to enter the field who cannot relocate jobs or families will still be able to pursue other options, one of which is master's degree programs such as our program at OHSU.

The official eligibility statement for the subspecialty is otherwise unchanged from the beginning of the grandfathering period and is documented on the ABPM Web site. The first three eligibility requirements are:
  1. Primary certification by one of the 23 member boards of the American Board of Medical Specialties (ABMS)
  2. Graduate from a US, Canadian, or other medical school deemed acceptable by the ABPM
  3. Unrestricted license to practice medicine in the US or Canada
The fourth requirement is the "pathway" by which one is eligible during the grandfathering era. There are two pathways for eligibility, one of which must be completed to be eligible to take the certification exam under the grandfathering criteria.

The first of the two pathways is the "practice pathway." Those who have been working in informatics professionally for at least 25% time during any three of the previous five years, and can have a supervisory individual attest to it, are eligible for this pathway. "Working" in informatics not only includes "practice" (i.e., being a Chief Medical Information Officer or other clinical informatics professional or leader), but also teaching and research.

The second pathway is the "non-traditional fellowship," which is any informatics fellowship of 24 or more months duration deemed acceptable by ABPM. At a 2012 panel at the American Medical Informatics Association (AMIA) Annual Symposium, Dr. William Greaves of ABPM stated this would be composed of informatics educational programs that were listed in the proposal submitted to ABPM by AMIA in 2009. This list, which has never been made public by ABPM, included programs that were funded by training grants from the National Library of Medicine (NLM) or were members of the AMIA Academic Forum at the time the proposal was submitted by AMIA to ABMS in 2009. (I can say that OHSU was definitely on the list, since we were both NLM-funded and a member of the Academic Forum at that time and still are. both). Dr. Greaves also said that ABPM would review applicants trained in other fellowships for eligibility on a case-by-case basis.

The ABPM eligibility criteria also state that time spent in training in informatics can be applied to the practice pathway at one-half the value of practice time. In other words, someone in an educational program for at least 50% time during the previous five years would be eligible to take the certification exam. My interpretation of this is that someone in a master's degree program that involves the equivalent of one and a half years of full-time study would thus be eligible. This has indeed been the case, i.e., those completing the Master of Biomedical Informatics (MBI) Program at OHSU have been deemed eligible, presumably since it requires six academic quarters of full-time study. The OHSU Graduate Certificate Program, on the other hand, which is a subset of the MBI requiring about nine months of study if done full-time, has not on its own been enough. Some applicants have been able to mix and match to achieve eligibility, i.e., with some practice time combined with some education.

It should be noted that another option for physicians who are not eligible for board exam will be the Advanced Health Informatics Certification being developed by AMIA. This certification will be available to all clinician practitioners of informatics trained at the master's level and higher. It will also provide a pathway for physicians who are not eligible for the board certification pathway.

Overall, I am pleased with this development, although it still presents problems for physicians in the future who will want to transition their careers into informatics in the middle of their careers. But since that day of reckoning has now been put off another five years, I guess we can cross that proverbial bridge when we come to it in the early part of the next decade.

Sunday, October 23, 2016

Biomedical Big Data Science Open Educational Resources (OERs) Released; Feedback Sought

For the last couple years, faculty from the Oregon Health & Science University (OHSU) Department of Medical Informatics & Clinical Epidemiology (DMICE) and Library have been developing open educational resources (OERs) in the area of Biomedical Big Data Science. Funded by a grant from the National Institutes of Health (NIH) Big Data to Knowledge (BD2K) Program, OERs have been produced that can be downloaded, used, and repurposed for a variety of educational audiences by both learners and educators.

Development of the OERs is an ongoing process, but we have reached the point where a critical mass of the content is being made available for use and to obtain feedback. The image below shows the home page for the Web site.


The OERs are intended to be flexible and customizable and we encourage others to use or repurpose these materials for training, workshops and professional development or for dissemination to instructors in various fields. They can be used as "out of the box" courses for students, or as materials for educators to use in courses, training programs, and other learning activities. We ultimately aim to create 32 modules on the following topics:
  1. Biomedical Big Data Science
  2. Introduction to Big Data in Biology and Medicine
  3. Ethical Issues in Use of Big Data
  4. Clinical Standards Related to Big Data
  5. Basic Research Data Standards
  6. Public Health and Big Data
  7. Team Science
  8. Secondary Use (Reuse) of Clinical Data
  9. Publication and Peer Review
  10. Information Retrieval
  11. Version Control and Identifiers
  12. Data Annotation and Curation
  13. Data Tools and Landscape
  14. Ontologies 101
  15. Data Metadata and Provenance
  16. Semantic Data Interoperability
  17. Choice of Algorithms and Algorithm Dynamics
  18. Visualization and Interpretation
  19. Replication, Validation and the Spectrum of Reproducibility
  20. Regulatory Issues in Big Data for Genomics and Health Semantic Web Data
  21. Hosting Data Dissemination and Data Stewardship Workshops
  22. Guidelines for Reporting, Publications, and Data Sharing
  23. Terminology of Biomedical, Clinical, and Translational Research
  24. Computing Concepts for Big Data
  25. Data Modeling
  26. Semantic Web Data
  27. Context-based Selection of Data
  28. Translating the Question
  29. Implications of Provenance and Pre-processing
  30. Data Tells a Story
  31. Statistical Significance, P-hacking and Multiple-testing
  32. Displaying Confidence and Uncertainty
At the present time, 20 of the above modules are available for download and use. We are encouraging their use and seeking feedback from those who make use of them. The feedback will be used to improve the available modules and guide development of those not yet released.

We have also been developing mappings to research competencies in other areas, such as for the NIH Clinical and Translational Science Award (CTSA) consortium research competency requirements and the Medical Library Association professional competencies for health sciences librarians. To this end, we have been able to link these materials to existing efforts, and provide training opportunities for learners and educators working in these areas. We ultimately aim to complete this mapping across all of the BD2K training offerings, to align with other groups, avoid redundancy and to ensure we are meeting the needs of these various groups.

This project is actually one of several projects that have been funded by grants to develop and provide education in biomedical informatics and data science. The other projects include:
We hope that all of these materials are useful for many audiences and look forward to feedback enabling their improvement.

Thursday, October 13, 2016

What Should Be The Spectrum of Career Opportunities for Clinical Informatics Subspecialists?

A common reason given for the establishment of clinical informatics as a physician subspecialty is the recognition of the growing role of physicians who work in informatics professionally, particularly in operational clinical settings. Sometimes this is viewed almost synonymous with the Chief Medical Informatics Officer (CMIO) and related roles in healthcare provider organizations.

However, I prefer to think of the subspecialty more broadly. Even if the CMIO is the most common or aspired to position for clinical informatics subspecialists, we should still consider other career paths, especially for those who will increasingly be trained in formal fellowships. Just as physicians of other specialties may enter private practice, managed care settings, academia, and even industry, so should we view the breadth of options for those trained in clinical informatics. I certainly hope there will be pathways from clinical fellowships into academic careers for these physicians.

I was recently involved in a discussion on an email list where many CMIOs lamented that many of the questions on the clinical informatics subspecialty board exam did not seem pertinent to their day-to-role as CMIOs. That led me to raise the question, do we view this subspecialty as primarily focused on the CMIO role, or should it cover broader aspects of clinical informatics? Not being a CMIO, and being in academia, my sentiments are with the broader view. But on the other hand, as the CMIO is a prominent position for those working in this field, and perhaps the most common one, it does deserve important consideration.

This discussion is highly relevant to those of us standing up ACGME-accredited clinical informatics fellowships. We certainly want our fellows to gain substantial operational experience. But I would advocate that they also learn the fundamentals of the informatics field, and believe that although a little dated since its creation in 2009, the Core Content outline covers it pretty well.

Just as while most physicians in a specialty (e.g., internal medicine) do not use the entire spectrum of knowledge in their fields on a daily basis, I believe our clinical informatics fellowships should take the same approach and that the board exam should reflect comparable breadth. I do not believe there is anything in the Core Content outline that is completely superfluous to the practice of being a CMIO or other jobs applying clinical informatics.

The challenge, then, is how to create a fellowship program and board exam to reflect the broader field. Informatics has always had (and I am a product of) the research-oriented NLM fellowships. Even though focused on research, these fellowships have produced diverse outcomes, including some CMIOs. While the focus on clinical fellowships is somewhat different, there should be no reason why graduates of these fellowships should not be able to pursue careers in academia, research, industry, and other settings.

Monday, October 10, 2016

Apple Watch Series 2: Great Hardware, Software Needs Work

When the original Apple Watch came out, it was a non-starter for me. As one of my main uses of a smartwatch is for running, i.e., to track my runs and view them on a map, the lack of on-board GPS meant that the watch had to be tethered to an iPhone. While I do sometimes run with my iPhone, I might as well carry just my iPhone. In addition, I sometimes run in places where I am not able to use my iPhone, such as countries that do not have international data plans with my carrier, Verizon (admittedly increasingly rare).

I was therefore thrilled to read the announcement of the Apple Watch Series 2, which would have standalone GPS and enable me to track runs without a phone.

I have been using my Apple Watch Series 2 for about a month now, and have some observations and hopes for improvement. If those improvements are made soon, I will add a postscript to this posting.

From a hardware standpoint, the watch is excellent. It is comfortable to wear and works seamlessly with my iPhone 6 (soon to be replaced with a 7 Plus). I have always able to make it through a day (even with a run that consumes 20-30% of the battery life per hour of activity) without having to recharge it.

In addition to capturing my runs, I want to be able to view them on any device, including on a computer via a Web site, and also share them on Facebook. I want to be able to view all of the data, including the map of where I have run, as well as export it via standard formats such as GPX and TCX. For years I have used various Garmin fitness watches, and I appreciated the ease by which I could capture my run, display its data and map on the Garmin Connect Web site, and share it to Facebook and other digital places.

In terms of capturing the run, the Apple Watch Series 2 does great. I am actually impressed at how quickly it locks on the GPS satellites, and the accuracy seems to be equal to my previous Garmin watches.


But I have disappointment in its ability to export or display of data. While the watch’s Workout app is simple and easy to use, and the Activity app on my iPhone easy to use and display results, the data cannot be exported to other apps. I am also disappointed that the Activity app only runs on the iPhone, and therefore cannot be accessed by other hardware, including the iPad or a computer accessing a Web site.

I also dislike the sharing capabilities of the Activity app. When one tries to share the entire exercise activity to Facebook, all that is uploaded is an image from the app, and not the details of distance run, time, map, etc. One can share the map of the Activity, but that is not uploaded with any other data about the run, e.g., distance, time, etc.


Another disappointment is that other fitness apps do not (yet) allow capture of the watch GPS data. For example, while RunKeeper and MapMyRun have Apple Watch apps, they presently do not capture the watch’s GPS data when not tethered to the phone. The SpectraRun workouts app can access and export the run data but it presently does not export the GPS data into the TCX file it generates. I assume that updates to these non-Apple apps will eventually be able to access the GPS, and this might also solve the problem of Activity app data not being exportable.

Fortunately, all of these disappointments should be easily fixable in software, and I am hopeful that Apple and other developers will remedy them quickly. I have had some online dialog with one of the running app developers, and they assured me (and others) that they are trying to quickly update their watch apps to capture the GPS directly from the watch.

Tuesday, October 4, 2016

Update for a Standard Occupational Classification (SOC) Code for Informatics: Likely to Happen But Needing Revision

For years, many in the informatics field have lamented our invisibility when it comes to US government labor statistics. As I and others have been writing for years, there is no Standard Occupational Classification (SOC) code for those who work professionally in informatics [1]. As the SOC is updated by the Bureau of Labor Statistics about once a decade, I was pleased to be appointed to a group led by the Office of National Coordinator for Health IT (ONC) to submit a proposed revision to the 2018 SOC to include a code for health informatics in July 2014.

Like many classifications, the SOC is organized hierarchically. Its hierarchy goes to a depth of four levels, with the levels called Major Group, Minor Group, Broad Group, and Detailed Occupation. The Major Group. Most healthcare occupations are in the Major Group 29-0000, which is subdivided into three Minor Groups, which are in turn broken down into Broad Groups and Detailed Occupations for many health professionals from physicians to phlebotomists. In the last (2010) SOC, there was only one Broad Group and Detailed Occupation pertaining to health IT, namely 29-2070 Medical Records and Health Information Technicians, which mainly referred to those with the Registered Health Information Technologist (RHIT) certification from the health information management (HIM) field. The list below shows the three Minor Groups in the health professions and then more detail for the 29-2070/22-2071 code:
29-0000 Healthcare Practitioners and Technical Occupations
  29-1000 Health Diagnosing and Treating Practitioners
  29-2000 Health Technologists and Technicians
    29-2070 Medical Records and Health Information Technicians
      29-2071 Medical Records and Health Information Technicians
  29-9000 Other Healthcare Practitioners and Technical Occupations
Those from HIM with the Registered Health Information Administrator (RHIA) certification were among those included in the 11-9111 Medical and Health Services Managers category. (The Broad Group 11-0000 serves for Management Occupations.)

Earlier this year, the BLS released its first proposed revisions for the 2018 SOC for public comment. In particular, they released Docket Number 1-0148 -- Health Informatics Practitioners (Multiple), which included the following:
Multiple dockets requested new detailed occupations and improved coverage of occupations related to Health Information Technology such as Health Informatics Practitioners, Medical Records Specialists, and Medical Registrars. The SOCPC partially accepted these recommendations and proposed revising the title for 29-2071 Medical Records and Health Information Technicians to 29-2071 Medical Registrar and Records Specialists, adding Medical Bill Coder as an illustrative example, and adding "Includes medical coders" to the definition. The SOCPC also proposes a new broad and detailed occupations (29-9020 and 29-9021) for Health Information Technology, Health Information Management, and Health Informatics Specialists and Analysts. Finally, the SOCPC proposes adding illustrative examples to the existing 11-9111 Medical and Health Services Managers to include: Clinical Informatics Director, Health Information Services Manager, and Chief Medical Information Officer.

While I was pleased to see that our recommendation for the addition of a code for health informatics practitioners was accepted, myself and others were disappointed that the code lumped together three distinct groups who work professionally with IT in healthcare, namely health informatics, health information management, and health IT. A number of leading health IT organizations support the view that these are distinct. I was pleased to have the opportunities to work with my colleagues from the American Medical Informatics Association (AMIA) and a number of other organizations to draft a letter endorsing the view the there should be three Detailed Occupation codes for these three areas.

In particular, the letter led by AMIA advocates modification to the final 2018 SOC that will be released in 2017 that will split the new 29-9021 code into three new Detailed Occupations defined as follows:
  • Health Informatics professionals: Design, develop, select, test, implement, and evaluate new or modified informatics solutions, data structures, and clinical decision support mechanisms to support patients, healthcare professionals, and improved usability of such systems for patient safety within healthcare contexts.
  • HIM professionals: acquire, analyze, and protect digital and traditional medical information vital to the daily operations management of health information and electronic health records (EHRs).
  • Health IT professionals: Apply knowledge of healthcare and information systems to assist in the design, development, and continued modification of computerized health care systems.
The letter also suggests that, SOCPC for 11-9111 Medical and Health Services Managers to include: Clinical Informatics Director, Health Information Services Manager, and Chief Medical Information Officer. We suggest the addition of “Chief Nursing Informatics Officer” to this list to add further clarity. Experience among our constituencies indicate a proliferation of senior executives and other management-level job titles within and across these distinct occupations, all of which need to be captured under this detailed code.

I agree with AMIA and others that the occupations of health informatics, health information management, and health IT are each important yet unique within healthcare. Having them represented in the SOC separately will hopefully allow further delineation of the contributions each makes to advancing the use of information and technology in healthcare.

References

1. Hersh, W (2010). The health information technology workforce: estimations of demands and a framework for requirements. Applied Clinical Informatics. 1: 197-212.

Wednesday, September 7, 2016

Free Course in Healthcare Data Analytics Offered by OHSU

I am pleased to announce that the Department of Medical Informatics & Clinical Epidemiology (DMICE) of Oregon Health & Science University (OHSU) is offering a free continuing education course, Update in Health Information Technology: Healthcare Data Analytics, to physicians, nurses, other healthcare professionals, and health informatics/IT professionals. Registration is available at https://www.surveymonkey.com/r/onc-course.

This course is made freely available via a grant from the Office of the National Coordinator for Health IT (ONC) that I described in a previous posting last year. The grant requires us to have 1000 individuals complete the course by June 2017. The full updated ONC Health IT curriculum will also be made freely available in 2017.

Although the course is open to all healthcare professionals and health informatics/IT professionals, physicians will additionally be able to obtain continuing medical education (CME) credit through OHSU. For physicians certified in the new Clinical Informatics Subspecialty, Lifelong Learning and Self-Assessment (LLSA) credits towards American Board of Preventive Medicine (ABPM) Maintenance of Certification Part II (MOC-II) requirements for the subspecialty are also available.

The course consists of 14 modules that are estimated to take about 18 hours to complete. The course is completely online, and consists of lectures and self-assessment quizzes. References to further information are also provided. Those completing the entire course (viewing all of the lectures and completing the self-assessment quizzes) and evaluation form will receive a Certificate of Completion from OHSU. Physicians will be able to claim 18 credits of CME or (for those certified in Clinical Informatics) MOC-II. (We are not able to offer OHSU academic credit for the course.)

The course will be offered 6 times in overlapping two-month blocks starting in October 2016. Because of the anticipated large enrollment, the entire course will need to be completed during one block in order to receive the Certificate of Completion and CME/MOC-II credit. If the course is not completed during the block, participants can re-enroll in a later block. The course will only be offered for free through May 2017.

The first step in taking the course is registering at https://www.surveymonkey.com/r/onc-course. Each participant will be asked to provide some basic information, including name, employer, and email address. (All data will be kept confidential by OHSU, with the exception of confidential reporting to ONC.) After registration, participants will be sent login information to OHSU's Sakai Learning Management System. After completing all of the modules and the self-assessment quizzes, each participant will need to complete the evaluation form. He or she will then be sent via email a PDF Certificate of Completion. (Physicians will additionally be sent certifications for CME or MOC-II credit after completing additional evaluation information.)

Within the Sakai system, each module will provide an overview of learning objectives, one or more lecture segments (in MP4 format, viewable on both computers and mobile devices), optional additional materials, and a self-assessment quiz of 5-10 multiple-choice questions. (Those seeking CME or MOC-II credit must achieve a correct rate of 70% to pass; each quiz will be able to be taken up to 5 times.) Sakai will also provide an interactive forum for those having questions or comments about the materials. Due to the anticipated large enrollment, we will encourage participants to interact and answer questions among themselves, with OHSU teaching assistants bringing in course faculty as needed.

The 14 modules of the course include the following:

  • General Health Care Data Analytics
  • Extracting and Working with Data
  • Population Health and the Application of Health IT
  • Applying Health IT to Improve Population Health at the Community Level
  • Identifying Risk and Segmenting Populations: Predictive Analytics for Population Health
  • Big Data, Interoperability, and Analytics for Population Health
  • Data Analytics in Clinical Settings
  • Risk Adjustment and Predictive Modeling
  • Overview of Interoperable Health IT
  • Standards for Interoperable Health IT
  • Implementing Health Interoperability
  • Ensuring the Security and Privacy of Information Shared
  • Secondary Use of Clinical Data
  • Machine Learning and Natural Language Processing

The OHSU course faculty include:

  • William Hersh, MD, Department of Medical Informatics & Clinical Epidemiology
  • Vishnu Mohan, MD, MBI, Department of Medical Informatics & Clinical Epidemiology
  • David Dorr, MD, MS, Department of Medical Informatics & Clinical Epidemiology
  • Peter Graven, PhD, Department of Emergency Medicine
  • Karen Eden, PhD, Department of Medical Informatics & Clinical Epidemiology

The MOC-II credit is important for the new subspecialty, with those who are board-certified needing to obtain a certain amount to re-certify in 10 years. The American Medical Informatics Association (AMIA) has already developed MOC-II activities, largely through its meetings, but will also have online offerings as it implements its learning management system. They will also offer MOC-IV credits in the future.

Sunday, September 4, 2016

Kudos for the Informatics Professor - Summer 2016 Edition

It has been a busy but enjoyable summer for me, with the opportunity to give invited talks at a number of international locations as well as at some international conferences closer to home. I also had some publications released and carried out a number of teaching activities.

My talks began with leading a roundtable discussion at the Society for Imaging Informatics in Medicine 2016 Conference in Portland, OR. The title of the roundtable was, Clinical Informatics Certification for Physicians & Non-Physicians, and I provided a history and overview, and led a discussion of future directions, for the new clinical informatics subspecialty for physicians

Later in July, I ventured to Pisa Italy, where I gave the Keynote Talk at the Medical Information Retrieval (MedIR) Workshop, which was part of the ACM SIGIR 2016 meeting. Entitled, Challenges for Information Retrieval and Text Mining in Biomedicine: Imperatives for Systems and Their Evaluation, my talk described the challenges for search and text processing systems in the biomedical domain for computer science researchers.


In early August, back in Oregon, I delivered the Keynote Talk at the Joint International Conference on Biological Ontology and BioCreative at Oregon State University in Corvallis, OR. My talk, Information Retrieval and Text Mining Evaluation Must Go Beyond “Users”: Incorporating Real-World Context and Outcomes, discussed the challenges of evaluating search and text processing systems in the biomedical domain for bioinformatics researchers.


Later in August I was in a different part of the world, Thailand. Oregon Health & Science University (OHSU) has a growing international collaboration there in partnership with Bangkok Dusuit Medical Services. I delivered Grand Rounds at their flagship Bangkok Hospital. The title of my talk was, Overview of Clinical Informatics Activities in the US. I provided an overview of clinical informatics activities in the US, including adoption of electronic health records and the new clinical informatics subspecialty for physicians.

Also on that trip I was one of the keynote speakers at the HIMSS AsiaPAC 16 Conference in Bangkok. My talk was entitled, Advancing Digital and Patient-Centered Care Requires Competent Clinicians and Informatics Professionals, and I described the knowledge and training needed for optimal use of digital health systems for patients by clinicians and informatics professionals.


Finally on that trip I spent a day leading a workshop on various clinical informatics topics at Phuket International Hospital. Even better was getting to spend a weekend in that lovely beach city (see below)!



I also had release of some published papers this summer. One was a Technical Brief (hardly brief at over 60 pages!) prepared for the Agency for Healthcare Research & Quality (AHRQ) Effective Health Care Program on Telehealth: Mapping the Evidence for Patient Outcomes From Systematic Reviews. Another was a publication describing early experiences with clinical informatics fellowships for physicians in Journal of the American Medical Informatics Association.

I also carried out a substantial amount of teaching this summer. As I have every summer, I directed and taught in the AMIA Clinical Informatics Board Review Course. Next year is the last year of the “grandfathering” period that allows physicians to become board-certified without formal clinical informatics fellowship training, although a proposal has been put forth to the American Board of Preventive Medicine to extend that period for another five years. We will see what their decision is in November.

I also brought to a close the four-month long introductory online course I had been teaching to clinical informatics leaders at BDMS (see above) in Thailand. We spent a couple days at Bangkok Hospital reviewing course content, presenting papers, and preparing for course projects that will be presented when this group visits OHSU in November.


That trip also took me briefly to Singapore, where I led the in-person session at the end of the i10x10 course under the rubric of the Gateway to Health Informatics Course. This was the 15th offering of the course dating back to 2009.

Upon returning from Thailand and Singapore, I gave a lecture to new first-year OHSU medical students like I did last year entitled, Information is Different Now That You’re a Doctor. I enjoy giving this lecture to new medical students and describing the many ways that information is different now that they are becoming professionals, everything from seeking best evidence to maintaining professional behavior with highly private information, especially on social media.

I will also be doing some teaching in the next couple weeks for federal organizations, namely the National Library of Medicine (NLM) and the National Institutes of Health (NIH) Big Data to Knowledge (BD2K) Program. The NLM teaching involves giving the introductory lecture that kicks off their week-long in-residence biomedical informatics course. The BD2K teaching will involve giving a webinar in the year-long BD2K Guide to the Fundamentals of Data Science Series. My overview lecture will focus on data management, indexing, and retrieval.

There will be more talks, publishing, and teaching this fall, so stay tuned!

Sunday, July 31, 2016

AMIA Unveils Advanced Health Informatics Certification (AHIC) for Broader Health Professions

While the clinical informatics physician subspecialty has been an excellent way to recognize the value of the informatics profession [1], there are clearly many important other professionals in the informatics field who deserve the same professional recognition for their knowledge and skills in using data and information to improve health and healthcare. A further step in that evolution took place recently with the unveiling of the Advanced Health Information Certification (AHIC) by the American Medical Informatics Association (AMIA).

More details about the process can be found in three papers published in Journal of the American Medical Informatics Association and made freely available on the AMIA Web site. These papers describe the rationale and process for developing the certification [2], the eligibility requirements for it [3], and an explanation on how it fits in the larger perspective of the field [4]. The AHIC is viewed as a specialization in informatics beyond one’s initial health professional training. The latter training must be at the master’s or professional doctorate level, such as an MD, PharmD, Master of Nursing, etc.. This pathway will also provide an alternative for physicians who are not eligible for the medical subspecialty, i.e., who do not have an active primary specialty certification. This includes those who never attained a formal specialty in their medical training as well as those who discontinued the practice of medicine and allowed their primary board certification to lapse. It also includes osteopathic physicians (DOs), although these physicians will eventually be eligible for the physician subspecialty as the merger between the Accreditation Council for Graduate Medical Education (ACGME) and the American Osteopathic Association (AOA) is implemented and AOA programs achieve ACGME accreditation.

To be eligible for AHIC, an individual with a health professional master’s or higher must also have a master’s degree or higher from an accredited informatics program and professional experience applying informatics to healthcare. The accreditation of informatics educational programs will be based on AMIA’s recently becoming a member of the Commission on Accreditation for Health Informatics and Information Management (CAHIIM), which is in the process of revising its health informatics accreditation standards. Similar to the physician subspecialty, a “grandfathering” period will allow individuals to achieve certification from educational programs deemed “acceptable” that are not yet accredited. There will also be a temporary pathway for those with no formal informatics education at all who are long-time practitioners, i.e., have 36 months of informatics experience over a five-year period that has been completed within the past 10 years. Those who obtain a PhD in an informatics-related field will also be eligible, even if they do not have formal health professional training.

One way AMIA hopes to see the process viewed is as analogous to the physician certification. In particular, those certified by AHIC are expected to be have advanced training in a healthcare profession in addition to formal training and experience in informatics.

AMIA will also establish an entity to develop the certification exam, which will likely be aligned with the physician subspecialty certifying exam. The content for the certification exam will be based on an update of the core content for the physician subspecialty, which itself needs update since it has not been revised since it was published in 2009 [5]. (I have always believed that there is little in the core content of the physician subspecialty that is truly specific to physicians. There really need not be any, since informatics is agnostic and complementary to one's healthcare field.)

As with the physician subspecialty, I am highly supportive of the new certification and its professional recognition of all who work in the field professionally. Our Biomedical Informatics Graduate Program at Oregon Health & Science University (OHSU) will certainly aim to align with it. (We are currently accredited under the original CAHIIM health informatics process but will transition to the new one when we are able to do so.)

Despite my optimism and support for AHIC, I do have one concern, which is the requirement to have a master’s degree in a health profession. I understand the rationale for aligning the AHIC with the physician subspecialty in requiring advanced training in both informatics and a health profession. At some level, this makes sense, and I have long advocated that informatics is primarily a health profession. However, this leaves out those with master’s degrees in applied informatics who do not also have a master’s degree in a health profession. That excludes those with pre-informatics training in non-health professional fields, such as computer science, life sciences, and health administration. It also leaves out health professionals whose health field has a bachelor’s degree as a terminal degree. I am not aware of any evidence that shows those with a healthcare master’s are any better operational informaticians than those without such a degree, nor do I know that potential employers share the view that they are different. Interestingly, those who obtain a PhD in informatics are not subject to this requirement, even though their degree is a research degree, i.e., less applied and less likely to have courses about the healthcare system.

There is no question that informaticians working in healthcare settings need to have a solid understanding of the healthcare system. Indeed, most applied informatics master’s programs (including ours at OHSU) have courses about the healthcare system, and are encouraged to obtain practical experiences in healthcare settings. I worry that this process may cleave professional informatics master’s programs in two, with those who are eligible for AHIC and those who are not, despite having nearly the exact same training. Many current and former OHSU master’s graduates without formal clinical backgrounds have developed successful careers in applied health or clinical informatics. The AMIA leadership has vowed to consider other certifications for these types of individuals.

Nonetheless, the AHIC will be a great accomplishment for the field when it is fully implemented over the next year or two. The recognition brought on by certification of individuals will advance the profession as a whole and consolidate the important contributions that informaticians bring to 21st century healthcare and other health-related activities.

References

1. Detmer, DE and Shortliffe, EH (2014). Clinical informatics: prospects for a new medical subspecialty. Journal of the American Medical Association. 311: 2067-2068.
2. Gadd, CS, Williamson, JJ, et al. (2016). Creating advanced health informatics certification. Journal of the American Medical Informatics Association. 23: 848-850.
3. Gadd, CS, Williamson, JJ, et al. (2016). Eligibility requirements for advanced health informatics certification. Journal of the American Medical Informatics Association. 23: 851-854.
4. Fridsma, DB (2016). The scope of health informatics and the Advanced Health Informatics Certification. Journal of the American Medical Informatics Association. 23: 855-856.
5. Gardner, RM, Overhage, JM, et al. (2009). Core content for the subspecialty of clinical informatics. Journal of the American Medical Informatics Association. 16: 153-157.

Sunday, June 26, 2016

20 Years of Biomedical Informatics Graduate Education at OHSU

Earlier this month was graduation at Oregon Health & Science University (OHSU), and I was proud to see 41 individuals listed in the program receiving Graduate Certificates as well as master’s and PhD degrees in biomedical informatics. This year also marks the 20th year of the program, dating back to the first group of master’s degree students matriculating in the fall of 1996. (OHSU already had been funded as part of the National Library of Medicine [NLM] training grant program since 1992, but initially only accepted non-degree postdoc trainees. I also launched my introductory course prior to 1996 as an elective in the OHSU Master of Public Health program.)

The program has achieved many milestones, but clearly the most important is the number of people who have launched careers in the field by graduating from the program. As shown in the table below, the OHSU Biomedical Informatics Graduate Program now has awarded 716 degrees and certificates to 653 people. (Some of the latter have more than one of the former.)


Later this year will be the celebration of another milestone for informatics at OHSU, which will be the celebration of 25 years of the Biomedical Information Communication Center (BICC) Building. I have worked in this building since the day it opened and look forward to celebrating the sustained success of its occupants, especially the Department of Medical Informatics & Clinical Epidemiology and the OHSU Library.

Monday, June 6, 2016

Generalizability and Reproducibility of Scientific Literature and the Limits to Machine Learning

A couple years ago, some colleagues and I wrote a paper raising a number of caveats about the enthusiasm for leveraging the growing volume of patient data in electronic health records and other clinical information systems for so-called re-use of secondary use, such as clinical research, quality improvement, and public health [1]. While we shared that enthusiasm for that type of use, we also recognized some major challenges for trying to extract knowledge from these sources of data, and advocated a disciplined approach [2].

Now that the world’s knowledge is increasingly available in electronic form in online scientific papers and other resources, a growing number of researchers, companies, and others are calling for the same type of approach that will allow computers to process the world’s scientific literature to answer questions, give advice, and perform other tasks. Extracting knowledge from scientific literature may be easier than from medical records. After all, scientific literature is written in a way to report findings and conclusions in a relatively unambiguous manner. In addition, scientific writing is usually subject to copy-editing that decreases the likelihood of grammatical or spelling errors, both of which often make processing medical records more difficult.

There is no question that machine processing of literature can help answer many questions we have [3]. Google does an excellent job of answering questions I have about the time in various geographic locations, the status of current airplane flights, and calories or fat in a given food. But for more complex questions, such as the best treatment for a complex patient, we still have a ways to go.

Perhaps the system with the most hype around this sort of functionality is IBM’s Watson from. Recently, one of the early leaders of artificial intelligence research, Dr. Roger Schank, took IBM to task for its excessive claims (really marketing hype) around Watson [4]. Among the concerns Schank raised were IBM claims that Watson can “out-think” cancer. I too have written about Watson, in a posting to this blog now four years ago, in which I lamented the lack of published research describing its benefits (as opposed to hype pieces extolling its “graduation” of medical school) [5]. While there have been some conference abstracts presented on Watson’s work, we have yet to see any major contributions toward improving the diagnosis or treatment of cancer [6]. Like Schank, I find Watson’s technology interesting, but claims of its value in helping clinicians to treat cancer or other diseases need scientific verification as much as the underlying treatments being used.

We also, however, need to do some thought experiments as to how likely computers can carry out machine learning in this manner. In fact, there are many reasons why the published scientific literature must be approached with care. It has become clear in recent years that what is reported in the scientific literature may not reflect the totality of knowledge, but instead representing the “winner’s curse” of results that have been positive and thus more likely to be published [7,8]. In reality, however, “publication bias” pervades all of science [9].

In addition, further problems plague the scientific literature. It has been discovered in recent years that a good deal of scientific experiments are not reproducible. This was found to be quite prevalent in preclinical studies analyzed by pharmaceutical companies looking for promising drugs that might be candidates for commercial development [10]. It has also been demonstrated in psychology [11]. In a recent survey of scientists, over half agreed with the statement there is a “reproducibility crisis” in science, with 50-80% (depending on the field) unable to reduce an experiment yet very few trying or able to publish about it [12].

Even on the clinical side we know there are many problems with randomized controlled trials (RCTs). Some recent analyses have documented that RCTs do not always reflect the larger population from which the sampling is intended to represent [13,14], something we can now document with that growing quantity of EHR data [15].  Additional recent work has questioned the use of surrogate outcomes in cancer drugs, questioning their validity as indicators of efficacy of the drugs [16,17]. Indeed, it has been shown in many areas of medicine that initial studies are overturned with later, usually larger, studies [18-20].

In addition, the problem is not limited to published literature. A recent study was published that documented significant amounts of inaccuracy in drug compendia that are commonly used by clinicians [21].

My argument has always been that informatics interventions must prove their scientific mettle no differently than other interventions that claim to improve clinical practice, patient health, or other tasks for which we develop them. A few years back some colleagues and I raised some caveats about clinical data. For different reasons, there are also challenges with scientific literature as well. Thus we should be wary of system that claim to “ingest” scientific literature and perform machine learning from it. While it is important to continue this important area of research, we must resist efforts to over-hype it and also must carry out research to validate its success.

References

1. Hersh, WR, Weiner, MG, et al. (2013). Caveats for the use of operational electronic health record data in comparative effectiveness research. Medical Care. 51(Suppl 3): S30-S37.
2. Hersh, WR, Cimino, JJ, et al. (2013). Recommendations for the use of operational electronic health record data in comparative effectiveness research. eGEMs (Generating Evidence & Methods to improve patient outcomes). 1: 14. http://repository.academyhealth.org/egems/vol1/iss1/14/.
3. Wright, A (2016). Reimagining search. Communications of the ACM. 59(6): 17-19.
4. Schank, R (2016). The fraudulent claims made by IBM about Watson and AI. They are not doing "cognitive computing" no matter how many times they say they are. Roger Schank. http://www.rogerschank.com/fraudulent-claims-made-by-IBM-about-Watson-and-AI.
5. Hersh, W (2013). What is a Thinking Informatician to Think of IBM's Watson? Informatics Professor. http://informaticsprofessor.blogspot.com/2013/06/what-is-thinking-informatician-to-think.html.
6. Kim, C (2015). How much has IBM’s Watson improved? Abstracts at 2015 ASCO. Health + Digital. http://healthplusdigital.chiweon.com/?p=83.
7. Ioannidis, JP (2005). Why most published research findings are false. PLoS Medicine. 2(8): e124. http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124.
8. Young, NS, Ioannidis, JP, et al. (2008). Why current publication practices may distort science. PLoS Medicine. 5(10): e201. http://www.plosmedicine.org/article/info%3Adoi%2F10.1371%2Fjournal.pmed.0050201.
9. Dwan, K, Gamble, C, et al. (2013). Systematic review of the empirical evidence of study publication bias and outcome reporting bias - an updated review. PLoS ONE. 8(7): e66844. http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0066844.
10. Begley, CG and Ellis, LM (2012). Raise standards for preclinical cancer research. Nature. 483: 531-533.
11. Anonymous (2015). Estimating the reproducibility of psychological science. Science. 349: aac4716. http://science.sciencemag.org/content/349/6251/aac4716.
12. Baker, M (2016). 1,500 scientists lift the lid on reproducibility. Nature. 533: 452-454.
13. Prieto-Centurion, V, Rolle, AJ, et al. (2014). Multicenter study comparing case definitions used to identify patients with chronic obstructive pulmonary disease. American Journal of Respiratory and Critical Care Medicine. 190: 989-995.
14. Geifman, N and Butte, AJ (2016). Do cancer clinical trial populations truly represent cancer patients? A comparison of open clinical trials to the Cancer Genome Atlas. Pacific Symposium on Biocomputing, 309-320. http://www.worldscientific.com/doi/10.1142/9789814749411_0029.
15. Weng, C, Li, Y, et al. (2014). A distribution-based method for assessing the differences between clinical trial target populations and patient populations in electronic health records. Applied Clinical Informatics. 5: 463-479.
16. Prasad, V, Kim, C, et al. (2015). The strength of association between surrogate end points and survival in oncology: a systematic review of trial-level meta-analyses. JAMA Internal Medicine. 175: 1389-1398.
17. Kim, C and Prasad, V (2015). Strength of validation for surrogate end points used in the US Food and Drug Administration's approval of oncology drugs. Mayo Clinic Proceedings. Epub ahead of print.
18. Ioannidis, JP (2005). Contradicted and initially stronger effects in highly cited clinical research. Journal of the American Medical Association. 294: 218-228.
19. Prasad, V, Vandross, A, et al. (2013). A decade of reversal: an analysis of 146 contradicted medical practices. Mayo Clinic Proceedings. 88: 790-798.
20. Prasad, VK and Cifu, AS (2015). Ending Medical Reversal: Improving Outcomes, Saving Lives. Baltimore, MD, Johns Hopkins University Press.
21. Randhawa, AS, Babalola, O, et al. (2016). A collaborative assessment among 11 pharmaceutical companies of misinformation in commonly used online drug information compendia. Annals of Pharmacotherapy. 50: 352-359.
22. Malin, JL (2013). Envisioning Watson as a rapid-learning system for oncology. Journal of Oncology Practice. 9: 155-157.

Thursday, April 28, 2016

Earthquake Preparedness in the Pacific Northwest, 21st Century Style: Don't Forget the Data

Most of us who live in the Pacific Northwest have known for a couple decades of the earthquake risk sitting 90 miles off the Pacific coast, the Cascadia subduction zone. Concern reached a fervent pitch last year with the publication of an article in The New Yorker magazine authored by Kathryn Schulz, which was recently awarded a Pulitzer Prize. A follow-on article to the original piece provided good advice concerning planning.

At my family's household, we have always had earthquake insurance, although had never really taken planning seriously. As with many in Portland, the New Yorker article sprung us into action. A first step was to educate ourselves, i.e., what might happen and how can we best prepare. We also wanted to take into account our 21st century lifestyles that include work-related travel as well as a good portion of our lives being digitized.

There are many resources that are available to get one starting thinking and acting on the planning process. The first step is to understand the risk and what may happen. The State of Oregon Department of Geology and Mineral Industries, the Cascadia Region Earthquake Workgroup, and others have laid out the risk and likely consequences. While the most devastation will occur along the Oregon coast, the damage in Portland, 50 miles inland from the coast, will still be substantial. The figure below, from the Oregon Resilience Plan, shows the likely effects based on location.


One can quickly find a great deal of information. The State of Oregon also maintains a Web site about earthquakes. The state has also carried out some detailed analyses, The Earthquake Risk Study for Oregon's Critical Energy Infrastructure Hub, and The Oregon Resilience Plan – Cascadia: Oregon’s Greatest Natural Threat. The Cascadia Region Earthquake Workgroup has also published a report, Cascadia Subduction Zone Earthquakes: A magnitude 9.0 earthquake scenario.

So how does our family begin planning? I have found the most useful publication to be on the State Web site, Living on Shaky Ground: How to Survive Earthquakes and Tsunamis in Oregon. This report details the planning process, which our family has started to implement. Another report, Cascadia Subduction Zone Catastrophic Operations Plan, gives us a sense of planning for the region.

Our first step has been to start stockpiling water, food, and other supplies. Of course, it is impossible know how much of what we will need, what the public emergency response will be, and where we will even be when an earthquake happens. But we have started stockpiling water, canned food, and medical supplies. Our supply of materials are in plastic bins on the side of our house, some of which can be seen in the picture below. We will be adding more over time.


Another concern is how we will communicate and know we are all safe. It is likely that all telecommunications - cell phones, land lines, and Internet - will initially be disabled. Our plan is to migrate toward Oregon Health & Science University (OHSU), which is not only where three of the four of us work and attend school, but it likely itself to be an epicenter of recovery activity.

Another important action is to retrofit our house to improve its chances of surviving an earthquake and being able to escape once it starts. Many homes in the Pacific Northwest consist of a concrete slab foundation with a wood-based frame sitting atop it. Most experts recommend a retrofit process that bolts the frame to the foundation. Ironically, we “grandfathered” in on earthquake insurance back in the 1990s and never had to do this to obtain insurance. Many people now have their houses bolted in order to obtain insurance, but we are doing things backward in a sense, bolting the house even though we already have insurance.

There is uncertainty in the value of this procedure, since we will never know up until the onset of an earthquake whether it was a good investment. But it does bring some peace of mind, and even if the house does not remain livable after an earthquake, the added reinforcement will provide a greater chance of being able to escape once an earthquake starts. We recently had this process completed, with about two dozen plates installed that bolt the wood frame of the house to the concrete foundation or walls emanating from it. We also had an emergency shut-off switch installed for our gas line and bolted down our hot-water heater.

Below are some pictures of the bolting process. The first shows the east side of our house, where the upper level sites atop the garage. The garage walls are concrete, and this picture shows the plates above the garage level and before the siding was reinstalled.


The next picture shows the plates on the rear of the house, after the siding was reinstalled. There are no plates over the windows, since this part of the wall lacks strength.


On the west side of our house is a deck, which needed to be partially removed to install plates in that location.


In the area of the garage doors, the plates needed to be installed on the inside.


My work life also impacts my planning. One concern is my travel schedule, which finds me on the road once or twice per month. It is entirely possible an earthquake could happen while I was away from Oregon, which would complicate getting back after it happens. Unfortunately there is little I can do in advance for this possibility.

A final critical activity in this day and age is to preserve my data. I have always been meticulous about backing up my data, especially work data and my large personal photo and video archive. But most of the backup has been local, in particular to external hard disks at home and in my office. These may survive an earthquake, but an added layer of safety is provided by the technology of our time, namely in the cloud.

As we use Box at OHSU for cloud-based storage, I have uploaded all of my work-related data to my account there. This includes archives of data, documents, and teaching materials. My work life will obviously be interrupted in a major way by an earthquake, but preserving my data will at least give me a chance to get restarted at some point.

I also want to preserve what I can of my personal life in the form of photos, videos, and other data. These days I rarely print pictures, preferring to view them on my computer, tablet, or phone. I have purchased a 1-terabyte Dropbox account to handle archiving of all of my personal data.

Clearly a major earthquake will be devastating to my personal and professional life. But by being prepared, I am improving my chances of survival as well as returning to a somewhat normal life after it happens.