Tuesday, November 29, 2016

Kudos for the Informatics Professor - Fall 2016 Edition

The year 2016 has been a busy but fun year of personal achievements. Many of the notable accomplishments involved giving talks, both in person and online, and around the country and the world. However, I also had a number of other achievements.

A few months ago I posted about talks during the summer of 2016. The fall of 2016 was equally busy. As I noted at the end of the summer posting, I was slated to give two talks in September. The first was the opening talk at the National Library of Medicine (NLM) Georgia Biomedical Informatics Course entitled, What is Biomedical Informatics? This talk was an updated version from the previous offering in this course delivered in April, 2016. In September, I also provided an online lecture in the National Institutes of Health BD2K Guide to the Fundamentals of Data Science Series entitled, Data Indexing and Retrieval.

In October, I had the opportunity to visit the world-renowned Geisinger Health System, where I met with a number of individuals who have taken courses of mine, both my 10x10 ("ten by ten") course as well as physicians in the new Clinical Informatics Fellowship who are taking online courses in the OHSU Biomedical Informatics Graduate Program. I also presented Grand Rounds on the topic of competencies in clinical informatics required of 21st-century clinicians and informaticians.

Also in October was the 25th Anniversary Celebration of the Biomedical Information Communication Center (BICC) at OHSU. The speakers at the event  included the current and long-time former Directors of the NLM. I provided an overview talk about the OHSU Department of Medical Informatics & Clinical Epidemiology (DMICE) and presented a poster on all of the collaboration that DMICE does at OHSU.

I started November with a talk at the OHSU Informatics Research Conference on Challenge Evaluations in Biomedical Information Retrieval, which was a preparation talk for another 25th anniversary talk to be mentioned in a moment.

In mid-November I was busy at the AMIA Annual Symposium, first leading a workshop on Evidence-Based Informatics at the Clinical Informatics Fellows’ Retreat that took place at my alma mater, the University of Illinois College of Medicine. Next I provided a talk at the AMIA Annual Symposium Learning Showcase entitled, The Full Spectrum Biomedical and Health Informatics Education at Oregon Health & Science University.

My final talk of the fall was at the Celebrating 25 Years of TREC Conference at the National Institute for Standards and Technology (NIST) in Gaithersurg, MD. My talk, The TREC Bio/Medical Tracks, described the various tracks in the biomedical domain at TREC over the years. A video of the talk is in Part 3 (starting around the 50-minute mark) of the Webcast archive page for the meeting.

What other accomplishments did I have this past fall? One was teaching my introductory biomedical and health informatics course to a group of clinical and IT leaders from Bangkok Duisuit Medical Services (BDMS), a network of hospitals in Thailand and a few in nearby countries. OHSU has an ongoing collaboration with BDMS in many areas, including informatics. This offering of the course had the usual recorded lectures and discussion forums, but added other activities, including interactive videoconferences and in-person sessions in both Bangkok and Portland. One of the participants in the course, Dr. Somsak Wankijcharoen, created a video of the experience.

Saturday, November 12, 2016

ABPM Extends “Grandfathering” Period for Clinical Informatics Physician Subspecialty Through 2022

The single most-viewed entry in the history of this blog is a posting from 2013 describing eligibility for the clinical informatics subspecialty for physicians. This was partly due to my wanting to have a standard reply for the frequent emails I received at the time from individuals asking if they would be eligible to sit for the board exam during the "grandfathering" period. I would also mention to them my singular most important piece of advice, which was to try, if possible, to get certified before 2018, after which they would need to complete an Accreditation Council for Graduate Medical Education (ACGME)-accredited fellowship.

Earlier this month, however, the American Board of Preventive Medicine (ABPM) extended the period that allows physicians to be eligible for board certification in the clinical informatics subspecialty by five years, through 2022. This means that the grandfathering (and "grandmothering" for my female colleagues!) period can be used to achieve board eligibility through 2022.

One new issue is how this will impact the growing number of ACGME-accredited fellowships, such as the one we offer at Oregon Health & Science University (OHSU). I still believe those fellowships will be the gold standard for early-career physicians to receive the best training in clinical informatics. But other physicians wanting to enter the field who cannot relocate jobs or families will still be able to pursue other options, one of which is master's degree programs such as our program at OHSU.

The official eligibility statement for the subspecialty is otherwise unchanged from the beginning of the grandfathering period and is documented on the ABPM Web site. The first three eligibility requirements are:
  1. Primary certification by one of the 23 member boards of the American Board of Medical Specialties (ABMS)
  2. Graduate from a US, Canadian, or other medical school deemed acceptable by the ABPM
  3. Unrestricted license to practice medicine in the US or Canada
The fourth requirement is the "pathway" by which one is eligible during the grandfathering era. There are two pathways for eligibility, one of which must be completed to be eligible to take the certification exam under the grandfathering criteria.

The first of the two pathways is the "practice pathway." Those who have been working in informatics professionally for at least 25% time during any three of the previous five years, and can have a supervisory individual attest to it, are eligible for this pathway. "Working" in informatics not only includes "practice" (i.e., being a Chief Medical Information Officer or other clinical informatics professional or leader), but also teaching and research.

The second pathway is the "non-traditional fellowship," which is any informatics fellowship of 24 or more months duration deemed acceptable by ABPM. At a 2012 panel at the American Medical Informatics Association (AMIA) Annual Symposium, Dr. William Greaves of ABPM stated this would be composed of informatics educational programs that were listed in the proposal submitted to ABPM by AMIA in 2009. This list, which has never been made public by ABPM, included programs that were funded by training grants from the National Library of Medicine (NLM) or were members of the AMIA Academic Forum at the time the proposal was submitted by AMIA to ABMS in 2009. (I can say that OHSU was definitely on the list, since we were both NLM-funded and a member of the Academic Forum at that time and still are. both). Dr. Greaves also said that ABPM would review applicants trained in other fellowships for eligibility on a case-by-case basis.

The ABPM eligibility criteria also state that time spent in training in informatics can be applied to the practice pathway at one-half the value of practice time. In other words, someone in an educational program for at least 50% time during the previous five years would be eligible to take the certification exam. My interpretation of this is that someone in a master's degree program that involves the equivalent of one and a half years of full-time study would thus be eligible. This has indeed been the case, i.e., those completing the Master of Biomedical Informatics (MBI) Program at OHSU have been deemed eligible, presumably since it requires six academic quarters of full-time study. The OHSU Graduate Certificate Program, on the other hand, which is a subset of the MBI requiring about nine months of study if done full-time, has not on its own been enough. Some applicants have been able to mix and match to achieve eligibility, i.e., with some practice time combined with some education.

It should be noted that another option for physicians who are not eligible for board exam will be the Advanced Health Informatics Certification being developed by AMIA. This certification will be available to all clinician practitioners of informatics trained at the master's level and higher. It will also provide a pathway for physicians who are not eligible for the board certification pathway.

Overall, I am pleased with this development, although it still presents problems for physicians in the future who will want to transition their careers into informatics in the middle of their careers. But since that day of reckoning has now been put off another five years, I guess we can cross that proverbial bridge when we come to it in the early part of the next decade.

Sunday, October 23, 2016

Biomedical Big Data Science Open Educational Resources (OERs) Released; Feedback Sought

For the last couple years, faculty from the Oregon Health & Science University (OHSU) Department of Medical Informatics & Clinical Epidemiology (DMICE) and Library have been developing open educational resources (OERs) in the area of Biomedical Big Data Science. Funded by a grant from the National Institutes of Health (NIH) Big Data to Knowledge (BD2K) Program, OERs have been produced that can be downloaded, used, and repurposed for a variety of educational audiences by both learners and educators.

Development of the OERs is an ongoing process, but we have reached the point where a critical mass of the content is being made available for use and to obtain feedback. The image below shows the home page for the Web site.

The OERs are intended to be flexible and customizable and we encourage others to use or repurpose these materials for training, workshops and professional development or for dissemination to instructors in various fields. They can be used as "out of the box" courses for students, or as materials for educators to use in courses, training programs, and other learning activities. We ultimately aim to create 32 modules on the following topics:
  1. Biomedical Big Data Science
  2. Introduction to Big Data in Biology and Medicine
  3. Ethical Issues in Use of Big Data
  4. Clinical Standards Related to Big Data
  5. Basic Research Data Standards
  6. Public Health and Big Data
  7. Team Science
  8. Secondary Use (Reuse) of Clinical Data
  9. Publication and Peer Review
  10. Information Retrieval
  11. Version Control and Identifiers
  12. Data Annotation and Curation
  13. Data Tools and Landscape
  14. Ontologies 101
  15. Data Metadata and Provenance
  16. Semantic Data Interoperability
  17. Choice of Algorithms and Algorithm Dynamics
  18. Visualization and Interpretation
  19. Replication, Validation and the Spectrum of Reproducibility
  20. Regulatory Issues in Big Data for Genomics and Health Semantic Web Data
  21. Hosting Data Dissemination and Data Stewardship Workshops
  22. Guidelines for Reporting, Publications, and Data Sharing
  23. Terminology of Biomedical, Clinical, and Translational Research
  24. Computing Concepts for Big Data
  25. Data Modeling
  26. Semantic Web Data
  27. Context-based Selection of Data
  28. Translating the Question
  29. Implications of Provenance and Pre-processing
  30. Data Tells a Story
  31. Statistical Significance, P-hacking and Multiple-testing
  32. Displaying Confidence and Uncertainty
At the present time, 20 of the above modules are available for download and use. We are encouraging their use and seeking feedback from those who make use of them. The feedback will be used to improve the available modules and guide development of those not yet released.

We have also been developing mappings to research competencies in other areas, such as for the NIH Clinical and Translational Science Award (CTSA) consortium research competency requirements and the Medical Library Association professional competencies for health sciences librarians. To this end, we have been able to link these materials to existing efforts, and provide training opportunities for learners and educators working in these areas. We ultimately aim to complete this mapping across all of the BD2K training offerings, to align with other groups, avoid redundancy and to ensure we are meeting the needs of these various groups.

This project is actually one of several projects that have been funded by grants to develop and provide education in biomedical informatics and data science. The other projects include:
We hope that all of these materials are useful for many audiences and look forward to feedback enabling their improvement.

Thursday, October 13, 2016

What Should Be The Spectrum of Career Opportunities for Clinical Informatics Subspecialists?

A common reason given for the establishment of clinical informatics as a physician subspecialty is the recognition of the growing role of physicians who work in informatics professionally, particularly in operational clinical settings. Sometimes this is viewed almost synonymous with the Chief Medical Informatics Officer (CMIO) and related roles in healthcare provider organizations.

However, I prefer to think of the subspecialty more broadly. Even if the CMIO is the most common or aspired to position for clinical informatics subspecialists, we should still consider other career paths, especially for those who will increasingly be trained in formal fellowships. Just as physicians of other specialties may enter private practice, managed care settings, academia, and even industry, so should we view the breadth of options for those trained in clinical informatics. I certainly hope there will be pathways from clinical fellowships into academic careers for these physicians.

I was recently involved in a discussion on an email list where many CMIOs lamented that many of the questions on the clinical informatics subspecialty board exam did not seem pertinent to their day-to-role as CMIOs. That led me to raise the question, do we view this subspecialty as primarily focused on the CMIO role, or should it cover broader aspects of clinical informatics? Not being a CMIO, and being in academia, my sentiments are with the broader view. But on the other hand, as the CMIO is a prominent position for those working in this field, and perhaps the most common one, it does deserve important consideration.

This discussion is highly relevant to those of us standing up ACGME-accredited clinical informatics fellowships. We certainly want our fellows to gain substantial operational experience. But I would advocate that they also learn the fundamentals of the informatics field, and believe that although a little dated since its creation in 2009, the Core Content outline covers it pretty well.

Just as while most physicians in a specialty (e.g., internal medicine) do not use the entire spectrum of knowledge in their fields on a daily basis, I believe our clinical informatics fellowships should take the same approach and that the board exam should reflect comparable breadth. I do not believe there is anything in the Core Content outline that is completely superfluous to the practice of being a CMIO or other jobs applying clinical informatics.

The challenge, then, is how to create a fellowship program and board exam to reflect the broader field. Informatics has always had (and I am a product of) the research-oriented NLM fellowships. Even though focused on research, these fellowships have produced diverse outcomes, including some CMIOs. While the focus on clinical fellowships is somewhat different, there should be no reason why graduates of these fellowships should not be able to pursue careers in academia, research, industry, and other settings.

Monday, October 10, 2016

Apple Watch Series 2: Great Hardware, Software Needs Work

When the original Apple Watch came out, it was a non-starter for me. As one of my main uses of a smartwatch is for running, i.e., to track my runs and view them on a map, the lack of on-board GPS meant that the watch had to be tethered to an iPhone. While I do sometimes run with my iPhone, I might as well carry just my iPhone. In addition, I sometimes run in places where I am not able to use my iPhone, such as countries that do not have international data plans with my carrier, Verizon (admittedly increasingly rare).

I was therefore thrilled to read the announcement of the Apple Watch Series 2, which would have standalone GPS and enable me to track runs without a phone.

I have been using my Apple Watch Series 2 for about a month now, and have some observations and hopes for improvement. If those improvements are made soon, I will add a postscript to this posting.

From a hardware standpoint, the watch is excellent. It is comfortable to wear and works seamlessly with my iPhone 6 (soon to be replaced with a 7 Plus). I have always able to make it through a day (even with a run that consumes 20-30% of the battery life per hour of activity) without having to recharge it.

In addition to capturing my runs, I want to be able to view them on any device, including on a computer via a Web site, and also share them on Facebook. I want to be able to view all of the data, including the map of where I have run, as well as export it via standard formats such as GPX and TCX. For years I have used various Garmin fitness watches, and I appreciated the ease by which I could capture my run, display its data and map on the Garmin Connect Web site, and share it to Facebook and other digital places.

In terms of capturing the run, the Apple Watch Series 2 does great. I am actually impressed at how quickly it locks on the GPS satellites, and the accuracy seems to be equal to my previous Garmin watches.

But I have disappointment in its ability to export or display of data. While the watch’s Workout app is simple and easy to use, and the Activity app on my iPhone easy to use and display results, the data cannot be exported to other apps. I am also disappointed that the Activity app only runs on the iPhone, and therefore cannot be accessed by other hardware, including the iPad or a computer accessing a Web site.

I also dislike the sharing capabilities of the Activity app. When one tries to share the entire exercise activity to Facebook, all that is uploaded is an image from the app, and not the details of distance run, time, map, etc. One can share the map of the Activity, but that is not uploaded with any other data about the run, e.g., distance, time, etc.

Another disappointment is that other fitness apps do not (yet) allow capture of the watch GPS data. For example, while RunKeeper and MapMyRun have Apple Watch apps, they presently do not capture the watch’s GPS data when not tethered to the phone. The SpectraRun workouts app can access and export the run data but it presently does not export the GPS data into the TCX file it generates. I assume that updates to these non-Apple apps will eventually be able to access the GPS, and this might also solve the problem of Activity app data not being exportable.

Fortunately, all of these disappointments should be easily fixable in software, and I am hopeful that Apple and other developers will remedy them quickly. I have had some online dialog with one of the running app developers, and they assured me (and others) that they are trying to quickly update their watch apps to capture the GPS directly from the watch.

Tuesday, October 4, 2016

Update for a Standard Occupational Classification (SOC) Code for Informatics: Likely to Happen But Needing Revision

For years, many in the informatics field have lamented our invisibility when it comes to US government labor statistics. As I and others have been writing for years, there is no Standard Occupational Classification (SOC) code for those who work professionally in informatics [1]. As the SOC is updated by the Bureau of Labor Statistics about once a decade, I was pleased to be appointed to a group led by the Office of National Coordinator for Health IT (ONC) to submit a proposed revision to the 2018 SOC to include a code for health informatics in July 2014.

Like many classifications, the SOC is organized hierarchically. Its hierarchy goes to a depth of four levels, with the levels called Major Group, Minor Group, Broad Group, and Detailed Occupation. The Major Group. Most healthcare occupations are in the Major Group 29-0000, which is subdivided into three Minor Groups, which are in turn broken down into Broad Groups and Detailed Occupations for many health professionals from physicians to phlebotomists. In the last (2010) SOC, there was only one Broad Group and Detailed Occupation pertaining to health IT, namely 29-2070 Medical Records and Health Information Technicians, which mainly referred to those with the Registered Health Information Technologist (RHIT) certification from the health information management (HIM) field. The list below shows the three Minor Groups in the health professions and then more detail for the 29-2070/22-2071 code:
29-0000 Healthcare Practitioners and Technical Occupations
  29-1000 Health Diagnosing and Treating Practitioners
  29-2000 Health Technologists and Technicians
    29-2070 Medical Records and Health Information Technicians
      29-2071 Medical Records and Health Information Technicians
  29-9000 Other Healthcare Practitioners and Technical Occupations
Those from HIM with the Registered Health Information Administrator (RHIA) certification were among those included in the 11-9111 Medical and Health Services Managers category. (The Broad Group 11-0000 serves for Management Occupations.)

Earlier this year, the BLS released its first proposed revisions for the 2018 SOC for public comment. In particular, they released Docket Number 1-0148 -- Health Informatics Practitioners (Multiple), which included the following:
Multiple dockets requested new detailed occupations and improved coverage of occupations related to Health Information Technology such as Health Informatics Practitioners, Medical Records Specialists, and Medical Registrars. The SOCPC partially accepted these recommendations and proposed revising the title for 29-2071 Medical Records and Health Information Technicians to 29-2071 Medical Registrar and Records Specialists, adding Medical Bill Coder as an illustrative example, and adding "Includes medical coders" to the definition. The SOCPC also proposes a new broad and detailed occupations (29-9020 and 29-9021) for Health Information Technology, Health Information Management, and Health Informatics Specialists and Analysts. Finally, the SOCPC proposes adding illustrative examples to the existing 11-9111 Medical and Health Services Managers to include: Clinical Informatics Director, Health Information Services Manager, and Chief Medical Information Officer.

While I was pleased to see that our recommendation for the addition of a code for health informatics practitioners was accepted, myself and others were disappointed that the code lumped together three distinct groups who work professionally with IT in healthcare, namely health informatics, health information management, and health IT. A number of leading health IT organizations support the view that these are distinct. I was pleased to have the opportunities to work with my colleagues from the American Medical Informatics Association (AMIA) and a number of other organizations to draft a letter endorsing the view the there should be three Detailed Occupation codes for these three areas.

In particular, the letter led by AMIA advocates modification to the final 2018 SOC that will be released in 2017 that will split the new 29-9021 code into three new Detailed Occupations defined as follows:
  • Health Informatics professionals: Design, develop, select, test, implement, and evaluate new or modified informatics solutions, data structures, and clinical decision support mechanisms to support patients, healthcare professionals, and improved usability of such systems for patient safety within healthcare contexts.
  • HIM professionals: acquire, analyze, and protect digital and traditional medical information vital to the daily operations management of health information and electronic health records (EHRs).
  • Health IT professionals: Apply knowledge of healthcare and information systems to assist in the design, development, and continued modification of computerized health care systems.
The letter also suggests that, SOCPC for 11-9111 Medical and Health Services Managers to include: Clinical Informatics Director, Health Information Services Manager, and Chief Medical Information Officer. We suggest the addition of “Chief Nursing Informatics Officer” to this list to add further clarity. Experience among our constituencies indicate a proliferation of senior executives and other management-level job titles within and across these distinct occupations, all of which need to be captured under this detailed code.

I agree with AMIA and others that the occupations of health informatics, health information management, and health IT are each important yet unique within healthcare. Having them represented in the SOC separately will hopefully allow further delineation of the contributions each makes to advancing the use of information and technology in healthcare.


1. Hersh, W (2010). The health information technology workforce: estimations of demands and a framework for requirements. Applied Clinical Informatics. 1: 197-212.

Wednesday, September 7, 2016

Free Course in Healthcare Data Analytics Offered by OHSU

I am pleased to announce that the Department of Medical Informatics & Clinical Epidemiology (DMICE) of Oregon Health & Science University (OHSU) is offering a free continuing education course, Update in Health Information Technology: Healthcare Data Analytics, to physicians, nurses, other healthcare professionals, and health informatics/IT professionals. Registration is available at https://www.surveymonkey.com/r/onc-course.

This course is made freely available via a grant from the Office of the National Coordinator for Health IT (ONC) that I described in a previous posting last year. The grant requires us to have 1000 individuals complete the course by June 2017. The full updated ONC Health IT curriculum will also be made freely available in 2017.

Although the course is open to all healthcare professionals and health informatics/IT professionals, physicians will additionally be able to obtain continuing medical education (CME) credit through OHSU. For physicians certified in the new Clinical Informatics Subspecialty, Lifelong Learning and Self-Assessment (LLSA) credits towards American Board of Preventive Medicine (ABPM) Maintenance of Certification Part II (MOC-II) requirements for the subspecialty are also available.

The course consists of 14 modules that are estimated to take about 18 hours to complete. The course is completely online, and consists of lectures and self-assessment quizzes. References to further information are also provided. Those completing the entire course (viewing all of the lectures and completing the self-assessment quizzes) and evaluation form will receive a Certificate of Completion from OHSU. Physicians will be able to claim 18 credits of CME or (for those certified in Clinical Informatics) MOC-II. (We are not able to offer OHSU academic credit for the course.)

The course will be offered 6 times in overlapping two-month blocks starting in October 2016. Because of the anticipated large enrollment, the entire course will need to be completed during one block in order to receive the Certificate of Completion and CME/MOC-II credit. If the course is not completed during the block, participants can re-enroll in a later block. The course will only be offered for free through May 2017.

The first step in taking the course is registering at https://www.surveymonkey.com/r/onc-course. Each participant will be asked to provide some basic information, including name, employer, and email address. (All data will be kept confidential by OHSU, with the exception of confidential reporting to ONC.) After registration, participants will be sent login information to OHSU's Sakai Learning Management System. After completing all of the modules and the self-assessment quizzes, each participant will need to complete the evaluation form. He or she will then be sent via email a PDF Certificate of Completion. (Physicians will additionally be sent certifications for CME or MOC-II credit after completing additional evaluation information.)

Within the Sakai system, each module will provide an overview of learning objectives, one or more lecture segments (in MP4 format, viewable on both computers and mobile devices), optional additional materials, and a self-assessment quiz of 5-10 multiple-choice questions. (Those seeking CME or MOC-II credit must achieve a correct rate of 70% to pass; each quiz will be able to be taken up to 5 times.) Sakai will also provide an interactive forum for those having questions or comments about the materials. Due to the anticipated large enrollment, we will encourage participants to interact and answer questions among themselves, with OHSU teaching assistants bringing in course faculty as needed.

The 14 modules of the course include the following:

  • General Health Care Data Analytics
  • Extracting and Working with Data
  • Population Health and the Application of Health IT
  • Applying Health IT to Improve Population Health at the Community Level
  • Identifying Risk and Segmenting Populations: Predictive Analytics for Population Health
  • Big Data, Interoperability, and Analytics for Population Health
  • Data Analytics in Clinical Settings
  • Risk Adjustment and Predictive Modeling
  • Overview of Interoperable Health IT
  • Standards for Interoperable Health IT
  • Implementing Health Interoperability
  • Ensuring the Security and Privacy of Information Shared
  • Secondary Use of Clinical Data
  • Machine Learning and Natural Language Processing

The OHSU course faculty include:

  • William Hersh, MD, Department of Medical Informatics & Clinical Epidemiology
  • Vishnu Mohan, MD, MBI, Department of Medical Informatics & Clinical Epidemiology
  • David Dorr, MD, MS, Department of Medical Informatics & Clinical Epidemiology
  • Peter Graven, PhD, Department of Emergency Medicine
  • Karen Eden, PhD, Department of Medical Informatics & Clinical Epidemiology

The MOC-II credit is important for the new subspecialty, with those who are board-certified needing to obtain a certain amount to re-certify in 10 years. The American Medical Informatics Association (AMIA) has already developed MOC-II activities, largely through its meetings, but will also have online offerings as it implements its learning management system. They will also offer MOC-IV credits in the future.

Sunday, September 4, 2016

Kudos for the Informatics Professor - Summer 2016 Edition

It has been a busy but enjoyable summer for me, with the opportunity to give invited talks at a number of international locations as well as at some international conferences closer to home. I also had some publications released and carried out a number of teaching activities.

My talks began with leading a roundtable discussion at the Society for Imaging Informatics in Medicine 2016 Conference in Portland, OR. The title of the roundtable was, Clinical Informatics Certification for Physicians & Non-Physicians, and I provided a history and overview, and led a discussion of future directions, for the new clinical informatics subspecialty for physicians

Later in July, I ventured to Pisa Italy, where I gave the Keynote Talk at the Medical Information Retrieval (MedIR) Workshop, which was part of the ACM SIGIR 2016 meeting. Entitled, Challenges for Information Retrieval and Text Mining in Biomedicine: Imperatives for Systems and Their Evaluation, my talk described the challenges for search and text processing systems in the biomedical domain for computer science researchers.

In early August, back in Oregon, I delivered the Keynote Talk at the Joint International Conference on Biological Ontology and BioCreative at Oregon State University in Corvallis, OR. My talk, Information Retrieval and Text Mining Evaluation Must Go Beyond “Users”: Incorporating Real-World Context and Outcomes, discussed the challenges of evaluating search and text processing systems in the biomedical domain for bioinformatics researchers.

Later in August I was in a different part of the world, Thailand. Oregon Health & Science University (OHSU) has a growing international collaboration there in partnership with Bangkok Dusuit Medical Services. I delivered Grand Rounds at their flagship Bangkok Hospital. The title of my talk was, Overview of Clinical Informatics Activities in the US. I provided an overview of clinical informatics activities in the US, including adoption of electronic health records and the new clinical informatics subspecialty for physicians.

Also on that trip I was one of the keynote speakers at the HIMSS AsiaPAC 16 Conference in Bangkok. My talk was entitled, Advancing Digital and Patient-Centered Care Requires Competent Clinicians and Informatics Professionals, and I described the knowledge and training needed for optimal use of digital health systems for patients by clinicians and informatics professionals.

Finally on that trip I spent a day leading a workshop on various clinical informatics topics at Phuket International Hospital. Even better was getting to spend a weekend in that lovely beach city (see below)!

I also had release of some published papers this summer. One was a Technical Brief (hardly brief at over 60 pages!) prepared for the Agency for Healthcare Research & Quality (AHRQ) Effective Health Care Program on Telehealth: Mapping the Evidence for Patient Outcomes From Systematic Reviews. Another was a publication describing early experiences with clinical informatics fellowships for physicians in Journal of the American Medical Informatics Association.

I also carried out a substantial amount of teaching this summer. As I have every summer, I directed and taught in the AMIA Clinical Informatics Board Review Course. Next year is the last year of the “grandfathering” period that allows physicians to become board-certified without formal clinical informatics fellowship training, although a proposal has been put forth to the American Board of Preventive Medicine to extend that period for another five years. We will see what their decision is in November.

I also brought to a close the four-month long introductory online course I had been teaching to clinical informatics leaders at BDMS (see above) in Thailand. We spent a couple days at Bangkok Hospital reviewing course content, presenting papers, and preparing for course projects that will be presented when this group visits OHSU in November.

That trip also took me briefly to Singapore, where I led the in-person session at the end of the i10x10 course under the rubric of the Gateway to Health Informatics Course. This was the 15th offering of the course dating back to 2009.

Upon returning from Thailand and Singapore, I gave a lecture to new first-year OHSU medical students like I did last year entitled, Information is Different Now That You’re a Doctor. I enjoy giving this lecture to new medical students and describing the many ways that information is different now that they are becoming professionals, everything from seeking best evidence to maintaining professional behavior with highly private information, especially on social media.

I will also be doing some teaching in the next couple weeks for federal organizations, namely the National Library of Medicine (NLM) and the National Institutes of Health (NIH) Big Data to Knowledge (BD2K) Program. The NLM teaching involves giving the introductory lecture that kicks off their week-long in-residence biomedical informatics course. The BD2K teaching will involve giving a webinar in the year-long BD2K Guide to the Fundamentals of Data Science Series. My overview lecture will focus on data management, indexing, and retrieval.

There will be more talks, publishing, and teaching this fall, so stay tuned!

Sunday, July 31, 2016

AMIA Unveils Advanced Health Informatics Certification (AHIC) for Broader Health Professions

While the clinical informatics physician subspecialty has been an excellent way to recognize the value of the informatics profession [1], there are clearly many important other professionals in the informatics field who deserve the same professional recognition for their knowledge and skills in using data and information to improve health and healthcare. A further step in that evolution took place recently with the unveiling of the Advanced Health Information Certification (AHIC) by the American Medical Informatics Association (AMIA).

More details about the process can be found in three papers published in Journal of the American Medical Informatics Association and made freely available on the AMIA Web site. These papers describe the rationale and process for developing the certification [2], the eligibility requirements for it [3], and an explanation on how it fits in the larger perspective of the field [4]. The AHIC is viewed as a specialization in informatics beyond one’s initial health professional training. The latter training must be at the master’s or professional doctorate level, such as an MD, PharmD, Master of Nursing, etc.. This pathway will also provide an alternative for physicians who are not eligible for the medical subspecialty, i.e., who do not have an active primary specialty certification. This includes those who never attained a formal specialty in their medical training as well as those who discontinued the practice of medicine and allowed their primary board certification to lapse. It also includes osteopathic physicians (DOs), although these physicians will eventually be eligible for the physician subspecialty as the merger between the Accreditation Council for Graduate Medical Education (ACGME) and the American Osteopathic Association (AOA) is implemented and AOA programs achieve ACGME accreditation.

To be eligible for AHIC, an individual with a health professional master’s or higher must also have a master’s degree or higher from an accredited informatics program and professional experience applying informatics to healthcare. The accreditation of informatics educational programs will be based on AMIA’s recently becoming a member of the Commission on Accreditation for Health Informatics and Information Management (CAHIIM), which is in the process of revising its health informatics accreditation standards. Similar to the physician subspecialty, a “grandfathering” period will allow individuals to achieve certification from educational programs deemed “acceptable” that are not yet accredited. There will also be a temporary pathway for those with no formal informatics education at all who are long-time practitioners, i.e., have 36 months of informatics experience over a five-year period that has been completed within the past 10 years. Those who obtain a PhD in an informatics-related field will also be eligible, even if they do not have formal health professional training.

One way AMIA hopes to see the process viewed is as analogous to the physician certification. In particular, those certified by AHIC are expected to be have advanced training in a healthcare profession in addition to formal training and experience in informatics.

AMIA will also establish an entity to develop the certification exam, which will likely be aligned with the physician subspecialty certifying exam. The content for the certification exam will be based on an update of the core content for the physician subspecialty, which itself needs update since it has not been revised since it was published in 2009 [5]. (I have always believed that there is little in the core content of the physician subspecialty that is truly specific to physicians. There really need not be any, since informatics is agnostic and complementary to one's healthcare field.)

As with the physician subspecialty, I am highly supportive of the new certification and its professional recognition of all who work in the field professionally. Our Biomedical Informatics Graduate Program at Oregon Health & Science University (OHSU) will certainly aim to align with it. (We are currently accredited under the original CAHIIM health informatics process but will transition to the new one when we are able to do so.)

Despite my optimism and support for AHIC, I do have one concern, which is the requirement to have a master’s degree in a health profession. I understand the rationale for aligning the AHIC with the physician subspecialty in requiring advanced training in both informatics and a health profession. At some level, this makes sense, and I have long advocated that informatics is primarily a health profession. However, this leaves out those with master’s degrees in applied informatics who do not also have a master’s degree in a health profession. That excludes those with pre-informatics training in non-health professional fields, such as computer science, life sciences, and health administration. It also leaves out health professionals whose health field has a bachelor’s degree as a terminal degree. I am not aware of any evidence that shows those with a healthcare master’s are any better operational informaticians than those without such a degree, nor do I know that potential employers share the view that they are different. Interestingly, those who obtain a PhD in informatics are not subject to this requirement, even though their degree is a research degree, i.e., less applied and less likely to have courses about the healthcare system.

There is no question that informaticians working in healthcare settings need to have a solid understanding of the healthcare system. Indeed, most applied informatics master’s programs (including ours at OHSU) have courses about the healthcare system, and are encouraged to obtain practical experiences in healthcare settings. I worry that this process may cleave professional informatics master’s programs in two, with those who are eligible for AHIC and those who are not, despite having nearly the exact same training. Many current and former OHSU master’s graduates without formal clinical backgrounds have developed successful careers in applied health or clinical informatics. The AMIA leadership has vowed to consider other certifications for these types of individuals.

Nonetheless, the AHIC will be a great accomplishment for the field when it is fully implemented over the next year or two. The recognition brought on by certification of individuals will advance the profession as a whole and consolidate the important contributions that informaticians bring to 21st century healthcare and other health-related activities.


1. Detmer, DE and Shortliffe, EH (2014). Clinical informatics: prospects for a new medical subspecialty. Journal of the American Medical Association. 311: 2067-2068.
2. Gadd, CS, Williamson, JJ, et al. (2016). Creating advanced health informatics certification. Journal of the American Medical Informatics Association. 23: 848-850.
3. Gadd, CS, Williamson, JJ, et al. (2016). Eligibility requirements for advanced health informatics certification. Journal of the American Medical Informatics Association. 23: 851-854.
4. Fridsma, DB (2016). The scope of health informatics and the Advanced Health Informatics Certification. Journal of the American Medical Informatics Association. 23: 855-856.
5. Gardner, RM, Overhage, JM, et al. (2009). Core content for the subspecialty of clinical informatics. Journal of the American Medical Informatics Association. 16: 153-157.

Sunday, June 26, 2016

20 Years of Biomedical Informatics Graduate Education at OHSU

Earlier this month was graduation at Oregon Health & Science University (OHSU), and I was proud to see 41 individuals listed in the program receiving Graduate Certificates as well as master’s and PhD degrees in biomedical informatics. This year also marks the 20th year of the program, dating back to the first group of master’s degree students matriculating in the fall of 1996. (OHSU already had been funded as part of the National Library of Medicine [NLM] training grant program since 1992, but initially only accepted non-degree postdoc trainees. I also launched my introductory course prior to 1996 as an elective in the OHSU Master of Public Health program.)

The program has achieved many milestones, but clearly the most important is the number of people who have launched careers in the field by graduating from the program. As shown in the table below, the OHSU Biomedical Informatics Graduate Program now has awarded 716 degrees and certificates to 653 people. (Some of the latter have more than one of the former.)

Later this year will be the celebration of another milestone for informatics at OHSU, which will be the celebration of 25 years of the Biomedical Information Communication Center (BICC) Building. I have worked in this building since the day it opened and look forward to celebrating the sustained success of its occupants, especially the Department of Medical Informatics & Clinical Epidemiology and the OHSU Library.

Monday, June 6, 2016

Generalizability and Reproducibility of Scientific Literature and the Limits to Machine Learning

A couple years ago, some colleagues and I wrote a paper raising a number of caveats about the enthusiasm for leveraging the growing volume of patient data in electronic health records and other clinical information systems for so-called re-use of secondary use, such as clinical research, quality improvement, and public health [1]. While we shared that enthusiasm for that type of use, we also recognized some major challenges for trying to extract knowledge from these sources of data, and advocated a disciplined approach [2].

Now that the world’s knowledge is increasingly available in electronic form in online scientific papers and other resources, a growing number of researchers, companies, and others are calling for the same type of approach that will allow computers to process the world’s scientific literature to answer questions, give advice, and perform other tasks. Extracting knowledge from scientific literature may be easier than from medical records. After all, scientific literature is written in a way to report findings and conclusions in a relatively unambiguous manner. In addition, scientific writing is usually subject to copy-editing that decreases the likelihood of grammatical or spelling errors, both of which often make processing medical records more difficult.

There is no question that machine processing of literature can help answer many questions we have [3]. Google does an excellent job of answering questions I have about the time in various geographic locations, the status of current airplane flights, and calories or fat in a given food. But for more complex questions, such as the best treatment for a complex patient, we still have a ways to go.

Perhaps the system with the most hype around this sort of functionality is IBM’s Watson from. Recently, one of the early leaders of artificial intelligence research, Dr. Roger Schank, took IBM to task for its excessive claims (really marketing hype) around Watson [4]. Among the concerns Schank raised were IBM claims that Watson can “out-think” cancer. I too have written about Watson, in a posting to this blog now four years ago, in which I lamented the lack of published research describing its benefits (as opposed to hype pieces extolling its “graduation” of medical school) [5]. While there have been some conference abstracts presented on Watson’s work, we have yet to see any major contributions toward improving the diagnosis or treatment of cancer [6]. Like Schank, I find Watson’s technology interesting, but claims of its value in helping clinicians to treat cancer or other diseases need scientific verification as much as the underlying treatments being used.

We also, however, need to do some thought experiments as to how likely computers can carry out machine learning in this manner. In fact, there are many reasons why the published scientific literature must be approached with care. It has become clear in recent years that what is reported in the scientific literature may not reflect the totality of knowledge, but instead representing the “winner’s curse” of results that have been positive and thus more likely to be published [7,8]. In reality, however, “publication bias” pervades all of science [9].

In addition, further problems plague the scientific literature. It has been discovered in recent years that a good deal of scientific experiments are not reproducible. This was found to be quite prevalent in preclinical studies analyzed by pharmaceutical companies looking for promising drugs that might be candidates for commercial development [10]. It has also been demonstrated in psychology [11]. In a recent survey of scientists, over half agreed with the statement there is a “reproducibility crisis” in science, with 50-80% (depending on the field) unable to reduce an experiment yet very few trying or able to publish about it [12].

Even on the clinical side we know there are many problems with randomized controlled trials (RCTs). Some recent analyses have documented that RCTs do not always reflect the larger population from which the sampling is intended to represent [13,14], something we can now document with that growing quantity of EHR data [15].  Additional recent work has questioned the use of surrogate outcomes in cancer drugs, questioning their validity as indicators of efficacy of the drugs [16,17]. Indeed, it has been shown in many areas of medicine that initial studies are overturned with later, usually larger, studies [18-20].

In addition, the problem is not limited to published literature. A recent study was published that documented significant amounts of inaccuracy in drug compendia that are commonly used by clinicians [21].

My argument has always been that informatics interventions must prove their scientific mettle no differently than other interventions that claim to improve clinical practice, patient health, or other tasks for which we develop them. A few years back some colleagues and I raised some caveats about clinical data. For different reasons, there are also challenges with scientific literature as well. Thus we should be wary of system that claim to “ingest” scientific literature and perform machine learning from it. While it is important to continue this important area of research, we must resist efforts to over-hype it and also must carry out research to validate its success.


1. Hersh, WR, Weiner, MG, et al. (2013). Caveats for the use of operational electronic health record data in comparative effectiveness research. Medical Care. 51(Suppl 3): S30-S37.
2. Hersh, WR, Cimino, JJ, et al. (2013). Recommendations for the use of operational electronic health record data in comparative effectiveness research. eGEMs (Generating Evidence & Methods to improve patient outcomes). 1: 14. http://repository.academyhealth.org/egems/vol1/iss1/14/.
3. Wright, A (2016). Reimagining search. Communications of the ACM. 59(6): 17-19.
4. Schank, R (2016). The fraudulent claims made by IBM about Watson and AI. They are not doing "cognitive computing" no matter how many times they say they are. Roger Schank. http://www.rogerschank.com/fraudulent-claims-made-by-IBM-about-Watson-and-AI.
5. Hersh, W (2013). What is a Thinking Informatician to Think of IBM's Watson? Informatics Professor. http://informaticsprofessor.blogspot.com/2013/06/what-is-thinking-informatician-to-think.html.
6. Kim, C (2015). How much has IBM’s Watson improved? Abstracts at 2015 ASCO. Health + Digital. http://healthplusdigital.chiweon.com/?p=83.
7. Ioannidis, JP (2005). Why most published research findings are false. PLoS Medicine. 2(8): e124. http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124.
8. Young, NS, Ioannidis, JP, et al. (2008). Why current publication practices may distort science. PLoS Medicine. 5(10): e201. http://www.plosmedicine.org/article/info%3Adoi%2F10.1371%2Fjournal.pmed.0050201.
9. Dwan, K, Gamble, C, et al. (2013). Systematic review of the empirical evidence of study publication bias and outcome reporting bias - an updated review. PLoS ONE. 8(7): e66844. http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0066844.
10. Begley, CG and Ellis, LM (2012). Raise standards for preclinical cancer research. Nature. 483: 531-533.
11. Anonymous (2015). Estimating the reproducibility of psychological science. Science. 349: aac4716. http://science.sciencemag.org/content/349/6251/aac4716.
12. Baker, M (2016). 1,500 scientists lift the lid on reproducibility. Nature. 533: 452-454.
13. Prieto-Centurion, V, Rolle, AJ, et al. (2014). Multicenter study comparing case definitions used to identify patients with chronic obstructive pulmonary disease. American Journal of Respiratory and Critical Care Medicine. 190: 989-995.
14. Geifman, N and Butte, AJ (2016). Do cancer clinical trial populations truly represent cancer patients? A comparison of open clinical trials to the Cancer Genome Atlas. Pacific Symposium on Biocomputing, 309-320. http://www.worldscientific.com/doi/10.1142/9789814749411_0029.
15. Weng, C, Li, Y, et al. (2014). A distribution-based method for assessing the differences between clinical trial target populations and patient populations in electronic health records. Applied Clinical Informatics. 5: 463-479.
16. Prasad, V, Kim, C, et al. (2015). The strength of association between surrogate end points and survival in oncology: a systematic review of trial-level meta-analyses. JAMA Internal Medicine. 175: 1389-1398.
17. Kim, C and Prasad, V (2015). Strength of validation for surrogate end points used in the US Food and Drug Administration's approval of oncology drugs. Mayo Clinic Proceedings. Epub ahead of print.
18. Ioannidis, JP (2005). Contradicted and initially stronger effects in highly cited clinical research. Journal of the American Medical Association. 294: 218-228.
19. Prasad, V, Vandross, A, et al. (2013). A decade of reversal: an analysis of 146 contradicted medical practices. Mayo Clinic Proceedings. 88: 790-798.
20. Prasad, VK and Cifu, AS (2015). Ending Medical Reversal: Improving Outcomes, Saving Lives. Baltimore, MD, Johns Hopkins University Press.
21. Randhawa, AS, Babalola, O, et al. (2016). A collaborative assessment among 11 pharmaceutical companies of misinformation in commonly used online drug information compendia. Annals of Pharmacotherapy. 50: 352-359.
22. Malin, JL (2013). Envisioning Watson as a rapid-learning system for oncology. Journal of Oncology Practice. 9: 155-157.

Thursday, April 28, 2016

Earthquake Preparedness in the Pacific Northwest, 21st Century Style: Don't Forget the Data

Most of us who live in the Pacific Northwest have known for a couple decades of the earthquake risk sitting 90 miles off the Pacific coast, the Cascadia subduction zone. Concern reached a fervent pitch last year with the publication of an article in The New Yorker magazine authored by Kathryn Schulz, which was recently awarded a Pulitzer Prize. A follow-on article to the original piece provided good advice concerning planning.

At my family's household, we have always had earthquake insurance, although had never really taken planning seriously. As with many in Portland, the New Yorker article sprung us into action. A first step was to educate ourselves, i.e., what might happen and how can we best prepare. We also wanted to take into account our 21st century lifestyles that include work-related travel as well as a good portion of our lives being digitized.

There are many resources that are available to get one starting thinking and acting on the planning process. The first step is to understand the risk and what may happen. The State of Oregon Department of Geology and Mineral Industries, the Cascadia Region Earthquake Workgroup, and others have laid out the risk and likely consequences. While the most devastation will occur along the Oregon coast, the damage in Portland, 50 miles inland from the coast, will still be substantial. The figure below, from the Oregon Resilience Plan, shows the likely effects based on location.

One can quickly find a great deal of information. The State of Oregon also maintains a Web site about earthquakes. The state has also carried out some detailed analyses, The Earthquake Risk Study for Oregon's Critical Energy Infrastructure Hub, and The Oregon Resilience Plan – Cascadia: Oregon’s Greatest Natural Threat. The Cascadia Region Earthquake Workgroup has also published a report, Cascadia Subduction Zone Earthquakes: A magnitude 9.0 earthquake scenario.

So how does our family begin planning? I have found the most useful publication to be on the State Web site, Living on Shaky Ground: How to Survive Earthquakes and Tsunamis in Oregon. This report details the planning process, which our family has started to implement. Another report, Cascadia Subduction Zone Catastrophic Operations Plan, gives us a sense of planning for the region.

Our first step has been to start stockpiling water, food, and other supplies. Of course, it is impossible know how much of what we will need, what the public emergency response will be, and where we will even be when an earthquake happens. But we have started stockpiling water, canned food, and medical supplies. Our supply of materials are in plastic bins on the side of our house, some of which can be seen in the picture below. We will be adding more over time.

Another concern is how we will communicate and know we are all safe. It is likely that all telecommunications - cell phones, land lines, and Internet - will initially be disabled. Our plan is to migrate toward Oregon Health & Science University (OHSU), which is not only where three of the four of us work and attend school, but it likely itself to be an epicenter of recovery activity.

Another important action is to retrofit our house to improve its chances of surviving an earthquake and being able to escape once it starts. Many homes in the Pacific Northwest consist of a concrete slab foundation with a wood-based frame sitting atop it. Most experts recommend a retrofit process that bolts the frame to the foundation. Ironically, we “grandfathered” in on earthquake insurance back in the 1990s and never had to do this to obtain insurance. Many people now have their houses bolted in order to obtain insurance, but we are doing things backward in a sense, bolting the house even though we already have insurance.

There is uncertainty in the value of this procedure, since we will never know up until the onset of an earthquake whether it was a good investment. But it does bring some peace of mind, and even if the house does not remain livable after an earthquake, the added reinforcement will provide a greater chance of being able to escape once an earthquake starts. We recently had this process completed, with about two dozen plates installed that bolt the wood frame of the house to the concrete foundation or walls emanating from it. We also had an emergency shut-off switch installed for our gas line and bolted down our hot-water heater.

Below are some pictures of the bolting process. The first shows the east side of our house, where the upper level sites atop the garage. The garage walls are concrete, and this picture shows the plates above the garage level and before the siding was reinstalled.

The next picture shows the plates on the rear of the house, after the siding was reinstalled. There are no plates over the windows, since this part of the wall lacks strength.

On the west side of our house is a deck, which needed to be partially removed to install plates in that location.

In the area of the garage doors, the plates needed to be installed on the inside.

My work life also impacts my planning. One concern is my travel schedule, which finds me on the road once or twice per month. It is entirely possible an earthquake could happen while I was away from Oregon, which would complicate getting back after it happens. Unfortunately there is little I can do in advance for this possibility.

A final critical activity in this day and age is to preserve my data. I have always been meticulous about backing up my data, especially work data and my large personal photo and video archive. But most of the backup has been local, in particular to external hard disks at home and in my office. These may survive an earthquake, but an added layer of safety is provided by the technology of our time, namely in the cloud.

As we use Box at OHSU for cloud-based storage, I have uploaded all of my work-related data to my account there. This includes archives of data, documents, and teaching materials. My work life will obviously be interrupted in a major way by an earthquake, but preserving my data will at least give me a chance to get restarted at some point.

I also want to preserve what I can of my personal life in the form of photos, videos, and other data. These days I rarely print pictures, preferring to view them on my computer, tablet, or phone. I have purchased a 1-terabyte Dropbox account to handle archiving of all of my personal data.

Clearly a major earthquake will be devastating to my personal and professional life. But by being prepared, I am improving my chances of survival as well as returning to a somewhat normal life after it happens.

Monday, March 28, 2016

Eligibility for the Clinical Informatics Subspecialty: 2016 Update

Some of the most highly viewed posts in this blog have been those on eligibility for the clinical informatics subspecialty for physicians, the first in January, 2013 and an update in June, 2014. I recently led a Webinar hosted by the American Medical Informatics Association (AMIA) on the eligibility that I am using as a segue here to provide another update, which may be my last one before the "grandfathering" period ends.

One of the reasons for these posts has been to use them as a starting point for replying to those who email or otherwise contact me with questions about their own eligibility. After all these years, I still get such emails and inquiries. While the advice in the previous posts is largely still correct, we have had the ensuing experience of three years of the board exam, who qualified to sit for it, and what proportion of those taking the test passed. There are still (only) two boards that qualify physicians for the exam, the American Board of Preventive Medicine (ABPM) and the American Board of Pathology (ABP). ABP handles qualifications for those with Pathology as a primary specialty and ABPM handles those from all other primary specialties.

The official eligibility statement for the subspecialty is unchanged from the beginning of the grandfathering period and is documented in the same PDF file posted then from the ABPM (and summarized by the ABP). One must be a physician who has board certification in one of the primary 23 medical specialties. They must have an active and unrestricted medical license in one US state. For the first five years of the subspecialty (through 2017), the "practice pathway" or completing a "non-traditional fellowship" (i.e. one not accredited by the Accreditation Council for Graduate Medical Education, or ACGME) will allow physicians to "grandfather" the training requirements, i.e., take the exam without completing a formal fellowship accredited by the ACGME. But starting in 2018, the only pathway to board eligibility will be via an ACGME-accredited fellowship.

In the Webinar, I made some observations about who was deemed eligible for the exam, although as always, I must provide the disclaimer that ABPM and ABP are the ultimate arbiters of eligibility, and anyone who has questions should contact ABPM or ABP. I only interpret their rules.

We have now learned from the experience of having the exam offered over three years. As I noted in the Webinar, there are now 1107 physicians who have achieved certification in the subspecialty. The exam pass rate has been high but has declined each year, starting at 91% in 2013 and falling to 90% in 2014 and 80% in 2015. While we cannot be sure that the exam has not changed, the declining pass rate most likely reflects highly experienced individuals taking the exam initially and those with less experience taking it in subsequent years.

I (and a number of others) have been somewhat surprised at the high pass rate in all years, given the vast body of knowledge covered by the exam and the lack of formal training, especially "book" training, of many who took the exam. It is not uncommon for pass rates for those grandfathering training requirements into a new subspecialty to be much lower. We will have to see how the pass rate changes going forward, and it will be especially interesting when those completing ACGME-accredited fellowships start taking the exam.

One bit of advice I definitely give to any physician who meets the practice pathway qualifications (or can do so by 2017) is to sit for the exam before the end of grandfathering period. After that time, the only way to become certified in the subspecialty will be to complete a two-year, on-site, ACGME-accredited fellowship. While we were excited to be the third program nationally to launch a fellowship at OHSU, it will be a challenge for those who are mid-career, with jobs, family, and/or geographical roots, to up and move to become board-certified.

But starting in 2018, board certification for physicians not able to pursue fellowships will become much more difficult. There are many categories of individuals for whom getting certified in the subspecialty after the grandfathering period will be a challenge:
  • Those who are mid-career - I have written in the past that the age range of OHSU online informatics students, including physicians, is spread almost evenly across all ages up to 65. Many physicians transition into informatics during the course of their careers, and not necessarily at the start.
  • Those pursuing research training in informatics, such as an NLM fellowship or, in the case of some of our current students, in an MD/PhD program (and will not finish their residency until after the grandfathering period ends) - Why must these individuals also need to pursue an ACGME-accredited clinical fellowship to be eligible for the board exam?
  • Those who already have had long medical training experiences, such as subspecialists with six or more years of training - Would such individuals want to do two additional years of informatics when, as I recently pointed out, it might be an ideal experience for them to overlay informatics and their subspecialty training?
Fortunately, one option for physicians who are not eligible for board exam will be the Advanced Health Informatics Certification being developed by AMIA. This certification will be available to all practitioners of informatics trained at the master's level and higher. It will also provide a pathway for physicians who are not eligible for the board certification pathway. I am looking forward to AMIA releasing its detailed plans for this certification, not only for these physicians but also other practitioners of informatics.

However, I also hold out hope for the ideal situation for physician-informaticians, which in my opinion will be our own specialty. The work of informatics carried out by physicians is unique and not really dependent on their initial clinical specialty (or lack of one at all). I still believe that robust training is required to be an informatician; I just don't believe it needs to be a two-year, in-residence experience. An online master's degree or something equivalent, with a good deal of experiential learning in real-world settings, should be an option. The lack of these sorts of options will keep many talented physicians from joining the field. Such training would also be consistent with the 21st century knowledge workforce that will involve many career transitions over one's working lifetime.

Saturday, February 13, 2016

Health IT and the Limits to Analogies

Many who write and talk about health IT, including myself, are fond of using analogies. One of the most common analogies that we use is that of the banking industry. I have noted that I can insert my Wells Fargo ATM card into just about any ATM in the world and receive out local currency. This is all made possible by a standard adopted worldwide by the banking industry. Of course, there is another reason for banking interoperability that does not exist in healthcare, which is that the financial incentives are all aligned. Each time we make an ATM transaction, a fee goes to both the bank that owns the ATM and (if the machine is owned by a bank different from our own) our bank. While most of us grumble about ATM fees, we usually pay them, not only because we have to, but also because of the convenience.

Another common analogy we use for how health IT could be better is to discuss the aviation industry. There is no question that healthcare could learn more from not only the IT of the aviation industry, but also the relationships between all the players who insure that planes take off and land safely [1]. With regards to the IT of aviation, there is definitely more human factors and usability analysis that go into the design of cockpit displays for flying these complex machines than is done in healthcare.

An addition to the analogy list I hear with increasing frequency is the smartphone. In particular, many ask, why can’t the electronic health record (EHR) be as simple as a smartphone? Again, there is much to learn from the simplicity and ease of use of smartphones, especially their organization as allowing “substitutable” apps on top of a common data store and set of features, such as GPS [2]. However, there are also limitations to the smartphone analogy. First, the uses of the EHR are much more complex than most smartphone apps. There is a much larger quantity and diversity of data in the patient’s record. Second, the functions of viewing results, placing orders, and other actions are much more complex than our interactions with simple apps.

I look at my own smartphone usage and note that I spend a great deal of time (probably too much) using it. But there are many things I do with my laptop that I cannot do with my smartphone. For example, my phone is fine for reading email and typing simple replies. However, composing longer replies or working with attachments is not feasible, at least for me, on my phone. Likewise, writing documents, creating presentations, and carrying out other work requiring more than a small screen is also not possible on my phone with its limited screen, keyboard, and file storage capabilities.

While a key challenge of informatics is to make the EHR simpler and easier to use, it will never approach the simplicity of a highly focused smartphone app. Analogies can be helpful in elucidating problems, but we also must recognize their limitations.


1. Pronovost, PJ, Goeschel, CA, et al. (2009). Reducing health care hazards: lessons from the commercial aviation safety team. Health Affairs. 28: w479-w489.
2. Mandl, KD, Mandel, JC, et al. (2012). The SMART Platform: early experience enabling substitutable applications for electronic health records. Journal of the American Medical Informatics Association. 19: 597-603.

Monday, February 1, 2016

60 Years of Informatics: In the Context of Data Science

Like many academic health science universities, my institution has undertaken a planning process around data science. In the process of figuring how to merge our various data-related silos, we tried to look at what other universities were doing. One high-profile effort has been launched at the University of Michigan, and the formation of their program and those of others inspired a statistician, David Donoho, to look at data science from the purview of his field 50 years after famed statistician John Turkey had called for reformulation of the discipline into a science of learning from data. Donoho’s resulting paper [1] motivated me to look at data science from the purview of my field, biomedical and health informatics.

Statistics has of course been around for centuries, although this author drew from an event 50 years ago, a lecture by George Tukey. The informatics field has not been in existence for as many centuries, but one summary of its history by Fourman credits the origin of the term to Philip Dreyfus in 1962 [2]. However, the Wikipedia entry for informatics attributes the term to a German computer scientist Karl Steinbuch in 1956. Fourman also notes that the heaviest use of the term informatics comes from its attachment to various biomedical and health terms [2].

If the informatics field is indeed 60 years old, I have been working in it for about half of its existence, since I started my National Library of Medicine (NLM) medical informatics fellowship in 1987. I have certainly devoted a part of my career to raising awareness of the term informatics, making the case for it as a discipline [3]. Clearly the discipline has become recognized, with many academic departments, mostly in health science universities, and a new physician subspecialty devoted to it [4].

And now comes data science. What are we in informatics to make of this new field? Is it the same as informatics? If not, how does it differ? I have written about this before.

Donoho’s paper does offer some interesting insights [1]. I get a kick out of one tongue-in-cheek definition he gives of a data scientist, whom he defines as a “person who is better at statistics than any software engineer and better at software engineering than any statistician.” Perhaps we could substitute informatician for software engineer, i.e., a data scientist is someone who is better at statistics than any informatician and is better at informatics than any statistician?

Donoho does later provide a more serious definition of data science, which is that it is “the science of learning from data; it studies the methods involved in the analysis and processing of data and proposes technology to improve methods in an evidence-based manner.” He goes on to further note, “the scope and impact of this science will expand enormously in coming decades as scientific data and data about science itself become ubiquitously available.”

Donoho goes on to note six key aspects (he calls them “divisions” of “greater data science”) that I believe further serve to define the work of the field:
  • Data Exploration and Preparation
  • Data Representation and Transformation
  • Computing with Data
  • Data Modeling
  • Data Visualization and Presentation
  • Science about Data Science
Clearly data is important to informatics. But is it everything? We can being to answer this question by thinking about the activities of informatics where data, at least not “Big Data,” is not central. While I suppose it could be argued that all applications of informatics make use of some amount of data, there are aspects of those applications where data is not the central element. Consider the many complaints that have emerged around the adoption of electronic health records, such as poor usability, impeding of workflow, and even concerns around patient safety [5]. Academic health science leaders can lead the charge in use of data but must do so in the context of a framework that protects the rights of patients, clinicians, and others [6].

Like many informaticians, I do remain enthusiastic for the prospect of the growing quantity of data to advance our understanding of human health and disease, and how to treat the latter better. But I also have some caveats. I have concerns that some data scientists read too much into correlations and associations, especially in the face of so much medical data capture being imprecise, our lack of adoption of standards, and its inaccessibility when not structured well (which can lead us to try to “unscramble eggs”).

It is clear that informatics cannot ignore data science, but our field must also be among the leaders in determining its proper place and usage, especially in health-related areas. We must recognize the overlap as well as appreciate the areas where informatics can be synergistic with data science.


1. Donoho, D (2015). 50 years of Data Science. Princeton NJ, Tukey Centennial Workshop. https://dl.dropboxusercontent.com/u/23421017/50YearsDataScience.pdf.
2. Fourman, M (2002). Informatics. In International Encyclopedia of Information and Library Science, 2nd Edition. J. Feather and P. Sturges. London, England, Routledge: 237-244.
3. Hersh, W (2009). A stimulus to define informatics and health information technology. BMC Medical Informatics & Decision Making. 9: 24. http://www.biomedcentral.com/1472-6947/9/24/.
4. Detmer, DE and Shortliffe, EH (2014). Clinical informatics: prospects for a new medical subspecialty. Journal of the American Medical Association. 311: 2067-2068.
5. Rosenbaum, L (2015). Transitional chaos or enduring harm? The EHR and the disruption of medicine. New England Journal of Medicine. 373: 1585-1588.
6. Koster, J, Stewart, E, et al. (2016). Health care transformation: a strategy rooted in data and analytics. Academic Medicine. Epub ahead of print.

Monday, January 25, 2016

Biomedical Data Science Needs Measures of Information Density and Value

I wrote recently that one of my concerns for data science is the Big Data over-emphasis on one of its four Vs, namely volume. Since then, I was emailing with Dr. Shaun Grannis and other colleagues from the Indiana Health Information Exchange (IHIE). I asked them about size of their data for near 6 billion clinical observations from the 17 million patients in their system. I was somewhat surprised to hear that the structured data only takes up 26 terabytes. I joked that I almost have that much disk storage lying around my office and home. That is a huge amount of data, but some in data science seem to imply that data sizes that do not seem to start with at least “peta-” are somehow not real data science.

Of course, imaging and other binary data add much more to the size of the IHIE data, as will the intermediate products of various processing that are carried out when doing analysis. But it is clear that the information “density” or “value” contained in that 26 terabytes is probably much higher than a comparable amount of binary (e.g., imaging, genome, etc.) data. This leads me to wonder whether we should be thinking about how we might measure the density or value of different types of biomedical and health information, especially if we are talking about the Vs of Big Data.

The measurement of information is decades old. Its origin is attributed to Shannon and Weaver from a seminal publication in 1949 [1]. They defined information as the number of forms a message could take. As such, a coin flip has 2 bits of information (heads or tails), a single die has 6 bits, and a letter in the English language has 26 bits. This measure is of course simplistic in that it assumes the value of each form in the message is equal. For this reason, others such as Bar Hillel and Carnap began adding semantics (meaning) that, among other things, allowed differing values for each form [2].

We can certainly think of plenty of biomedical examples where the number of different forms that data can take yields widely divergent value of the information. For example, the human genome contains 3 billion nucleotide pairs, each of which can take 4 forms. Uncompressed, and not accounting for the fact a large proportion is identical across all humans [3], this genome by Shannon and Weaver’s measure would have 12 billion bits of information. The real picture of human genomic variation is more complex (such as through copy number variations), and the point is that there is less information density in the huge amount of data in a genome than in, say, a short clinical fact, such as a physical exam finding or a diagnosis.

By the same token, images also have different information density than clinical facts. This is especially so as the resolution of digital images continues to increase. There is certainly value in higher-resolution images, but there are also diminishing returns in terms of the information value. Doubling or quintupling or any other increase of pixels or their depth will create more information as measured by Shannon and Weaver’s formula but not necessarily provide more value of that information.

Even clinical data may have diminishing returns based on its size. Some interesting work from OHSU faculty Nicole Weiskopf and colleagues demonstrates an obvious finding but one that has numerous implications for secondary use of clinical data, which is that sicker patients have more data in the electronic health record (EHR) [4-5]. The importance of this is that sicker patients may be “oversampled” in clinical data sets and thus skew secondary analysis by over-representing patients who have received more healthcare.

There are a number of implications for increasing volumes of data that we must take into consideration, especially when using such data for purposes for which it was not collected. This is probably true for any Big Data endeavor, where the data may be biased by the frequency and depth of its measuring. The EHR in particular is not a continuous sampling of a patient’s course, but rather represents periods of sampling that course. With the EHR there is also the challenge that different individual clinicians collect and enter data differently.

Another implication of data volumes is its impact on statistical significance testing. This is one form of what many criticize in science as “p-hacking,” where researchers modify the presentation of their data in order to achieve a certain value for the p statistic that measures the likelihood that differences are not due to chance [6]. Most researchers are well aware that their samples must be of sufficient size in order to achieve the statistical power to attain a significant difference. However, on the flip side, it is very easy to obtain a p value that shows small, perhaps meaningless, differences are statistically significant when one has very large quantities of data.

The bottom line is that as we think about using data science, certainly in biomedicine and health, and the development of information systems to store and analyze it, we must consider the value of information. Just because data is big does not mean it is more important than when data is small. Data science needs to focus on all types and sizes of data.


1. Shannon, CE and Weaver, W (1949). The Mathematical Theory of Communication. Urbana, IL, University of Illinois Press.
2/ Bar-Hillel, Y and Carnap, R (1953). Semantic information. British Journal for the Philosophy of Science. 4: 147-157.
3. Abecasis, GR, Auton, A, et al. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature. 491: 56-65.
4. Weiskopf, NG, Rusanov, A, et al. (2013). Sick patients have more data: the non-random completeness of electronic health records. AMIA Annual Symposium Proceedings 2013, Washington, DC. 1472-1477.
5. Rusanov, A, Weiskopf, NG, et al. (2014). Hidden in plain sight: bias towards sick patients when sampling patients with sufficient electronic health record data for research. BMC Medical Informatics & Decision Making. 14: 51. http://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/1472-6947-14-51.
6. Head, ML, Holman, L, et al. (2015). The extent and consequences of p-hacking in science. PLoS Biology. 13: e1002106. http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002106.