Sunday, January 22, 2017

Response to Request for Information (RFI): Strategic Plan for the National Library of Medicine, National Institutes of Health

Under the leadership of its new Director, Patricia Brennan, PhD, RN, the National Library of Medicine (NLM) is undertaking a strategic planning process to develop goals and priorities for the NLM going forward. This process builds on a Request for Information (RFI) in 2015 from the NLM Working Group of the Advisory Committee to the National Institutes of Health (NIH) Director (ACD) to obtain input for a report on a vision for the future of NLM in the context of NLM’s leadership transition and emerging NIH data science priorities. The report was released in 2015. I posted to this blog both the comments that I submitted for the report as well as an overview of the report after it was published.

The new RFI asks for comments on four themes:
  1. Role of NLM in advancing data science, open science, and biomedical informatics
  2. Role of NLM in advancing biomedical discovery and translational science
  3. Role of NLM in supporting the public’s health: clinical systems, public health systems and services, and personal health
  4. Role of NLM in building collections to support discovery and health in the 21st century
For each theme, respondents are asked to:
  1. Identify what you consider an audacious goal in this area – a challenge that may be daunting but would represent a huge leap forward were it to be achieved
  2. The most important thing NLM does in this area, from your perspective
  3. Research areas that are most critical for NLM to conduct or support
  4. Other comments, suggestions, or considerations, keeping in mind that the aim is to build the NLM of the future
In the remainder of this post, I will provide the comments I submitted to the RFI. I chose to limit my comments to the first of the four themes because the role of NLM is to advance the other themes – discovery, translation, and the public’s health – in the context of the first theme – namely the field of biomedical informatics, and data/open science within it.

a. Identify what you consider an audacious goal in this area – a challenge that may be daunting but would represent a huge leap forward were it to be achieved. Include input on the barriers to and benefits of achieving the goal.

I have chosen to focus my comments on the first of the four themes because the role of NLM is to advance the other themes – discovery, translation, and the public’s health – by advancing the first theme – namely the field of biomedical informatics, and data/open science within it. Therefore, the most audacious goal for all of NLM is to build and sustain the infrastructure of biomedical informatics, i.e., the people, technology, and resources to advance discovery, translation, and the public’s health.

Biomedical informatics must leverage both achievements that are new, such as digital and networking technologies, as well as goals that are enduring, such as improving individual health, healthcare, public health, and research. The NLM must promote, educate about, and fund biomedical informatics and related disciplines to the appropriate level they deserve in relation to the larger biomedical research enterprise. While research in domain-specific areas (e.g., cancer, cardiovascular, mental health) is important, biomedical informatics can provide fundamental tools to advance science in all domain-specific areas. To achieve this, we still need basic research in biomedical informatics itself, improving our knowledge and tools in many areas, including but not limited to human-computer interaction, natural language understanding, standards and interoperability, data quality, the intersection of people and organizational issues with information technology, workflow analysis, etc.

b. The most important thing NLM does in this area, from your perspective.

Although there are many institutes within NIH (e.g., NCI, NHLBI, and the Fogarty International Center) and other entities outside of NIH (e.g., AHRQ and PCORI) that fund research in informatics-related areas, NLM is the only entity that funds basic research in biomedical informatics. Most of the other institutes and entities that fund informatics support projects that are highly applied and/or domain-focused. These projects are important, but basic informatics research is also key to improving discovery, translation, and the public’s health.

The NLM is also unique in developing emerging technologies, some of which we cannot foresee now. When I was an NLM informatics postdoctoral fellow in the late 1980s, I could not have imagined the emergence of the World Wide Web, the wireless ubiquitous Internet, modern mobile devices, or the widespread adoption of electronic health records that we now have. There are likely new technologies coming down the road that few if any of us can predict that will have major impacts on health and healthcare. It is critical that the NLM and the research it supports enable these technologies to be put to optimal usage.

c. Research areas that are most critical for NLM to conduct or support.

Although it is critical for NLM to support research in biomedical informatics as applied to all areas of individual and public health and of healthcare and research, it is nearly unique in funding basic research in clinical informatics. A good deal of informatics research in the other NIH institutes is focused in basic science, e.g., genomics, bioinformatics, and computational biology. AHRQ and PCORI support clinical informatics research, but it is highly applied. Only NLM funds critical basic research in clinical informatics, and this function is vitally important as we strive to use informatics to achieve the triple aim of better health, improved healthcare, and reduced costs.

d. Other comments, suggestions, or considerations, keeping in mind that the aim is to build the NLM of the future.

Another critical function of NLM that has provided value and should be further augmented is its training programs for those who aspire to careers in informatics research. I count myself among many whose NLM fellowship training led to a successful career as a researcher, educator, and academician generally. NLM training grants have also provided support for my university to educate the next generation of informatics researchers who have gone on to become successful researchers and other leaders in the field.

A final problem is that I would like to see addressed is the name itself, "National Library of Medicine." This name does not connote all of what NLM does. Yes the NLM is a world-renowned biomedical library, and that function is critically important to continue. But NLM also provides cutting-edge research and training in informatics, and an ideal change for NLM would be a name change to something like the "National Biomedical and Health Informatics Institute," of which a robust and innovative National Library of Medicine would be a vital part.

I look forward to seeing other input to the new NLM strategic planning process and the resulting strategic plan that will set priorities going forward for this great public resource that has benefitted patients, the healthcare system, and students, faculty, and others who have worked in biomedical informatics to advance human health.

Friday, January 20, 2017

What is the Value of Those Who Create and Disseminate Knowledge?

There is an old adage, “Those who can’t do, teach.” (And Woody Allen’s further, “Those who can’t teach, teach gym.”) My usual retort is a quote from Aristotle, "Those that know, do. Those that understand, teach.”

But we seem to be entering an era where an individual’s worth is related mostly to his or her wealth. In addition, there are plenty of people, many of same mind-set, who are highly critical of academia, in particular of people whose livelihood involves creating and/or disseminating knowledge.

I am not uncritical of some aspects of the academic world in which I work, but I am even more aghast toward those who believe it to be misguided or unnecessary.

In essence, my job involves the creation and dissemination of knowledge. This takes a certain skill set and collection of talents, just like any other knowledge-oriented job. I believe that this work is important to society and worthy of its investment, even though the lion’s share of the funding of my teaching work comes from learners who pay tuition.

My job is hardly stress-free. Academia is like most pursuits in life, where a certain amount of stress and competition is good, leading to productivity and innovation. And there are times when the stress and pursuit become counterproductive.

I owe a lot to subsidized public academia that has enabled my professional success in life. I attended public schools for my entire education, from kindergarten through medical school. When I started college at the University of Illinois in 1976, tuition was $293 per semester. Not per course or per credit, but for all of the courses I took that term. Even medical school, also at University of Illinois, was relatively inexpensive for me, with tuition around $3000 per year when I started in 1980. I am not against students have some “skin in the game” in higher education, but it must be within the means of anyone who wants to pursue it. By the same token, I believe that we in academia need to be accountable in providing a skill set that enables individuals to succeed in their chosen careers.

I am extremely gratified to have an academic job that I mostly enjoy going to each day. While most higher education faculty positions have a combination of research, teaching, and service, I have found my most passion in teaching. I particularly enjoy, and have received feedback from others, that I have a knack for taking bodies of knowledge and distilling out the big themes and most salient facts. I do also enjoy research and building on the synergy of the two that characterizes optimal higher education. I make a good salary as a department chair at a public medical school. I could certainly make more money in other pursuits, but I have had plenty to live comfortably, save for retirement, send my children to college, and handle unexpected expenses.

I don’t begrudge rich people their wealth, especially those who earned it from modest beginnings and/or by producing things that truly benefit society. But wealth is hardly the only measure of a person’s contributions and value to society, and there must always be a role for those who create and disseminate knowledge.

Thursday, January 12, 2017

What is the Right Approach to Sharing Clinical Research Data?

While many people and organizations have long called for data from randomized clinical trials (RCTs) and other clinical research to be shared with other researchers for re-analysis and other re-use, the impetus for it accelerated about a year ago with two publications. One was a call by the International Committee of Medical Journal Editors (ICMJE) for de-identified data from RCTs to be shared as condition of publication [1]. The other was the publication of an editorial in the New England Journal of Medicine wondering whether those who do secondary analysis of such data were “research parasites” [2]. The latter set off a fury of debate across the spectrum, e.g. [3], from those who argued that primary researchers labored hard to devise experiments and collect their data, thus having claim to control over it, to those who argued that since most research is government-funded, the taxpayers deserve to have access to that data. (Some of those in the latter group proudly adopted the “research parasite” tag.)

Many groups and initiatives have advocated for the potential value of wider re-use of data from clinical research. The cancer genomics community has long seen the value of a data commons to facilitate sharing among researchers [4]. Recent US federal research initiatives, such as the Precision Medicine Initiative [5] and the 21st Century Cures program [6] envision an important role for large repositories of data to accompany patients in cutting-edge research. There are a number of large-scale efforts in clinical data collection that are beginning to accumulate substantial amounts of data, such as the National Patient-Centered Clinical Research Network (PCORNet) and the Observational Health Data Sciences and Informatics (OHDSI) initiative.

As with many contentious debates, there are valid points on both sides. The case for requiring publication of data is strong. As most research is taxpayer-funded, it only seems fair that those who paid are entitled to all the data for which they paid. Likewise, all of the subjects were real people who potentially took risks to participate in the research, and their data should be used for discovery of knowledge to the fullest extent possible. And finally, new discoveries may emerge from re-analysis of data. This was actually the case that prompted the Longo “ esearch parasites” editorial, which was praising the “right way” to do secondary analysis, including working with the original researchers. The paper that the editorial described had discovered that the lack of expression of a gene (CDX2) was associated with benefit from adjuvant chemotherapy [7].

Some researchers, however, are pushing back. They argue that those who carry out the work of designing, implementing, and evaluating experiments certainly have some exclusive rights to the data generated by their work. Some also question whether the cost is a good expenditure of limited research dollars, especially since the demand for such data sets may be modest and the benefit is not clear. One group of 282 researchers in 33 countries, the International Consortium of Investigators for Fairness in Trial Data Sharing, notes that there are risks, such as misleading or inaccurate analyses as well as efforts aimed at discrediting or undermining the original research [8]. They also express concern about the costs, given that there are over 27,000 RCTs performed each year. As such, this group calls for an embargo on reuse of data for two years plus another half-year for each year of the length of the RCT. Even those who support data sharing point out the requirement for proper curation, wide availability to all researchers, and appropriate credit to and involvement of those who originally obtained the data [9].

There are a number of challenges to more widespread dissemination of RCT data for re-use. A number of pharmaceutical companies have begun making such data available over the last few years. Their experience has shown that the costs are not insignificant (estimated to be about $30,000-$50,000 per RCT) and a scientific review process is essential [10]. Another analysis found that the time to re-analyze data sets can be long, and so far the number of publications have been few [11]. An additional study found that identifiable data sets were only explicitly visible from 12% of all clinical research funded by the National Institutes of Health in 2011 [12]. This means that from 2011 alone, there are possibly more than 200,000 data sets that could be made publicly available, indicating some type of prioritization might be required.

There are also a number of informatics-related issues to be addressed. These not only include adherence to standards and interoperability [13], but also attention to workflows, integration with other data, such as that from electronic health records (EHRs), and consumer/patient engagement [14]. Clearly the trialists who generate the data must be given incentives for their data to be re-used [15]. My own work assessing the caveats of re-using EHR data is somewhat applicable here too, in that even RCT data may not have the breadth of data or cover sufficient periods of time for additional analyses [16].

There is definitely great potential for re-use of RCT and other clinical research data to advanced research and ultimately health and clinical care for the population. However, it must be done in ways that represent an appropriate use of resources and result in data that truly advances research, clinical care, and ultimately individual health.

1. Taichman, DB, Backus, J, et al. (2016). Sharing clinical trial data: a proposal from the International Committee of Medical Journal Editors. New England Journal of Medicine. 374: 384-386.
2. Longo, DL and Drazen, JM (2016). Data sharing. New England Journal of Medicine. 374: 276-277.
3. Berger, B, Gaasterland, T, et al. (2016). ISCB’s initial reaction to The New England Journal of Medicine Editorial on data sharing. PLoS Computational Biology. 12(3): e1004816.
4. Grossman, RL, Heath, AP, et al. (2016). Toward a shared vision for cancer genomic data. New England Journal of Medicine. 379: 1109-1112.
5. Collins, FS and Varmus, H (2015). A new initiative on precision medicine. New England Journal of Medicine. 372: 793-795.
6. Kesselheim, AS and Avorn, J (2017). New "21st Century Cures" legislation: speed and ease vs science. Journal of the American Medical Association. Epub ahead of print.
7. Dalerba, P, Sahoo, D, et al. (2016). CDX2 as a prognostic biomarker in stage II and stage III colon cancer. New England Journal of Medicine. 374: 211-222.
8. Anonymous (2016). Toward fairness in data sharing. New England Journal of Medicine. 375: 405-407.
9. Merson, L, Gaye, O, et al. (2016). Avoiding data dumpsters — toward equitable and useful data sharing. New England Journal of Medicine. 374: 2414-2415.
10. Rockhold, F, Nisen, P, et al. (2016). Data sharing at a crossroads. New England Journal of Medicine. 375: 1115-1117.
11. Strom, BL, Buyse, ME, et al. (2016). Data sharing — is the juice worth the squeeze? New England Journal of Medicine. 375: 1608-1609.
12. Read, KB, Sheehan, JR, et al. (2015). Sizing the problem of improving discovery and access to NIH-funded data: a preliminary study. PLoS ONE. 10(7): e0132735.
13. Kush, R and Goldman, M (2016). Fostering responsible data sharing through standards. New England Journal of Medicine. 370: 2163-2165.
14. Tenenbaum, JD, Avillach, P, et al. (2016). An informatics research agenda to support precision medicine: seven key areas. Journal of the American Medical Informatics Association. 23: 791-795.
15. Lo, B and DeMets, DL (2016). Incentives for clinical trialists to share data. New England Journal of Medicine. 375: 1112-1115.
16. Hersh, WR, Weiner, MG, et al. (2013). Caveats for the use of operational electronic health record data in comparative effectiveness research. Medical Care. 51(Suppl 3): S30-S37.