Wednesday, February 7, 2024

Translational AI: A Necessity and Opportunity for Biomedical Informatics and Data Science

How much of the hype for artificial intelligence (AI) that will truly impact health, healthcare, and research is an unknown. The potential benefits are unequivocal, from assistant patients pursue actions to improve their health to giving guidance to clinicians in diagnosis and treatment to helping researchers find information and devise new ideas to advance their research.

I have published an invited post in the National Library of Medicine (NLM) Musings from the Mezzanine Blog, the blog of the Director of the NLM. I chose to update some of my past writings posted in this blog with a new discussion of what I call translational AI.

The tl;dr is:

  • The actual day-to-day use of clinical AI in healthcare is still modest, according to surveys.
  • While there are thousands of machine learning model papers that are published, and many systematic reviews of those model papers, there are a much small number, probably on the order of 100, randomized controlled trials (RCTs) of AI interventions in healthcare.
  • Of those RCTs, not all have resulted in positive outcomes and a number of them have risk of bias concerns.

Clearly, as in all of healthcare, we cannot do RCTs on every permutation of model, implementation, setting, etc. of AI. However, we must treat AI the same way as any other tool we use in healthcare: Show us the evidence. Granted, evaluating the use of AI has plenty of differences from evaluating other interventions used in patient care, such as drugs and devices. It is difficult to conure a “placebo” for AI, and hard to perform controlled studies when AI, such as ChatGPT, is all around us.

Nonetheless, we can apply evidence-based medicine (EBM) to help inform its clinical use. The ideal way to do that is through randomized controlled trials (RCTs), or ideally systematic reviews of RCTs. As I note in the post, this is imperative not only for those of us who promote the use of AI and other biomedical and health informatics interventions, but also for students and trainees looking for projects to develop impactful research programs in their careers.

Tuesday, January 30, 2024

Whither Search? A New Perspective on the Impact of Generative AI on Information Retrieval (IR)

When I was putting the finishing touches on the 4th edition of my textbook on information retrieval (IR, also known as search) in the domain on biomedicine and health in 2020, I wondered whether the major problems in the field of IR were mostly solved. Retrieval systems such as Google for general Web searching and PubMed for the biomedical literature were robust and mature. One literally had the world’s written knowledge at their fingertips for general and biomedical topics from these systems respectively (even if paywalls did not always allow immediate access to the content).

There were certainly some areas of IR where additional work was needed and important, e.g., search over specific types of content such as social media or, in the case of my own research, electronic health record (EHR) data and text. There were also some nascent advances in the application of machine learning, although the gains in experimental results were more incremental than transformative.

But any staidness of IR was upended by the emergence of generally available generative artificial intelligence (AI) chatbots, based on large language models (LLMs), initially with ChatGPT and soon others to follow. Shortly thereafter came generative AI capabilities added to the two major Web search engines, Microsoft Bing and Google. All of a sudden, searching the Web was transformed in ways that most of us did not see coming.

I recently took advantage of the call for papers for a special issue devoted to ChatGPT and LLMs in biomedicine and health of the flagship journal for the field of informatics, JAMIA, to write a perspective piece on why search is still important, even in the era of generative AI. At least for me, while the answer to my question is important in a search, it is also critical to know where the information came from. In addition, as I am commonly synthesizing my own knowledge and views on a topic, I do not just want a single generative AI answer to my question but rather the source articles and documents so I can compare and contrast different views and develop my own answer.

At the close of the paper, I do acknowledge that there may well be areas of IR where generative AI may have major impact going forward. I know that there is a lot of buzz around retrieval-augmented generation (RAG), although for many of the questions on which I search, I am much more interested in generation-augmented retrieval (GAR?). That is, how can generative AI methods improve the way we search to steer us to the kinds of authoritative, originally sourced information we seek to carry out our work?

The day before the article was published, a reporter who came across my preprint wrote a piece on the impact of AI on search, noting some of the issues I raise with regards to accuracy and authority for search in fields like medicine and in academia.

The paper itself has been published in JAMIA as an Advance Article, Hersh W, Search still matters: information retrieval in the era of generative AI, Journal of the American Medical Informatics Association, 2024, ocae014. Unfortunately, the open-access publishing fee for JAMIA is fairly steep ($4125), especially for a short perspective piece like this, but those wanting to read it can access the preprint that I posted.