This fascination guided my early research interests in the area of information retrieval. I write about it now because every spring I teach my course on this topic in the OHSU graduate program, BMI 514/614. (Hence the title of this posting.) My interest in this area resulted in dozens of scientific papers and a textbook, currently in its third edition [1]. Despite the marvel I have for today's modern systems, I always have to ask myself, Why didn't I think of the idea of ranking the output (Web pages) by how many other pages pointed to them? Had I thought of that before a couple Stanford graduate students named Brin and Page, my life might be considerably different. Or at least my wealth!
I suppose one is getting up in the years when you marvel at how things are now relative to how you remember them. I certainly recall "searching" when I was in medical school in the 1980s, which involved thumbing through the giant Index Medicus books on long shelves in the library. You would "link" to the full text by walking to a different part of the library where the journals were. If your needs were really critical, you could call on a librarian for help, who would take your request to a special computer that accessed a database somewhere (which happened to be MEDLINE, from the National Library of Medicine).
I actually did my first on-line searching in the 1980s. I was able to access PaperChase, and later Elhill, through dial-up networks, though at a price. For an even heftier price, you could get access to the full text … at least "text" in monospaced font and no figures or images. The world did advance, and by 1998 you could search Pubmed for free. (Al Gore, who actually deserves more credit in this area than his critics deny him, did the first "free" search.)
Now, of course, searching is ubiquitous. You can't even not do it, since most browsers will throw you into a search engine when you type in an invalid Web address (URL) into your browser. And the world not only searches, but searches for health information. The two major periodic surveys of health information searching show that 80% of Internet users have searched for health information for themselves, their family, or their friends [2, 3].
Of course, like many areas of informatics, while use of systems is ubiquitous, not all of the problems of systems are solved. Indeed, a few years ago I wrote a short piece on this topic [4]. As wonderful as today's search systems are, we still have many areas for improvement. In that paper, I identified four areas where grand challenges remained:
- Content - getting diverse users to the right information for the right task
- Indexing - developing better metadata to get searchers to that proper content
- Linkage - allowing navigation across multiple resources, even those of different publishing entities
- Access - making access as open as possible but still being protective of intellectual property
References
Hersh, W. (2009). Information Retrieval: A Health and Biomedical Perspective (3rd Edition). New York, NY. Springer.
Fox, S. (2011). Health topics. Washington, DC, Pew Internet & American Life Project. http://www.pewinternet.org/~/media//Files/Reports/2011/PIP_HealthTopics.pdf.
Taylor, H. (2010). "Cyberchondriacs" on the Rise? Those who go online for healthcare information continues to increase. Rochester, NY, Harris Interactive. http://www.harrisinteractive.com/vault/HI-Harris-Poll-Cyberchondriacs-2010-08-04.pdf.
4. Hersh, W. (2008). Ubiquitous but unfinished: grand challenges for information retrieval. Health Information and Libraries Journal, 25(Suppl 1): 90-93.
Tom, you are referring to the PCAST report, for which I have provided my thoughts in another post (http://informaticsprofessor.blogspot.com/2011/03/pcast-report-whats-big-deal.html).
ReplyDeleteThe PCAST universal exchange language (UEL) refers to a granular method for representing clinical data. The field of IR (search) has historically focused on what I call "knowledge-based information," which I distinguish from "patient-based information." The former focuses on the knowledge (science) base of health and biomedicine, whereas the latter represents information about patients.
Of course, moving forward, the distinction between the two may become less distinct, especially as the "learning healthcare system" takes hold.
I do cover some aspects of clinical/patient data in my course, especially the natural language processing (NLP) aspects, which I also covered in a post here (http://informaticsprofessor.blogspot.com/2011/03/natural-language-processing-dream-that.html).
I do agree that these issues are important going forward, though the UEL concept needs a lot more work before it can be operationalized, as the ONC HIT Policy Committee agrees.