Friday, December 30, 2016

A Different Annual Reflection For This Past Year

Every year since the inception of this blog, my last posting of the year has been a reflection looking back over the year that is ending. This year’s reflection marks the completion of eight years of this blog, and writing this year’s posting feels different. This is no doubt because this blog has been very much tied into the key events of informatics over the last decade, in particular the Health Information Technology for Economic and Clinical Health (HITECH) Act and other actions emanating from the Presidency of Barack Obama. This has been a period of activist government with respect to our field, and now the US electorate (at least according to the rules of the Electoral College) has chosen a different path going forward.

Fortunately, the need for informatics is not going away. Even if the Affordable Care Act is repealed, the underlying problems in healthcare that led to its passage are still a challenge. Healthcare in the US is still the most fragmented, expensive, and inefficient of any country in the world. This does not mean I would want to get seriously ill anywhere else in the world, but I still believe there is also an ethical imperative to provide basic healthcare to all citizens in the least costly manner. Medicine is supposed to be a calling for physicians, and not just a job. Although I no longer care for patients directly, I view my work as a physician-informatician to support the delivery of more universal and efficient care by supporting the data, information, and knowledge needs of healthcare delivery and patients.

Informatics also supports other aspects of health that will also continue to be important even if reform of the US healthcare delivery system takes different directions. Informatics should support the health of the population through public health. It can support expansion of our knowledge and best practices by enhancing basic, clinical, and translational research. It can extend the reach of healthcare through telehealth and telemedicine. And because the US is still a prosperous nation to whom many still look for leadership, we can share our knowledge and tools for better health and healthcare with our fellow planetary citizens around the world, especially clinical and informatics professionals.

As for the blog itself, it continues to thrive. I am always gratified when people tell me they find it a valuable source of information, especially for key topics in the application of informatics as well as for issues for people seeking to start or advance careers in the field. The number of page views continues to increase, and in this last month, the total barreled through the 400,000 mark for the (including this) 267 posts I have made over the eight years. I have no plans to change anything with my approach to the blog any time soon.

There is no question that for people who work in academia, in research, and in health IT that there is uncertainty as to the future. Nonetheless, I am grateful that I have a loving family, wonderful colleagues, and a great many other friends who bring happiness and stability to my life.

Wednesday, December 28, 2016

Benchmarks to Assess the New President

According to the rules of US elections, Donald Trump won the Presidency and Republicans control the Senate and House of Representatives. I respect that.

This does not mean, however, that Trump and his party have any sort of mandate. Not only did Trump lose the popular vote by about 2.8 million votes (48.2%-46.2%), but he won three large states (Pennsylvania, Michigan, and Wisconsin) that would have swung the election to Hillary Clinton with a combined total of only 77,000 votes. And while this year there was a narrow overall majority of votes for House Republican candidates this year, in recent years there has been a majority of votes for Democratic candidates despite a hefty majority of seats filled by Republicans, owing to gerrymandering. There were also more popular votes for Democratic Senate candidates this year, although it was somewhat anomalous due to the California Senate race being a run-off between two Democratic candidates.

There is no question that this election was tilted to Mr. Trump by the growing number of mostly white working class people who have been left behind by both economic and social changes in our society. His election was also aided by an unknown amount from Russian hacking, fake news, and the questionable decision by the FBI to raise its investigation of the email issue in the last weeks of the campaign. This was a candidate who set new records for fact-checkers disputing his statements and had a large following of people who believed those falsehoods.

As such, the outcome of this election is anything but a mandate for Donald Trump. Yes, he did obtain more electoral votes than Hillary Clinton, but his victory was extremely narrow, and he and other Republicans need to be careful of overreach. This is especially true since it is not clear that Mr. Trump really stands for the kind of people and their views that he is installing in his political leadership. (It is often not clear what he stands for at all, since his governing philosophy is not very detailed or consistent.)

But the new Republican majority may find it harder to improve upon the economic situation than what they have been handed. The US economy certainly still has a number of problems, especially income inequality and technology that is changing the nature of work, especially manual work. By most measures, however, the US economy is actually doing well. We finish the year, and President Obama’s second term, with strong economic growth (Gross Domestic Product [GDP] up at a 3.5% annual rate last quarter and being positive most of his second term), low unemployment (currently 4.6%, nearly full employment), low inflation, and a booming stock market (Dow Jones Industrial Average closing in on 20,000). Gas prices are low and the proportion of people lacking health insurance is lower than it has been in decades.

I believe an important task is to hold President Trump accountable. We will want to see how he adheres to his conflicting campaign pledges and the results of those policies when they are implemented. This includes promises to massively slash taxes, increase defense and infrastructure spending, make no cuts to Medicare or Social Security, build a border wall and deport 11 million people, renegotiate trade deals and implement tariffs if necessary, and come up with "something better" as the Affordable Care Act is repealed. While I disagree with many of these actions, it will be important to see whether Mr. Trump carries them out, and if he does, what is their impact.

Even though a good deal of what Mr. Trump says bothers many of us, I believe it will be more important to look at his actions. I hope he will especially be held accountable by those who are not conservative ideologues, such as workers who have been displaced from coal-mining and manufacturing jobs and those who don’t believe that their new health insurance they have received through the Affordable Care Act will be taken away. I also hope the impact of his policies on the environment, including climate change, will be objectively measured. And, of course, an objective assessment of a foreign policy administered via Twitter.

While I believe Mr. Trump should be judged more for his actions and their outcomes, I don't think he should be let off the hook for his words either. This includes all the vitriol he spread through the years of President Obama, from stoking the fires of the birther movement to making false statements on the economy. Despite attempts to "unify" the electorate after a divisive election, we cannot forget Mr. Trump's insults and lies about individual people and of groups, from women to Muslims to Mexicans. I still shake my head in amazement when people are asked to not take everything Trump said during the campaign literally, that it is legitimate to enter some sort of "post-truth" era., or that a good proportion of his

In the end, a President is not responsible for everything that happens on his or her watch. But for a narcissistic individual who takes credit for things that go right, even when that credit is not deserved, we should also hold him or her to objective measures of performance as well. While Mr. Trump has mastered the neutering of the press through social media and other means, I hope that responsible journalism will rise to the task and objectively report the impact of the words and policies that emanate from his Presidency and his political party.

Thursday, December 8, 2016

Coping With Adversarial Information Retrieval in Modern Times

When I first chose my area of research focus in my postdoctoral fellowship in biomedical informatics in the late 1980s, I was intrigued by information retrieval (IR; also known as search). While most in informatics were still focused on artificial intelligence and expert systems, I was fascinated by the notion that computers could provide information in response to users entering text. At that time, of course, there were only modest amounts of information to retrieve. The main source was bibliographic databases such as MEDLINE. While the full text of journals and even some textbooks was starting to become available, it was mostly text and not figures or images.

The world of search started to change with the advent of the World Wide Web in the early 1990s. I had actually been skeptical that the Web could even deliver more than text in real-time, given how slow the Internet was at that time. This was also a time when my colleagues at Oregon Health & Science University (OHSU) started putting on continuing medical education (CME) courses for physicians about the growing amount of information available (including via CD-ROM drives). But when we taught about searching the Web, we presented many caveats, especially because there was no control over the quality of information [1].

A related happening about this same time was the growth of spam email [2]. In the 1980s and even into the early 1990s, the only real users of Internet email were academics and techies. But as the Web and underlying Internet spread to broader populations, so did spam email, especially because it was so easy to reach massive numbers of people.

These developments all gave rise to the notion of “adversarial” IR, something that was initially difficult to fathom when we were trying to develop the most effective methods to provide access to the highest quality information available [3]. But as content emerged that we hoped users would not retrieve, there started an additional focus in IR that considered ways to avoid providing users the worst information.

One advance that improved the ability of Web searching to retrieve high-quality material was Google and its PageRank algorithm. A major change pioneered by Google was to rank results based not on measures of similarity between words in the query and page, at the time considered to be our best approach, but instead by how many other pages pointed to them. While not perfect, the number of links to a page is indeed associated with its quality, e.g.,, more pages will point to those from the National Library of Medicine or Mayo Clinic than a less credible site.

Of course, this situation resulted in a number of other consequences, not the least of which was the emergence of search engine optimization (SEO), enabling people to fight against PageRank and related algorithms [5]. It also set off a tit-for-tat battle of search engine sites hiring armies of engineers to figure out how people were trying to game their systems [6]. In more recent years, the emergence of new information streams, most notably the Facebook newsfeed, has provided new opportunities and led to the proliferation of “fake news” attributed to impacting the recent US president election [7].

While technology will play some role in solving the adversarial IR problem, it will not succeed by itself. Clever programmers and others will likely always find ways to exploit approaches to limiting the spread of false or incorrect information. The sheer volume of such information makes human intervention an unlikely solution, and of course one person’s high quality information is another person’s trash heap.

The main way to solve the problem, however, is through education. It is all part of basic modern information literacy everyone must have in the 21st century. Just as I have argued that statistics should be a topic taught in high school if not earlier, so should modern information literacy, including related to health. While there will always be shades of gray in terms of information quality, people can and should be taught how to recognize that which is flagrantly false.

I hope we will learn from fake news, newer variants of spam email such as phishing, and other risks of the Internet era that we must train society to better understand our new information ecosystem, and how we can benefit from its value while minimizing its risk.


1. Hersh, WR, Gorman, PN, et al. (1998). Applicability and quality of information for answering clinical questions on the Web. Journal of the American Medical Association. 280: 1307-1308.
2. Goodman, J, Cormack, GV, et al. (2007). Spam and the ongoing battle for the inbox. Communications of the ACM. 50(2): 25-33.
3. Castillo, C and Davison, BD (2011). Adversarial Web Search. Delft, Netherlands, now Publishers.
4. Brin, S and Page, L (1998). The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems. 30: 107-117.
5. Anonymous (2015). The Beginner's Guide to SEO. Seattle, WA, SEOmoz.
6. Singhal, A (2004). Challenges in Running a Commercial Web Search Engine. Mountain View, CA, Google.
7. Davis, W (2016). Fake Or Real? How To Self-Check The News And Get The Facts. Washington, DC, National Public Radio.