I have written over the years about the need for all who work in biomedical and health informatics to have appropriate knowledge and skills in data science, machine learning (ML), artificial intelligence (AI), and related topics. I am now excited to announce that our OHSU Biomedical Informatics Graduate Program is launching a new course in Applied Clinical Data Science and Machine Learning for Health & Clinical Informatics (HCIN) majors.
The goal of this new course is not to provide students with the mastery of ML and AI tools and techniques; rather, it is to provide a conceptual understanding of their practical application in health and biomedicine. The course is not meant to be a substitute for the sequence of courses available in the other major in our program, Bioinformatics & Computational Biomedicine (BCB), whose offerings delve far more into the theory, mathematics, and programming of these topics and include:
- BMI 551/651 - Statistical Methods
- BMI 531/631 - Probability and Statistical Inference
- BMI 543/643 - Machine Learning
- BMI 525/625 - Principles and Practice of Data Visualization
The new HCIN course will be focused on applied data science and machine learning, with a focus on clinical data sets as well as clinical issues and challenges in their application. While the course will have some programming activity (requiring Python programming as a prerequisite), it will focus on a hands-on, high-level view of the different types of machine learning methods and their applications. It will also cover the topics of data management and selection, pitfalls in building and deploying models, and critical appraisal of clinical machine learning literature. The course will aim to provide an in-depth understanding for those who will work alongside experts who develop, build models, implement, and evaluate machine learning applications in health and clinical settings.
The textbook for the course will be: Hoyt, R. and Muenchen, R. (Eds.), 2019. Introduction to Biomedical Data Science, Lulu.com. The course syllabus provides further details on the topics to be covered.
The content of the course will be based on a combination of what faculty and students believe is most important for a course like this. Among the topics that be included are:
- Data sources - electronic health records, registries (e.g., N3C, AllOfUs), patient-generated, social media, public health
- Data preparation (wrangling) - cleaning, quality analysis, feature selection, de-biasing
- Exploratory data analysis - summaries, correlations, visualizations
- Machine learning approaches and models - supervised, unsupervised, reinforcement, deep learning
- Software and tools available
- Common pitfalls and misunderstandings of applying machine learning
- Critical appraisal of clinical machine learning literature
- Ethical issues and challenges
The 3-credit course will be taught in the OHSU spring academic quarter, which runs from late March to early June. The lead instructors will be Steven Chamberlin, ND and myself, with other department faculty contributing. As with all courses in the HCIN major, it will be mostly online and asynchronous, with some option synchronous activities (which will be recorded for those not able to attend). This course will be different from to complementary to other data science-related courses in the HCIN major, including:
- BSTA 525 - Introduction to Biostatistics
- BMI 540/640 - Computer Science and Programming for Clinical Informatics
- BMI 544/644 - Databases
- BMI 524/624 - Data Analytics for Healthcare
- BMI 516/616 - Standards/Interoperability in Healthcare
- BMI 537/637 - Healthcare Quality
- BMI 525/625 - Principles and Practice of Data Visualization
I will be excited to see how this course is accepted and how it evolves based on feedback of students and others. I suspect there will be interest beyond our graduate program.