Skip to main content

Collaborative Data Science for Healthcare

Data and learning should be at the front and center of healthcare delivery. In this course, we bring together computer scientists, health providers and social scientists collaborating to improve population health by analyzing and mining data routinely collected in the process of patient care.

Collaborative Data Science for Healthcare

There is one session available:

After a course session ends, it will be archived.
Estimated 12 weeks
2–3 hours per week
Progress at your own speed
Optional upgrade available

About this course

Skip About this course

Research has been traditionally viewed as a purely academic undertaking, especially in limited-resource healthcare systems. Clinical trials, the hallmark of medical research, are expensive to perform, and take place primarily in countries which can afford them. Around the world, the blood pressure thresholds for hypertension, or the blood sugar targets for patients with diabetes, are established based on research performed in a handful of countries. There is an implicit assumption that the findings and validity of studies carried out in the US and other Western countries generalize to patients around the world.

This course was created by members of MIT Critical Data, a global consortium that consists of healthcare practitioners, computer scientists, and engineers from academia, industry, and government, that seeks to place data and research at the front and center of healthcare operations.

Big data is proliferating in diverse forms within the healthcare field, not only because of the adoption of electronic health records, but also because of the growing use of wireless technologies for ambulatory monitoring. The world is abuzz with applications of data science in almost every field – commerce, transportation, banking, and more recently, healthcare. These breakthroughs are due to rediscovered algorithms, powerful computers to run them, and most importantly, the availability of bigger and better data to train the algorithms. This course provides an introductory survey of data science tools in healthcare through several hands-on workshops and exercises.

Who this course is aimed at

The most daunting global health issues right now are the result of interconnected crises. In this course, we highlight the importance of a multidisciplinary approach to health data science. It is intended for front-line clinicians and public health practitioners, as well as computer scientists, engineers and social scientists, whose goal is to understand health and disease better using digital data captured in the process of care.

We highly recommend that this course be taken as part of a team consisting of clinicians and computer scientists or engineers. Learners from the healthcare sector are likely to have difficulties with the programming aspect while the computer scientists and engineers will not be familiar with the clinical context of the exercises and workshops.

The MIT Critical Data team would like to acknowledge the contribution of the following members: Aldo Arevalo, Alistair Johnson, Alon Dagan, Amber Nigam, Amelie Mathusek, Andre Silva, Chaitanya Shivade, Christopher Cosgriff, Christina Chen, Daniel Ebner, Daniel Gruhl, Eric Yamga, Grigorich Schleifer, Haroun Chahed, Jesse Raffa, Jonathan Riesner, Joy Tzung-yu Wu, Kimiko Huang, Lawerence Baker, Marta Fernandes, Mathew Samuel, Philipp Klocke, Pragati Jaiswal, Ryan Kindle, Shrey Lakhotia, Tom Pollard, Yueh-Hsun Chuang, Ziyi Hou.

At a glance

  • Institution: MITx
  • Subject: Data Analysis & Statistics
  • Level: Advanced
  • Prerequisites:

    Experience with R, Python and/or SQL is required unless the course is taken with computer scientists in the team.

  • Language: English

What you'll learn

Skip What you'll learn
  • Principles of data science as applied to health

  • Analysis of electronic health records

  • Artificial intelligence and machine learning in healthcare

Section 1 provides a general perspective about digital health data, their potential and challenges for research and use for retrospective analyses and modeling. Section 2 focuses on the Medical Information Mart for Intensive Care (MIMIC) database, curated by the Laboratory for Computational Physiology at MIT. The learners will have an opportunity to develop their analytical skills while following a research project, from the definition of a clinical question to the assessment of the analysis’ robustness. The last section is a collection of the workshops around the applications of data science in healthcare.

About the instructors

Who can take this course?

Unfortunately, learners from one or more of the following countries or regions will not be able to register for this course: Iran, Cuba and the Crimea region of Ukraine. While edX has sought licenses from the U.S. Office of Foreign Assets Control (OFAC) to offer our courses to learners in these countries and regions, the licenses we have received are not broad enough to allow us to offer this course in all locations. edX truly regrets that U.S. sanctions prevent us from offering all of our courses to everyone, no matter where they live.

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX for Business.