Skip to main content

Statistics.comX: Applied Data Science Ethics

AI’s popularity has resulted in numerous well-publicized cases of bias, injustice, and discrimination. Often these harms occur in machine learning projects that have the best of goals, developed by data scientists with good intentions. This course, the second in the data science ethics program for both practitioners and managers, provides guidance and practical tools to build better models and avoid these problems.

Applied Data Science Ethics
4 weeks
4–5 hours per week
Self-paced
Progress at your own speed
Free
Optional upgrade available

There is one session available:

After a course session ends, it will be archivedOpens in a new tab.
Starts Mar 28
Ends Dec 31

About this course

Skip About this course

Concern about the harmful effects of machine learning algorithms and big data AI models (bias and more) has resulted in greater attention to the fundamentals of data ethics. News stories appear regularly about credit algorithms that discriminate against women, medical algorithms that discriminate against African Americans, hiring algorithms that base decisions on gender, and more. In most cases, the data scientists who developed and deployed these decision making algorithms and data processes had no such intentions, and were unaware of the harmful impact of their work.

This data science ethics course, the second in the data science ethics program for both practitioners and managers, provides guidance and practical tools to build better models, do better data analysis and avoid these problems. You’ll learn about ****

  • Tools for model interpretability

  • Global versus local model interpretability methods

  • Metrics for model fairness

  • Auditing your model for bias and fairness

  • Remedies for biased models

The course offers real world problems and datasets, a framework data scientists can use to develop their projects, and an audit process to follow in reviewing them. Case studies with ethical considerations, along with Python code, are provided.

At a glance

  • Institution: Statistics.comX
  • Subject: Ethics
  • Level: Intermediate
  • Prerequisites:
    • Principles of Data Science Ethics
    • We will present Python code to illustrate, so we assume some familiarity with Python.
    • You will need a gmail account for the lab in Module 3 which is housed at Colab (Colaboratory by Google)
  • Associated programs:
  • Language: English
  • Video Transcript: English
  • Associated skills:Data Ethics, Auditing, Python (Programming Language), Algorithms, Data Analysis, News Stories, Machine Learning Algorithms, Machine Learning, Artificial Intelligence, Decision Making, Big Data, Data Science

What you'll learn

Skip What you'll learn
  • How to evaluate predictor impact in black box models using interpretability methods
  • How to explain the average contribution of features to predictions and the contribution of individual feature values to individual predictions

  • How to Assess the performance of models with metrics to measure bias and unfairness

  • How to describe potential ethical issues that can arise with image and text data, and how to address them

  • How to donduct an audit of a data science project from an ethical standpoint to identify possible harms and potential areas for bias mitigation or harm reduction

In this course we will mostly be addressing things the data scientist can do to ensure that their projects and solutions are designed and implemented responsibly. We will primarily focus on issues of bias and unfairness across protected groups.

This course is arranged in 4 modules. We estimate that you will need to spend at least 5 hours per week. The course is self-paced, so you have the flexibility to complete modules in your own time. ****

Week 1 – Audit and Remediation

  • Videos:

    • Introduction

    • Audit and Remediation

    • Confusion Matrix

    • Beyond Classic Bias

    • Regression

  • Knowledge Checks

  • Lab 1 (for verified users only)

  • Discussion Prompt (for verified users only)

Week 2 – Interpretability in Practice

  • Videos:

    • Interpretability

    • Global Interpretability

    • Fidelity, Robustness, Caveats

    • Local Interpretability Methods

  • Knowledge Checks

  • Reading

  • Lab 2 (for verified users only)

  • Discussion Prompt (for verified users only)

Week 3 – Image and Text Data

  • Videos:

    • Image and Text Data

    • Neural Net Interpretability

  • Knowledge Checks

  • Readings

  • Lab 3 (for verified users only) - will need gmail account for this lab

  • Discussion Prompt (for verified users only)

Week 4 – Tools and Documentation

  • Videos:

    • Tools and Documentation
    • Readings
    • Knowledge Checks
    • Quiz (for verified users only)

Please note:

  • There are 4 modules in total.

  • Labs are for verified users only. They are 'open book' and there is no set time limit. You will need a gmail account for the lab on Colab (Colabatory on Google) for Week 3.

  • The exercises involve hands-on work with Python (we will provide useful hints)

  • You will only have one attempt to answer each exercise.

  • You can complete the exercises at any time while the course is open, however, we do recommend that you complete them sequentially, after you complete the relevant module.

This course is part of Data Science Ethics Professional Certificate Program

Learn more 
Expert instruction
2 skill-building courses
Self-paced
Progress at your own speed
2 months
4 - 5 hours per week

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX For Business.