Skip to main content

Foundations of Data Science: Inferential Thinking by Resampling

Provided by University of California, Berkeley (BerkeleyX)
Introductory
See prerequisites
4–6 hours
per week, for 5 weeks
Free

$99 USD for graded exams and assignments, plus a certificate

Learn how to use inferential thinking to make conclusions about unknowns based on data in random samples.

Before you start

Course opens: Jan 28, 2019
Course ends: Dec 31, 2019

What you will learn

  • The logical and conceptual frameworks of statistical inference
  • How to conduct hypothesis testing, permutation testing, and A/B testing
  • The purpose and power of resampling methods
  • Relations between sample size and accuracy
  • P-values, quantifying uncertainty, and generating confidence intervals using the bootstrap method
  • How to interpret the results from hypothesis testing

Overview

This course will teach you the power of statistical inference: given a random sample, how do we predict some quantity that we cannot observe directly?

Using real-world examples from a wide array of domains including law, medicine and football, you’ll learn how data scientists make conclusions about unknowns based on the data available. Often, the data we have is not complete, yet we’d still like to draw inferences about the world and quantify the uncertainty in our conclusions. This is called statistical inference. In this course, you will learn the framework for statistical inference and apply them to real-world data sets.

Notably, you will develop the practice of hypothesis testing—comparing theoretical predictions to actual data, and choosing whether to accept those predictions. This method allows us to evaluate theories or hypotheses about how the world works.

You will also learn how to quantify the uncertainty in the conclusions you draw from hypothesis testing. This helps assess whether patterns that appear to be present in the data actually represent a true relationship in the world, or whether they might merely reflect random fluctuations due to noise. Throughout this course, we will go over multiple methods for estimation and hypothesis testing, based on simulations and the bootstrap method. Finally, you will learn about randomized controlled experiments and how to draw conclusions about causality.

The course emphasizes the conceptual basis of inference, the logic of the decision-making process, and the sound interpretation of results.

Meet your instructors

Ani Adhikari
Teaching Professor of Statistics
UC Berkeley
John DeNero
Giancarlo Teaching Fellow in the EECS Department
UC Berkeley
David Wagner
Professor of Computer Science
UC Berkeley
View Courses
This course is part of:

Earn a Professional Certificate in 2-4 months if courses are taken one at a time.

View the program
  1. 20–30 hours of effort

    Learn the basics of computational thinking, an essential skill in today’s data-driven world, using the popular programming language, Python.

  2. Foundations of Data Science: Inferential Thinking by Resampling
  3. 24–36 hours of effort

    Learn how to use machine learning, with a focus on regression and classification, to automatically identify patterns in your data and make better predictions.

Get started in computer science

Browse over 600 computer science courses
Of all edX learners:
73% are employed
Of all edX learners:
45% have children
Based on internal survey results
361,006 people are learning on edX today