Ir al contenido principal

DelftX: AI skills for Engineers: Supervised Machine Learning

Learn the fundamentals of machine learning to help you correctly apply various classification and regression machine learning algorithms to real-life problems using the Python toolbox scikit-learn.

AI skills for Engineers: Supervised Machine Learning
6 semanas
6–8 horas por semana
A tu ritmo
Avanza a tu ritmo
Verificación opcional disponible

Hay una sesión disponible:

Una vez finalizada la sesión del curso, será archivadoAbre en una pestaña nueva.
Comienza el 20 feb
Termina el 26 mar

Sobre este curso

Omitir Sobre este curso

Machine learning classification and regression techniques have potential uses in various engineering disciplines. These machine learning models allow you to make predictions for a category (classification) or for a number (regression) given sensor data, and can be used in, for example, predicting properties of objects (such as their weight or shape).

Using hands-on and interactive exercises you will get insight into:

Machine learning and its variants, such as supervised learning, semi-supervised learning, unsupervised learning and reinforcement learning.

Regression techniques such as linear regression, K-nearest neighbor regression, how to deal with outliers and evaluation metrics such as the mean squared error (MSE) and mean absolute error (MAE).

Classification techniques such as the histogram method, the nearest mean (or nearest medoid) method and the nearest neighbor classifier. We cover the classification setting and important concepts such as the Bayes classifier and the Bayes error, the optimal classifier in theory.

Training models using (stochastic) gradient descent and its variants, we learn how to tune this optimizer, and how to use it to construct a logistic regression classification model.

Overfitting means a classifier works well on a training set but not on unseen test data. We discuss how to build complex non-linear models, and we analyze how we can understand overfitting using the bias-variance decomposition and the curse of dimensionality. Finally, we discuss how to evaluate fairly and tune machine learning models and estimate how much data they need for an sufficient performance.

Regularization methods can help to mitigate overfitting. We discuss two regularization techniques for estimating the linear regression coefficients: ridge regression and LASSO. The latter can also be used for variable selection.

Classifier evaluation metrics such as the ROC curve and confusion matrix can give more insight into the performance of classifiers. We also discuss what constitutes a “good” accuracy; this is given by so-called dummy-classifiers which are naïve baselines.

Support Vector Machines (SVMs) are more advanced classification models that can provide good performance even in high-dimensional spaces and with little data. We discuss their different variants such as the soft-margin SVM, the hard-margin SVM and the nonlinear kernel SVM.

Decision Trees are simple models that can easily be understood by lay people. They are easy to use and visualize, and instead of a black box they can be easily understood as an interpretable white box model, making them suitable for various applications.

The lectures feature a unique combination of videos mixed with hands-on interaction with machine learning algorithms to stimulate a deeper understanding. In the exercises you apply the algorithms in Python using scikit-learn and in the final project you will further deepen your understanding of the various concepts by building and tuning a machine learning pipeline from start to finish.

De un vistazo

  • Institución: DelftX
  • Tema: Informática
  • Nivel: Intermediate
  • Prerrequisitos:
    • Basic linear algebra
    • Basic Python programming skills
    • Basic probability & statistics
  • Idioma: English
  • Transcripciones de video: اَلْعَرَبِيَّةُ, Deutsch, English, Español, Français, हिन्दी, Bahasa Indonesia, Português, Kiswahili, తెలుగు, Türkçe, 中文
  • Programas asociados:

Lo que aprenderás

Omitir Lo que aprenderás
  • Apply common operations (pre-processing, plotting, etc.) to datasets using Python.
  • Explain the concept of supervised, semi-supervised, unsupervised machine learning and reinforcement learning.
  • Explain how various supervised learning models work and recognize their limitations.
  • Analyze which factors impact the performance of learning algorithms.
  • Apply learning algorithms to datasets using Python and Scikit-learn and evaluate their performance.
  • Optimize a machine learning pipeline using Python and Scikit-learn.

Plan de estudios

Omitir Plan de estudios

Topic 1: Introduction

This is an introduction to the course with an overview of the topics. We give a brief introduction to machine learning and its different variants.

• Why use machine learning?

• Machine learning basics and terminology

• The biggest challenge in machine learning

• Machine learning frameworks: supervised, semi-supervised, unsupervised and reinforcement learning

Topic 2: Regression

We will make a gentle start with regression. In the regression setting, a machine learning model will need to predict a number.

• The regression setting and its assumptions

• The mean squared error (MSE) and mean absolute error (MAE)

• Outliers in regression

• Linear regression and K-nearest neighbour regression

Topic 3: Classification

In classification, a machine learning model will need to predict a category or class.

• Terminology and basics of classification

• Building classifiers using histograms, nearest mean (nearest medoid) classifier, K-nearest neighbour (KNN) classifier

• The Bayes classifier and the Bayes error

• How to use the KNN classifier in practice

Topic 4: Training Models

Gradient descent is an iterative procedure to train models, such as logistic regression and neural networks.

• The basics of gradient descent

• The three variants of gradient descent: batch, mini-batch and stochastic gradient descent (SGD)

• How to tune gradient descent

• The basics of logistic regression

Topic 5: Overfitting

Overfitting is the problem where a machine learning algorithm performs well on the training set but does not perform well on new and unseen data.

• How to use linear models for nonlinear tasks?

• The bias-variance trade-off and the curse of dimensionality

• How to use learning curves to estimate the amount of data needed

Topic 6: Cross Validation & Regularization

To get a good estimate of the performance of machine learning models, cross validation is an essential technique. This is also important to tune hyperparameters of models. Finally, we discuss regularization, a technique that aims to avoid overfitting.

• Cross validation, model selection and hyperparameter tuning

• Ridge regression

• LASSO regularization and how it’s used for variable selection

Topic 7: Classifier Evaluation

Classifier evaluation delves deeper into the various evaluation metrics for classifiers.

• What a “good” accuracy means (e.g., naïve baselines/dummy classifiers)

• The confusion matrix (false positive, false negative, costs)

• ROC-curves

Topic 8: Support Vector Machines

The support vector machine is a well-known more advanced classification model.

• Basics of the SVM, the margin and the hard-margin SVM

• The soft-margin SVM

• Kernels

Topic 9: Decision Trees

Decision trees are simple and interpretable models that are very user-friendly.

• Basics of decision trees and their terminology

• How to train decision trees with CART

• Overfitting and other pros and cons of decision trees

Topic 10: Final Project

The final project will involve building a machine learning pipeline, including hyperparameter tuning and a careful and fair evaluation, to solve a small practical application, that is the recognition of handwritten digits (MNIST).

¿Quién puede hacer este curso?

Lamentablemente, las personas residentes en uno o más de los siguientes países o regiones no podrán registrarse para este curso: Irán, Cuba y la región de Crimea en Ucrania. Si bien edX consiguió licencias de la Oficina de Control de Activos Extranjeros de los EE. UU. (U.S. Office of Foreign Assets Control, OFAC) para ofrecer nuestros cursos a personas en estos países y regiones, las licencias que hemos recibido no son lo suficientemente amplias como para permitirnos dictar este curso en todas las ubicaciones. edX lamenta profundamente que las sanciones estadounidenses impidan que ofrezcamos todos nuestros cursos a cualquier persona, sin importar dónde viva.

Este curso es parte delprograma AI Skills: Basic and Advanced Techniques in Machine Learning Professional Certificate

Más información 
Instrucción por expertos
2 cursos de capacitación
A tu ritmo
Avanza a tu ritmo
3 meses
5 - 7 horas semanales

¿Te interesa este curso para tu negocio o equipo?

Capacita a tus empleados en los temas más solicitados con edX para Negocios.