Apache Spark for Data Engineering and Machine Learning
This short course introduces you to the fundamentals of Data Engineering and Machine Learning with Apache Spark, including Spark Structured Streaming, ETL for Machine Learning (ML) Pipelines, and Spark ML. By the end of the course, you will have hands-on experience applying Spark skills to ETL and ML workflows.

Choose your session:
After a course session ends, it will be archivedOpens in a new tab.
About this course
What you'll learn
Instructors
Frequently Asked Questions
Ways to take this course
edX For Business
Apache Spark for Data Engineering and Machine Learning
This short course introduces you to the fundamentals of Data Engineering and Machine Learning with Apache Spark, including Spark Structured Streaming, ETL for Machine Learning (ML) Pipelines, and Spark ML. By the end of the course, you will have hands-on experience applying Spark skills to ETL and ML workflows.

3 weeks
2–3 hours per week
Self-paced
Progress at your own speed
Free
Optional upgrade available
Choose your session:
After a course session ends, it will be archivedOpens in a new tab.
Apache Spark for Data Engineering and Machine Learning
At a glance
- Institution: IBM
- Subject: Computer Science
- Level: Intermediate
- Prerequisites:
Foundational Apache Spark knowledge and skills.
- Associated programs:
- Professional Certificate in NoSQL, Big Data and Spark Fundamentals
- Professional Certificate in Data Engineering
- Language: English
- Video Transcript: English
- Associated skills: Apache Spark, Unsupervised Learning, Spark Dataframes, Operations, Data Engineering, Machine Learning, Extract Transform Load (ETL), Cluster Analysis, Big Data, Batch Processing, SQL (Programming Language), Graph Theory, Apache Hadoop, Stream Processing
About the instructors
Who can take this course?
Unfortunately, learners residing in one or more of the following countries or regions will not be able to register for this course: Iran, Cuba and the Crimea region of Ukraine. While edX has sought licenses from the U.S. Office of Foreign Assets Control (OFAC) to offer our courses to learners in these countries and regions, the licenses we have received are not broad enough to allow us to offer this course in all locations. edX truly regrets that U.S. sanctions prevent us from offering all of our courses to everyone, no matter where they live.This course is part of Data Engineering Professional Certificate Program
Learn moreExpert instruction
14 skill-building courses
Self-paced
Progress at your own speed
1 year 2 months
3 - 4 hours per week
Interested in this course for your business or team?
Train your employees in the most in-demand topics, with edX For Business.