Skip to main content

HKUSTx: Big Data Computing with Spark

Learn the theory and gain hands-on experience of big data systems, using Spark as the exemplary platform.

Big Data Computing with Spark
8 weeks
6–10 hours per week
Self-paced
Progress at your own speed
Free
Optional upgrade available

There is one session available:

After a course session ends, it will be archivedOpens in a new tab.
Starts Apr 24
Ends Jul 29

About this course

Skip About this course

Big data systems such as Hadoop and Spark emerge as enabling technologies in managing massive amounts of data across hundreds or even thousands of computing nodes. Meanwhile, cloud computing platforms have made these technologies easily accessible to individuals as well as large enterprises. This course is an online adaptation of the signature course MSBD 5003 Big Data Computing offered to our popular MSc Program in Big Data Technology. In addition to 20+ hours of lecture videos, the course contains 100+ multiple-choice questions and 20 coding questions, aimed at equipping learners with both the theory and practical skills of big data systems, using Spark as the exemplary platform.

At a glance

  • Language: English
  • Video Transcript: English
  • Associated programs:
  • Associated skills:Nodes (Networking), Apache Hadoop, Big Data, Apache Spark, Cloud Computing

What you'll learn

Skip What you'll learn
  • Spark programming using both RDD and DataFrame APIs
  • Useful packages including ML, GraphX/GraphFrames, and SparkStreaming
  • Spark internals and performance optimizations
  • Algorithm design for big data systems
  • Week 1: Overview, MapReduce, and Hadoop
  • Week 2-3: Spark Basics and RDD
  • Week 4: SparkSQL and MLib
  • Week 5: Spark internals
  • Week 6: Algorithm design for big data
  • Week 7: GraphX/GraphFrames
  • Week 8: Spark Streaming

Who can take this course?

Unfortunately, learners residing in one or more of the following countries or regions will not be able to register for this course: Iran, Cuba and the Crimea region of Ukraine. While edX has sought licenses from the U.S. Office of Foreign Assets Control (OFAC) to offer our courses to learners in these countries and regions, the licenses we have received are not broad enough to allow us to offer this course in all locations. edX truly regrets that U.S. sanctions prevent us from offering all of our courses to everyone, no matter where they live.

This course is part of Big Data Technology MicroMasters Program

Learn more 
Expert instruction
5 graduate-level courses
Self-paced
Progress at your own speed
9 months
6 - 10 hours per week

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX For Business.