• Length:
    5 Weeks
  • Effort:
    4–6 hours per week
  • Price:

    FREE
    Add a Verified Certificate for $49 USD

  • Institution
  • Subject:
  • Level:
    Intermediate
  • Language:
    English
  • Video Transcript:
    English

Prerequisites

Exposure to programming and Linux. Basic understanding of matrix-matrix multiplication.

About this course

Is my code fast? Can it be faster? Scientific computing, machine learning, and data science are about solving problems that are compute intensive. Choosing the right algorithm, extracting parallelism at various levels, and amortizing the cost of data movement are vital to achieving scalable speedup and high performance.

In this course, the simple but important example of matrix-matrix multiplication is used to illustrate fundamental techniques for attaining high-performance on modern CPUs.  A carefully designed and scaffolded sequence of exercises leads the learner from a naive implementation to one that effectively utilizes instruction level parallelism and culminates in a high-performance multithreaded implementation.  Along the way, it is discovered that careful attention to data movement is key to efficient computing.

Prerequisites for this course are a basic understanding of matrix computations (roughly equivalent to Weeks 1-5 of Linear Algebra: Foundations to Frontiers on edX) and an exposure to programming.  Hands-on exercises start with skeletal code in the C programming language that is progressively modified, so that extensive experience with C is not required. Access to a relatively recent x86 processor such as Intel Haswell or AMD Ryzen (or newer) running Linux is required.  

MATLAB Online licenses will be made available to the participants free of charge for the duration of the course.

Join us to satisfy your need for speed!

What you'll learn

·      Mapping algorithms to architectures

·      Extracting parallelism at multiple levels

·      Amortizing data movement over computation

·      Understanding performance data

·      Managing complexity through layering of software
0 Getting Started

1 Loops and More Loops

2 Start Your Engines

3 Pushing the Limits

4 Multithreaded Parallelism

Meet your instructors

Maggie Myers
Lecturer, Department of Statistics and Data Sciences
The University of Texas at Austin
Robert van de Geijn
Professor of Computer Science
The University of Texas at Austin
Devangi Parikh
Research Fellow
The University of Texas at Austin

Pursue a Verified Certificate to highlight the knowledge and skills you gain $49.00

View a PDF of a sample edX certificate
  • Official and Verified

    Receive an instructor-signed certificate with the institution's logo to verify your achievement and increase your job prospects

  • Easily Shareable

    Add the certificate to your CV or resume, or post it directly on LinkedIn

  • Proven Motivator

    Give yourself an additional incentive to complete the course

  • Support our Mission

    EdX, a non-profit, relies on verified certificates to help fund free education for everyone globally

Learner testimonials

“We will include this course as one of the main onboarding materials for our team’'

- Dr. Misha Smelyanskiy, Technical Lead and Manager of AI System Co-design Group at Facebook