Skip to main content

UTAustinX: LAFF-On Programming for High Performance

Learn to squeeze high performance out of modern CPUs.

LAFF-On Programming for High Performance
5 weeks
4–6 hours per week
Progress at your own speed
Optional upgrade available

There is one session available:

5,222 already enrolled! After a course session ends, it will be archivedOpens in a new tab.
Starts Mar 1
Ends Dec 31

About this course

Skip About this course

Is my code fast? Can it be faster? Scientific computing, machine learning, and data science are about solving problems that are compute intensive. Choosing the right algorithm, extracting parallelism at various levels, and amortizing the cost of data movement are vital to achieving scalable speedup and high performance.

In this course, the simple but important example of matrix-matrix multiplication is used to illustrate fundamental techniques for attaining high-performance on modern CPUs. A carefully designed and scaffolded sequence of exercises leads the learner from a naive implementation to one that effectively utilizes instruction level parallelism and culminates in a high-performance multithreaded implementation. Along the way, it is discovered that careful attention to data movement is key to efficient computing.

Prerequisites for this course are a basic understanding of matrix computations (roughly equivalent toWeeks 1-5 of Linear Algebra: Foundations to Frontiers on edX) and an exposure to programming. Hands-on exercises start with skeletal code in the C programming language that is progressively modified, so that extensive experience with C is not required. Access to a relatively recent x86 processor such as Intel Haswell or AMD Ryzen (or newer) running Linux is required.

MATLAB Online licenses will be made available to the participants free of charge for the duration of the course.

Join us to satisfy your need for speed!

At a glance

  • Institution:


  • Subject: Computer Science
  • Level: Intermediate
  • Prerequisites:

    Exposure to programming and Linux. Basic understanding of matrix-matrix multiplication.

  • Language: English
  • Video Transcript: English
  • Associated skills:Machine Learning, Extract Transform Load (ETL), Linux, C (Programming Language), Algorithms, Scientific Computing, X86 Architecture, Linear Algebra, Amortization, Data Science, Scalability, Matrix Multiplication

What you'll learn

Skip What you'll learn
  • Mapping algorithms to architectures

  • Extracting parallelism at multiple levels

  • Amortizing data movement over computation

  • Understanding performance data

  • Managing complexity through layering of software

0 Getting Started

1 Loops and More Loops

2 Start Your Engines

3 Pushing the Limits

4 Multithreaded Parallelism

Learner testimonials

Skip Learner testimonials

"We will include this course as one of the main onboarding materials for our team''

- Dr. Misha Smelyanskiy, Technical Lead and Manager of AI System Co-design Group at Facebook

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX For Business.