About this course
In this course, the simple but important example of matrix-matrix multiplication is used to illustrate fundamental techniques for attaining high-performance on modern CPUs. A carefully designed and scaffolded sequence of exercises leads the learner from a naive implementation to one that effectively utilizes instruction level parallelism and culminates in a high-performance multithreaded implementation. Along the way, it is discovered that careful attention to data movement is key to efficient computing.
Prerequisites for this course are a basic understanding of matrix computations (roughly equivalent to Weeks 1-5 of Linear Algebra: Foundations to Frontiers on edX) and an exposure to programming. Hands-on exercises start with skeletal code in the C programming language that is progressively modified, so that extensive experience with C is not required. Access to a relatively recent x86 processor such as Intel Haswell or AMD Ryzen (or newer) running Linux is required.
MATLAB Online licenses will be made available to the participants free of charge for the duration of the course.
Join us to satisfy your need for speed!
What you'll learn
· Extracting parallelism at multiple levels
· Amortizing data movement over computation
· Understanding performance data
· Managing complexity through layering of software
1 Loops and More Loops
2 Start Your Engines
3 Pushing the Limits
4 Multithreaded Parallelism
Pursue a Verified Certificate to highlight the knowledge and skills you gain $49.00
Official and Verified
Receive an instructor-signed certificate with the institution's logo to verify your achievement and increase your job prospects
Add the certificate to your CV or resume, or post it directly on LinkedIn
Give yourself an additional incentive to complete the course
Support our Mission
EdX, a non-profit, relies on verified certificates to help fund free education for everyone globally
- Dr. Misha Smelyanskiy, Technical Lead and Manager of AI System Co-design Group at Facebook