Course Description:

Artificial intelligence (AI) techniques, especially recent advances in generative large language models (LLMs), have surpassed human predictive performance in a variety of real-world tasks. This success is enabled by the recent development of Machine learning (ML) systems (MLSys) that provide high-level programming interfaces for people to easily prototype different ML models on modern hardware platforms.

In this course, we will explore the design of modern ML systems by learning how an ML model written in high-level languages is decomposed into low-level kernels and executed across hardware accelerators (e.g., GPUs) in a distributed fashion. Topics covered in this course include: neural networks and backpropagation, programming models for expressing ML models, automatic differentiation, deep learning accelerators, distributed training techniques, computation graph optimizations, automated kernel generation, memory optimizations, etc. The main goal of this course is to provide a comprehensive view on how existing ML systems work. Throughout this course, we will also learn the design principles behind these systems and discuss the challenges and opportunities for building future ML systems for next-generation ML applications and hardware platforms.


  • Lectures: Tuesday/Thursday 4:30-5:45pm, BRNG 1254
  • Zoom: the first few weeks of the lectures will be on zoom. Links will be available on Brightspace.
  • Announcement or Assignments: Brightspace.
  • Discussions: Brightspace or Piazza.
  • Contact: For external enquiries, personal matters or in emergencies, you can email the instructor.