91259 - Architecture and Platforms for Artificial Intelligence

Academic Year 2024/2025

  • Moduli: Moreno Marzolla (Modulo 1) Davide Rossi (Modulo 2)
  • Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2)
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Artificial Intelligence (cod. 9063)

Learning outcomes

At the end of the course, the student has a deep understanding of the requirements of machine-learning workloads for computing systems, has an understanding of the main architectures for accelerating machine learning workloads and heterogeneous architectures for embedded machine learning, and of the most popular platforms made available by cloud providers to specifically support machine/deep learning applications.

Course contents

Module 1:

  1. Introduction to parallel programming.
  2. Parallel programming patterns: embarassingly parallel, decomposition, master/worker, scan, reduce, ...
  3. Shared-Memory programming with OpenMP.
  4. OpenMP programming model: the “omp parallel” construct, scoping constructs, other work-sharing constructs.
  5. GPU programming with CUDA.

Module 2:

  1. From ML to DNNs - a computational perspective
    1. Introduction to key computational kernels (dot-product, matrix multiply...)
    2. Inference vs training - workload analysis characterization
    3. The NN computational zoo: DNNs, CNNs, RNNs, GNNs, Attention-based Networks
  2. Running ML workloads on programmable processors
    1. recap of processor instruction set architecture (ISA) with focus on data processing
    2. improving processor ISAs for ML: RISC-V and ARM use cases
    3. fundamentals of parallel processor architecture and parallelization of ML workloads
  3. Algorithmic optimizations for ML
    1. Key bottlenecks taxonomy of optimization techniques
    2. Algorithmic techniques: Strassen, Winograd, FFT
    3. Topology optimization: efficient NN models - depthwise convolutions, inverse bottleneck, introduction to Neural Architectural Search

Readings/Bibliography

Suggested readings for Module 1: selected parts from the following books

An Introduction to Parallel Programming
Peter Pacheco,
Morgan Kaufmann, 2011, ISBN 9780123742605
https://shop.elsevier.com/books/an-introduction-to-parallel-programming/pacheco/978-0-12-374260-5
Note: there is an updated version of this book, published in 2021 and coauthored with Matthew Malensek.

CUDA C programming guide
NVidia Corporation
http://docs.nvidia.com/cuda/cuda-c-programming-guide/

Teaching methods

Traditional lectures for theory.

In addition, both Module 1 and Module 2 include hands-on sessions requiring a student laptop.

Assessment methods

Module 1: Project work

Module 2: Written exam with oral discussion

Teaching tools

Lectures using projector for slides provided by the instructors.

Hands-on sessions.

Office hours

See the website of Moreno Marzolla

See the website of Davide Rossi