Kurs : Optimizing Deep Learning Systems for Hardware | TÜBİTAK-ULAKBİM Açık Ders Platformu

Konu özeti

Konu seçin Overview

Overview

Hepsini daralt Hepsini genişlet
Optimizing Deep Learning Systems for Hardware
Skill Level: Intermediate; suitable for participants with prior experience in training deep neural networks.
Language: English
Workload:
Entirely theoretical; no installation, setup, or configuration required.
Includes an introduction to HPC frameworks (DeepSpeed) at the end.
Optional hands-on direction: installing and experimenting with DeepSpeed after the course.
Topic: Deep Learning
Overview
Deep learning has revolutionized fields from computer vision to natural language processing, but its success depends as much on computational efficiency as on model accuracy. As models grow larger and more complex, the ability to train and deploy them efficiently across different hardware platforms becomes a critical skill. This course explores the hardware-software co-design principles that underpin modern deep learning systems. We begin with core performance concepts and hardware fundamentals, then move through model-level and system-level optimization strategies, from pruning and quantization to parallelism and mixed-precision training. You'll gain insight into how deep learning workloads are executed on CPUs, GPUs, TPUs, and specialized accelerators, and understand the trade-offs across edge devices, datacenters, and high-performance computing clusters. The course concludes with a look at scaling frameworks like DeepSpeed and future directions in AI infrastructure. Whether you're optimizing for speed, memory, energy, or cost, this course will equip you with the tools to build efficient and scalable AI systems.
Who Should Enroll:
Graduate students, researchers, and professionals interested in deep learning efficiency, system-level optimization, and hardware-aware model design.
Those seeking to understand trade-offs between hardware, memory, and computation in DL training and inference.
Prerequisites:
Experience training deep neural networks (e.g., at least an introductory DL course).
Familiarity with computer architecture concepts is recommended but optional.
Basic knowledge of deep learning model optimization is helpful.
Tools, libraries, frameworks used:
The course is theoretical; no hands-on software required.
Concepts include DeepSpeed (introduction and configuration overview).
Discussion may reference GPUs, CPUs, TPUs, FPGAs, and HPC environments.
Learning Objectives
Understand how hardware and system design impact deep learning performance and efficiency.
Learn strategies to optimize models and systems, covering memory usage, computation, and precision.
Explore scaling concepts for deploying deep learning across edge, cloud, and HPC environments
Course Description
The course is structured into five parts:
Introduction
Part I: Fundamentals
Why hardware matters in deep learning: training vs inference bottlenecks
Performance metrics: FLOPs, latency, throughput, energy, cost
Case studies: edge devices vs datacenter vs supercomputers
Part II: Hardware & Memory Hierarchy
Existing Solutions: CPU, GPU, TPU, FPGA, ASIC basics
Memory hierarchy, bandwidth bottlenecks, and data movement costs
Precision
Part III: Model-Level Optimizations
Model compression: pruning, quantization, knowledge distillation
Efficient architectures.
Part IV: System-Level Optimizations
Parallelism: data, model, pipeline, tensor parallelism
Mixed-precision training (AMP, bfloat16)
Other System Optimizations
Part V: Scaling Deep Learning in HPC
Introduction to DeepSpeed.
Conclusions and Directions
About the instructor

Erdem Akagündüz is an Associate Professor at the Graduate School of Informatics, Middle East Technical University (METU), and a principal investigator at the Applied Intelligence Research Laboratory (AIRLab). His research interests include computer vision, deep learning, pattern recognition, image processing, machine learning, object tracking, and 3D modeling, with numerous journal publications, conference papers, and international patents in these areas. After completing his Ph.D. at METU Electrical and Electronics Engineering and conducting post-doctoral research at the University of York, he worked as a Computer Vision Scientist at ASELSAN Inc, focusing on real-time image processing and intelligent decision systems. He later held an academic position at Çankaya University before joining METU in his current role.
- Duyurular etkinliğini seçin
  
  Duyurular Forum
Konu seçin Introduction

Introduction
- Introduction Video etkinliğini seçin
- Introduction Slides etkinliğini seçin
  
  Introduction Slides Dosya
Konu seçin Part I

Part I
Konu seçin Part II

Part II
Konu seçin Part III

Part III
Konu seçin Part IV

Part IV
Konu seçin Part V

Part V

Konu özeti

Overview

Introduction

Part I

Part II

Part III

Part IV

Part V