Topic outline

  • Overview

    Skill Level: Intermediate 

    Workload: 1.5 hours total 

    Topic: Artificial Intelligence

    Overview: This course offers a comprehensive introduction to large language models (LLMs), exploring their historical evolution, architectural foundations, and modern applications. Students will develop a strong understanding of LLMs and their role in various real-world scenarios.

    Course Description: This course provides a structured and in-depth exploration of large language models (LLMs), focusing on their historical, theoretical, and practical dimensions. 

    Students will trace the development of language models, from early statistical approaches to the groundbreaking advancements in deep learning that paved the way for LLMs.

    Students will learn the core architecture that revolutionized natural language processing—the Transformer. Students will examine its core component, attention/self-attention mechanisms, and learn how these enable LLMs to process and generate text.

    Student will learn the fundamental principles of training LLMs, including various LLM structures and the training methods of LLMs.

    Student will learn state-of-the-art LLMs currently in use, highlighting their capabilities, unique features, and applications across various domains.

    Course Content:

    Week 1: Evolution of Language Models 

       - Overview of language modeling 

       - Historical development: From n-gram models to neural networks  

       - Key advancements in deep learning that paved the way for LLMs  

       - Introduction to early architectures like RNNs and LSTMs  

    Week 2: The Power of LLMs: Transformer (Part 1)  

       - Introduction to the Transformer architecture

    • Recurrent Neural Networks (RNNs)
    • The Encoder-Decoder model with RNNs
    • Attention mechanism

       - Understanding attention mechanisms in a mathematical way

    Week 3: The Power of LLMs: Transformer (Part 2)  

       - Self-attention mechanisms in detail  

       - Multi-head attention and its benefits  

       - Layer normalization in Transformers  

    Week 4: Large Language Models: Key Concepts and Training  

       - Fundamentals of LLM training: Objectives and challenges  

       - Overview of various LLM structures (e.g., encoder-only, decoder-only, encoder-decoder)  

       - Training techniques and model deployment 

    Week 5: Available Large Language Models  

       - Survey of state-of-the-art LLMs (e.g., BERT, Llama, T5) 

       - Key features and differences among popular LLMs 

    Who Should Enroll: Anyone who wishes to understand the basic principles of LLMs

    Prerequisites: Experience with machine learning, deep neural networks

                               Basic programming skills in Python 

    Learning Objectives: By participating in this course, you will learn the intermediate level of information for Large Language Models.

    About the instructor(s):  Tuğba Pamay Arslan is a lecturer at the Department of AI & Data Engineering at İstanbul Technical University.  She is an active member of İTÜ NLP Research Group and currently a PhD candidate. Her expertise includes multilingual semantic-level NLP applications (e.g., Coreference Resolution), and the training and adaptation of large language models for various multilingual NLP tasks. She also works as an NLP researcher in a universal TÜBİTAK 2515 project (123E079) focusing on multilingual coreference resolution models using neural networks. 


  • Introduction

  • Part I

  • Part II

  • Part III

  • Part IV

  • Part V