Konu özeti

  • Overview

    Prediction of Protein Structures Using Deep Learning Tools

    Skill Level: Advanced

    Language: English

    Workload: 2 hours total

    Topic: Computational structural biology

    Overview 

    This course introduces the principles and applications of protein structure prediction using state-of-the-art deep learning tools. It emphasizes the relationship between sequence, structure, and function, guiding learners through methods that connect evolutionary information with modern computational predictions. Participants will explore multiple sequence alignments, structure prediction pipelines, multimer modeling, and protein–ligand interactions. The course also highlights recent breakthroughs such as AlphaFold, ProteinMPNN, BioEmu, and emerging tools for dynamic conformational studies.

    The overall aim is to provide learners with a conceptual framework to understand how deep learning has revolutionized structural biology, without requiring them to implement the algorithms themselves. By the end, students will gain practical familiarity with prediction tools and an appreciation of their strengths, limitations, and applications in biological research.

    Course Description

    Prediction of Protein Structures Using Deep Learning Tools is designed for participants with prior experience in command-line environments, basic Python programming skills, and a general understanding of protein structures.

    The course is structured into six parts:

    1. Multiple Sequence Alignments (MSA) – Understanding evolutionary conservation and coevolution, using tools like MMseqs2, JackHMMER, ConSurf, GREMLIN, and EVCoupling.
    2. Protein Structure Prediction – Exploring AlphaFold2, ColabFold, and their role in solving the sequence-to-structure challenge.
    3. Protein Sequence Predictions from Structure – Introducing reverse-design frameworks such as ProteinMPNN for biomaterials and binder design.
    4. Protein Multimer Prediction – Examining how prediction extends to protein complexes, DNA/RNA interactions, and ligands, including AlphaFold3.
    5. Protein–Ligand Interactions – Investigating affinity prediction tools like Boltz-2 and applications in drug discovery.
    6. Protein Conformations – Addressing protein dynamics with physics-based methods, molecular dynamics simulations, and tools like BioEmu to capture flexibility beyond static models.

    This course provides a conceptual guide to understanding the sequence–structure relationship, the strengths and weaknesses of prediction methods, and practical experience with widely used tools through online resources and Google Colab. It is not focused on developing deep learning algorithms, fine-tuning parameters, or advanced coding, but instead offers an accessible, hands-on introduction to how these methods are applied in modern biology.

    Course Contents:

    Part 1: Multiple sequence alignments

    Part 2: Protein structure prediction

    Part 3: Protein sequence predictions from structure

    Part 4: Protein multimer prediction

    Part 5: Protein–ligand interactions

    Part 6: Protein conformations

    Who Should Enroll: Individuals interested in protein structure prediction.

    Prerequisites:

    Basic knowledge of protein structures

    Basic proficiency in Python programming

    Tools, libraries, frameworks used: Google Collaboratory

    Learning Objectives

    By the end of this course, participants will be able to:

    Explain the relationship between protein sequence, structure, and function.

    Apply multiple sequence alignment (MSA) tools to identify conserved and coevolved residues.

    Use state-of-the-art deep learning–based tools such as AlphaFold and ColabFold for protein structure prediction.

    Evaluate predicted protein structures to understand their limitations and potential biological significance.

    Explore approaches for reverse protein design, including ProteinMPNN, to connect structure back to sequence.

    Analyze protein–protein and protein–ligand interactions through multimer prediction and interaction tools.

    Discuss how protein conformational flexibility can be studied using molecular dynamics simulations and differentiate what deep learning tools capture versus what requires experimental or physics-based validation.

    About the instructor(s): https://tfguclu.github.io/

    -------------

    In colab_intro_curated video, there is a selective tutorial on how to use the Google Collaboratory environment for Python programming and deep learning–based protein prediction tools.
    Our example protein is GB1, with the PDB code 1PGA (https://www.rcsb.org/structure/1PGA).

    In Part 1, we used ConSurf (https://consurf.tau.ac.il/) and EvCoupling (https://v2.evcouplings.org/) online tools.
    In Part 2, we used Sergey Ovchinnikov’s (https://biology.mit.edu/profile/sergey-ovchinnikov/) GitHub repository for ColabFold (https://github.com/sokrypton/ColabFold).
    In Part 3, to predict the sequence space from structure, we employed ProteinMPNN, again using the same GitHub address (https://github.com/sokrypton/ColabDesign).
    In Part 4, to predict protein multimers, we turned to the AlphaFold3 Server (https://alphafoldserver.com/).
    In Part 5, we used Boltz-2 (https://boltz.bio/boltz2), an open-source tool for predicting proteins, multimers, and protein–ligand complexes. The Colab notebook for this tool, prepared by me, is included in the boltz2.zip file.
    Finally, in Part 6, we used BioEmu to generate conformations through the BioEmu Colab, available on the same page as ColabFold (https://github.com/sokrypton/ColabFold).

    Additionally, another useful set of Colab notebooks can be found on this website: https://github.com/pablo-arantes/making-it-rain.


  • Course Introduction

  • Lesson I - Part1

  • Lesson 1 - Part 2

  • Lesson 1 - Part 3

  • Lesson 1 - Part 4

  • Lesson 1 - Part 5

  • Lesson 1 - Part 6