Skip to content

NilsDunlop/PROTACFold

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PROTACFold

License: MIT Python 3.8+ AlphaFold 3 Boltz-1 DOI
PROTACFold Workflow

Overview

PROTACFold is a comprehensive toolkit for analyzing and predicting Proteolysis Targeting Chimera (PROTAC) structures using AlphaFold 3 and Boltz-1. PROTACs are heterobifunctional molecules that induce targeted protein degradation by forming ternary complexes between a protein of interest (POI) and an E3 ubiquitin ligase. This toolkit provides methods for accurate prediction, evaluation, and analysis of these complex structures and models to advance PROTAC drug discovery.

Table of Contents

Website

To make PROTAC analysis more accessible, we launched protacfold.xyz, our web platform that automates PDB extraction, identifies PROTAC POI & E3 ligase components, and prepares input files for both AlphaFold3 and Boltz-1.

Features

  • AF3 & B1 Integration: Streamlined setup and usage of both AlphaFold 3 and Boltz-1 for comparative PROTAC ternary complex prediction.
  • Multiple Ligand Representation Methods: Support for both Chemical Component Dictionary (CCD) and SMILES formats
  • Comprehensive Structure Analysis: Calculate RMSD, DockQ scores, pTM, ipTM, and TM-scores for evaluating model quality
  • Molecular Property Analysis: Calculate and analyze physicochemical properties of PROTACs using RDKit
  • Advanced Visualization: Interactive plots and statistical analysis of prediction metrics
  • Benchmark Capabilities: Compare predictions with experimental structures and other computational methods

Installation

Prerequisites

  • Python 3.11+
  • CUDA-compatible GPU (for AlphaFold 3)
  • Docker (recommended for AlphaFold 3 setup)

AlphaFold 3 Setup (Docker Recommended)

We use AlphaFold 3 inference code available from Google DeepMind.

Our detailed instructions for setting up AlphaFold 3 using Docker can be found in the installation guide. For reference, you can also consult the official AlphaFold 3 documentation, though our guide provides comprehensive step-by-step instructions tailored more for PROTACFold users.

Boltz-1 Setup

Install Boltz using pip:

pip install boltz -U

To run predictions with Boltz YAML input files, please refer to the detailed instructions in the official Boltz Prediction Guide.

Manual Installation

  1. Clone the repository:
git clone https://github.com/NilsDunlop/PROTACFold.git
cd PROTACFold
  1. Install Python dependencies:
pip install -r requirements.txt

Directory Structure

  • data/: Contains datasets and analysis results
    • af3_input/: Input files for AlphaFold 3 (SMILES and CCD formats)
    • af3_results/: Consolidated results from AlphaFold 3 predictions
    • boltz_results/: Consolidated results from Boltz-1 predictions
    • plots/: Generated visualizations
    • hal_04732948/: Data from Pereira et al., 2024 for comparison
  • utils/: Utility scripts for structure analysis and property calculation
  • src/:
    • plots/: Scripts for generating all figures and data for our research.
    • website/: Local deployment of protacfold.xyz for private analysis using Ollama.
  • docs/: Documentation including installation guides and images

Usage

PROTAC Structure Prediction

Proposed workflow for predicting PROTAC ternary complexes using AlphaFold 3 and Boltz-1:

  1. Determine PDB structures to analyze and automate JSON and YAML input files with protacfold.xyz.
  2. Run AlphaFold 3 and Boltz predictions.
  3. Analyze results using the provided utility scripts.

Structure Prediction Evaluation

The utils/evaluation.py script automates the extraction of all quantitative metrics from our study (see Key Metrics). It uses the (PDBID)_analysis.txt files (generated by protacfold.xyz) to identify POI and E3 ligase chains, enabling fully automated, component-wise RMSD calculations with PyMOL.

Note: Script requires a local installation of PyMOL for structural alignments.

To run a complete analysis on a directory of PROTAC predictions:

# Analyze all AlphaFold 3 predictions in a given directory
python utils/evaluation.py --protac path/to/predictions --model_type AlphaFold3

# Analyze all Boltz-1 predictions in a given directory
python utils/evaluation.py --boltz path/to/predictions --model_type Boltz1

This will generate an evaluation_results.csv file in the data/af3_results/ directory.

Visualization and Plotting

The src/plots/ directory contains all the scripts used to generate the figures and perform the data analysis for our research. These scripts produce a variety of visualizations and can be ran by:

python src/plots/main.py

Key Metrics

PROTACFold evaluates predictions using multiple metrics:

  • DockQ Score: Quality measure for protein-protein docking interfaces
  • RMSD: Root Mean Square Deviation between predicted and experimental structures
  • pTM/ipTM: AlphaFold confidence metrics for overall and interface quality
  • Molecular Descriptors: Physicochemical properties of PROTAC molecules

Predicted Structures

All 124 predicted PROTAC structures, as well as two replicas of a 300 ns MD simulation of complex 9B9W, are available on Zenodo. An example of a high-quality prediction, the structure for complex 7PI4 is shown below. The experimental structure in gray, with the AlphaFold 3 prediction in gold and the Boltz-1 prediction in cyan.

AlphaFold3 Boltz-1
PDB ID 7PI4 AlphaFold3 Prediction PDB ID 7PI4 Boltz-1 Prediction

Tools

Protein Structure Prediction

  • AlphaFold 3 - DeepMind's state-of-the-art protein structure prediction model

  • Boltz-1 - MIT researchers open source biomolecular interaction model

Structure Analysis and Comparison

  • DockQ - Quality measure for protein-protein docking models

Visualization and Chemoinformatics

  • PyMOL - Molecular visualization system
  • RDKit - Open-source chemoinformatics toolkit

Data Sources

This project integrates data from:

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • The AlphaFold team at Google DeepMind
  • The Boltz researchers at MIT
  • Developers of open-source tools used in this project (RDKit, DockQ)
  • PyMOL for visualization
  • Contributors to PROTAC databases and experimental data

Citation

If you use PROTACFold in your research, please cite the paper: Predicting PROTAC-Mediated Ternary Complexes with AlphaFold3 and Boltz-1

About

A toolkit developed to predict and analyze PROTAC-mediated ternary complexes using AlphaFold3 and Boltz-1.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages