Using the AlphaFold3 web server to model the MDM2-p53 complex
General Overview
This extra module provides a quick guide on how to use the AlphaFold Server to predict a structure of the MDM2-p53 complex and on how to use the output metrics to evaluate the prediction.
Introduction
The press release issued on October 9th, 2024, announcing the Nobel Prize in Chemistry, states that “The Nobel Prize in Chemistry 2024 is about proteins, life’s ingenious chemical tools. David Baker has succeeded with the almost impossible feat of building entirely new kinds of proteins. Demis Hassabis and John Jumper have developed an AI model to solve a 50-year-old problem: predicting proteins’ complex structures. These discoveries hold enormous potential”. This prize was awarded to the developers of AlphaFold2, a tool capable of predicting protein structures with an accuracy previously unseen Many of AlphaFold2’s predictions at Critical Assessment of Protein Structure Prediction (CASP) were so accurate as to be indistinguishable from experimentally solved protein structures. Shortly after the release of AlphaFold2, AlphaFold-Multimer, was introduced, extending the approach to the prediction of protein–protein complexes() rather than individual protein structures. More background on the evolution from DeepMind’s early work on game-playing AI to AlphaFold2, as well as practical guidance on running AlphaFold-Multimer in Google Colab, can be found here.
By the time AlphaFold2 was awarded the Nobel Prize, an even more powerful model had already been published.AlphaFold3 (AF3), unlike its predecessors, it is designed to predict the structures of a broad range of macromolecular complexes, including not only protein–protein complexes but also complexes involving DNA, RNA, small molecules, and ions DeepMind achieved this by large-scale training on structural data, drawing on essentially all relevant entries available in the Protein Data Bank up to 30/09/2021, as well as additional curated chemical and biological information. Architecturally, AlphaFold3 differs from earlier versions by introducing a diffusion-based generative model for structure prediction.
Both AlphaFold3 and 2 are using attention-based neural networks for representation learning, but the the structure generation process differs between the models. In AF2 it is done via the structure module which predicts per-residue rigid frames (rotation + translation) along the protein backbone, filling in side-chain atoms afterwards. On the contrary, AF3 uses a diffusion-based generative process to generate all atomic coordinates jointly. What did not change is the reliance on evolutionary and structural context, i.e. multiple sequence alignments (MSA) and templates are still being used for modelling protein components, although their role is reduced or absent for non-protein partners.
The detailed mechanics and intuitions behind the AlphaFold3 pipeline have been described many times in various online sources, and we can recommend:
- A talk on AF3 by Sergey Ovchinnikov, one of the leading researchers in protein structure prediction.
- A non-ML expert breakdown of the AF3 pipeline.
What is the AlphaFold Server?
AlphaFold3 itself is a deep learning model, and while the inference pipeline, i.e. the code required to run the model, but not to train it, is publicly available on GitHub, it cannot realistically be run on a regular PC. The system is designed for datacenter-class GPUs and requires very large memory and storage resources, including up to ~1 TB of disk space to host the necessary databases, as well as hundreds of gigabytes of RAM for practical use.
The AlphaFold Server is the most user-friendly way to use this tool. It not only removes the burden of installing the pipeline but also provides the necessary resources for computations - free of charge! However, there is a limit of 30 jobs per day (as of January 2026), and you need a Google account to register.
Using the AlphaFold Server to predict the p53-MDM2 complex
Generating a prediction with the AlphaFold Server is extremely straightforward and can be done in just a few clicks (just 4, if you use keyboard shortcuts to copy-paste the sequences and navigate :) ). Once a set of 5 models has been generated, it is essential to examine the associated confidence metrics to assess the reliability of the predictions. In this section, we will use the AlphaFold Server to predict the 3D structure of the p53–MDM2 complex and examine the confidence metrics provided by the server.
Go to AlphaFold Server https://alphafoldserver.com/welcome and log in using the “Continue with Google” button.
Scroll past the information block and several examples of structures until you see the input form.
You should see the job appearing in the panel at the bottom of the page. Depending on the size of the molecules, the job can take several minutes to several hours. Once it’s done you will be able to click on it and see the results.
Interpreting confidence metrics
On the result page you can see the 1st, i.e. top-ranked, predicted structure out of 5 and associated confidence metrics - pLDDT, PAE, pTM and ipTM scores.
You can find a description of the metrics online, for example:
- A practical guide to AlphaFold in the frame of the EMBL-EBI Training: https://www.ebi.ac.uk/training/online/courses/alphafold/. Navigate to “How to assess the quality of AlphaFold 3 predictions” to see info about the metrics specifically.
- An FAQ & Guides page of the AF Server: https://alphafoldserver.com/guides#overview:. Navigate to “Section 3: Interpreting results from AlphaFold Server” to see info about the metrics specifically.
Take a look at the result page:

Look at the pLDDT of MDM2. Can you trust this model? What about the N-ter?
Answer expand_more
The majority of the MDM2-predicted structure has a pLDDT of 90 or higher, so the structure can, in principle, be trusted. AlphaFold3, like any of the currently existing deep learning models, cannot guarantee the absence of so-called artificial hallucination, i.e. false information presented with high confidence; however, given the abundance of information about MDM2 that you used to build a homology model, it's safe to trust AlphaFold3 in this case. The N-terminus of the model has a low pLDDT score, which is expected for long terminal regions of proteins, as these regions are often intrinsically flexible or disordered. Often because those parts of the molecules are not experimentally resolved, or are resolved as an NMR ensemble with varied positions. This means that these termini are not as well presented in the online databases, meaning that AF3 is not as good in predicting them. This usually does not hinder the quality of the overall prediction.
Look at the ipTM and pTM scores - do they indicate that 3D structure of the complex is trustworthy?
Answer expand_more
Both ipTM and pTM scores are above 0.8 which indicates a confidently predicted interaction.
Answer expand_more
A dark green colour is indicative of a low predicted position error, i.e. the relative position of the pairs of tokens. The area corresponding to the peptide structure shows a very low overall predicted error, with core residues having higher confidence than terminal ones. The light-coloured vertical section corresponds to the N-ter of MDM2, residues 1-~16. This tail can flop and move around without impacting the overall structure, so its predicted position error is high. You can compare its position between all 5 generated models - it is extremely likely to differ between those.
Align the models, and colour them by b-factor visualize pLDDT values per atom. alignto fold_p53_mdm2_model_0 and chain B spectrum b
As you can see, generating structural models with AF3 requires far less time and manual effort than classical structural modelling. This naturally raises an important question: how accurate are the models produced by AF3?
In structural modelling, the quality of the predicted models depends on many factors, one of the most important being the amount of relevant information available. As you observed during homology modelling, there is extensive experimental information for both MDM2 and the p53–MDM2 complex, including multiple experimentally solved structures of human MDM2 bound to p53 (PDB: 1YCR, 4HFZ etc.) which can serve as modelling templates. In such cases, producing an accurate prediction is relatively straightforward. The situation is very different for more challenging targets, where experimental information is sparse or entirely absent. A useful way to assess the current state of the field is through community-wide blind prediction experiments such as CASP and the Critical Assessment of Predicted Interactions (CAPRI). In these challenges, participants are given only the sequences of the constituents of the complex, whose structure have been experimentally determined but not yet released, providing an objective benchmark for computational methods. The results of these challenges show that output of AlphaFold Server is often not as accurate as the output generated by the human teams, especially when it comes to the challenging targets. To sum up, while tools like AlphaFold3 represent a major advance, they do not eliminate the need for careful analysis, complementary methods, and human expertise.