Alexandre Bonvin bio photo

Computational Structural Biology group focusing on dissecting, understanding and predicting biomolecular interactions at the molecular level.

Email Twitter Github Youtube Subscribe


Supported by:




Structural Bioinformatics & Modelling

About this course

The Structural Bioinformatics & Modelling course, created and maintained by the Computational Structural Biology group of Utrecht University, is aimed at those interested in learning protein modelling, molecular simulation, and docking of biological molecules. The course material requires a solid understanding of molecular biology, namely of protein sequence and structure, as well as familiarity with basic chemistry concepts. Experience with a UNIX-like command-line environment is not required but helps typing the commands to the modelling concepts and the biology.

The practical course is divided into three modules, each covering a particular modelling method. While the goal is to combine the three methods to answer a biological problem, each module can be followed independently. [//]: # (Due to the unfortunate COVID-19 reasons, this course has been updated to a fully online version, which can be easily followed remotely.)

Part 1: Homology modelling

This first module is about performing homology modelling of a protein, consisting of:

  • Template Search
  • Template selection
  • Model building
  • and Model estimation.

Part 2: Molecular dynamic simulations of a peptide

This module introduces Molecular Dynamics (MD) simulations of proteins. The simulation protocol can be used as a starting point for the investigation of protein dynamics (provided your system does not contain non-standard groups). By the end of this tutorial, you should know the steps involved in:

  • setting up
  • running
  • and analyzing a simulation, including critically assessing the choices made at the different steps.

Part 3: Protein-peptide data-driven docking

The third module introduces protein-peptide docking using the HADDOCK2.4 web server. It also introduces the CPORT web server for interface prediction, based on evolutionary conservation and other biophysical properties. By the end of this tutorial, you should know how to:

  • setup a HADDOCK run
  • interpret its results in terms of biological insights.

Conventions

Each module has its separate set of web pages, but they all share the same conventions. Throughout the material, colored text will be used to refer to questions or instructions, Unix and/or Pymol commands, and attention prompts (to avoid distractions!). Students following these tutorials should try their best and answer these questions, instead of blindly copy-pasting commands!

This is an attention prompt: pay special attention to this! This is a question prompt: try answering it! This an instruction prompt: follow it properly! This is a PyMOL prompt: commands are for PyMOL only! This is a Unix prompt: insert the commands in the terminal!


Requirements

For the homology modelling module, we will be using SWISS-MODEL https://swissmodel.expasy.org online modelling tool. It is very convenient as it allows to perform all stages related to homology modelling: Template Search, Template selection, Model building and Model estimation.

The molecular dynamics module requires installation of specific software packages: GROMACS. GROMACS is installed on the virtual machines, which students can access via NMRbox) (see below). IMPORTANT: Early registration to NMRBox before the course start is necessary https://nmrbox.org/signup.

Once you have registered, please enroll for the 2024 version of the course on NMRBox here.

Another software we will be using throughout the course is a very popular molecular visualization software named PyMOL. PyMOL can be downloaded for free or used via NMRbox.

The third module, protein-peptide data-driven docking, uses the HADDOCK2.4 web server, which requires registration but is free for academic users. All the required scripts and data are available for free on a dedicated GitHub repository.

Modules 1 (homology modelling) and 3 (protein-peptide data-driven docking) can in principle be run from any computer, provided you have web access, and have also installed PyMOL. Module 2 (Molecular dynamic simulations of a peptide) does require access to a terminal and will use Linux commands. In principle the entire tutorial can also be run from within a NMRbox virtual machine (see below).


Use of NMRbox virtual machines (VMs)

In this course, we will be using NMRbox. NMRbox offers cloud-based virtual machines for executing various biomolecular software that can complement NMR (Nuclear Magnetic Resonance) experiements. NMRbox users can choose from 261 already installed software packages, that focus on various research topics such as metabolomics, molecular dynamics, structure, intrinsically disordered proteins or binding. One can search through all available packages on https://nmrbox.org/software.

Register to NMRhub

To use virtual machines through NMRbox, one needs to first register to NMRhub, preferably with their institutional account . Since the registration has to be manually validated and it can take up to two business days, we strongly encourage students to do so before the course starts. After a successful validation you will receive an e-mail with your NMRhub username and password that you will be using while accessing your virtual machine.

Accessing NMRbox

To run the virtual machine on a local computer, one needs to install VNCviewer. With the RealVNC client connects your computer to the NMRbox servers with a virtual desktop - graphical interface. More information about the VNC viewer is in the FAQ of NMRbox.

To choose a virtual machine:

  • first log into the user dashboard https://nmrbox.org/user-dashboard.
  • Download the zip file with bookmarks for the production NMRbox virtual machines.
  • Click File -> Import connections and select the downloaded zip file.
  • After importing, you will see the current release virtual machines. You can use any available virtual machine. The user-dashboard provides information on machine capabilities and recent compute load, thus it is clever to choose a less occupied one.
  • Double click on one of the VMs. An “Authentication” panel appears.
  • Enter your NMRbox username and password.
  • Click on the “Remember password” box to have RealVNC save your information.

By default, your desktop remains running when you disconnect from it. If you login to your VM repeatedly, you will see a screen symbol next to the VM you connected to recently. For more details follow the quick start guide for using NMRbox with VNC viewer here.

If everything runs correctly, you should have a window with your virtual desktop open. In the virtual desktop you have an access to the internet with Chromium as browser or use various programs, including PyMOL. Thus, you could run all three stages of this course here or transfer data between your local machine and the virtual machine. File transfer to and from the VM is quite straightforward and it is described here: https://nmrbox.org/faqs/file-transfer.

In this course we will be working with command lines. For those of you who are not familiar with it, a lot of useful tutorials and documentation can be found here. To find the terminal, look for a black icon with a $_ symbol on it. Once you are familiar with the use of the terminal and basic command lines, we can start the Molecular Dynamics tutorial.

Further NMRbox documentation can be found here.

Once you are done using your VM for the day, just log out of it using the top menu button as shown in this 9s video.


Tutorial layout & Biological Significance

The E3 ubiquitin-protein ligase MDM2 regulates p53, also known as the cell’s guardian angel, via two main mechanisms: ubiquitination-dependent proteasomal degradation and direct inhibition through binding to a region of the trans-activation domain of p53. Not surprisingly, many cancer types take advantage of this interplay and overexpress MDM2, making the p53/MDM2 interaction a prime target for drug development for cancer therapeutics. While many researchers focus on the human p53/MDM2 interaction, we believe mice (Mus musculus) also deserve their share of cutting-edge research, if only for their long-standing contribution to human disease! Therefore, the aim of this course is to probe the binding of a peptide sequence of the mouse p53 tumour suppressor protein to mouse MDM2. Since neither partner has an experimentally determined structure available yet, this scientific problem is perfectly suited for a course on computational structural biology.

To this end, the course will describe how to build structural models for the mouse MDM2 protein via homology modelling and for the p53 trans-activation peptide via molecular dynamics simulations. Afterwards, the resulting models will seed docking simulations to predict the structure of the p53/MDM2 complex. The final goal is to suggest a possible interface and produce a starting model to design and develop drugs that will help save millions of mice! Maybe, with the right amount of luck, these results will be transferable to the human p53/MDM2 complex and will also contribute to our well-being.


Get started!

To get started, click on the icons of the modules. Since the docking simulations require structures, we suggest that for last. If time is an issue, start with the molecular dynamics simulations and, while these run, have fun modelling MDM2.

Homology modelling
Molecular dynamics
Docking

Bonus!

AlphaFold 2
In this bonus module you will discover the power of artificial intelligence (AI) for structural biology. We will introduce AlphaFold 2 and use it to model the MDM2/p53 protein-peptide complex from sequence only.