HADDOCK2.2 manual
PDBfiles
In order to run HADDOCK you need to have the structures of the molecules (or fragments thereof) in PDB format. There are a few points to pay attention to when preparing the PDBs for HADDOCK.
- Make sure that all PDB files end with an END statement
- HADDOCK will check from breaks in the chain (e.g. missing density in crystal structures or
between the two strands of a DNA molecules). In the case of multiple chains within one molecule (e.g.
DNA) or in the presence of co-factors, it is however recommended to add a TER statement
in between the chains/sub-molecules.
- Remove the SEGIDs if present (the SEGID is a four character long
string at columns 73-76 in the PDB format. This is particularly
important for docking from an ensemble of starting conformations.
If not blanked, the
topology and structure generation step
will give problems.
For this purpose an awk script called pdb_blank_segid is provided in the tools directory. Its usage is:
pdb_blank_segid infile > outfile
- When starting from an ensemble of structure like, for example,
from an NMR PDB entry, split structures into single PDB files. Make sure
that each structure file ends with an END statement and does not contain any
SEGID.
- HADDOCK can deal with ions. You will have however to make sure that the
ion naming is consistent with the ion topologies provided in HADDOCK
(check for this the ion.top file in the toppar directory.
For example, a CA heteroatom with residue name CA will be interpreted as
a neutral calcium atom. A doubly charged calcium ion should be name
CA+2 with as residue name CA2 to be properly recognized by HADDOCK.
(See also the FAQ for docking in the presence of ions).