Usage Guide

This guide provides a comprehensive, step-by-step introduction to using haddock-runner for running large-scale HADDOCK docking benchmarks. No prior experience with previous versions is assumed.

Quick Start Workflow

Using haddock-runner involves three main steps:

Prepare your input files
Configure your benchmark
Run the benchmark

Complete Usage Guide

Step 1: Prepare Your Molecular Data

Before using haddock-runner, you need:

Protein structures: PDB files for your docking targets
Restraint files (optional): TBL files for guided docking
Topology/parameter files (optional): For ligands or special molecules

Organize your files:

your_project/
├── structures/
│   ├── target1_r_u.pdb    # Receptor structure
│   ├── target1_l_u.pdb    # Ligand structure
│   ├── target1_ti.tbl     # True interface restraints
│   └── target1_ref.pdb    # Reference structure (for evaluation)
└── ...

Step 2: Create the Input List File

The input list file specifies all files needed for each docking target.

Key points:

One target per section (separated by comments)
List all required files for each target
Paths can be relative or absolute
Use consistent naming conventions

Example (input_list.txt):

# Target 1A2K - Protein-protein complex
structures/1A2K/1A2K_r_u.pdb
structures/1A2K/1A2K_l_u.pdb
structures/1A2K/1A2K_ti.tbl
structures/1A2K/1A2K_unambig.tbl
structures/1A2K/1A2K_ref.pdb

# Target 1GGR - Another complex
structures/1GGR/1GGR_r_u.pdb
structures/1GGR/1GGR_l_u.pdb
structures/1GGR/1GGR_ti.tbl

Step 3: Write the Benchmark Configuration

The YAML configuration file defines your benchmark scenarios and settings.

Main sections:

general: Global settings (concurrency, resources, directories)
scenarios: Different docking workflows to test
Each scenario defines a complete HADDOCK workflow

Example (benchmark.yaml):

general:
  max_concurrent: 4        # How many jobs to run simultaneously
  ncores: 2               # CPU cores per job
  execution: local        # Execution mode (local, slurm, etc.)
  mol_suffixes: [_r_u, _l_u]  # File name suffixes for molecules
  input_list: input_list.txt  # Path to your input list file
  work_dir: ./results     # Where to store results

scenarios:
  - name: true-interface
    workflow:
      topoaa:
        autohis: true
      rigidbody:
        sampling: 1000
        ambig_fname: _ti.tbl
      flexref:
        ambig_fname: _ti.tbl
      caprieval:
        reference_fname: _ref.pdb

  - name: center-of-mass
    workflow:
      topoaa:
        autohis: true
      rigidbody:
        sampling: 500
        cmrest: true

See Configuration Reference for complete configuration options.

Step 4: Run the Benchmark

Execute haddock-runner with your configuration:

# Basic execution
haddock-runner benchmark.yaml

# Setup mode (validate without running)
haddock-runner --setup benchmark.yaml

# Debug mode (verbose logging)
haddock-runner --debug benchmark.yaml

What happens during execution:

Input validation and checksum verification
Job creation for each target-scenario combination
Concurrent execution according to resource limits
Results organization in the working directory
Progress logging and error handling

See Running the Benchmark for runtime details.

Step 5: Analyze Results

After completion, results are organized by scenario and target:

results/
├── true-interface/
│   ├── 1A2K/
│   │   ├── haddock3.cfg
│   │   ├── run1/
│   │   └── ...
│   └── 1GGR/
│       └── ...
└── center-of-mass/
    ├── 1A2K/
    └── 1GGR/
        └── ...

Result analysis tips:

Compare docking success rates between scenarios
Analyze CAPRI metrics for quality assessment
Examine computation times and resource usage
Use HADDOCK analysis tools for detailed evaluation

Practical Tips

Starting Small

For your first benchmark:

Use 2-3 well-characterized targets
Test 2 different scenarios
Start with small sampling numbers (100-500)
Use --setup mode to validate before full execution

Resource Management

Memory: Each job needs ~2-4GB RAM
CPU: Allocate cores based on your system capacity
Storage: Results can be large (1-10GB per target)
Time: Docking runs can take hours to days

Common Workflows

Parameter optimization:

scenarios:
  - name: sampling-500
    workflow:
      rigidbody:
        sampling: 500
  - name: sampling-1000
    workflow:
      rigidbody:
        sampling: 1000
  - name: sampling-2000
    workflow:
      rigidbody:
        sampling: 2000

Restraint strategy comparison:

scenarios:
  - name: true-interface
    workflow:
      rigidbody:
        ambig_fname: _ti.tbl
  - name: hbond-only
    workflow:
      rigidbody:
        ambig_fname: _hb.tbl
  - name: center-of-mass
    workflow:
      rigidbody:
        cmrest: true

Troubleshooting

Common issues and solutions:

Input file errors:

Verify all files exist and are readable
Check file paths in your input list
Use absolute paths if relative paths don’t work

HADDOCK module errors:

Ensure HADDOCK3 is properly installed
Verify all required modules are available
Check your HADDOCK3 configuration

Resource limitations:

Reduce max_concurrent if running out of memory
Lower sampling numbers for faster testing
Use --setup to validate before full runs

Permission issues:

Ensure write access to working directory
Check execution permissions for the binary
Verify HADDOCK3 has proper file access

Best Practices

File Organization

benchmark_project/
├── configs/
│   ├── benchmark.yaml
│   └── input_list.txt
├── structures/
│   ├── target1/
│   ├── target2/
│   └── ...
├── results/
│   └── (auto-generated)
└── analysis/
    └── (your analysis scripts)

Version Control

Keep configuration files in Git
Store input structures separately (large files)
Document changes between benchmark runs
Use meaningful commit messages

Reproducibility

Fix random seeds when comparing methods
Document exact HADDOCK3 version used
Record system specifications
Archive complete configuration files

Next Steps

Now that you understand the basic workflow:

Set up your first benchmark → Setting Up a Benchmark
Explore example configurations → Examples
Learn about advanced features → Development
Get help with specific issues → Getting Help

Getting Help

If you encounter any issues:

Check the Troubleshooting section above
Consult the GitHub Issues
Review the HADDOCK3 documentation
Contact the support team via the channels mentioned in the main documentation

Keyboard shortcuts

haddock-runner