haddock.modules.analysis.caprifilter.caprifilter module

Helper functions for the caprifilter module.

haddock.modules.analysis.caprifilter.caprifilter.collect_metrics(capri_objects: list) dict[PDBFile, dict[str, float]][source]

Return {model: {metric: value}} for all CAPRI metrics.

Parameters:

capri_objects (list[CAPRI]) – List of CAPRI objects, one per model (if multiple references given - one best reference is already selected).

Returns:

metrics_data (dict[PDBFile, dict[str, float]]) – Dictionary mapping each model’s PDBFile to a dict of metric values.

haddock.modules.analysis.caprifilter.caprifilter.filter_models(metrics_data: dict[PDBFile, dict[str, float]], filter_specs: dict[str, tuple[float, str]]) tuple[list[PDBFile], list[PDBFile]][source]

Split models into kept and filtered_out based on multiple metric filters.

All filters are applied simultaneously with AND logic. A model must pass every active filter to be kept. Models with NaN for any filtered metric are always removed.

Parameters:
  • metrics_data (dict[PDBFile, dict[str, float]]) – Metric values per model, as returned by collect_metrics.

  • filter_specs (dict[str, tuple[float, str]]) – {metric: (cutoff, filter_out)} where filter_out is ‘above’ or ‘below’. ‘above’: filter out models with value > cutoff (keep value <= cutoff). ‘below’: filter out models with value < cutoff (keep value >= cutoff).

Returns:

kept, filtered_out (tuple[list[PDBFile], list[PDBFile]]) – ‘filtered_out’ is not used in the caprifilter module, it is kept for tests.

haddock.modules.analysis.caprifilter.caprifilter.get_capri_params(filter_by: list[str]) dict[str, bool][source]

Return CAPRI computation params required for the given filter metrics.

Translates the user-facing filter_by metric names into the boolean computation flags expected by the CAPRI class. All six computation params are returned: those required by filter_by are set to True, the rest to False. If dockq is requested, its dependencies (fnat, irmsd, lrmsd) are also enabled automatically.

Parameters:

filter_by (list[str]) – List of metrics to filter on. Valid values: irmsd, lrmsd, ilrmsd, fnat, dockq, rmsd.

Returns:

params (dict[str, bool]) – Computation param flags ready to be injected into self.params before CAPRI jobs are created. Keys: irmsd, lrmsd, ilrmsd, fnat, dockq, global_rmsd.

haddock.modules.analysis.caprifilter.caprifilter.write_caprifilter(kept: list[PDBFile], capri_objects: list, filter_specs: dict[str, tuple[float, str]], fname: str = 'caprifilter.tsv') None[source]

Write TSV with kept models and only the user-requested metric columns.

Parameters:
  • kept (list[PDBFile]) – Models that passed all filters.

  • capri_objects (list[CAPRI]) – CAPRI objects (one per model, best reference already selected).

  • filter_specs (dict[str, tuple[float, str]]) – Active filter specifications {metric: (cutoff, filter_out)}.

  • fname (str) – Output file name.

haddock.modules.analysis.caprifilter.caprifilter.write_caprifilter_full(capri_objects: list, metrics_data: dict[PDBFile, dict[str, float]], kept: list[PDBFile], filter_specs: dict[str, tuple[float, str]], fname: str = 'caprifilter_all_models.tsv') None[source]

Write TSV with all models, user-requested metric columns, and a status column.

Parameters:
  • capri_objects (list[CAPRI]) – CAPRI objects (one per model, best reference already selected).

  • metrics_data (dict[PDBFile, dict[str, float]]) – Metric values per model, as returned by collect_metrics.

  • kept (list[PDBFile]) – Models that passed all filters.

  • filter_specs (dict[str, tuple[float, str]]) – Active filter specifications {metric: (cutoff, filter_out)}.

  • fname (str) – Output file name.