haddock.modules.analysis.contactmap.contmap module

Module computing contact maps of complexes, alone or grouped by cluster.

Chord diagram functions were adapted from: https://plotly.com/python/v3/filled-chord-diagram/

class haddock.modules.analysis.contactmap.contmap.ClusteredContactMap(models: list[pathlib.Path], output: Path, params: dict)[source]

Bases: object

ContactMap analysis for set of clustered structures.

static aggregate_contacts(contacts_holder: dict, contact_keys: list[str], contacts: list[dict], key1: str, key2: str) None[source]

Aggregate single models data belonging to a cluster.

Parameters:
  • contacts_holder (dict) – Dictionnary holding list of contact data

  • contact_keys (list[str]) – Order of the keys to access the dictionnary

  • contacts (list[dict]) – Singel model contact data.

  • key1 (str) – Name of the key to access first entry in data.

  • key2 (str) – Name of the key to access second entry in data.

run()[source]

Process analysis of contacts of a set of PDB structures.

class haddock.modules.analysis.contactmap.contmap.ContactsMap(model: Path, output: Path, params: dict)[source]

Bases: object

ContactMap analysis for single structure.

generate_output(res_res_contacts: list[dict], all_heavy_interchain_contacts: list[dict]) None[source]

Generate several outputs based on contacts.

Parameters:
  • res_res_contacts (list[dict]) – List of residue-residue contacts

  • all_heavy_interchain_contacts (list[dict]) – List of heavy atoms interchain contacts

run()[source]

Process analysis of contacts of a PDB structure.

class haddock.modules.analysis.contactmap.contmap.ContactsMapJob(output, params, name, contact_obj)[source]

Bases: SupportsRun

A Job dedicated to the running of contact maps objects.

run()[source]

Run this ContactMap Object Job.

haddock.modules.analysis.contactmap.contmap.add_chordchart_legends(fig: Figure) None[source]

Add custom legend to chordchart.

Parameters:

fig (go.Figure) – A plotly figure.

haddock.modules.analysis.contactmap.contmap.check_square_matrix(data_matrix: ndarray[Any, dtype[ScalarType]]) int[source]

Check if the matrix is a square one.

Parameters:

data_matrix (NDArray (2DArray)) – The matrix to be checked.

Returns:

nb_rows (int) – Number of rows in this matrix.

haddock.modules.analysis.contactmap.contmap.compute_distance_matrix(all_atm_coords: list[list[float]]) ndarray[Any, dtype[float64]][source]

Compute all vs all distance matrix.

Paramaters

all_atm_coordslist[list[float]]

List of atomic coordinates.

returns:

dist_matrix (NDFloat) – N*N distance matrix between all coordinates.

haddock.modules.analysis.contactmap.contmap.contacts_to_connect_matrix(matrix: ndarray[Any, dtype[float64]], labels: list[str]) list[list[int]][source]

.

Parameters:
  • matrix (NDFloat) – A square contact matrix.

  • labels (list[str]) – List of labels corresponding row & columns entries.

Returns:

connect_matrix (list[list[Union[str, int]]]) – The connectivity matrix without self contacts.

haddock.modules.analysis.contactmap.contmap.control_pts(angle: list[float], radius: float) list[tuple[float, float]][source]

Generate control points to draw a SVGpath.

Parameters:
  • angle (list[float]) – A list containing angular coordinates of the control points b0, b1, b2.

  • radius (float) – The distance from b1 to the origin O(0,0)

Returns:

control_points (list[tuple[float, float]]) – The set of control points.

Raises:

ValueError – Raised if the number of angular coordinates is not equal to 3.

haddock.modules.analysis.contactmap.contmap.ctrl_rib_chords(side1: tuple[float, float], side2: tuple[float, float], radius: float) list[list[tuple[float, float]]][source]

Generate poligons points aiming at drawing ribbons.

Parameters:
  • side1 (tuple[float, float]) –

    List of angular variables of the ribbon arc ends defining

    the ribbon starting (ending) arc

  • side2 (tuple[float, float]) –

    List of angular variables of the ribbon arc ends defining

    the ribbon starting (ending) arc

  • radius (float, optional) – Circle radius size

Returns:

list[list[tuple[float, float]]] – _description_

haddock.modules.analysis.contactmap.contmap.datakey_to_colorscale(data_key: str, color_scale: str = 'Greys') str[source]

Convert color scale into reverse if data implies to do it.

data_keystr

A dictionary key pointing to data type.

color_scalestr

Name of a base plotpy color_scale.

Returns:

color_scale (str) – Possibly the reverse name of the color_scale.

haddock.modules.analysis.contactmap.contmap.extract_heavyatom_contacts(matrix: ndarray[Any, dtype[float64]], resdt: dict, res1_key: str, res2_key: str, contact_distance: float = 4.5) list[dict[str, Union[float, str]]][source]

Generate contacts data.

Parameters:
  • matrix (NDFloat) – The distance matrix.

  • resdt (dict) – Residues data with atom indices as returned by get_ordered_coords().

  • res1_key (str) – First residue of interest.

  • res2_key (str) – Second residue of interest.

  • contact_distance (float) – Distance defining a contact.

Returns:

all_contacts (list[dict[str, Union[float, str]]]) – List holding contact data

haddock.modules.analysis.contactmap.contmap.extract_pdb_coords(line: str) list[float][source]

Extract coordinated from a PDB line.

Parameters:

line (str) – A strandard ATOM/HETATM pdb record.

Returns:

coords (list[float]) – List of the X, Y and Z coordinate of this atom.

haddock.modules.analysis.contactmap.contmap.extract_pdb_dt(path: Path) dict[source]

Read and extract ATOM/HETATM records from a pdb file.

Parameters:

path (Path) – Path to a pdb file.

Returns:

pdb_chains (dict) – A dictionary of the pdb file accesible using chains as keys.

haddock.modules.analysis.contactmap.contmap.extract_submatrix(matrix: ndarray[Any, dtype[float64]], indices: list[int], indices2: list[int] | None = None) ndarray[Any, dtype[float64]][source]

Extract submatrix based on desired indices.

Paramaters

matrixNDFloat

A N*N matrix.

indiceslist[int]

List of row indices to extract from this matrix

indices2list[int]
List of columns indices to extract from this matrix.

if unspecified, indices2 == indices and symetric matrix is extracted.

returns:

submat (NDFloat) – The extracted submatrix.

haddock.modules.analysis.contactmap.contmap.gen_contact_dt(matrix: ndarray[Any, dtype[float64]], resdt: dict, res1_key: str, res2_key: str) dict[source]

Generate contacts data.

Parameters:
  • matrix (NDFloat) – The distance matrix.

  • resdt (dict) – Residues data with atom indices as returned by get_ordered_coords().

  • res1_key (str) – First residue of interest.

  • res2_key (str) – Second residue of interest

Returns:

cont_dt (dict) – Dictionary holding contact data

haddock.modules.analysis.contactmap.contmap.get_all_ideograms_ends(chains: dict, gap: float = 0.031415926535897934) tuple[list[tuple[float, float]], list[tuple[float, float]]][source]

Generate both chain and residues ideograms ends.

Parameters:
  • chains (dict) – Dictionary mapping to list of residues labels.

  • gap (float, optional) – Gap distance used to separate two ideograms, by default 2*PI*0.005

Returns:

  • tuple[ideo_ends, chain_ideo_ends] – A tuple containing residues ideo ends and chains ideo ends.

  • ideo_ends (list[tuple[float, float]]) – List of residues ideograms start and ending positions.

  • chain_ideo_ends (list[tuple[float, float]]) – List of chain ideograms start and ending positions.

haddock.modules.analysis.contactmap.contmap.get_chains_ideograms_ends(chains: dict[str, list[str]], gap: float = 0.031415926535897934) tuple[list[tuple[float, float]], numpy.ndarray[Any, numpy.dtype[numpy.float64]]][source]

Build ideogram ends to represent protein chains.

Parameters:
  • chains (dict[str, list[str]]) – Dictionary mapping chains with their respective set of residues labels.

  • gap (float, optional) – Gap between two ideograms, by default 2*PI*0.005

Returns:

  • chain_ideo_ends (list[tuple[float, float]]) – Ideogram ends to represent protein chains.

  • chain_ideogram_length (NDFloat)

haddock.modules.analysis.contactmap.contmap.get_clusters_sets(models: list[haddock.libs.libontology.PDBFile]) dict[source]

Split models by clusters ids.

Parameters:

models (list) – List of pdb models/complexes.

Returns:

clusters_sets (dict) – Dictionary of models acccessible by their cluster ids as keys.

haddock.modules.analysis.contactmap.contmap.get_cont_type(resn1: str, resn2: str) str[source]

Generate polarity key between two residues.

Parameters:
  • resn1 (str) – 3 letters code of fist residue.

  • resn2 (str) – 3 letters code of second residue.

Returns:

pol_key (str) – Combined residues polarities

haddock.modules.analysis.contactmap.contmap.get_ideogram_ends(ideogram_len: ndarray[Any, dtype[float64]], gap: float) list[tuple[float, float]][source]

Generate ideogram ends.

Paramaters

ideogram_lenNDArray

Length of each ideograms.

gapfloat

Gap to add in between each ideogram.

returns:

ideo_ends (list[tuple[float]]) – List of start and end position for each ideograms.

haddock.modules.analysis.contactmap.contmap.get_ordered_coords(pdb_chains: dict) tuple[list[list[float]], list[str], dict][source]

Generate list of all atom coordinates.

Parameters:

pdb_chains (dict) –

A dictionary of the pdb file accesible using chains as keys,

as provided by the extract_pdb_dt() function.

Returns:

  • all_coords (list[list[float]]) – All atomic coordinates in a single list.

  • resid_keys (list[str]) – Ordered list of residues keys.

  • resid_dt (dict) – Dictionary of coordinates indices for each residue.

haddock.modules.analysis.contactmap.contmap.invPerm(perm: list[int]) list[int][source]

Generate the inverse of a permutation.

Parameters:

perm (_type_) – A permutation.

Returns:

inv (list[int]) – Inverse of a permutation.

haddock.modules.analysis.contactmap.contmap.make_chordchart(_contact_matrix: list[list[int]], _dist_matrix: list[list[float]], _interttype_matrix: list[list[str]], _labels: list[str], gap: float = 0.031415926535897934, output_fpath: str | Path = 'chordchart.html', title: str = 'Chord diagram', offline: bool = False) str | Path[source]

Generate a plotly chordchart graph.

Parameters:
  • _contact_matrix (list[list[int]]) – The contact matrix

  • _dist_matrix (list[list[float]]) – The distance matrix

  • _interttype_matrix (list[list[str]]) – The interaction type matrix

  • _labels (list[str]) – Labels of each matrix rows (and columns as supposed to be symetric)

  • gap (float, optional) – Gap between two ideograms, by default 2*PI*0.005

  • output_fpath (Union[str, Path], optional) – Path to the output file, by default ‘chordchart.html’

  • title (str, optional) – Title to give to the diagram, by default ‘Chord diagram’

Returns:

output_fpath (Union[str, Path]) – Path to the genereated output file.

haddock.modules.analysis.contactmap.contmap.make_contactmap_report(contactmap_jobs: list[haddock.modules.analysis.contactmap.contmap.ContactsMapJob], outputpath: str | Path) str | Path[source]

Generate a HTML navigation page holding all generated files.

Parameters:
  • contact_jobs (list[Union[ClusteredContactMap, ContactsMap]]) – All the terminated jobs

  • outputpath (Union[str, Path]) – Output filepath where to write the report.

Returns:

outputpath (Union[str, Path]) – Path to the generated report.

haddock.modules.analysis.contactmap.contmap.make_ideo_shape(path: str, line_color: str, fill_color: str) dict[source]

Generate data to draw a ideogram shape.

Parameters:
  • path (str) – A SVGPath to be drawn.

  • line_color (str) – Color of the shape boundary.

  • fill_color (str) – Shape filling color fr the ribbon shape.

Returns:

dict – Data enabling to draw a ideogram shape in layout.

haddock.modules.analysis.contactmap.contmap.make_ideogram_arc(radius: float, _phi: tuple[float, float], nb_points: float = 50) ndarray[Any, dtype[float64]][source]

Generate ideogran arc.

Parameters:
  • radius (float) – The circle radius.

  • phi (tuple[float, float]) – Tuple of ends angle coordinates of an arc.

  • nb_points (float) – Parameter that controls the number of points to be evaluated on an arc

Returns:

arc_positions (NDArray) – Array of 2D coorinates defining an arc.

haddock.modules.analysis.contactmap.contmap.make_layout(title: str, plot_size: float, layout_shapes: list[dict]) Layout[source]

Generate the chart layout.

Parameters:
  • title (str) – Title to be given to the chart.

  • plot_size (float) – Size of the chart.

  • layout_shapes (list[dict]) – Shapes to be drawn.

Returns:

layout (go.Layout) – The plotly layout.

haddock.modules.analysis.contactmap.contmap.make_q_bezier(control_points: list[tuple[float, float]]) str[source]

Define the Plotly SVG path for a quadratic Bezier curve.

defined by the list of its control points.

Parameters:

control_points (list[tuple[float, float]]) – List of control points

Returns:

svgpath (str) – An SVG path

haddock.modules.analysis.contactmap.contmap.make_ribbon(side1: tuple[float, float], side2: tuple[float, float], line_color: str, fill_color: str, radius: float = 0.2) dict[source]

Generate data to draw a ribbon.

Parameters:
  • side1 (list[float]) –

    List of angular variables of first ribbon arc ends defining

    the ribbon starting (ending) arc.

  • side2 (list[float]) –

    List of angular variables of the other ribbon arc ends defining

    the ribbon starting (ending) arc.

  • line_color (str) – Color of the shape boundary.

  • fill_color (str) – Shape filling color fr the ribbon shape.

  • radius (float, optional) – Circle radius size, by default 0.2.

Returns:

dict – Data enabling to draw a ribbon in layout.

haddock.modules.analysis.contactmap.contmap.make_ribbon_arc(theta0: float, theta1: float) str[source]

Generate a SVGpath to draw a ribbon arc.

Parameters:
  • theta0 (float) – Starting angle value

  • theta1 (float) – Ending angle value

Returns:

string_arc (str) – A string representing the SVGpath of the ribbon arc.

Raises:
  • ValueError – If provided theta0 and theta1 angles are incorrect for a ribbon.

  • ValueError – If the angle coordinates for an arc side of a ribbon are not in the appropriate range [0, 2*pi]

haddock.modules.analysis.contactmap.contmap.make_ribbon_ends(matrix: ndarray[Any, dtype[ScalarType]], row_sum: list[int], ideo_ends: list[tuple[float, float]], L: int) list[list[tuple[float, float]]][source]

Generate all connecting ribbons coordinates.

Parameters:
  • matrix (NDArray) – The data matrix.

  • row_sum (list[int]) – Number of connexions in each row.

  • ideo_ends (list[tuple[float, float]]) – List of start and end position for each ideograms.

Returns:

ribbon_boundary (list[list[tuple[float, float]]]) – Matrix of per residue ribbons start and end positions.

haddock.modules.analysis.contactmap.contmap.min_dist(matrix: ndarray[Any, dtype[float64]]) float[source]

Find minimum value in a matrix.

haddock.modules.analysis.contactmap.contmap.moduloAB(val: float, lb: float, ub: float) float[source]

Map a real number onto the unit circle.

The unit circle is identified with the interval [lb, ub), ub-lb=2*PI.

Parameters:
  • val (float) – The value to be mapped into the unit circle.

  • lb (float) – The lower boundary.

  • ub (float) – The upper boundary

Returns:

moduloab (float) – The modulo of val between lb and ub

haddock.modules.analysis.contactmap.contmap.split_labels_by_chains(labels: list[str]) dict[str, list[str]][source]

Map each label to its chain.

Parameters:

labels (list[str]) – List of residues keys. e.g.: A-SER-123 (chain A, serine 123)

Returns:

chains (dict[str, list[str]]) – Dictionary mapping chains with their respective set of residues labels.

haddock.modules.analysis.contactmap.contmap.to_color_weight(distance: float, max_dist: float, min_dist: float = 2.0, min_weight: float = 0.2, max_weight: float = 0.9) float[source]

Compute color weight based on distance.

Parameters:
  • distance (float) – The distance to weight.

  • max_dist (float) – The max distance observed in the dataset.

  • min_dist (float, optional) – The minumu, distance observed in the dataset, by default 2.

  • min_weight (float, optional) – Color wight for the maximum distance, by default 0.2

  • max_weight (float, optional) – Color wight for the minimum distance, by default 0.90

Returns:

weight (float) – The color weight. in range [min_weight, max_weight]

haddock.modules.analysis.contactmap.contmap.to_full_matrix(half_matrix: list[Union[int, float, str]], diag_val: int | float | str) ndarray[Any, dtype[ScalarType]][source]

Generate a full matrix from a half matrix.

Parameters:
  • half_matrix (list[Any]) – Values of the N*(N-1)/2 half matrix.

  • diag_val (Any) – Value to be placed in diagonal of the full matrix.

Returns:

matrix (NDArry) – The reconstituted full matrix.

haddock.modules.analysis.contactmap.contmap.to_nice_label(label: str) str[source]

Convert a label into a user friendly label.

Parameters:

label (str) – Label name as found in csv

Returns:

nicelabel (str) – User friendly description of the label.

haddock.modules.analysis.contactmap.contmap.to_rgba_color_string(connect_color: tuple[int, int, int], alpha: float) str[source]

Generate a rgba string from list of colors and alpha.

Parameters:
  • connect_color (list[int]) – A 3-values list of integers defining the red, green and blue colors.

  • alpha (float) – color_weight

Returns:

rgba_color (str) – The html like rgba colors. e.g.: ‘rgba(123, 123, 123, 0.5)’

haddock.modules.analysis.contactmap.contmap.topX_models(models: list[haddock.libs.libontology.PDBFile], topX: int = 10) list[Any][source]

Sort and return subset of top X best models.

Parameters:
  • models (list) – List of pdb models/complexes.

  • topX (int) – Number of models to return after sorting.

Returns:

subset_bests (list) – List of top X best models.

haddock.modules.analysis.contactmap.contmap.tsv_to_chordchart(tsv_path: Path, sep: str = '\t', data_key: str = 'ca-ca-dist', contact_threshold: float = 7.5, filter_intermolecular_contacts: bool = True, output_fname: Path | str = 'contacts_chordchart.html', title: str = 'Chord diagram', offline: bool = False) Path | str[source]

Read a tsv file and generate a chord diagram from it.

Paramters

tsv_pathPath

Path a the .tsv file containing contact data.

sepstr

Separator character used to split data in each line.

data_keystr

Data key used to draw the plot.

contact_thresholdfloat
Upper boundary of maximum value to be plotted.

any value above it will be set to this value.

output_fnameUnion[Path, str]

Path where to generate the graph.

titlestr

Title to give to the Chord diagram

returns:

chord_chart_fpath (Union[Path, str]) – Path to the generated graph

haddock.modules.analysis.contactmap.contmap.tsv_to_heatmap(tsv_path: Path, sep: str = '\t', data_key: str = 'ca-ca-dist', contact_threshold: float = 7.5, colorscale: str = 'Greys', output_fname: Path | str = 'contacts.html', offline: bool = False) Path | str[source]

Read a tsv file and generate a heatmap from it.

Paramters

tsv_pathPath

Path a the .tsv file containing contact data.

sepstr

Separator character used to split data in each line.

data_keystr

Data key used to draw the plot.

contact_thresholdfloat
Upper boundary of maximum value to be plotted.

any value above it will be set to this value.

output_fnamePath

Path to the generated graph.

returns:

output_filepath (Union[Path, str]) – Path to the generated file.

haddock.modules.analysis.contactmap.contmap.within_2PI(val: float) bool[source]

Check if float value is within unit circle value range.

Parameters:

val (float) – The value to be tested.

haddock.modules.analysis.contactmap.contmap.write_res_contacts(res_res_contacts: list[dict], header: list[str], path: Path | str, sep: str = '\t', interchain_data: bool | dict | None = None) Path[source]

Write a tsv file based on residues-residues contacts data.

Parameters:
  • res_res_contacts (list[dict]) – List of dict holding data for each residue-residue contacts.

  • header (list[str]) – Ordered list of keys to access in the dicts.

  • path (Path) – Path to the output file to generate.

  • sep (str) – Character used to separate data within a line.

Returns:

path (Path) – Path to the generated file.