filters

Module for filtering structure files and their contents.

`ResidueFilterStatistics` `dataclass`

Statistics for filtering files based on residue count in a specific chain.

Parameters:

Name	Type	Description	Default
`input_file`	`Path`	The path to the input file.	required
`residue_count`	`int`	The number of residues.	required
`passed`	`bool`	Whether the file passed the filtering criteria.	required
`output_file`	`Path \| None`	The path to the output file, if passed.	required

Filter mmcif/PDB files by chain.

Parameters:

Name	Type	Description	Default
`file2chains`	`Collection[tuple[Path, str]]`	Which chain to keep for each PDB file. First item is the PDB file path, second item is the chain ID.	required
`output_dir`	`Path`	The directory where the filtered files will be written.	required
`out_chain`	`str`	Under what name to write the kept chain.	`'A'`
`scheduler_address`	`str \| Cluster \| Literal['sequential'] \| None`	The address of the Dask scheduler. If not provided, will create a local cluster. If set to `sequential` will run tasks sequentially.	`None`
`copy_method`	`CopyMethod`	How to copy when a direct copy is possible.	`'copy'`

Returns:

Type	Description
`list[ChainFilterStatistics]`	Result of the filtering process.

Filter PDB/mmCIF files by number of residues in given chain.

Parameters:

Name	Type	Description	Default
`input_files`	`list[Path]`	The list of input PDB/mmCIF files.	required
`output_dir`	`Path`	The directory where the filtered files will be written.	required
`min_residues`	`int`	The minimum number of residues in chain.	required
`max_residues`	`int`	The maximum number of residues in chain.	required
`chain`	`str`	The chain to count residues of.	`'A'`
`copy_method`	`CopyMethod`	How to copy passed files to output directory:	`'copy'`

Yields:

Type	Description
`Generator[ResidueFilterStatistics]`	Objects containing information about the filtering process for each input file.