filters
Module for filtering structure files and their contents.
ResidueFilterStatistics
dataclass
Statistics for filtering files based on residue count in a specific chain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_file
|
Path
|
The path to the input file. |
required |
residue_count
|
int
|
The number of residues. |
required |
passed
|
bool
|
Whether the file passed the filtering criteria. |
required |
output_file
|
Path | None
|
The path to the output file, if passed. |
required |
filter_files_on_chain(file2chains, output_dir, out_chain='A', scheduler_address=None, copy_method='copy')
Filter mmcif/PDB files by chain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file2chains
|
Collection[tuple[Path, str]]
|
Which chain to keep for each PDB file. First item is the PDB file path, second item is the chain ID. |
required |
output_dir
|
Path
|
The directory where the filtered files will be written. |
required |
out_chain
|
str
|
Under what name to write the kept chain. |
'A'
|
scheduler_address
|
str | Cluster | Literal['sequential'] | None
|
The address of the Dask scheduler.
If not provided, will create a local cluster.
If set to |
None
|
copy_method
|
CopyMethod
|
How to copy when a direct copy is possible. |
'copy'
|
Returns:
Type | Description |
---|---|
list[ChainFilterStatistics]
|
Result of the filtering process. |
filter_files_on_residues(input_files, output_dir, min_residues, max_residues, chain='A', copy_method='copy')
Filter PDB/mmCIF files by number of residues in given chain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_files
|
list[Path]
|
The list of input PDB/mmCIF files. |
required |
output_dir
|
Path
|
The directory where the filtered files will be written. |
required |
min_residues
|
int
|
The minimum number of residues in chain. |
required |
max_residues
|
int
|
The maximum number of residues in chain. |
required |
chain
|
str
|
The chain to count residues of. |
'A'
|
copy_method
|
CopyMethod
|
How to copy passed files to output directory: |
'copy'
|
Yields:
Type | Description |
---|---|
Generator[ResidueFilterStatistics]
|
Objects containing information about the filtering process for each input file. |