filters
FilterStat
dataclass
Statistics for filtering files based on residue count in a specific chain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_file
|
Path
|
The path to the input file. |
required |
residue_count
|
int
|
The number of residues. |
required |
passed
|
bool
|
Whether the file passed the filtering criteria. |
required |
output_file
|
Path | None
|
The path to the output file, if passed. |
required |
filter_files_on_chain(input_dir, id2chains, output_dir, scheduler_address=None, out_chain='A')
Filter mmcif/PDB files by chain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_dir
|
Path
|
The directory containing the input mmcif/PDB files. |
required |
id2chains
|
dict[str, str]
|
Which chain to keep for each PDB ID. Key is the PDB ID, value is the chain ID. |
required |
output_dir
|
Path
|
The directory where the filtered files will be written. |
required |
scheduler_address
|
str | Cluster | None
|
The address of the Dask scheduler. |
None
|
out_chain
|
str
|
Under what name to write the kept chain. |
'A'
|
Returns:
Type | Description |
---|---|
list[tuple[str, str, Path | None]]
|
A list of tuples containing the PDB ID, chain ID, and path to the filtered file. |
list[tuple[str, str, Path | None]]
|
Last tuple item is None if something went wrong like chain not present. |
filter_files_on_residues(input_files, output_dir, min_residues, max_residues, chain='A')
Filter PDB/mmCIF files by number of residues in given chain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_files
|
list[Path]
|
The list of input PDB/mmCIF files. |
required |
output_dir
|
Path
|
The directory where the filtered files will be written. |
required |
min_residues
|
int
|
The minimum number of residues in chain. |
required |
max_residues
|
int
|
The maximum number of residues in chain. |
required |
chain
|
str
|
The chain to count residues of. |
'A'
|
Yields:
Type | Description |
---|---|
Generator[FilterStat]
|
FilterStat objects containing information about the filtering process for each input file. |