filter
Filter subcommands for protein-quest.
chain(chains, input_dir, output_dir, /, *, scheduler_address=None, cache=None, _=None)
Filter on chain.
For each input PDB/mmCIF and chain combination write a PDB/mmCIF file with just the given chain
and rename it to chain A. Filtering is done in parallel using a Dask cluster.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chains
|
InputFile
|
CSV file with |
required |
input_dir
|
InputDir
|
Directory with PDB/mmCIF files.
Expected filenames are |
required |
output_dir
|
OutputDir
|
Directory to write the single-chain PDB/mmCIF files. Output files are in same format as input files. |
required |
scheduler_address
|
str | None
|
Address of the Dask scheduler to connect to.
If not provided, will create a local cluster.
If set to |
None
|
cache
|
CacheParameter | None
|
Cache options including no_cache, cache_dir, and copy_method. |
None
|
_
|
Common | None
|
Common CLI options. |
None
|
confidence(input_dir, output_dir, /, *, filters=None, write_stats=None, scheduler_address=None, cache=None, _=None)
Filter AlphaFold mmcif/PDB files by confidence (plDDT).
Filter AlphaFold mmcif/PDB files by confidence (plDDT). Passed files are written with residues below threshold removed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dir
|
InputDir
|
Directory with AlphaFold mmcif/PDB files. |
required |
output_dir
|
OutputDir
|
Directory to write filtered mmcif/PDB files. |
required |
filters
|
ConfidenceFilterQuery | None
|
Confidence filtering criteria. |
None
|
write_stats
|
OutputFile | None
|
Write filter statistics to file.
In CSV format with |
None
|
scheduler_address
|
str | None
|
Address of the Dask scheduler to connect to.
If not provided, will create a local cluster.
If set to |
None
|
cache
|
CacheParameter | None
|
Cache options including no_cache, cache_dir, and copy_method. |
None
|
_
|
Common | None
|
Common CLI options. |
None
|
residue(input_dir, output_dir, /, *, min_residues=0, max_residues=10000000, write_stats=None, cache=None, _=None)
Filter PDB/mmCIF files by number of residues in chain A.
Filter PDB/mmCIF files by number of residues in chain A.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dir
|
InputDir
|
Directory with PDB/mmCIF files (for example from 'filter chain'). |
required |
output_dir
|
OutputDir
|
Directory to write filtered PDB/mmCIF files. Files are copied without modification. |
required |
min_residues
|
MinResidues
|
Min residues in chain A. |
0
|
max_residues
|
MaxResidues
|
Max residues in chain A. |
10000000
|
write_stats
|
OutputFile | None
|
Write filter statistics to file.
In CSV format with |
None
|
cache
|
CacheParameter | None
|
Cache options including no_cache, cache_dir, and copy_method. |
None
|
_
|
Common | None
|
Common CLI options. |
None
|
resolution(input_dir, output_dir, /, *, group_by='uniprot_accession', no_group_by=False, top=1000, write_stats=None, cache=None, _=None)
Filter structure files by best resolution.
AlphaFold structures are preferred over non-AlphaFold. Structures with lower resolution are preferred. If resolution is the same, structures with more residues are preferred. If resolution is missing, those structures are undesirable.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dir
|
InputDir
|
Directory structure files. |
required |
output_dir
|
OutputDir
|
Directory to write the selected structure files. |
required |
group_by
|
Annotated[GroupBy, Parameter(group=_GROUP_BY)]
|
Pass top-N structures with best resolution per uniprot accession.
Structures without uniprot accession are never passed.
Mutually exclusive with |
'uniprot_accession'
|
no_group_by
|
Annotated[bool, Parameter(name=--no - group - by, negative='', group=_GROUP_BY)]
|
Disable grouping and use global top-N ranking across all files.
Mutually exclusive with |
False
|
top
|
PositiveInt
|
Maximum number of files to keep. |
1000
|
write_stats
|
OutputFile | None
|
Write filter statistics to file.
In CSV format. For |
None
|
cache
|
CacheParameter | None
|
Cache options |
None
|
_
|
Common | None
|
Common CLI options. |
None
|
secondary_structure(input_dir, output_dir, /, *, filters=None, write_stats=None, cache=None, _=None)
Filter PDB/mmCIF files by secondary structure.
Filter PDB/mmCIF files by secondary structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dir
|
InputDir
|
Directory with PDB/mmCIF files. |
required |
output_dir
|
OutputDir
|
Directory to write filtered PDB/mmCIF files. Files are copied without modification. |
required |
filters
|
SecondaryStructureFilterQuery | None
|
Secondary structure filtering criteria. |
None
|
write_stats
|
OutputFile | None
|
Write filter statistics to file.
In CSV format with columns:
|
None
|
cache
|
CacheParameter | None
|
Cache options including no_cache, cache_dir, and copy_method. |
None
|
_
|
Common | None
|
Common CLI options. |
None
|