io
Module for structure file input/output.
StructureFileExtensions = Literal['.pdb', '.pdb.gz', '.ent', '.ent.gz', '.cif', '.cif.gz', '.bcif', '.bcif.gz']
module-attribute
Type of supported structure file extensions.
valid_structure_file_extensions = set(get_args(StructureFileExtensions))
module-attribute
Set of valid structure file extensions.
bcif2cif(bcif_file)
bcif2structure(bcif_file)
Read a binary CIF (bcif) file and return a gemmi Structure object.
This is slower than other formats because gemmi does not support reading bcif files directly. So we convert it to a cif string first using mmcif package and then read the cif string using gemmi.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bcif_file
|
Path
|
Path to the binary CIF file. |
required |
Returns:
Type | Description |
---|---|
Structure
|
A gemmi Structure object representing the structure in the bcif file. |
bcifgz2structure(bcif_gz_file)
Read a binary CIF (bcif) gzipped file and return a gemmi Structure object.
This is slower than other formats because gemmi does not support reading bcif files directly. So we first gunzip the file to a temporary location, convert it to a cif string using mmcif package, and then read the cif string using gemmi.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bcif_gz_file
|
Path
|
Path to the binary CIF gzipped file. |
required |
Returns:
Type | Description |
---|---|
Structure
|
A gemmi Structure object representing the structure in the bcif.gz file. |
convert_to_cif_file(input_file, output_dir, copy_method)
Convert a single structure file to .cif format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_file
|
Path
|
The structure file to convert. See StructureFileExtensions for supported extensions. |
required |
output_dir
|
Path
|
Directory to save the converted .cif file. |
required |
copy_method
|
CopyMethod
|
How to copy when no changes are needed to output file. |
required |
Returns:
Type | Description |
---|---|
Path
|
Path to the converted .cif file. |
convert_to_cif_files(input_files, output_dir, copy_method)
Convert structure files to .cif format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_files
|
Iterable[Path]
|
Iterable of structure files to convert. |
required |
output_dir
|
Path
|
Directory to save the converted .cif files. |
required |
copy_method
|
CopyMethod
|
How to copy when no changes are needed to output file. |
required |
Yields:
Type | Description |
---|---|
Generator[tuple[Path, Path]]
|
A tuple of the input file and the output file. |
glob_structure_files(input_dir)
Glob for structure files in a directory.
Uses StructureFileExtensions as valid extensions. Does not search recursively.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_dir
|
Path
|
The input directory to search for structure files. |
required |
Yields:
Type | Description |
---|---|
Generator[Path]
|
Paths to the found structure files. |
gunzip_file(gz_file, output_file=None, keep_original=True)
Unzip a .gz file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gz_file
|
Path
|
Path to the .gz file. |
required |
output_file
|
Path | None
|
Optional path to the output unzipped file. If None, the .gz suffix is removed from gz_file. |
None
|
keep_original
|
bool
|
Whether to keep the original .gz file. Default is True. |
True
|
Returns:
Type | Description |
---|---|
Path
|
Path to the unzipped file. |
Raises:
Type | Description |
---|---|
ValueError
|
If output_file is None and gz_file does not end with .gz. |
locate_structure_file(root, pdb_id)
Locate a structure file for a given PDB ID in the specified directory.
Uses StructureFileExtensions as potential extensions. Also tries different casing of the PDB ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
root
|
Path
|
The root directory to search in. |
required |
pdb_id
|
str
|
The PDB ID to locate. |
required |
Returns:
Type | Description |
---|---|
Path
|
The path to the located structure file. |
Raises:
Type | Description |
---|---|
FileNotFoundError
|
If no structure file is found for the given PDB ID. |
read_structure(file)
Read a structure from a file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file
|
Path
|
Path to the input structure file. See StructureFileExtensions for supported extensions. |
required |
Returns:
Type | Description |
---|---|
Structure
|
A gemmi Structure object representing the structure in the file. |
split_name_and_extension(name)
Split a filename into its name and extension.
.gz
is considered part of the extension if present.
Examples:
Some example usages.
>>> from protein_quest.pdbe.io import split_name_and_extension
>>> split_name_and_extension("1234.pdb")
('1234', '.pdb')
>>> split_name_and_extension("1234.pdb.gz")
('1234', '.pdb.gz')
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The filename to split. |
required |
Returns:
Type | Description |
---|---|
tuple[str, str]
|
A tuple containing the name and the extension. |
structure2bcif(structure, bcif_file)
Write a gemmi Structure object to a binary CIF (bcif) file.
This is slower than other formats because gemmi does not support writing bcif files directly. So we convert it to a cif string first using gemmi and then convert cif to bcif using mmcif package.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
structure
|
Structure
|
The gemmi Structure object to write. |
required |
bcif_file
|
Path
|
Path to the output binary CIF file. |
required |
structure2bcifgz(structure, bcif_gz_file)
Write a gemmi Structure object to a binary CIF gzipped (bcif.gz) file.
This is slower than other formats because gemmi does not support writing bcif files directly. So we convert it to a cif string first using gemmi and then convert cif to bcif using mmcif package. Finally, we gzip the bcif file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
structure
|
Structure
|
The gemmi Structure object to write. |
required |
bcif_gz_file
|
Path
|
Path to the output binary CIF gzipped file. |
required |
write_structure(structure, path)
Write a gemmi structure to a file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
structure
|
Structure
|
The gemmi structure to write. |
required |
path
|
Path
|
The file path to write the structure to. The format depends on the file extension. See StructureFileExtensions for supported extensions. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
If the file extension is not supported. |