fetch
Module for fetch Alphafold data.
DownloadableFormat = Literal['summary', 'bcif', 'cif', 'pdb', 'paeDoc', 'amAnnotations', 'amAnnotationsHg19', 'amAnnotationsHg38', 'msa', 'plddtDoc']
module-attribute
Types of formats that can be downloaded from the AlphaFold web service.
UrlFileNamePair = tuple[URL, str]
module-attribute
A tuple of a URL and a filename.
UrlFileNamePairsOfFormats = dict[DownloadableFormat, UrlFileNamePair]
module-attribute
A mapping of DownloadableFormat to UrlFileNamePair.
downloadable_formats = set(get_args(DownloadableFormat))
module-attribute
Set of formats that can be downloaded from the AlphaFold web service.
AlphaFoldEntry
dataclass
AlphaFold entry with summary object and optionally local files.
See https://alphafold.ebi.ac.uk/api-docs for more details on the summary data structure.
by_format(dl_format)
Get the file path for a specific format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dl_format
|
DownloadableFormat
|
The format for which to get the file path. |
required |
Returns:
| Type | Description |
|---|---|
Path | None
|
The file path corresponding to the download format. |
Path | None
|
Or None if the file is not set. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the format is not valid. |
format2attr(dl_format)
classmethod
Get the attribute name for a specific download format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dl_format
|
DownloadableFormat
|
The format for which to get the attribute name. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The attribute name corresponding to the download format. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the format is not valid. |
nr_of_files()
Nr of _file properties that are set
Returns:
| Type | Description |
|---|---|
int
|
The number of _file properties that are set. |
relative_to(session_dir)
Convert paths in an AlphaFoldEntry to be relative to the session directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session_dir
|
Path
|
The session directory to which the paths should be made relative. |
required |
Returns:
| Type | Description |
|---|---|
AlphaFoldEntry
|
An AlphaFoldEntry instance with paths relative to the session directory. |
fetch_alphafold_db_version()
async
Fetch the current version of the AlphaFold database.
Returns:
| Type | Description |
|---|---|
str
|
The current version of the AlphaFold database as a string. For example: "6". |
fetch_many(uniprot_accessions, save_dir, formats, db_version=None, max_parallel_downloads=5, cacher=None, gzip_files=False, all_isoforms=False)
Synchronously fetches summaries and/or files like cif from AlphaFold Protein Structure Database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uniprot_accessions
|
Iterable[str]
|
A set of Uniprot accessions to fetch. |
required |
save_dir
|
Path
|
The directory to save the fetched files to. |
required |
formats
|
set[DownloadableFormat]
|
A set of formats to download.
If |
required |
db_version
|
str | None
|
The version of the AlphaFold database to use. If None, the latest version will be used. |
None
|
max_parallel_downloads
|
int
|
The maximum number of parallel downloads. |
5
|
cacher
|
Cacher | None
|
A cacher to use for caching the fetched files. |
None
|
gzip_files
|
bool
|
Whether to gzip the downloaded files. Summaries are never gzipped. |
False
|
all_isoforms
|
bool
|
Whether to yield all isoforms of each uniprot entry. When False then yields only the canonical sequence per uniprot entry. |
False
|
Returns:
| Type | Description |
|---|---|
list[AlphaFoldEntry]
|
A list of AlphaFoldEntry dataclasses containing the summary, pdb file, and pae file. |
fetch_many_async(uniprot_accessions, save_dir, formats, db_version=None, max_parallel_downloads=5, cacher=None, gzip_files=False, all_isoforms=False)
Asynchronously fetches summaries and/or files from AlphaFold Protein Structure Database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uniprot_accessions
|
Iterable[str]
|
A set of Uniprot accessions to fetch. |
required |
save_dir
|
Path
|
The directory to save the fetched files to. |
required |
formats
|
set[DownloadableFormat]
|
A set of formats to download.
If |
required |
db_version
|
str | None
|
The version of the AlphaFold database to use. If None, the latest version will be used. |
None
|
max_parallel_downloads
|
int
|
The maximum number of parallel downloads. |
5
|
cacher
|
Cacher | None
|
A cacher to use for caching the fetched files. |
None
|
gzip_files
|
bool
|
Whether to gzip the downloaded files. Summaries are never gzipped. |
False
|
all_isoforms
|
bool
|
Whether to yield all isoforms of each uniprot entry. When False then yields only the canonical sequence per uniprot entry. |
False
|
Yields:
| Type | Description |
|---|---|
AsyncGenerator[AlphaFoldEntry]
|
A dataclass containing the summary, pdb file, and pae file. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If 'formats' set is empty. |
ValueError
|
If all_isoforms is True and 'summary' is not in 'formats' set. |
fetch_summary(qualifier, session, semaphore, save_dir, cacher)
async
Fetches a summary from the AlphaFold database for a given qualifier.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
qualifier
|
str
|
The uniprot accession for the protein or entry to fetch.
For example |
required |
session
|
RetryClient
|
An asynchronous HTTP client session with retry capabilities. |
required |
semaphore
|
Semaphore
|
A semaphore to limit the number of concurrent requests. |
required |
save_dir
|
Path | None
|
An optional directory to save the fetched summary as a JSON file. If set and summary exists then summary will be loaded from disk instead of being fetched from the API. If not set then the summary will not be saved to disk and will always be fetched from the API. |
required |
cacher
|
Cacher
|
A cacher to use for caching the fetched summary. Only used if save_dir is not None. |
required |
Returns:
| Type | Description |
|---|---|
list[EntrySummary]
|
A list of EntrySummary objects representing the fetched summary. |
list[EntrySummary]
|
When qualifier has multiple isoforms then multiple summaries are returned, |
list[EntrySummary]
|
otherwise a list of a single summary is returned. |
files_for_alphafold_entries(uniprot_accessions, formats, db_version, gzip_files)
Get the files to download for multiple AlphaFold entries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uniprot_accessions
|
Iterable[str]
|
A set of Uniprot accessions. |
required |
formats
|
set[DownloadableFormat]
|
A set of formats to download. |
required |
db_version
|
str
|
The version of the AlphaFold database to use. |
required |
gzip_files
|
bool
|
Whether to download gzipped files. Otherwise downloads uncompressed files. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, UrlFileNamePairsOfFormats]
|
A mapping of Uniprot accession to a mapping of DownloadableFormat to UrlFileNamePair. |