ocrd.resolver module¶
- class ocrd.resolver.Resolver[source]¶
Bases:
object
Handle uploads, downloads, repository access, and manage temporary directories
- download_to_directory(directory, url, basename=None, if_exists='skip', subdir=None, retries=None, timeout=None)[source]¶
Download a URL
url
to a local file indirectory
.If
url
looks like a file path, check whether that exists. If it does exist and is withindirectory` already, return early. If it does exist but is outside of ``directory
. copy it. Ifurl` does not appear to be a file path, try downloading via HTTP, retrying ``retries
times with timeouttimeout
between calls.If
basename
is not given butsubdir
is, setbasename
to the last path segment ofurl
.- If the target file already exists within
directory
, behavior depends onif_exists
: skip
(default): do nothing and return early. Note that thisoverwrite
: overwrite the existing fileraise
: raise aFileExistsError
- Parameters:
directory (string) – Directory to download files to
url (string) – URL to download from
- Keyword Arguments:
basename (string, None) – basename part of the filename on disk. Defaults to last path segment of
url
if unset.if_exists (string, "skip") – What to do if target file already exists. One of
skip
(default),overwrite
orraise
subdir (string, None) – Subdirectory to create within the directory. Think
mets:fileGrp[@USE]
.retries (int, None) – Number of retries to attempt on network failure.
timeout (tuple, None) – Timeout in seconds for establishing a connection and reading next chunk of data.
- Returns:
Local filename string, relative to directory
- If the target file already exists within
- workspace_from_url(mets_url, dst_dir=None, clobber_mets=False, mets_basename=None, download=False, src_baseurl=None, mets_server_url=None)[source]¶
Create a workspace from a METS by URL (i.e. clone if
mets_url
is remote ordst_dir
is given).- Parameters:
mets_url (string) – Source METS URL or filesystem path
- Keyword Arguments:
dst_dir (string, None) – Target directory for the workspace. By default create a temporary directory under
ocrd.constants.TMP_PREFIX
. (The resulting path can be retrieved viaocrd.Workspace.directory
.)clobber_mets (boolean, False) – Whether to overwrite existing
mets.xml
. By default existingmets.xml
will raise an exception.download (boolean, False) – Whether to also download all the files referenced by the METS
src_baseurl (string, None) – Base URL for resolving relative file locations
Download (clone)
mets_url
tomets.xml
indst_dir
, unless the former is already local and the latter isnone
or already identical to its directory name.- Returns:
a new
Workspace
- workspace_from_nothing(directory, mets_basename='mets.xml', clobber_mets=False)[source]¶
Create an empty workspace.
- Parameters:
directory (string) – Target directory for the workspace. If
none
, create a temporary directory underocrd.constants.TMP_PREFIX
. (The resulting path can be retrieved viaocrd.Workspace.directory
.)- Keyword Arguments:
clobber_mets (boolean, False) – Whether to overwrite existing
mets.xml
. By default existingmets.xml
will raise an exception.- Returns:
a new
Workspace
- resolve_mets_arguments(directory, mets_url, mets_basename, mets_server_url)[source]¶
Resolve the
--mets
,--mets-basename
, –directory`,--mets-server-url
, arguments into a coherent set of arguments according to https://github.com/OCR-D/core/issues/517