ocrd.resolver module¶
- class ocrd.resolver.Resolver[source]¶
Bases:
objectHandle uploads, downloads, repository access, and manage temporary directories
- download_to_directory(directory, url, basename=None, if_exists='skip', subdir=None, retries=None, timeout=None)[source]¶
Download a URL
urlto a local file indirectory.If
urllooks like a file path, check whether that exists. If it does exist and is withindirectory` already, return early. If it does exist but is outside of ``directory. copy it. Ifurl` does not appear to be a file path, try downloading via HTTP, retrying ``retriestimes with timeouttimeoutbetween calls.If
basenameis not given butsubdiris, setbasenameto the last path segment ofurl.- If the target file already exists within
directory, behavior depends onif_exists: skip(default): do nothing and return early. Note that thisoverwrite: overwrite the existing fileraise: raise aFileExistsError
- Parameters:
directory (string) – Directory to download files to
url (string) – URL to download from
- Keyword Arguments:
basename (string, None) – basename part of the filename on disk. Defaults to last path segment of
urlif unset.if_exists (string, "skip") – What to do if target file already exists. One of
skip(default),overwriteorraisesubdir (string, None) – Subdirectory to create within the directory. Think
mets:fileGrp[@USE].retries (int, None) – Number of retries to attempt on network failure.
timeout (tuple, None) – Timeout in seconds for establishing a connection and reading next chunk of data.
- Returns:
Local filename string, relative to directory
- If the target file already exists within
- workspace_from_url(mets_url, dst_dir=None, clobber_mets=False, mets_basename=None, download=False, src_baseurl=None, mets_server_url=None, **kwargs)[source]¶
Create a workspace from a METS by URL (i.e. clone if
mets_urlis remote ordst_diris given).- Parameters:
mets_url (string) – Source METS URL or filesystem path
- Keyword Arguments:
dst_dir (string, None) – Target directory for the workspace. By default create a temporary directory under
ocrd.constants.TMP_PREFIX. (The resulting path can be retrieved viaocrd.Workspace.directory.)clobber_mets (boolean, False) – Whether to overwrite existing
mets.xml. By default existingmets.xmlwill raise an exception.download (boolean, False) – Whether to also download all the files referenced by the METS
src_baseurl (string, None) – Base URL for resolving relative file locations
mets_server_url (string, None) – URI of TCP or local path of UDS for METS server handling the OcrdMets of the workspace. By default the METS will be read from and written to the filesystem directly.
() (**kwargs) – Passed on to
OcrdMets.find_filesif download == True
Download (clone)
mets_urltomets.xmlindst_dir, unless the former is already local and the latter isnoneor already identical to its directory name.- Returns:
a new
Workspace
- workspace_from_nothing(directory, mets_basename='mets.xml', clobber_mets=False)[source]¶
Create an empty workspace.
- Parameters:
directory (string) – Target directory for the workspace. If
none, create a temporary directory underocrd.constants.TMP_PREFIX. (The resulting path can be retrieved viaocrd.Workspace.directory.)- Keyword Arguments:
clobber_mets (boolean, False) – Whether to overwrite existing
mets.xml. By default existingmets.xmlwill raise an exception.- Returns:
a new
Workspace
- resolve_mets_arguments(directory, mets_url, mets_basename='mets.xml', mets_server_url=None)[source]¶
Resolve the
--mets,--mets-basename, –directory`,--mets-server-url, arguments into a coherent set of arguments according to https://github.com/OCR-D/core/issues/517