ocrd.workspace_bagger module¶
- class ocrd.workspace_bagger.WorkspaceBagger(resolver, strict=False)[source]¶
Bases:
object
Serialize/De-serialize from OCRD-ZIP to workspace and back.
- bag(workspace, ocrd_identifier, dest=None, ocrd_mets='mets.xml', ocrd_base_version_checksum=None, processes=1, skip_zip=False, tag_files=None, include_fileGrp=None, exclude_fileGrp=None)[source]¶
Bag a workspace
See https://ocr-d.github.com/ocrd_zip#packing-a-workspace-as-ocrd-zip
- Parameters:
workspace (ocrd.Workspace) – workspace to bag
ord_identifier (string) – Ocrd-Identifier in bag-info.txt
dest (string) – Path of the generated OCRD-ZIP.
ord_mets (string) – Ocrd-Mets in bag-info.txt
ord_base_version_checksum (string) – Ocrd-Base-Version-Checksum in bag-info.txt
processes (integer) – Number of parallel processes checksumming
skip_zip (boolean) – Whether to leave directory unzipped
tag_files (list<string>) – Path names of additional tag files to be bagged at the root of the bag
- spill(src, dest)[source]¶
Spill a workspace, i.e. unpack it and turn it into a workspace.
See https://ocr-d.github.com/ocrd_zip#unpacking-ocrd-zip-to-a-workspace
- Parameters:
src (string) – Path to OCRD-ZIP
dest (string) – Path to directory to unpack data folder to
- recreate_checksums(src, dest=None, overwrite=False)[source]¶
(Re)creates the files containing the checksums of a bag
This function uses bag.py to create new files: manifest-sha512.txt and tagminifest-sha512.txt for the bag. Also ‘Payload-Oxum’ in bag-info.txt will be set to the appropriate value.
- Parameters:
src (string) – Path to Bag. May be a zipped or unzipped bagit
dest (string) – Path to where the result should be stored. Not needed if overwrite is set
overwrite (bool) – Replace bag with newly created bag