ocrd_models.ocrd_mets module


class ocrd_models.ocrd_mets.OcrdMets(**kwargs)[source]

Bases: ocrd_models.ocrd_xml_base.OcrdXmlDocument

API to a single METS file

add_agent(*args, **kwargs)[source]

Add an OcrdAgent to the list of agents in the metsHdr.

add_file(fileGrp, mimetype=None, url=None, ID=None, pageId=None, force=False, local_filename=None, ignore=False, **kwargs)[source]

Add a OcrdFile.

  • fileGrp (string) – Add file to mets:fileGrp with this USE attribute

  • mimetype (string) –

  • url (string) –

  • ID (string) –

  • pageId (string) –

  • force (boolean) – Whether to add the file even if a mets:file with the same ID already exists.

  • ignore (boolean) – Don’t look for existing files. Shift responsibility for preventing errors from duplicate ID to the user.

  • local_filename (string) –

  • mimetype


Add a new mets:fileGrp.


fileGrp (string) – USE attribute of the new filegroup.

property agents

List all OcrdAgent

static empty_mets(now=None)[source]

Create an empty METS file from bundled template.

property file_groups

List the USE attributes of all mets:fileGrp.

find_all_files(*args, **kwargs)[source]

Like find_files but return a list of all results.

Equivalent to list(self.find_files(...))

find_files(ID=None, fileGrp=None, pageId=None, mimetype=None, url=None, local_only=False)[source]

Search mets:file in this METS document and yield results.

The ID, fileGrp, url and mimetype parameters can be either a literal string or a regular expression if the string starts with // (double slash). If it is a regex, the leading // is removed and candidates are matched against the regex with re.fullmatch. If it is a literal string, comparison is done with string equality.

  • ID (string) – ID of the file

  • fileGrp (string) – USE of the fileGrp to list files of

  • pageId (string) – ID of physical page manifested by matching files

  • url (string) – @xlink:href of mets:Flocat of mets:file

  • mimetype (string) – MIMETYPE of matching files

  • local (boolean) – Whether to restrict results to local files


List of files.


Get the pageId for a ocrd_file


List all page IDs (optionally for a subset of file IDs)

property physical_pages

List all page IDs

remove_file(*args, **kwargs)[source]

Delete all files matching the query. Same arguments as OcrdMets.find_files

remove_file_group(USE, recursive=False, force=False)[source]

Remove a fileGrp (fixed USE) or fileGrps (regex USE)

  • USE (string) – USE attribute of the fileGrp to delete. Can be a regex if prefixed with //

  • recursive (boolean) – Whether to recursively delete all files in the group

  • force (boolean) – Do not raise an exception if file group doesn’t exist


Delete a OcrdFile.

set_physical_page_for_file(pageId, ocrd_file, order=None, orderlabel=None)[source]

Create a new physical page

property unique_identifier

Get the unique identifier by looking through mods:identifier

See specs for details.