ocrd.processor.builtin.merge_processor module¶
- class ocrd.processor.builtin.merge_processor.MergeProcessor(workspace: Workspace | None, ocrd_tool=None, parameter=None, input_file_grp=None, output_file_grp=None, page_id=None, version=None)[source]¶
Bases:
ProcessorInstantiate, but do not setup (neither for processing nor other usage). If given, do parse and validate
parameter.- Parameters:
workspace (
Workspace) – The workspace to process. If notNone, then chdir to that directory. Deprecated since version 3.0: Should beNonehere, but then needs to be set before processing.- Keyword Arguments:
parameter (string) – JSON of the runtime choices for ocrd-tool
parameters. Can beNoneeven for processing, but then needs to be set before running.input_file_grp (string) – comma-separated list of METS
fileGrpused for input. Deprecated since version 3.0: Should beNonehere, but then needs to be set before processing.output_file_grp (string) – comma-separated list of METS
fileGrpused for output. Deprecated since version 3.0: Should beNonehere, but then needs to be set before processing.page_id (string) – comma-separated list of METS physical
pageIDs to process (or empty for all pages). Deprecated since version 3.0: Should beNonehere, but then needs to be set before processing.
- process_page_pcgts(*input_pcgts: OcrdPage | None, page_id: str | None = None) OcrdPageResultVariadicListWrapper[source]¶
Merge PAGE segment hierarchy elements from all input file groups.
For each page, open and deserialise PAGE input files. Rename all elements of the segment hierarchy to new (clash-free) identifers. Redefine the Border coordinates as the convex hull of all input borders. Then add all regions from all input files, concatenating them into a single ReadingOrder in the order of input file groups.
Produce a new PAGE output file by serialising the resulting hierarchy.
- property metadata_filename¶
Relative location of the
ocrd-tool.jsonfile inside the package.Used by
metadata_location.(Override if
ocrd-tool.jsonis not in the root of the module, e.g.namespace/ocrd-tool.jsonordata/ocrd-tool.json).
- property executable¶
The executable name of this processor tool. Taken from the runtime filename.
Used by
ocrd_toolfor lookup inmetadata.(Override if your entry-point name deviates from the
executablename, or the processor gets instantiated from another runtime.)