ocrd.processor.builtin.shell_processor module¶
- class ocrd.processor.builtin.shell_processor.ShellProcessor(workspace: Workspace | None, ocrd_tool=None, parameter=None, input_file_grp=None, output_file_grp=None, page_id=None, version=None)[source]¶
Bases:
ProcessorInstantiate, but do not setup (neither for processing nor other usage). If given, do parse and validate
parameter.- Parameters:
workspace (
Workspace) – The workspace to process. If notNone, then chdir to that directory. Deprecated since version 3.0: Should beNonehere, but then needs to be set before processing.- Keyword Arguments:
parameter (string) – JSON of the runtime choices for ocrd-tool
parameters. Can beNoneeven for processing, but then needs to be set before running.input_file_grp (string) – comma-separated list of METS
fileGrpused for input. Deprecated since version 3.0: Should beNonehere, but then needs to be set before processing.output_file_grp (string) – comma-separated list of METS
fileGrpused for output. Deprecated since version 3.0: Should beNonehere, but then needs to be set before processing.page_id (string) – comma-separated list of METS physical
pageIDs to process (or empty for all pages). Deprecated since version 3.0: Should beNonehere, but then needs to be set before processing.
- setup()[source]¶
Prepare the processor for actual data processing, prior to changing to the workspace directory but after parsing parameters.
(Override this to load models into memory etc.)
- process_page_file(*input_files: OcrdFile | ClientSideOcrdFile | None) None[source]¶
Process PAGE files via arbitrary command line on the shell.
For each selected physical page of the workspace, pass
commandto the shell, replacing: - the string@INFILEwith the PAGE input file path, - the string@OUTFILEwith the PAGE output file path.Modify the resulting PAGE output file with our new @pcGtsId and metadata.
- property metadata_filename¶
Relative location of the
ocrd-tool.jsonfile inside the package.Used by
metadata_location.(Override if
ocrd-tool.jsonis not in the root of the module, e.g.namespace/ocrd-tool.jsonordata/ocrd-tool.json).
- property executable¶
The executable name of this processor tool. Taken from the runtime filename.
Used by
ocrd_toolfor lookup inmetadata.(Override if your entry-point name deviates from the
executablename, or the processor gets instantiated from another runtime.)