ocrd.processor.builtin.shell_processor module

class ocrd.processor.builtin.shell_processor.ShellProcessor(workspace: Workspace | None, ocrd_tool=None, parameter=None, input_file_grp=None, output_file_grp=None, page_id=None, version=None)[source]

Bases: Processor

Instantiate, but do not setup (neither for processing nor other usage). If given, do parse and validate parameter.

Parameters:

workspace (Workspace) – The workspace to process. If not None, then chdir to that directory. Deprecated since version 3.0: Should be None here, but then needs to be set before processing.

Keyword Arguments:
  • parameter (string) – JSON of the runtime choices for ocrd-tool parameters. Can be None even for processing, but then needs to be set before running.

  • input_file_grp (string) – comma-separated list of METS fileGrp used for input. Deprecated since version 3.0: Should be None here, but then needs to be set before processing.

  • output_file_grp (string) – comma-separated list of METS fileGrp used for output. Deprecated since version 3.0: Should be None here, but then needs to be set before processing.

  • page_id (string) – comma-separated list of METS physical page IDs to process (or empty for all pages). Deprecated since version 3.0: Should be None here, but then needs to be set before processing.

setup()[source]

Prepare the processor for actual data processing, prior to changing to the workspace directory but after parsing parameters.

(Override this to load models into memory etc.)

process_page_file(*input_files: OcrdFile | ClientSideOcrdFile | None) None[source]

Process PAGE files via arbitrary command line on the shell.

 For each selected physical page of the workspace, pass command to the shell, replacing: - the string @INFILE with the PAGE input file path, - the string @OUTFILE with the PAGE output file path.

Modify the resulting PAGE output file with our new @pcGtsId and metadata.

property metadata_filename

Relative location of the ocrd-tool.json file inside the package.

Used by metadata_location.

(Override if ocrd-tool.json is not in the root of the module, e.g. namespace/ocrd-tool.json or data/ocrd-tool.json).

property executable

The executable name of this processor tool. Taken from the runtime filename.

Used by ocrd_tool for lookup in metadata.

(Override if your entry-point name deviates from the executable name, or the processor gets instantiated from another runtime.)