ocrd.processor.helpers module

Helper methods for running and documenting processors

ocrd.processor.helpers.generate_processor_help(ocrd_tool, processor_instance=None, subcommand=None)[source]

Generate a string describing the full CLI of this processor including params.

Parameters:
  • ocrd_tool (dict) – this processor’s tools section of the module’s ocrd-tool.json

  • processor_instance – the processor implementation (for adding any module/class/function docstrings)

ocrd.processor.helpers.run_cli(executable, mets_url=None, resolver=None, workspace=None, page_id=None, overwrite=None, log_level=None, log_filename=None, input_file_grp=None, output_file_grp=None, parameter=None, working_dir=None, mets_server_url=None)[source]

Open a workspace and run a processor on the command line.

If workspace is not none, reuse that. Otherwise, instantiate an Workspace for mets_url (and working_dir) by using ocrd.Resolver.workspace_from_url() (i.e. open or clone local workspace).

Run the processor CLI executable on the workspace, passing: - the workspace, - page_id - input_file_grp - output_file_grp - parameter (after applying any parameter_override settings)

(Will create output files and update the in the filesystem).

Parameters:

executable (string) – Executable name of the module processor.

ocrd.processor.helpers.run_processor(processorClass, mets_url=None, resolver=None, workspace=None, page_id=None, log_level=None, input_file_grp=None, output_file_grp=None, show_resource=None, list_resources=False, parameter=None, parameter_override=None, working_dir=None, mets_server_url=None, instance_caching=False)[source]

Instantiate a Pythonic processor, open a workspace, run the processor and save the workspace.

If workspace is not none, reuse that. Otherwise, instantiate an Workspace for mets_url (and working_dir) by using ocrd.Resolver.workspace_from_url() (i.e. open or clone local workspace).

Instantiate a Python object for processorClass, passing: - the workspace, - page_id - input_file_grp - output_file_grp - parameter (after applying any parameter_override settings)

Warning: Avoid setting the instance_caching flag to True. It may have unexpected side effects. This flag is used for an experimental feature we would like to adopt in future.

Run the processor on the workspace (creating output files in the filesystem).

Finally, write back the workspace (updating the METS in the filesystem).

Parameters:

processorClass (object) – Python class of the module processor.