ocrd_utils.config module

Most behavior of OCR-D is controlled via command-line flags or keyword args. Some behavior is global or too cumbersome to handle via explicit code and better solved by using environment variables.

OcrdEnvConfig is a base class to make this more streamlined, to be subclassed in the ocrd package for the actual values


If set to true, access to the METS file is cached, speeding in-memory search and modification.


Maximum number of processor instances (for each set of parameters) to be kept in memory (including loaded models) for processing workers or processor servers. (Default: “128”)


Maximum number of processor threads for page-parallel processing (within each Processor’s selected page range, independent of the number of Processing Workers or Processor Servers). If set >1, then a METS Server must be used for METS synchronisation. (Default: “1”)


Timeout in seconds for processing a single page. If set >0, when exceeded, the same as OCRD_MISSING_OUTPUT applies. (Default: “0”)


Whether to enable gathering runtime statistics on the ocrd.profile logger (comma-separated):  - CPU: yields CPU and wall-time, - RSS: also yields peak memory (resident set size) - PSS: also yields peak memory (proportional set size)  (Default: “”)


If set, then the CPU profile is written to this file for later peruse with a analysis tools like snakeviz


Number of times to retry failed attempts for downloads of resources or workspace files.


Timeout in seconds for connecting or reading (comma-separated) when downloading.


Whether to download files not present locally during processing (Default: “True”)


How to deal with missing input files (for some fileGrp/pageId) during processing: 

  • SKIP: ignore and proceed with next page’s input

  • ABORT: throw MissingInputFile

 (Default: “SKIP”)


How to deal with missing output files (for some fileGrp/pageId) during processing: 

  • SKIP: ignore and proceed processing next page

  • COPY: fall back to copying input PAGE to output fileGrp for page

  • ABORT: re-throw whatever caused processing to fail

 (Default: “SKIP”)


Maximal rate of skipped/fallback pages among all processed pages before aborting (decimal fraction, ignored if negative). (Default: “0.1”)


How to deal with already existing output files (for some fileGrp/pageId) during processing: 

  • SKIP: ignore and proceed processing next page

  • OVERWRITE: force writing result to output fileGrp for page

  • ABORT: re-throw FileExistsError

 (Default: “SKIP”)


Default address of Processing Server to connect to (for ocrd network client processing). (Default: “”)


How many seconds to sleep before trying again. (Default: “10”)


Timeout for a blocking ocrd network client (in seconds). (Default: “3600”)


Default address of Workflow Server to connect to (for ocrd network client workflow). (Default: “”)


Default address of Workspace Server to connect to (for ocrd network client workspace). (Default: “”)


Number of attempts for a RabbitMQ client to connect before failing. (Default: “3”)


    Controls AMQP heartbeat timeout (in seconds) negotiation during connection tuning. An integer value always overrides the value proposed by broker. Use 0 to deactivate heartbeat.

    (Default: “0”)


The root directory where all mets server related socket files are created (Default: “/tmp/ocrd_network_sockets”)


The root directory where all ocrd_network related file logs are stored (Default: “/tmp/ocrd_network_logs”)

  • HOME

Directory to look for ocrd_logging.conf, fallback for unset XDG variables. (Default: “/home/kba”)


Directory to look for ./ocrd-resources/* (i.e. ocrd resmgr data location) (Default: “/home/kba/.local/share”)


Directory to look for ./ocrd/resources.yml (i.e. ocrd resmgr user database) (Default: “/home/kba/.config”)


Print information about the logging setup to STDERR (Default: “False”)

class ocrd_utils.config.OcrdEnvVariable(name, description, parser=<class 'str'>, validator=<function OcrdEnvVariable.<lambda>>, default=[False, None])[source]

Bases: object

An environment variable for use in OCR-D.

  • name (str) – Name of the environment variable

  • description (str) – Description of what the variable is used for.

Keyword Arguments:
  • parser (callable) – Function to transform the raw (string) value to whatever is needed.

  • validator (callable) – Function to validate that the raw (string) value is parseable.

  • default (tuple(bool, any)) – 2-tuple, first element is a bool whether there is a default value defined and second element contains that default value, which can be a callable for deferred evaluation

describe(wrap_text=True, indent_text=True)[source]

Output help information on a config option.

If option.description is a multiline string with complex formatting (e.g. markdown lists), replace empty lines with  and set wrap_text to False.

class ocrd_utils.config.OcrdEnvConfig[source]

Bases: object

add(name, *args, **kwargs)[source]
describe(name, *args, **kwargs)[source]