ocrd.cli package

OCR-D Command-line interface

ocrd

Entry-point of multi-purpose CLI for OCR-D

ocrd [OPTIONS] COMMAND [ARGS]...

Options

--version

Show the version and exit.

-l, --log-level <log_level>

Log level

Options:

OFF | ERROR | WARN | INFO | DEBUG | TRACE

Variables:
PATH
Search path for processor executables
(affects ocrd process and ocrd resmgr)
HOME
Directory to look for ocrd_logging.conf,
fallback for unset XDG variables. (Default:
“/home/kba”)
XDG_CONFIG_HOME
Directory to look for ./ocrd/resources.yml (i.e.
ocrd resmgr user database) (Default:
“/home/kba/.config”)
XDG_DATA_HOME
Directory to look for ./ocrd-resources/* (i.e.
ocrd resmgr data location) (Default:
“/home/kba/.local/share”)
OCRD_DOWNLOAD_RETRIES
Number of times to retry failed attempts for
downloads of resources or workspace files.
OCRD_DOWNLOAD_TIMEOUT
Timeout in seconds for connecting or reading
(comma-separated) when downloading.
OCRD_DOWNLOAD_INPUT
Whether to download files not present locally
during processing (Default: “True”)
OCRD_MISSING_INPUT
How to deal with missing input files
(for some fileGrp/pageId) during processing:

- SKIP: ignore and proceed with next page’s input
- ABORT: throw MissingInputFile

(Default: “SKIP”)
OCRD_MISSING_OUTPUT
How to deal with missing output files
(for some fileGrp/pageId) during processing:

- SKIP: ignore and proceed processing next page
- COPY: fall back to copying input PAGE to output fileGrp for page
- ABORT: re-throw whatever caused processing to fail

(Default: “SKIP”)
OCRD_EXISTING_OUTPUT
How to deal with already existing output files
(for some fileGrp/pageId) during processing:

- SKIP: ignore and proceed processing next page
- OVERWRITE: force writing result to output fileGrp for page
- ABORT: re-throw FileExistsError

(Default: “SKIP”)
OCRD_METS_CACHING
If set to true, access to the METS file is
cached, speeding in-memory search and
modification.
OCRD_MAX_PROCESSOR_CACHE
Maximum number of processor instances (for each
set of parameters) to be kept in memory (including
loaded models) for processing workers or processor
servers. (Default: “128”)
OCRD_NETWORK_CLIENT_POLLING_SLEEP
How many seconds to sleep before trying again.
(Default: “10”)
OCRD_NETWORK_CLIENT_POLLING_TIMEOUT
Timeout for a blocking ocrd network client (in
seconds). (Default: “3600”)
OCRD_NETWORK_SERVER_ADDR_PROCESSING
Default address of Processing Server to connect to
(for ocrd network client processing). (Default:
“”)
OCRD_NETWORK_SERVER_ADDR_WORKFLOW
Default address of Workflow Server to connect to
(for ocrd network client workflow). (Default:
“”)
OCRD_NETWORK_SERVER_ADDR_WORKSPACE
Default address of Workspace Server to connect to
(for ocrd network client workspace). (Default:
“”)
OCRD_NETWORK_RABBITMQ_CLIENT_CONNECT_ATTEMPTS
Number of attempts for a RabbitMQ client to
connect before failing. (Default: “3”)
OCRD_NETWORK_RABBITMQ_HEARTBEAT
Controls AMQP heartbeat timeout (in seconds)
negotiation during connection tuning. An integer
value always overrides the value proposed by
broker. Use 0 to deactivate heartbeat.
(Default: “0”)
OCRD_PROFILE_FILE
If set, then the CPU profile is written to this
file for later peruse with a analysis tools like
snakeviz
OCRD_PROFILE
Whether to enable gathering runtime statistics
on the ocrd.profile logger (comma-separated):

- CPU: yields CPU and wall-time,
- RSS: also yields peak memory (resident set size)
- PSS: also yields peak memory (proportional set size)

(Default: “”)
OCRD_NETWORK_SOCKETS_ROOT_DIR
The root directory where all mets server related
socket files are created (Default:
“/tmp/ocrd_network_sockets”)
OCRD_NETWORK_LOGS_ROOT_DIR
The root directory where all ocrd_network related
file logs are stored (Default:
“/tmp/ocrd_network_logs”)
OCRD_LOGGING_DEBUG
Print information about the logging setup to
STDERR (Default: “False”)

Commands

bashlib

Work with bash library

log

Logging

network

Managing network components

ocrd-tool

Work with ocrd-tool.json JSON_FILE

process

Process a series of tasks

resmgr

Managing processor resources

validate

All the validation in one CLI

workspace

Managing workspaces

zip

Bag/Spill/Validate OCRD-ZIP bags

Submodules