openapi: 3.0.1 info: title: OCR-D Web API description: > # HTTP API for offering OCR-D processing > This document defines the [data model](#/components/schemas) and various HTTP APIs related to OCR-D ## OCR-D API compatibility An implementation may claim compatibility with a `OCR-D ${N} API v{$V}` if * it implements all the methods tagged `${N}` * at major version `${V}` of this API definition ## Media types ### `application/json` Content serialized as `application/json` is defined by the [data model](#/components/schema) ### `application/vnd.ocrd+zip` Defined in [https://ocr-d.de/en/spec/ocrd_zip](https://ocr-d.de/en/spec/ocrd_zip) contact: email: info@ocr-d.de license: name: Apache 2.0 url: 'https://www.apache.org/licenses/LICENSE-2.0.html' version: 0.1.0 externalDocs: description: OCR-D Website url: 'https://ocr-d.de' servers: - url: 'https://example.org/ocrd/v1' description: The URL of your server offering the OCR-D API. tags: - name: discovery description: Discovery of capabilities of a server - name: processing description: OCR-D processing and processors - name: workflow description: Processing of workflows - name: workspace description: mets.xml-indexed BagIt container - name: training description: Training of OCR engines - name: acl description: Authorization and authentication paths: '/processor': get: tags: ['processing', 'discovery'] operationId: listProcessors responses: '200': description: A list of names of all available processors content: application/json: schema: type: array items: schema: $ref: '#/components/schemas/OcrdExecutable' '/processor/info/{executable}': get: tags: ['processing', 'discovery'] operationId: getProcessor parameters: - name: executable in: path description: Name of the executable schema: $ref: '#/components/schemas/OcrdExecutable' required: true responses: '200': description: Get this processor content: application/json: schema: type: object description: Ocrd-tool-json of the processor '404': description: Processor not available '/processor/run/{executable}': post: tags: ['processing'] description: Run a processor operationId: runProcessor parameters: - name: executable in: path description: Name of the executable schema: $ref: '#/components/schemas/OcrdExecutable' required: true requestBody: content: application/json: schema: $ref: '#/components/schemas/ProcessorArgs' required: true responses: '200': description: Return the ProcessorJob content: application/json: schema: $ref: '#/components/schemas/ProcessorJob' '/processor/job/{job-id}': get: tags: ['processing'] description: Get information about a ProcessorJob operationId: getProcessorJob parameters: - name: job-id in: path description: ID of the ProcessorJob schema: type: string required: true responses: '200': description: Return ProcessorJob content: application/json: schema: $ref: '#/components/schemas/ProcessorJob' '404': description: ProcessorJob not found '/processor/log/{job-id}': get: tags: ['processing'] description: Get logs of a Processor operationId: getProcessorJobLog parameters: - name: job-id in: path description: ID of the ProcessorJob schema: type: string required: true responses: '200': description: Return log content: 'text/plain': schema: type: string '404': description: Logs not found '/workflow': post: tags: ['workflow'] description: Upload/Register a new workflow file operationId: postWorkflow requestBody: required: true content: multipart/form-data: schema: type: object properties: workflow: type: string format: binary required: - workflow responses: '200': description: Created a new OCR-D workflow content: application/json: schema: type: object properties: workflow_id: string description: ID of the workflow created '422': description: Invalid workflow '/workflow/{workflow-id}': put: tags: ['workflow'] description: Update/Replace a workflow file operationId: putWorkflow parameters: - name: workflow-id in: path description: ID of the Workflow schema: type: string required: true requestBody: required: true content: multipart/form-data: schema: type: object properties: workflow: type: string format: binary responses: '200': description: Successful Response content: application/json: schema: type: string description: ID of the workflow script '404': description: Workflow to update not existing '422': description: Invalid workflow get: tags: ['workflow', 'discovery'] description: Download a workflow file operationId: getWorkflow parameters: - name: workflow-id in: path description: ID of the Workflow schema: {type: string} required: true responses: '200': description: Return Workflow content: text/plain: schema: type: string format: binary '404': description: Workflow not available '/workflow/run': post: tags: ['workflow', 'processing'] description: Run a workflow operationId: runWorkflow parameters: - name: mets_path in: query description: Path to mets-file schema: type: string required: true - name: agent_type in: query description: Type of agent to run the processor: worker or server schema: type: string default: worker required: false - name: page_id in: query description: Page Id or Page Range to process schema: type: string required: false - name: page_wise in: query required: false schema: type: boolean default: false description: Split requests into pages to process in parallel - name: workflow_callback_url in: query description: Url to receive a callback after workflow finishes schema: type: string required: false - name: workflow_id in: query required: false description: Use previously uploaded workflow instead of workflow in requestBody schema: type: string requestBody: content: multipart/form-data: description: Textfile with ocrd process syntax schema: properties: workflow: type: string format: binary example: | cis-ocropy-binarize -I DEFAULT -O BIN anybaseocr-crop -I BIN -O CROP cis-ocropy-denoise -I CROP -O DENOISE -P dpi 300 -P level-of-operation page required: false responses: "200": description: Successful Response content: application/json: schema: $ref: '#/components/schemas/WorkflowJob' '404': description: Workspace not found '422': description: Input validation error '/workflow/job/{job-id}': get: tags: ['workflow'] description: Get information about a workflow job operationId: getWorkflowJob parameters: - name: job-id in: path required: true description: ID of the ProcessorJob schema: type: string responses: '200': description: Return status of the workflow job content: application/json: schema: type: object description: Processor-jobs of the workflow with their status '404': description: Workflow-Job with job-id not found '/workspace': get: tags: ['workspace'] operationId: getWorkspaces responses: '200': description: Successful operation content: application/json: schema: type: array items: $ref: '#/components/schemas/Workspace' post: tags: ['workspace'] operationId: createWorkspace summary: Post a new workspace requestBody: description: OCRD-ZIP of the new workspace content: 'application/vnd.ocrd+zip': schema: type: string format: binary required: true responses: '201': description: Created Workspace content: application/json: schema: $ref: '#/components/schemas/Workspace' '400': description: Invalid workspace '/workspace/{workspace-id}': put: tags: ['workspace'] operationId: replaceWorkspace summary: Replace an existing workspace parameters: - name: workspace-id in: path description: ID of the workspace schema: type: string required: true requestBody: description: OCRD-ZIP of the updated workspace content: 'application/vnd.ocrd+zip': schema: type: string format: binary required: true responses: '200': description: Workspace replaced or created content: application/json: schema: $ref: '#/components/schemas/Workspace' '400': description: Workspace invalid get: tags: ['workspace'] operationId: getWorkspace parameters: - name: workspace-id in: path description: ID of the workspace schema: {type: string} required: true responses: '200': description: Return workspace or OCRD-ZIP content: application/json: schema: $ref: '#/components/schemas/Workspace' application/vnd.ocrd+zip: {} '404': description: Workspace not found content: {} '410': description: Workspace deleted before content: {} delete: tags: ['workspace'] operationId: deleteWorkspace parameters: - name: workspace-id in: path description: ID of the workspace schema: type: string required: true responses: '200': description: Workspace deleted content: application/json: schema: $ref: '#/components/schemas/Workspace' '404': description: Workspace not found content: {} '410': description: Workspace deleted before content: {} '/discovery': get: tags: ['discovery'] operationId: discover responses: '200': description: Return DiscoveryResponse content: application/json: schema: $ref: '#/components/schemas/DiscoveryResponse' components: schemas: Resource: type: object required: ['@id'] properties: '@id': type: string description: URL of this thing description: type: string description: Description of the thing JobState: type: string pattern: '^(CACHED|CANCELLED|QUEUED|RUNNING|SUCCESS|FAILED)' Workspace: allOf: - $ref: '#/components/schemas/Resource' OcrdExecutable: type: string pattern: '^ocrd-.*' ProcessorArgs: properties: processor_name: type: string description: Processor Name path_to_mets: type: string description: Path to mets of workspace to process if available workspace_id: type: string description: ID of the workspace to process if available description: type: string description: Description of the execution input_file_grps: items: type: string type: array description: Input file groups output_file_grps: items: type: string type: array description: Output file groups page_id: type: string description: Page Id parameters: type: object description: Parameters for processor execution default: {} result_queue_name: type: string description: Result Queue Name callback_url: type: string description: Url to send response to after finished execution agent_type: type: string description: Agent Type, worker or server default: worker depends_on: items: type: string type: array description: List of job_id's to wait for before starting execution type: object required: - input_file_grps description: Wraps the parameters required to make a run-processor-request example: path_to_mets: /path/to/mets.xml description: The description of this execution input_file_grps: - INPUT_FILE_GROUP output_file_grps: - OUTPUT_FILE_GROUP page_id: PAGE_ID parameters: {} ProcessorJob: description: Wraps output information for a job-response properties: job_id: type: string description: Job Id processor_name: type: string description: Processor Name state: $ref: '#/components/schemas/JobState' path_to_mets: type: string description: Path to mets of workspace to process if available workspace_id: type: string description: ID of the workspace to process if available input_file_grps: items: type: string type: array description: Input File Grps output_file_grps: items: type: string type: array description: Output File Grps page_id: type: string title: Page Id log_file_path: type: string description: Path to logfile for this job type: object required: - job_id - processor_name - state - input_file_grps Workflow: allOf: - {$ref: '#/components/schemas/Resource'} WorkflowJob: properties: job_id: type: string title: Job Id page_id: type: string title: Page Id page_wise: type: boolean title: Page Wise default: false processing_job_ids: type: object title: Processing Job Ids path_to_mets: type: string title: Path To Mets workspace_id: type: string title: Workspace Id description: type: string title: Description type: object required: - job_id - page_id - processing_job_ids description: Wraps output information for a workflow job-response DiscoveryResponse: type: object properties: ram: description: All available RAM in bytes type: integer format: int64 cpu_cores: description: Number of available CPU cores type: integer format: int64 has_cuda: description: Whether deployment supports NVIDIA's CUDA type: boolean cuda_version: description: Major/minor version of CUDA type: string has_ocrd_all: description: Whether deployment is based on ocrd_all type: boolean ocrd_all_version: description: Git tag of the ocrd_all version implemented type: string