ocrd_network.processing_server module¶
- class ocrd_network.processing_server.ProcessingServer(config_path: str, host: str, port: int)[source]¶
Bases:
FastAPIFastAPI app to make ocr-d processor calls
The Processing-Server receives calls conforming to the ocr-d webapi regarding the processing part. It can run ocrd-processors and provides endpoints to discover processors and watch the job status. The Processing-Server does not execute the processors itself but starts up a queue and a database to delegate the calls to processing workers. They are started by the Processing-Server and the communication goes through the queue.
Initializes the application.
- Parameters:
debug – Boolean indicating if debug tracebacks should be returned on errors.
routes – A list of routes to serve incoming HTTP and WebSocket requests.
middleware – A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. ServerErrorMiddleware is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. ExceptionMiddleware is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints.
exception_handlers – A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form handler(request, exc) -> response and may be either standard functions, or async functions.
on_startup – A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions.
on_shutdown – A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions.
lifespan – A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the on_startup and on_shutdown handlers. Use one or the other, not both.
- start() None[source]¶
deploy agents (db, queue, workers) and start the processing server with uvicorn
- async on_shutdown() None[source]¶
hosts and pids should be stored somewhere
ensure queue is empty or processor is not currently running
connect to hosts and kill pids
- async forward_tcp_request_to_uds_mets_server(request: Request) Dict[source]¶
Forward mets-server-request
A processor calls a mets related method like add_file with ClientSideOcrdMets. This sends a request to this endpoint. This request contains all information necessary to make a call to the uds-mets-server. This information is used by MetsServerProxy to make a the call to the local (local for the processing-server) reachable the uds-mets-server.
- async validate_and_forward_job_to_worker(processor_name: str, data: PYJobInput) PYJobOutput[source]¶
- async push_job_to_worker(data: PYJobInput, db_job: DBProcessorJob) PYJobOutput[source]¶
- async push_job_to_processing_queue(db_job: DBProcessorJob) PYJobOutput[source]¶
- async get_processor_job(job_id: str) PYJobOutput[source]¶
- async push_cached_jobs_to_workers(processing_jobs: List[PYJobInput]) None[source]¶
- async remove_job_from_request_cache(result_message: PYResultMessage)[source]¶
- async task_sequence_to_processing_jobs(tasks: List[ProcessorTask], mets_path: str, page_id: str) List[PYJobOutput][source]¶
- validate_tasks_worker_existence(tasks: List[ProcessorTask]) None[source]¶
- async run_workflow(mets_path: str, workflow: UploadFile | str | None = File(None), workflow_id: str = None, page_id: str = None, page_wise: bool = False, workflow_callback_url: str = None) PYWorkflowJobOutput[source]¶
- async kill_mets_server_zombies(minutes_ago: int | None = None, dry_run: bool | None = None) List[int][source]¶
- async get_workflow_info_simple(workflow_job_id) Dict[str, JobState][source]¶
Simplified version of the get_workflow_info that returns a single state for the entire workflow. - If a single processing job fails, the entire workflow job status is set to FAILED. - If there are any processing jobs running, regardless of other states, such as QUEUED and CACHED, the entire workflow job status is set to RUNNING. - If all processing jobs has finished successfully, only then the workflow job status is set to SUCCESS
- async upload_workflow(workflow: UploadFile | str) Dict[str, str][source]¶
Store a script for a workflow in the database