ocrd_network.database module

The database is used to store information regarding jobs and workspaces.

Jobs: for every process-request a job is inserted into the database with an uuid, status and information about the process like parameters and file groups. It is mainly used to track the status (ocrd_network.constants.JobState) of a job so that the state of a job can be queried. Finished jobs are not deleted from the database.

Workspaces: A job or a processor always runs on a workspace. So a processor needs the information where the workspace is available. This information can be set with providing an absolute path or a workspace_id. With the latter, the database is used to convert the workspace_id to a path.

XXX: Currently the information is not preserved after the processing-server shuts down as the database (runs in docker) currently has no volume set.

async ocrd_network.database.initiate_database(db_url: str, db_name: str = 'ocrd')[source]
ocrd_network.database.sync_initiate_database(db_url: str, db_name: str = 'ocrd')[source]
async ocrd_network.database.db_create_workspace(mets_path: str) DBWorkspace[source]

Create a workspace-database entry only from a mets-path

ocrd_network.database.sync_db_create_workspace(mets_path: str) DBWorkspace[source]
async ocrd_network.database.db_get_workspace(workspace_id: str | None = None, workspace_mets_path: str | None = None) DBWorkspace[source]
ocrd_network.database.sync_db_get_workspace(workspace_id: str = None, workspace_mets_path: str = None) DBWorkspace[source]
async ocrd_network.database.db_update_workspace(workspace_id: str | None = None, workspace_mets_path: str | None = None, **kwargs) DBWorkspace[source]
ocrd_network.database.sync_db_update_workspace(workspace_id: str = None, workspace_mets_path: str = None, **kwargs) DBWorkspace[source]
async ocrd_network.database.db_create_processing_job(db_processing_job: DBProcessorJob) DBProcessorJob[source]
ocrd_network.database.sync_db_create_processing_job(db_processing_job: DBProcessorJob) DBProcessorJob[source]
async ocrd_network.database.db_get_processing_job(job_id: str) DBProcessorJob[source]
ocrd_network.database.sync_db_get_processing_job(job_id: str) DBProcessorJob[source]
async ocrd_network.database.db_update_processing_job(job_id: str, **kwargs) DBProcessorJob[source]
ocrd_network.database.sync_db_update_processing_job(job_id: str, **kwargs) DBProcessorJob[source]
async ocrd_network.database.db_create_workflow_job(db_workflow_job: DBWorkflowJob) DBWorkflowJob[source]
ocrd_network.database.sync_db_create_workflow_job(db_workflow_job: DBWorkflowJob) DBWorkflowJob[source]
async ocrd_network.database.db_get_workflow_job(job_id: str) DBWorkflowJob[source]
ocrd_network.database.sync_db_get_workflow_job(job_id: str) DBWorkflowJob[source]
async ocrd_network.database.db_get_processing_jobs(job_ids: List[str]) List[DBProcessorJob][source]
ocrd_network.database.sync_db_get_processing_jobs(job_ids: List[str]) List[DBProcessorJob][source]
async ocrd_network.database.db_create_workflow_script(db_workflow_script: DBWorkflowScript) DBWorkflowScript[source]
ocrd_network.database.sync_db_create_workflow_script(db_workflow_script: DBWorkflowScript) DBWorkflowScript[source]
async ocrd_network.database.db_get_workflow_script(workflow_id: str) DBWorkflowScript[source]
ocrd_network.database.sync_db_get_workflow_script(workflow_id: str) DBWorkflowScript[source]
async ocrd_network.database.db_find_first_workflow_script_by_content(content_hash: str) DBWorkflowScript[source]
ocrd_network.database.sync_db_find_first_workflow_script_by_content(workflow_id: str) DBWorkflowScript[source]
ocrd_network.database.verify_database_uri(mongodb_address: str) str[source]
ocrd_network.database.verify_mongodb_available(mongo_url: str) None[source]

# The protocol is intentionally set to HTTP instead of MONGODB! mongodb_test_url = mongo_url.replace(“mongodb”, “http”) if is_url_responsive(url=mongodb_test_url, tries=3):

return

raise RuntimeError(f”Verifying connection has failed: {mongodb_test_url}”)