# ocrd_utils.image module¶

Calculate the enlarged image size after rotation.

Given a numpy array `size` of an original canvas (width and height), and a rotation angle in degrees counter-clockwise `angle`, calculate the new size which is necessary to encompass the full image after rotation.

Return a numpy array of the enlarged width and height.

Calculate the flipped image size after transposition.

Given a numpy array `size` of an original canvas (width and height), and a transposition mode `method` (see `transpose_image`), calculate the new size after transposition.

Return a numpy array of the enlarged width and height.

ocrd_utils.image.bbox_from_points(points)[source]

Construct a numeric list representing a bounding box from polygon coordinates in page representation.

ocrd_utils.image.bbox_from_polygon(polygon)[source]

Construct a numeric list representing a bounding box from polygon coordinates in numeric list representation.

ocrd_utils.image.bbox_from_xywh(xywh)[source]

Convert a bounding box from a numeric dict to a numeric list representation.

ocrd_utils.image.coordinates_for_segment(polygon, parent_image, parent_coords)[source]

Convert relative coordinates to absolute.

Given…

• `polygon`, a numpy array of points relative to

• `parent_image`, a PIL.Image (not used), along with

• `parent_coords`, its corresponding affine transformation,

…calculate the absolute coordinates within the page.

That is, apply the given transform inversely to `polygon` The transform encodes (recursively):

1. Whenever `parent_image` or any of its parents was cropped, all points must be shifted by the offset in opposite direction (i.e. coordinate system gets translated by the upper left).

2. Whenever `parent_image` or any of its parents was rotated, all points must be rotated around the center of that image in opposite direction (i.e. coordinate system gets translated by the center in opposite direction, rotated purely, and translated back; the latter involves an additional offset from the increase in canvas size necessary to accommodate all points).

Return the rounded numpy array of the resulting polygon.

ocrd_utils.image.coordinates_of_segment(segment, parent_image, parent_coords)[source]

Extract the coordinates of a PAGE segment element relative to its parent.

Given…

• `segment`, a PAGE segment object in absolute coordinates (i.e. RegionType / TextLineType / WordType / GlyphType), and

• `parent_image`, the PIL.Image of its corresponding parent object (i.e. PageType / RegionType / TextLineType / WordType), (not used), along with

• `parent_coords`, its corresponding affine transformation,

…calculate the relative coordinates of the segment within the image.

That is, apply the given transform to the points annotated in `segment`. The transform encodes (recursively):

1. Whenever `parent_image` or any of its parents was cropped, all points must be shifted by the offset (i.e. coordinate system gets translated by the upper left).

2. Whenever `parent_image` or any of its parents was rotated, all points must be rotated around the center of that image (i.e. coordinate system gets translated by the center in opposite direction, rotated purely, and translated back; the latter involves an additional offset from the increase in canvas size necessary to accommodate all points).

Return the rounded numpy array of the resulting polygon.

ocrd_utils.image.image_from_polygon(image, polygon, fill='background', transparency=False)[source]

“Mask an image with a polygon.

Given a PIL.Image `image` and a numpy array `polygon` of relative coordinates into the image, fill everything outside the polygon hull to a color according to `fill`:

• if `none` then do not touch the colour channels at all,

• else if `background` (the default), then use the median color of the image;

• otherwise use the given color, e.g. `'white'` or (255,255,255).

Moreover, if `transparency` is true, then add an alpha channel from the polygon mask (i.e. everything outside the polygon will be transparent, for those consumers that can interpret alpha channels). Images which already have an alpha channel will have it shrunk from the polygon mask (i.e. everything outside the polygon will be transparent, in addition to existing transparent pixels).

Return a new PIL.Image.

ocrd_utils.image.points_from_bbox(minx, miny, maxx, maxy)[source]

Construct polygon coordinates in page representation from a numeric list representing a bounding box.

ocrd_utils.image.points_from_polygon(polygon)[source]

Convert polygon coordinates from a numeric list representation to a page representation.

ocrd_utils.image.points_from_x0y0x1y1(xyxy)[source]

Construct a polygon representation from a rectangle described as a list [x0, y0, x1, y1]

ocrd_utils.image.points_from_xywh(box)[source]

Construct polygon coordinates in page representation from numeric dict representing a bounding box.

ocrd_utils.image.points_from_y0x0y1x1(yxyx)[source]

Construct a polygon representation from a rectangle described as a list [y0, x0, y1, x1]

ocrd_utils.image.polygon_from_bbox(minx, miny, maxx, maxy)[source]

Construct polygon coordinates in numeric list representation from a numeric list representing a bounding box.

ocrd_utils.image.polygon_from_points(points)[source]

Convert polygon coordinates in page representation to polygon coordinates in numeric list representation.

ocrd_utils.image.polygon_from_x0y0x1y1(x0y0x1y1)[source]

Construct polygon coordinates in numeric list representation from a string list representing a bounding box.

ocrd_utils.image.polygon_from_xywh(xywh)[source]

Construct polygon coordinates in numeric list representation from numeric dict representing a bounding box.

“Create a mask image of a polygon.

Given a PIL.Image `image` (merely for dimensions), and a numpy array `polygon` of relative coordinates into the image, create a new image of the same size with black background, and fill everything inside the polygon hull with white.

Return the new PIL.Image.

ocrd_utils.image.rotate_coordinates(transform, angle, orig=array([0, 0]))[source]

Compose an affine coordinate transformation with a passive rotation.

Given a numpy array `transform` of an existing transformation matrix in homogeneous (3d) coordinates, and a rotation angle in degrees counter-clockwise `angle`, as well as a numpy array `orig` of the center of rotation, calculate the affine coordinate transform corresponding to the composition of both transformations. (This entails translation to the center, followed by pure rotation, and subsequent translation back. However, since rotation necessarily increases the bounding box, and thus image size, do not translate back the same amount, but to the enlarged offset.)

Return a numpy array of the resulting affine transformation matrix.

ocrd_utils.image.shift_coordinates(transform, offset)[source]

Compose an affine coordinate transformation with a translation.

Given a numpy array `transform` of an existing transformation matrix in homogeneous (3d) coordinates, and a numpy array `offset` of the translation vector, calculate the affine coordinate transform corresponding to the composition of both transformations.

Return a numpy array of the resulting affine transformation matrix.

ocrd_utils.image.scale_coordinates(transform, factors)[source]

Compose an affine coordinate transformation with a proportional scaling. Given a numpy array `transform` of an existing transformation matrix in homogeneous (3d) coordinates, and a numpy array `factors` of the scaling factors, calculate the affine coordinate transform corresponding to the composition of both transformations.

Return a numpy array of the resulting affine transformation matrix.

ocrd_utils.image.transform_coordinates(polygon, transform=None)[source]

Apply an affine transformation to a set of points. Augment the 2d numpy array of points `polygon` with a an extra column of ones (homogeneous coordinates), then multiply with the transformation matrix `transform` (or the identity matrix), and finally remove the extra column from the result.

ocrd_utils.image.transpose_coordinates(transform, method, orig=array([0, 0]))[source]

“Compose an affine coordinate transformation with a transposition (i.e. flip or rotate in 90° multiples).

Given a numpy array `transform` of an existing transformation matrix in homogeneous (3d) coordinates, a transposition mode `method`, as well as a numpy array `orig` of the center of the image, calculate the affine coordinate transform corresponding to the composition of both transformations, which is respectively:

• `PIL.Image.FLIP_LEFT_RIGHT`: entails translation to the center, followed by pure reflection about the y-axis, and subsequent translation back

• `PIL.Image.FLIP_TOP_BOTTOM`: entails translation to the center, followed by pure reflection about the x-axis, and subsequent translation back

• `PIL.Image.ROTATE_180`: entails translation to the center, followed by pure reflection about the origin, and subsequent translation back

• `PIL.Image.ROTATE_90`: entails translation to the center, followed by pure rotation by 90° counter-clockwise, and subsequent translation back

• `PIL.Image.ROTATE_270`: entails translation to the center, followed by pure rotation by 270° counter-clockwise, and subsequent translation back

• `PIL.Image.TRANSPOSE`: entails translation to the center, followed by pure rotation by 90° counter-clockwise and pure reflection about the x-axis, and subsequent translation back

• `PIL.Image.TRANSVERSE`: entails translation to the center, followed by pure rotation by 90° counter-clockwise and pure reflection about the y-axis, and subsequent translation back

Return a numpy array of the resulting affine transformation matrix.

ocrd_utils.image.xywh_from_bbox(minx, miny, maxx, maxy)[source]

Convert a bounding box from a numeric list to a numeric dict representation.

ocrd_utils.image.xywh_from_points(points)[source]

Construct a numeric dict representing a bounding box from polygon coordinates in page representation.

ocrd_utils.image.xywh_from_polygon(polygon)[source]

Construct a numeric dict representing a bounding box from polygon coordinates in numeric list representation.