ocrd_utils.image module¶
- ocrd_utils.image.adjust_canvas_to_rotation(size, angle)[source]¶
Calculate the enlarged image size after rotation.
Given a numpy array
size
of an original canvas (width and height), and a rotation angle in degrees counter-clockwiseangle
, calculate the new size which is necessary to encompass the full image after rotation.Return a numpy array of the enlarged width and height.
- ocrd_utils.image.adjust_canvas_to_transposition(size, method)[source]¶
Calculate the flipped image size after transposition.
Given a numpy array
size
of an original canvas (width and height), and a transposition modemethod
(seetranspose_image
), calculate the new size after transposition.Return a numpy array of the enlarged width and height.
- ocrd_utils.image.bbox_from_points(points)[source]¶
Construct a numeric list representing a bounding box from polygon coordinates in page representation.
- ocrd_utils.image.bbox_from_polygon(polygon)[source]¶
Construct a numeric list representing a bounding box from polygon coordinates in numeric list representation.
- ocrd_utils.image.bbox_from_xywh(xywh)[source]¶
Convert a bounding box from a numeric dict to a numeric list representation.
- ocrd_utils.image.coordinates_for_segment(polygon, parent_image, parent_coords)[source]¶
Convert relative coordinates to absolute.
Given…
polygon
, a numpy array of points relative toparent_image
, a PIL.Image (not used), along withparent_coords
, its corresponding affine transformation,
…calculate the absolute coordinates within the page.
That is, apply the given transform inversely to
polygon
The transform encodes (recursively):Whenever
parent_image
or any of its parents was cropped, all points must be shifted by the offset in opposite direction (i.e. coordinate system gets translated by the upper left).Whenever
parent_image
or any of its parents was rotated, all points must be rotated around the center of that image in opposite direction (i.e. coordinate system gets translated by the center in opposite direction, rotated purely, and translated back; the latter involves an additional offset from the increase in canvas size necessary to accommodate all points).
Return the rounded numpy array of the resulting polygon.
- ocrd_utils.image.coordinates_of_segment(segment, parent_image, parent_coords)[source]¶
Extract the coordinates of a PAGE segment element relative to its parent.
Given…
segment
, a PAGE segment object in absolute coordinates (i.e. RegionType / TextLineType / WordType / GlyphType), andparent_image
, the PIL.Image of its corresponding parent object (i.e. PageType / RegionType / TextLineType / WordType), (not used), along withparent_coords
, its corresponding affine transformation,
…calculate the relative coordinates of the segment within the image.
That is, apply the given transform to the points annotated in
segment
. The transform encodes (recursively):Whenever
parent_image
or any of its parents was cropped, all points must be shifted by the offset (i.e. coordinate system gets translated by the upper left).Whenever
parent_image
or any of its parents was rotated, all points must be rotated around the center of that image (i.e. coordinate system gets translated by the center in opposite direction, rotated purely, and translated back; the latter involves an additional offset from the increase in canvas size necessary to accommodate all points).
Return the rounded numpy array of the resulting polygon.
- ocrd_utils.image.image_from_polygon(image, polygon, fill='background', transparency=False)[source]¶
“Mask an image with a polygon.
Given a PIL.Image
image
and a numpy arraypolygon
of relative coordinates into the image, fill everything outside the polygon hull to a color according tofill
:if
none
then do not touch the colour channels at all,else if
background
(the default), then use the median color of the image;otherwise use the given color, e.g.
'white'
or (255,255,255).
Moreover, if
transparency
is true, then add an alpha channel from the polygon mask (i.e. everything outside the polygon will be transparent, for those consumers that can interpret alpha channels). Images which already have an alpha channel will have it shrunk from the polygon mask (i.e. everything outside the polygon will be transparent, in addition to existing transparent pixels).Return a new PIL.Image.
- ocrd_utils.image.points_from_bbox(minx, miny, maxx, maxy)[source]¶
Construct polygon coordinates in page representation from a numeric list representing a bounding box.
- ocrd_utils.image.points_from_polygon(polygon)[source]¶
Convert polygon coordinates from a numeric list representation to a page representation.
- ocrd_utils.image.points_from_x0y0x1y1(xyxy)[source]¶
Construct a polygon representation from a rectangle described as a list [x0, y0, x1, y1]
- ocrd_utils.image.points_from_xywh(box)[source]¶
Construct polygon coordinates in page representation from numeric dict representing a bounding box.
- ocrd_utils.image.points_from_y0x0y1x1(yxyx)[source]¶
Construct a polygon representation from a rectangle described as a list [y0, x0, y1, x1]
- ocrd_utils.image.polygon_from_bbox(minx, miny, maxx, maxy)[source]¶
Construct polygon coordinates in numeric list representation from a numeric list representing a bounding box.
- ocrd_utils.image.polygon_from_points(points)[source]¶
Convert polygon coordinates in page representation to polygon coordinates in numeric list representation.
- ocrd_utils.image.polygon_from_x0y0x1y1(x0y0x1y1)[source]¶
Construct polygon coordinates in numeric list representation from a string list representing a bounding box.
- ocrd_utils.image.polygon_from_xywh(xywh)[source]¶
Construct polygon coordinates in numeric list representation from numeric dict representing a bounding box.
- ocrd_utils.image.polygon_mask(image, coordinates)[source]¶
“Create a mask image of a polygon.
Given a PIL.Image
image
(merely for dimensions), and a numpy arraypolygon
of relative coordinates into the image, create a new image of the same size with black background, and fill everything inside the polygon hull with white.Return the new PIL.Image.
- ocrd_utils.image.rotate_coordinates(transform, angle, orig=array([0, 0]))[source]¶
Compose an affine coordinate transformation with a passive rotation.
Given a numpy array
transform
of an existing transformation matrix in homogeneous (3d) coordinates, and a rotation angle in degrees counter-clockwiseangle
, as well as a numpy arrayorig
of the center of rotation, calculate the affine coordinate transform corresponding to the composition of both transformations. (This entails translation to the center, followed by pure rotation, and subsequent translation back. However, since rotation necessarily increases the bounding box, and thus image size, do not translate back the same amount, but to the enlarged offset.)Return a numpy array of the resulting affine transformation matrix.
- ocrd_utils.image.shift_coordinates(transform, offset)[source]¶
Compose an affine coordinate transformation with a translation.
Given a numpy array
transform
of an existing transformation matrix in homogeneous (3d) coordinates, and a numpy arrayoffset
of the translation vector, calculate the affine coordinate transform corresponding to the composition of both transformations.Return a numpy array of the resulting affine transformation matrix.
- ocrd_utils.image.scale_coordinates(transform, factors)[source]¶
Compose an affine coordinate transformation with a proportional scaling. Given a numpy array
transform
of an existing transformation matrix in homogeneous (3d) coordinates, and a numpy arrayfactors
of the scaling factors, calculate the affine coordinate transform corresponding to the composition of both transformations.Return a numpy array of the resulting affine transformation matrix.
- ocrd_utils.image.transform_coordinates(polygon, transform=None)[source]¶
Apply an affine transformation to a set of points. Augment the 2d numpy array of points
polygon
with a an extra column of ones (homogeneous coordinates), then multiply with the transformation matrixtransform
(or the identity matrix), and finally remove the extra column from the result.
- ocrd_utils.image.transpose_coordinates(transform, method, orig=array([0, 0]))[source]¶
“Compose an affine coordinate transformation with a transposition (i.e. flip or rotate in 90° multiples).
Given a numpy array
transform
of an existing transformation matrix in homogeneous (3d) coordinates, a transposition modemethod
, as well as a numpy arrayorig
of the center of the image, calculate the affine coordinate transform corresponding to the composition of both transformations, which is respectively:PIL.Image.Transpose.FLIP_LEFT_RIGHT
: entails translation to the center, followed by pure reflection about the y-axis, and subsequent translation backPIL.Image.Transpose.FLIP_TOP_BOTTOM
: entails translation to the center, followed by pure reflection about the x-axis, and subsequent translation backPIL.Image.Transpose.ROTATE_180
: entails translation to the center, followed by pure reflection about the origin, and subsequent translation backPIL.Image.Transpose.ROTATE_90
: entails translation to the center, followed by pure rotation by 90° counter-clockwise, and subsequent translation backPIL.Image.Transpose.ROTATE_270
: entails translation to the center, followed by pure rotation by 270° counter-clockwise, and subsequent translation backPIL.Image.Transpose.TRANSPOSE
: entails translation to the center, followed by pure rotation by 90° counter-clockwise and pure reflection about the x-axis, and subsequent translation backPIL.Image.Transpose.TRANSVERSE
: entails translation to the center, followed by pure rotation by 90° counter-clockwise and pure reflection about the y-axis, and subsequent translation back
Return a numpy array of the resulting affine transformation matrix.
- ocrd_utils.image.xywh_from_bbox(minx, miny, maxx, maxy)[source]¶
Convert a bounding box from a numeric list to a numeric dict representation.