Crops

Transforms

class BBoxSafeRandomCrop(erosion_rate: float = 0.0, always_apply=False, p=1.0)[source]

Bases: DualTransform

Crop a random part of the input without loss of bboxes. :param erosion_rate: erosion rate applied on input image height before crop. :type erosion_rate: float :param p: probability of applying the transform. Default: 1. :type p: float

Targets:

image, mask, bboxes

Image types:

uint8, float32

apply(img: ndarray, crop_height: int = 0, crop_width: int = 0, crop_depth: int = 0, h_start: int = 0, w_start: int = 0, d_start: int = 0, **params) ndarray[source]
apply_to_bbox(bbox: Tuple[float, float, float, float], crop_height: int = 0, crop_width: int = 0, crop_depth: int = 0, h_start: int = 0, w_start: int = 0, d_start: int = 0, rows: int = 0, cols: int = 0, slices: int = 0, **params) Tuple[float, float, float, float][source]
get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
property targets_as_params: List[str]
class CenterCrop(height: int, width: int, depth: int, always_apply=False, p=1.0)[source]

Bases: DualTransform

Crop the central part of the input.

Parameters:
  • height (int) – height of the crop.

  • width (int) – width of the crop.

  • depth (int) – depth of the crop.

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, int32, float32

apply(img: ndarray, **params) ndarray[source]
apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
get_transform_init_args_names() Tuple[str, ...][source]
class Crop(x_min: int = 0, y_min: int = 0, z_min: int = 0, x_max: int = 1024, y_max: int = 1024, z_max: int = 1024, always_apply=False, p=1.0)[source]

Bases: DualTransform

Crop region from image.

Parameters:
  • x_min (int) – Minimum closest upper left x coordinate.

  • y_min (int) – Minimum closest upper left y coordinate.

  • z_min (int) – Minimum closest upper left z coordinate.

  • x_max (int) – Maximum furthest lower right x coordinate.

  • y_max (int) – Maximum furthest lower right y coordinate.

  • z_max (int) – Maximum furthest lower right y coordinate.

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, int32, float32

apply(img: ndarray, **params) ndarray[source]
apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
get_transform_init_args_names() Tuple[str, ...][source]
class CropAndPad(px: int | Sequence[float] | Sequence[Tuple] | None = None, percent: float | Sequence[float] | Sequence[Tuple] | None = None, pad_mode: str = 'constant', pad_cval: float | Sequence[float] = 0, pad_cval_mask: float | Sequence[float] = 0, keep_size: bool = True, sample_independently: bool = True, interpolation: int = 1, always_apply: bool = False, p: float = 1.0)[source]

Bases: DualTransform

Crop and pad images by pixel amounts or fractions of image sizes. Cropping removes pixels at the sides (i.e. extracts a subimage from a given full image). Padding adds pixels to the sides (e.g. black pixels). This transformation will never crop images below a height or width of 1.

Note

This transformation automatically resizes images back to their original size. To deactivate this, add the parameter keep_size=False.

Parameters:
  • px (int or tuple) –

    The number of pixels to crop (negative values) or pad (positive values) on each side of the image. Either this or the parameter percent may be set, not both at the same time.
    • If None, then pixel-based cropping/padding will not be used.

    • If int, then that exact number of pixels will always be cropped/padded.

    • If a tuple of two int s with values a and b, then each side will be cropped/padded by a random amount sampled uniformly per image and side from the interval [a, b]. If however sample_independently is set to False, only one value will be sampled per image and used for all sides.

    • If a tuple of six entries, then the entries represent top, bottom, left, right, close, far. Each entry may be a single int (always crop/pad by exactly that value), a tuple of two int s a and b (crop/pad by an amount within [a, b]), a list of int s (crop/pad by a random value that is contained in the list).

  • percent (float or tuple) –

    The number of pixels to crop (negative values) or pad (positive values) on each side of the image given as a fraction of the image height/width. E.g. if this is set to -0.1, the transformation will always crop away 10% of the image’s height at both the top and the bottom (both 10% each), as well as 10% of the width at the right and left. Expected value range is (-1.0, inf). Either this or the parameter px may be set, not both at the same time:
    • If None, then fraction-based cropping/padding will not be used

    • If float, then that fraction will always be cropped/padded

    • If a tuple of two float s with values a and b, then each side will be cropped/padded by a random fraction sampled uniformly per image and side from the interval [a, b]. If however sample_independently is set to False, only one value will be sampled per image and used for all sides.

    • If a tuple of six entries, then the entries represent top, bottom, left, right, close, far. Each entry may be a single float (always crop/pad by exactly that percent value), a tuple of two float s a and b (crop/pad by a fraction from [a, b]), a list of float s (crop/pad by a random value that is contained in the list).

  • pad_mode (str) – scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following: - reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric. - constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter. - nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel. - mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric. - wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge. Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html Default: constant

  • pad_cval (number, Sequence[number]) –

    The constant value to use if pad_mode is constant.
    • If number, then that value will be used.

    • If a tuple of two number s and at least one of them is a float, then a random number will be uniformly sampled per image from the continuous interval [a, b] and used as the value. If both number s are int s, the interval is discrete.

    • If a list of number, then a random value will be chosen from the elements of the list and used as the value.

  • pad_cval_mask (number, Sequence[number]) – Same as pad_cval but only for masks.

  • keep_size (bool) – After cropping and padding, the result image will usually have a different height/width compared to the original input image. If this parameter is set to True, then the cropped/padded image will be resized to the input image’s size, i.e. the output shape is always identical to the input shape.

  • sample_independently (bool) – If False and the values for px/percent result in exactly one probability distribution for all image sides, only one single value will be sampled from that probability distribution and used for all sides. I.e. the crop/pad amount then is the same for all sides. If True, four values will be sampled independently, one per side.

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST) Default: dicaugment.INTER_LINEAR.

Targets:

image, mask, bboxes, keypoints

Image types:

any

apply(img: ndarray, crop_params: Sequence[int] = (), pad_params: Sequence[int] = (), pad_value: int | float = 0, rows: int = 0, cols: int = 0, slices: int = 0, interpolation: int = 1, **params) ndarray[source]
apply_to_bbox(bbox: Tuple[float, float, float, float], crop_params: Sequence[int] | None = None, pad_params: Sequence[int] | None = None, rows: int = 0, cols: int = 0, slices: int = 0, result_rows: int = 0, result_cols: int = 0, result_slices: int = 0, **params) Tuple[float, float, float, float][source]
apply_to_keypoint(keypoint: Tuple[float, float, float, float], crop_params: Sequence[int] | None = None, pad_params: Sequence[int] | None = None, rows: int = 0, cols: int = 0, slices: int = 0, result_rows: int = 0, result_cols: int = 0, result_slices: int = 0, **params) Tuple[float, float, float, float][source]
apply_to_mask(img: ndarray, crop_params: Sequence[int] | None = None, pad_params: Sequence[int] | None = None, pad_value_mask: float | None = None, rows: int = 0, cols: int = 0, slices: int = 0, interpolation: int = 0, **params) ndarray[source]
get_params_dependent_on_targets(params) dict[source]
get_transform_init_args_names() Tuple[str, ...][source]
property targets_as_params: List[str]
class RandomCrop(height: int, width: int, depth: int, always_apply=False, p=1.0)[source]

Bases: DualTransform

Crop a random part of the input.

Parameters:
  • height (int) – height of the crop.

  • width (int) – width of the crop.

  • depth (int) – depth of the crop.

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, int32, float32

apply(img: ndarray, h_start: int = 0, w_start: int = 0, d_start: int = 0, **params) ndarray[source]
apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
get_params() Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
class RandomCropFromBorders(crop_left: float = 0.1, crop_right: float = 0.1, crop_top: float = 0.1, crop_bottom: float = 0.1, crop_close: float = 0.1, crop_far: float = 0.1, always_apply=False, p=1.0)[source]

Bases: DualTransform

Crop bbox from image randomly cut parts from borders without resize at the end

Parameters:
  • crop_left (float) – single float value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut from left side in range [0, crop_left * width)

  • crop_right (float) – single float value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut from right side in range [(1 - crop_right) * width, width)

  • crop_top (float) – single float value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut from top side in range [0, crop_top * height)

  • crop_bottom (float) – single float value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut from bottom side in range [(1 - crop_bottom) * height, height)

  • crop_close (float) – single float value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut from close side in range [0, crop_close * depth)

  • crop_far (float) – single float value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut from far side in range [(1 - crop_far) * depth, depth)

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, float32

apply(img: ndarray, x_min: int = 0, x_max: int = 0, y_min: int = 0, y_max: int = 0, z_min: int = 0, z_max: int = 0, **params) ndarray[source]
apply_to_bbox(bbox: Tuple[float, float, float, float], x_min: int = 0, x_max: int = 0, y_min: int = 0, y_max: int = 0, z_min: int = 0, z_max: int = 0, **params) Tuple[float, float, float, float][source]
apply_to_keypoint(keypoint: Tuple[float, float, float, float], x_min: int = 0, x_max: int = 0, y_min: int = 0, y_max: int = 0, z_min: int = 0, z_max: int = 0, **params) Tuple[float, float, float, float][source]
apply_to_mask(mask: ndarray, x_min: int = 0, x_max: int = 0, y_min: int = 0, y_max: int = 0, z_min: int = 0, z_max: int = 0, **params) ndarray[source]
get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
property targets_as_params: List[str]
class RandomCropNearBBox(max_part_shift: float | Tuple[float, float, float] = (0.3, 0.3, 0.3), cropping_box_key: str = 'cropping_bbox', always_apply: bool = False, p: float = 1.0)[source]

Bases: DualTransform

Crop bbox from image with random shift by x,y,z coordinates

Parameters:
  • max_part_shift (float, (float, float, float)) – Max shift in height, width, and depth dimensions relative to cropping_bbox dimension. If max_part_shift is a single float, the range will be (max_part_shift, max_part_shift, max_part_shift). Default (0.3, 0.3, 0.3).

  • cropping_box_key (str) – Additional target key for cropping box. Default cropping_bbox

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, float32

Examples

>>> aug = Compose([RandomCropNearBBox(max_part_shift=(0.1, 0.5), cropping_box_key='test_box')],
>>>              bbox_params=BboxParams("pascal_voc"))
>>> result = aug(image=image, bboxes=bboxes, test_box=[0, 5, 10, 20])
apply(img: ndarray, x_min: int = 0, y_min: int = 0, z_min: int = 0, x_max: int = 0, y_max: int = 0, z_max: int = 0, **params) ndarray[source]
apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
apply_to_keypoint(keypoint: Tuple[float, float, float, float], x_min: int = 0, y_min: int = 0, z_min: int = 0, x_max: int = 0, y_max: int = 0, z_max: int = 0, **params) Tuple[float, float, float, float][source]
get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, int][source]
get_transform_init_args_names() Tuple[str, ...][source]
property targets_as_params: List[str]
class RandomSizedBBoxSafeCrop(height: int, width: int, depth: int, erosion_rate: float = 0.0, interpolation: int = 1, always_apply=False, p=1.0)[source]

Bases: BBoxSafeRandomCrop

Crop a random part of the input and rescale it to some size without loss of bboxes.

Parameters:
  • height (int) – height after crop and resize.

  • width (int) – width after crop and resize.

  • depth (int) – depth after crop and resize.

  • erosion_rate (float) – erosion rate applied on input image height before crop.

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST) Default: dicaugment.INTER_LINEAR.

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes

Image types:

uint8, float32

apply(img: ndarray, crop_height: int = 0, crop_width: int = 0, crop_depth: int = 0, h_start: int = 0, w_start: int = 0, d_start: int = 0, interpolation: int = 1, **params) ndarray[source]
get_transform_init_args_names() Tuple[str, ...][source]
class RandomSizedCrop(min_max_height: Tuple[int, int], height: int, width: int, depth: int, w2h_ratio: float = 1.0, d2h_ratio: float = 1.0, interpolation: int = 1, always_apply=False, p=1.0)[source]

Bases: _BaseRandomSizedCrop

Crop a random part of the input and rescale it to some size.

Parameters:
  • min_max_height ((int, int)) – crop size limits.

  • height (int) – height after crop and resize.

  • width (int) – width after crop and resize.

  • depth (int) – depth after crop and resize.

  • w2h_ratio (float) – width aspect ratio of crop.

  • d2h_ratio (float) – depth aspect ratio of crop.

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST) Default: dicaugment.INTER_LINEAR.

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, float32

get_params() Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]

Functional

bbox_center_crop(bbox: Tuple[float, float, float, float], crop_height: int, crop_width: int, crop_depth: int, rows: int, cols: int, slices: int)[source]
bbox_crop(bbox: Tuple[float, float, float, float], x_min: int, y_min: int, z_min: int, x_max: int, y_max: int, z_max: int, rows: int, cols: int, slices: int)[source]

Crop a bounding box.

Parameters:
  • bbox (tuple) – A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

  • x_min (int) –

  • y_min (int) –

  • z_min (int) –

  • x_max (int) –

  • y_max (int) –

  • z_max (int) –

  • rows (int) – Image width.

  • cols (int) – Image height.

  • slices (int) – Image depth.

Returns:

A cropped bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

Return type:

tuple

bbox_random_crop(bbox: Tuple[float, float, float, float], crop_height: int, crop_width: int, crop_depth: int, h_start: float, w_start: float, d_start: float, rows: int, cols: int, slices: int)[source]
center_crop(img: ndarray, crop_height: int, crop_width: int, crop_depth: int)[source]
clamping_crop(img: ndarray, x_min: int, y_min: int, z_min: int, x_max: int, y_max: int, z_max: int)[source]
crop(img: ndarray, x_min: int, y_min: int, z_min: int, x_max: int, y_max: int, z_max: int)[source]
crop_and_pad(img: ndarray, crop_params: Sequence[int] | None, pad_params: Sequence[int] | None, pad_value: float | None, rows: int, cols: int, slices: int, interpolation: int, pad_mode: int, keep_size: bool) ndarray[source]
crop_and_pad_bbox(bbox: Tuple[float, float, float, float], crop_params: Sequence[int] | None, pad_params: Sequence[int] | None, rows, cols, slices, result_rows, result_cols, result_slices) Tuple[float, float, float, float][source]
crop_and_pad_keypoint(keypoint: Tuple[float, float, float, float], crop_params: Sequence[int] | None, pad_params: Sequence[int] | None, rows: int, cols: int, slices: int, result_rows: int, result_cols: int, result_slices: int, keep_size: bool) Tuple[float, float, float, float][source]
crop_bbox_by_coords(bbox: Tuple[float, float, float, float], crop_coords: Tuple[int, int, int, int, int, int], crop_height: int, crop_width: int, crop_depth: int, rows: int, cols: int, slices: int)[source]

Crop a bounding box using the provided coordinates of bottom-left and top-right corners in pixels and the required height and width of the crop.

Parameters:
  • bbox (tuple) – A cropped box (x_min, y_min, z_min, x_max, y_max, z_max).

  • crop_coords (tuple) – Crop coordinates (x1, y1, z1, x2, y2, z2).

  • crop_height (int) –

  • crop_width (int) –

  • crop_depth (int) –

  • rows (int) – Image rows.

  • cols (int) – Image cols.

  • slices (int) – Image slices.

Returns:

A cropped bounding box (x_min, y_min, x_max, y_max, z_min, z_max).

Return type:

tuple

crop_keypoint_by_coords(keypoint: Tuple[float, float, float, float], crop_coords: Tuple[int, int, int, int, int, int])[source]

Crop a keypoint using the provided coordinates of closest-top-left and furthest-bottom-right corners in pixels and the required height, width, and depth of the crop.

Parameters:
  • keypoint (tuple) – A keypoint (x, y, z, angle, scale).

  • crop_coords (tuple) – Crop box coords (x1, y1, z1, x2, y2, z2).

Returns:

A keypoint (x, y, z, angle, scale).

get_center_crop_coords(height: int, width: int, depth: int, crop_height: int, crop_width: int, crop_depth: int)[source]
get_random_crop_coords(height: int, width: int, depth: int, crop_height: int, crop_width: int, crop_depth: int, h_start: float, w_start: float, d_start: float)[source]
keypoint_center_crop(keypoint: Tuple[float, float, float, float], crop_height: int, crop_width: int, crop_depth: int, rows: int, cols: int, slices: int)[source]

Keypoint center crop.

Parameters:
  • keypoint (tuple) – A keypoint (x, y, z, angle, scale).

  • crop_height (int) – Crop height.

  • crop_width (int) – Crop width.

  • crop_depth (int) – Crop depth.

  • rows (int) – Image height.

  • cols (int) – Image width.

  • slices (int) – Image depths.

Returns:

A keypoint (x, y, z, angle, scale).

Return type:

tuple

keypoint_random_crop(keypoint: Tuple[float, float, float, float], crop_height: int, crop_width: int, crop_depth: int, h_start: float, w_start: float, d_start: float, rows: int, cols: int, slices: int)[source]

Keypoint random crop.

Parameters:
  • keypoint – (tuple): A keypoint (x, y, angle, scale).

  • crop_height (int) – Crop height.

  • crop_width (int) – Crop width.

  • crop_depth (int) – Crop depth.

  • h_start (int) – Crop height start.

  • w_start (int) – Crop width start.

  • d_start (int) – Crop depth start.

  • rows (int) – Image height.

  • cols (int) – Image width.

  • slices (int) – Image depth

Returns:

A keypoint (x, y, z, angle, scale).

random_crop(img: ndarray, crop_height: int, crop_width: int, crop_depth: int, h_start: float, w_start: float, d_start: float)[source]