Augmentations

Subpackages

Transforms

class Downscale(scale_min: float = 0.25, scale_max: float = 0.25, interpolation: int | Interpolation | Dict[str, int] | None = None, always_apply: bool = False, p: float = 0.5)[source]

Bases: ImageOnlyTransform

Decreases image quality by downscaling and upscaling back.

Parameters:
  • scale_min (float) – lower bound on the image scale. Should be < 1.

  • scale_max (float) – upper bound on the image scale. Should be < 1.

  • interpolation (int, dict, Interpolation) –

    scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Could be:

    • Single Scipy interpolation flag: The selected method will be used for both downscale and upscale.

    • dict of flags: Dictionary with keys ‘downscale’ and ‘upscale’ specifying the interpolation flags for each operation.

    • Interpolation object: Downscale.Interpolation object with flags for both downscale and upscale.

    Default: Interpolation(downscale=dicaugment.INTER_NEAREST, upscale=dicaugment.INTER_NEAREST)

Targets:

image

Image types:

uint8, uint16, int16, int32, float32

class Interpolation(*, downscale: int = 0, upscale: int = 0)[source]

Bases: object

apply(img: ndarray, scale: float | None = None, **params) ndarray[source]
get_params() Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
class Equalize(range: int | Tuple[int, int] | None = None, mask: ndarray | callable | None = None, mask_params: Sequence[str] = (), always_apply: bool = False, p: float = 0.5)[source]

Bases: ImageOnlyTransform

Equalize the image histogram. For multi-channel images, each channel is processed individually

Parameters:
  • range (int, list of int) – Histogram range. If int, then range is defined as [0, range]. If None, the range is calculated as [0, max(img)]. Default: None

  • mask (np.ndarray, callable) – If given, only the pixels selected by the mask are included in the analysis. Function signature must include image argument.

  • mask_params (list of str) – Params for mask function.

Targets:

image

Image types:

uint8, uint16, int16

apply(image: ndarray, mask: None | ndarray = None, **params) ndarray[source]
get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
property targets_as_params: List[str]
class FromFloat(dtype: str = 'int16', min_value: float | None = None, max_value: float | None = None, always_apply=False, p=1.0)[source]

Bases: ImageOnlyTransform

Take an input array where all values should lie in the range [0, 1.0], multiply them by max_value and then cast the resulted value to a type specified by dtype. If max_value is None the transform will try to infer the maximum value for the data type from the dtype argument.

This is the inverse transform for ToFloat.

Parameters:
  • min_value (float) – minimum possible input value. Default: None.

  • max_value (float) – maximum possible input value. Default: None.

  • dtype (string or numpy data type) – data type of the output. See the ‘Data types’ page from the NumPy docs. Default: ‘int16’.

  • p (float) – probability of applying the transform. Default: 1.0.

Targets:

image

Image types:

float32

apply(img: ndarray, **params) ndarray[source]
get_transform_init_args() Dict[str, Any][source]
class GaussNoise(var_limit: float | Tuple[float, float] = (10.0, 50.0), mean: float = 0, apply_to_channel_idx: int | None = None, per_channel: bool = True, always_apply: bool = False, p: float = 0.5)[source]

Bases: ImageOnlyTransform

Apply gaussian noise to the input image.

Parameters:
  • var_limit ((float, float) or float) – variance range for noise. If var_limit is a single float, the range will be (0, var_limit). Default: (10.0, 50.0).

  • mean (float) – mean of the noise. Default: 0

  • apply_to_channel_idx (int, None) – If not None, then only only noise is applied on the specified channel index. Default: None

  • per_channel (bool) – if set to True, noise will be sampled for each channel independently. Otherwise, the noise will be sampled once for all channels. Ignored if apply_to_channel_idx is not None. Default: True

  • p (float) – probability of applying the transform. Default: 0.5.

Targets:

image

Image types:

uint8, uint16, int16, float32

apply(img: ndarray, gauss: None | ndarray = None, **params) ndarray[source]
get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
property targets_as_params: List[str]
class InvertImg(always_apply: bool = False, p: float = 0.5)[source]

Bases: ImageOnlyTransform

Invert the input image by subtracting pixel values from the maximum value for the input image dtype.

Parameters:

p (float) – probability of applying the transform. Default: 0.5.

Targets:

image

Image types:

uint8, uint16, int16, float32

apply(img: ndarray, **params) ndarray[source]
get_transform_init_args_names() Tuple[str, ...][source]
class Normalize(mean: None | float | Tuple[float] = None, std: None | float | Tuple[float] = None, always_apply: bool = False, p: float = 1.0)[source]

Bases: ImageOnlyTransform

Normalization is applied by the formula: img = (img - mean) / (std)

Parameters:
  • mean (None, float, list of float) – mean values along channel dimension. If None, mean is calculated per image at runtime.

  • std (None, float, list of float) – std values along channel dimension. If None, std is calculated per image at runtime.

  • always_apply (bool) – whether to always apply the transformation. Default: False

  • p (float) – probability of applying the transform. Default: 0.5.

Targets:

image

Image types:

uint8, float32

apply(image: ndarray, **params) ndarray[source]
get_transform_init_args_names() Tuple[str, ...][source]
class PixelDropout(dropout_prob: float = 0.01, per_channel: bool = False, drop_value: float | Sequence[float] | None = 0, mask_drop_value: float | Sequence[float] | None = None, always_apply: bool = False, p: float = 0.5)[source]

Bases: DualTransform

Set pixels to 0 with some probability.

Parameters:
  • dropout_prob (float) – pixel drop probability. Default: 0.01

  • per_channel (bool) – if set to True drop mask will be sampled fo each channel, otherwise the same mask will be sampled for all channels. Default: False

  • drop_value (number or sequence of numbers or None) – Value that will be set in dropped place. If set to None value will be sampled randomly, default ranges will be used: - uint8: [0, 255] - uint16: [0, 65535] - uint32: [0, 4294967295] - int16 - [-32768, 32767] - int32 - [-2147483648, 2147483647] - float, double - [0, 1] Default: 0

  • mask_drop_value (number or sequence of numbers or None) – Value that will be set in dropped place in masks. If set to None masks will be unchanged. Default: 0

  • p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask

Image types:

any

apply(img: ndarray, drop_mask: ndarray = array(None, dtype=object), drop_value: float | Sequence[float] = (), **params) ndarray[source]
apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
apply_to_mask(img: ndarray, drop_mask: ndarray = array(None, dtype=object), **params) ndarray[source]
get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
property targets_as_params: List[str]
class Posterize(num_bits=8, always_apply=False, p=0.5)[source]

Bases: ImageOnlyTransform

Reduce the number of bits for each color channel.

Parameters:
  • num_bits ((int, int) or int, or list of ints [r, g, b], or list of ints [[r1, r2], [g1, g2], [b1, b2]]) – number of high bits. If num_bits is a single value, the range will be [num_bits, num_bits]. Must be in range [0, n] where n is the number of bits in the image dtype . Default: 8.

  • p (float) – probability of applying the transform. Default: 0.5.

Targets: image

Image types:

uint8, uint16, int16, int32

apply(image, num_bits=1, **params)[source]
get_params()[source]
get_transform_init_args_names()[source]
class RandomBrightnessContrast(max_brightness: int | float | None = None, brightness_limit: float | Tuple[float, float] = 0.2, contrast_limit: float | Tuple[float, float] = 0.2, always_apply: bool = False, p: bool = 0.5)[source]

Bases: ImageOnlyTransform

Randomly change brightness and contrast of the input image.

Parameters:
  • max_brightness (int,float,None) – If not None, adjust contrast by specified maximum and clip to maximum, else adjust contrast by image mean. Default: None

  • brightness_limit ((float, float) or float) – factor range for changing brightness. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2).

  • contrast_limit ((float, float) or float) – factor range for changing contrast. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2).

  • p (float) – probability of applying the transform. Default: 0.5.

Targets:

image

Image types:

uint8, uint16, int16, float32

apply(img: ndarray, alpha: float = 1.0, beta: float = 0.0, **params) ndarray[source]
get_params() Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
class RandomGamma(gamma_limit=(80, 120), always_apply=False, p=0.5)[source]

Bases: ImageOnlyTransform

Parameters:

gamma_limit (float or (float, float)) – If gamma_limit is a single float value, the range will be (-gamma_limit, gamma_limit). Default: (80, 120).

Targets:

image

Image types:

uint8, float32

apply(img: ndarray, gamma: float = 1, **params) ndarray[source]
get_params() Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
class Sharpen(alpha: Tuple[float, float] | float = (0.2, 0.5), lightness: Tuple[float, float] | float = (0.5, 1.0), mode: str = 'constant', cval: float | int = 0, always_apply=False, p=0.5)[source]

Bases: ImageOnlyTransform

Sharpen the input image and overlays the result with the original image.

Parameters:
  • alpha ((float, float)) – range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).

  • lightness ((float, float)) – range to choose the lightness of the sharpened image. Default: (0.5, 1.0).

  • mode (str) –

    scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:

    • reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.

    • constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.

    • nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.

    • mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.

    • wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.

    Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html

    Default: constant

  • cval (int,float) – The fill value when mode = constant. Default: 0

  • p (float) – probability of applying the transform. Default: 0.5.

Targets:

image

apply(img: ndarray, sharpening_matrix: None | ndarray = None, **params) ndarray[source]
get_params() Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]
class ToFloat(min_value: float | None = None, max_value: float | None = None, always_apply=False, p=1.0)[source]

Bases: ImageOnlyTransform

Divide pixel values by max_value to get a float32 output array where all values lie in the range [0, 1.0]. If max_value is None the transform will try to infer the maximum value by inspecting the data type of the input image.

See also

FromFloat

Parameters:
  • min_value (float) – minimum possible input value. Default: None.

  • max_value (float) – maximum possible input value. Default: None.

  • p (float) – probability of applying the transform. Default: 1.0.

Targets:

image

Image types:

any type

apply(img: ndarray, **params) ndarray[source]
get_transform_init_args_names() Tuple[str, ...][source]
class UnsharpMask(blur_limit: int | Sequence[int] = (3, 7), sigma_limit: float | Sequence[float] = 0.0, alpha: float | Sequence[float] = (0.2, 0.5), threshold: float = 0.05, mode: str = 'constant', cval: int | float = 0, always_apply: bool = False, p: float = 0.5)[source]

Bases: ImageOnlyTransform

Sharpen the input image using Unsharp Masking processing and overlays the result with the original image.

Parameters:
  • blur_limit (int, (int, int)) – maximum Gaussian kernel size for blurring the input image. Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma as round(sigma * 4 * 2) + 1. If set single value blur_limit will be in range (0, blur_limit). Default: (3, 7).

  • sigma_limit (float, (float, float)) – Gaussian kernel standard deviation. Must be in range [0, inf). If set single value sigma_limit will be in range (0, sigma_limit). If set to 0 sigma will be computed as sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8. Default: 0.

  • alpha (float, (float, float)) – range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).

  • threshold (float) – Value to limit sharpening only for areas with high pixel difference between original image and it’s smoothed version. Higher threshold means less sharpening on flat areas. Must be in range [0, 1]. Default: 0.05.

  • mode (str) –

    scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:

    • reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.

    • constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.

    • nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.

    • mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.

    • wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.

    Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html

    Default: constant

  • cval (int,float) – The fill value when mode = constant. Default: 0

  • p (float) – probability of applying the transform. Default: 0.5.

Reference:

https://arxiv.org/pdf/2107.10833.pdf

Targets:

image

apply(img: ndarray, ksize: int = 3, sigma: float = 0, alpha: float = 0.2, **params) ndarray[source]
get_params() Dict[str, Any][source]
get_transform_init_args_names() Tuple[str, ...][source]

Functional

brightness_contrast_adjust(img, alpha=1, beta=0, max_brightness=None)[source]
convolve(img, kernel, mode='constant', cval=0)[source]
downscale(img, scale, down_interpolation=1, up_interpolation=1)[source]
equalize(img, hist_range=None, mask=None)[source]

Equalize the image histogram.

Parameters:
  • img (numpy.ndarray) – image.

  • hist_range (tuple) – The histogram range

  • mask (numpy.ndarray) – An optional mask. If given, only the pixels selected by the mask are included in the analysis. Maybe 1 channel or 3 channel array.

Returns:

Equalized image.

Return type:

numpy.ndarray

from_float(img, dtype, min_value=None, max_value=None)[source]
gamma_transform(img, gamma)[source]
gauss_noise(image, gauss)[source]
invert(img: ndarray) ndarray[source]
multiply(img, multiplier)[source]
Parameters:
  • img (numpy.ndarray) – Image.

  • multiplier (numpy.ndarray) – Multiplier coefficient.

Returns:

Image multiplied by multiplier coefficient.

Return type:

numpy.ndarray

noop(input_obj, **params)[source]
normalize(img, mean, std)[source]
to_float(img, min_value=None, max_value=None)[source]
unsharp_mask(image: ndarray, ksize: int, sigma: float = 0.0, alpha: float = 0.2, threshold: float = 0.05, mode: str = 'constant', cval: float | int = 0)[source]

Utils

angle_2pi_range(func: Callable[[Concatenate[Tuple[float, float, float, float], P]], Tuple[float, float, float, float]]) Callable[[Concatenate[Tuple[float, float, float, float], P]], Tuple[float, float, float, float]][source]
clip(img: ndarray, dtype: dtype, minval: float, maxval: float) ndarray[source]
clipped(func: Callable[[Concatenate[ndarray, P]], ndarray]) Callable[[Concatenate[ndarray, P]], ndarray][source]
ensure_contiguous(func: Callable[[Concatenate[ndarray, P]], ndarray]) Callable[[Concatenate[ndarray, P]], ndarray][source]

Ensure that input img is contiguous.

get_num_channels(image: ndarray) int[source]
is_grayscale_image(image: ndarray) bool[source]
is_multispectral_image(image: ndarray) bool[source]
is_rgb_image(image: ndarray) bool[source]
non_rgb_warning(image: ndarray) None[source]
preserve_channel_dim(func: Callable[[Concatenate[ndarray, P]], ndarray]) Callable[[Concatenate[ndarray, P]], ndarray][source]

Preserve dummy channel dim.

preserve_shape(func: Callable[[Concatenate[ndarray, P]], ndarray]) Callable[[Concatenate[ndarray, P]], ndarray][source]

Preserve shape of the image

read_dcm_image(path: str, include_header: bool = True, ends_with: str = '')[source]

Reads in an alphabetically sorted series of dcm file types stored in a directory as a np.ndarray and optionally a dicom header in a dict format.

Parameters:
  • path (str) – The filepath to the directory that stores the dcm files.

  • include_header (bool) – Whether to return the dicom header metadata associated with the scan. Default: True

  • ends_with (str) – If empty string, then all files in directory will be processed. If multiple file types are within the directory, you may filter the results by setting ends_with=”.dcm” Default: “”

Note

DICOM object types are dictionaries with the following keys:
PixelSpaxing (tuple)

The space in mm between pixels for both height and width of a slice, respectively

RescaleIntercept (float)

The value to add to each pixel of the scan after scaling with RescaleSlope to turn the pixel values of the scan into Hounsfield Units (HU)

RescaleSlope (float)

The value to multiply each pixel of the scan by before adding RescaleIntercept to turn the pixel values of the scan into Hounsfield Units (HU)

ConvolutionKernel (str)

A label describing the convolution kernel or algorithm used to reconstruct the data

XRayTubeCurrent (int)

X-Ray Tube Current in mA.

See example below:

dicom = {
    "PixelSpacing" : (0.5, 0.5),
    "RescaleIntercept" : -1024.0,
    "RescaleSlope" : 1.0,
    "ConvolutionKernel" : 'STANDARD',
    "XRayTubeCurrent" : 160
}