Augmentations

Subpackages

Transforms

class Downscale(scale_min: float = 0.25, scale_max: float = 0.25, interpolation: int | Interpolation | Dict[str, int] | None = None, always_apply: bool = False, p: float = 0.5)[source]

Bases: ImageOnlyTransform

Decreases image quality by downscaling and upscaling back.

Parameters:

scale_min (float) – lower bound on the image scale. Should be < 1.
scale_max (float) – upper bound on the image scale. Should be < 1.
interpolation (int, dict, Interpolation) –
scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Could be:
- Single Scipy interpolation flag: The selected method will be used for both downscale and upscale.
- dict of flags: Dictionary with keys ‘downscale’ and ‘upscale’ specifying the interpolation flags for each operation.
- Interpolation object: Downscale.Interpolation object with flags for both downscale and upscale.
Default: Interpolation(downscale=dicaugment.INTER_NEAREST, upscale=dicaugment.INTER_NEAREST)

Targets:: image
Image types:: uint8, uint16, int16, int32, float32

class Interpolation(*, downscale: int = 0, upscale: int = 0)[source]: Bases: object

apply(img: ndarray, scale: float | None = None, **params) → ndarray[source]

get_params() → Dict[str, Any][source]

get_transform_init_args_names() → Tuple[str, ...][source]

class Equalize(range: int | Tuple[int, int] | None = None, mask: ndarray | callable | None = None, mask_params: Sequence[str] = (), always_apply: bool = False, p: float = 0.5)[source]

Bases: ImageOnlyTransform

Equalize the image histogram. For multi-channel images, each channel is processed individually

Parameters:

range (int, list of int) – Histogram range. If int, then range is defined as [0, range]. If None, the range is calculated as [0, max(img)]. Default: None
mask (np.ndarray, callable) – If given, only the pixels selected by the mask are included in the analysis. Function signature must include image argument.
mask_params (list of str) – Params for mask function.

Targets:: image
Image types:: uint8, uint16, int16

apply(image: ndarray, mask: None | ndarray = None, **params) → ndarray[source]

get_params_dependent_on_targets(params: Dict[str, Any]) → Dict[str, Any][source]

get_transform_init_args_names() → Tuple[str, ...][source]

property targets_as_params: List[str]

class FromFloat(dtype: str = 'int16', min_value: float | None = None, max_value: float | None = None, always_apply=False, p=1.0)[source]

Bases: ImageOnlyTransform

Take an input array where all values should lie in the range [0, 1.0], multiply them by max_value and then cast the resulted value to a type specified by dtype. If max_value is None the transform will try to infer the maximum value for the data type from the dtype argument.

This is the inverse transform for ToFloat.

Parameters:

min_value (float) – minimum possible input value. Default: None.
max_value (float) – maximum possible input value. Default: None.
dtype (string or numpy data type) – data type of the output. See the ‘Data types’ page from the NumPy docs. Default: ‘int16’.
p (float) – probability of applying the transform. Default: 1.0.

Targets:: image
Image types:: float32

apply(img: ndarray, **params) → ndarray[source]

get_transform_init_args() → Dict[str, Any][source]

class GaussNoise(var_limit: float | Tuple[float, float] = (10.0, 50.0), mean: float = 0, apply_to_channel_idx: int | None = None, per_channel: bool = True, always_apply: bool = False, p: float = 0.5)[source]

Bases: ImageOnlyTransform

Apply gaussian noise to the input image.

Parameters:

var_limit ((float, float) or float) – variance range for noise. If var_limit is a single float, the range will be (0, var_limit). Default: (10.0, 50.0).
mean (float) – mean of the noise. Default: 0
apply_to_channel_idx (int, None) – If not None, then only only noise is applied on the specified channel index. Default: None
per_channel (bool) – if set to True, noise will be sampled for each channel independently. Otherwise, the noise will be sampled once for all channels. Ignored if apply_to_channel_idx is not None. Default: True
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, uint16, int16, float32

apply(img: ndarray, gauss: None | ndarray = None, **params) → ndarray[source]

get_params_dependent_on_targets(params: Dict[str, Any]) → Dict[str, Any][source]

get_transform_init_args_names() → Tuple[str, ...][source]

property targets_as_params: List[str]

class InvertImg(always_apply: bool = False, p: float = 0.5)[source]

Bases: ImageOnlyTransform

Invert the input image by subtracting pixel values from the maximum value for the input image dtype.

Parameters:: p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, uint16, int16, float32

apply(img: ndarray, **params) → ndarray[source]

get_transform_init_args_names() → Tuple[str, ...][source]

class Normalize(mean: None | float | Tuple[float] = None, std: None | float | Tuple[float] = None, always_apply: bool = False, p: float = 1.0)[source]

Bases: ImageOnlyTransform

Normalization is applied by the formula: img = (img - mean) / (std)

Parameters:

mean (None, float, list of float) – mean values along channel dimension. If None, mean is calculated per image at runtime.
std (None, float, list of float) – std values along channel dimension. If None, std is calculated per image at runtime.
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, float32

apply(image: ndarray, **params) → ndarray[source]

get_transform_init_args_names() → Tuple[str, ...][source]

class PixelDropout(dropout_prob: float = 0.01, per_channel: bool = False, drop_value: float | Sequence[float] | None = 0, mask_drop_value: float | Sequence[float] | None = None, always_apply: bool = False, p: float = 0.5)[source]

Bases: DualTransform

Set pixels to 0 with some probability.

Parameters:

dropout_prob (float) – pixel drop probability. Default: 0.01
per_channel (bool) – if set to True drop mask will be sampled fo each channel, otherwise the same mask will be sampled for all channels. Default: False
drop_value (number or sequence of numbers or None) – Value that will be set in dropped place. If set to None value will be sampled randomly, default ranges will be used: - uint8: [0, 255] - uint16: [0, 65535] - uint32: [0, 4294967295] - int16 - [-32768, 32767] - int32 - [-2147483648, 2147483647] - float, double - [0, 1] Default: 0
mask_drop_value (number or sequence of numbers or None) – Value that will be set in dropped place in masks. If set to None masks will be unchanged. Default: 0
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image, mask
Image types:: any

apply(img: ndarray, drop_mask: ndarray = array(None, dtype=object), drop_value: float | Sequence[float] = (), **params) → ndarray[source]

apply_to_bbox(bbox: Tuple[float, float, float, float], **params) → Tuple[float, float, float, float][source]

apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) → Tuple[float, float, float, float][source]

apply_to_mask(img: ndarray, drop_mask: ndarray = array(None, dtype=object), **params) → ndarray[source]

get_params_dependent_on_targets(params: Dict[str, Any]) → Dict[str, Any][source]

get_transform_init_args_names() → Tuple[str, ...][source]

property targets_as_params: List[str]

class Posterize(num_bits=8, always_apply=False, p=0.5)[source]

Bases: ImageOnlyTransform

Reduce the number of bits for each color channel.

Parameters:

num_bits ((int, int) or int, or list of ints [r, g, b], or list of ints [[r1, r2], [g1, g2], [b1, b2]]) – number of high bits. If num_bits is a single value, the range will be [num_bits, num_bits]. Must be in range [0, n] where n is the number of bits in the image dtype . Default: 8.
p (float) – probability of applying the transform. Default: 0.5.

Targets: image

Image types:: uint8, uint16, int16, int32

apply(image, num_bits=1, **params)[source]

get_params()[source]

get_transform_init_args_names()[source]

class RandomBrightnessContrast(max_brightness: int | float | None = None, brightness_limit: float | Tuple[float, float] = 0.2, contrast_limit: float | Tuple[float, float] = 0.2, always_apply: bool = False, p: bool = 0.5)[source]

Bases: ImageOnlyTransform

Randomly change brightness and contrast of the input image.

Parameters:

max_brightness (int,float,None) – If not None, adjust contrast by specified maximum and clip to maximum, else adjust contrast by image mean. Default: None
brightness_limit ((float, float) or float) – factor range for changing brightness. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2).
contrast_limit ((float, float) or float) – factor range for changing contrast. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2).
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, uint16, int16, float32

apply(img: ndarray, alpha: float = 1.0, beta: float = 0.0, **params) → ndarray[source]

get_params() → Dict[str, Any][source]

get_transform_init_args_names() → Tuple[str, ...][source]

class RandomGamma(gamma_limit=(80, 120), always_apply=False, p=0.5)[source]

Bases: ImageOnlyTransform

Parameters:: gamma_limit (float or (float, float)) – If gamma_limit is a single float value, the range will be (-gamma_limit, gamma_limit). Default: (80, 120).

Targets:: image
Image types:: uint8, float32

apply(img: ndarray, gamma: float = 1, **params) → ndarray[source]

get_params() → Dict[str, Any][source]

get_transform_init_args_names() → Tuple[str, ...][source]

class Sharpen(alpha: Tuple[float, float] | float = (0.2, 0.5), lightness: Tuple[float, float] | float = (0.5, 1.0), mode: str = 'constant', cval: float | int = 0, always_apply=False, p=0.5)[source]

Bases: ImageOnlyTransform

Sharpen the input image and overlays the result with the original image.

Parameters:

alpha ((float, float)) – range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).
lightness ((float, float)) – range to choose the lightness of the sharpened image. Default: (0.5, 1.0).
mode (str) –
scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:
- reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.
- constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.
- nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.
- mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.
- wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.
Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html

Default: constant
cval (int,float) – The fill value when mode = constant. Default: 0
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image

apply(img: ndarray, sharpening_matrix: None | ndarray = None, **params) → ndarray[source]

get_params() → Dict[str, Any][source]

get_transform_init_args_names() → Tuple[str, ...][source]

class ToFloat(min_value: float | None = None, max_value: float | None = None, always_apply=False, p=1.0)[source]

Bases: ImageOnlyTransform

Divide pixel values by max_value to get a float32 output array where all values lie in the range [0, 1.0]. If max_value is None the transform will try to infer the maximum value by inspecting the data type of the input image.

Functional

brightness_contrast_adjust(img, alpha=1, beta=0, max_brightness=None)[source]

convolve(img, kernel, mode='constant', cval=0)[source]

downscale(img, scale, down_interpolation=1, up_interpolation=1)[source]

equalize(img, hist_range=None, mask=None)[source]

Equalize the image histogram.

Parameters:

img (numpy.ndarray) – image.
hist_range (tuple) – The histogram range
mask (numpy.ndarray) – An optional mask. If given, only the pixels selected by the mask are included in the analysis. Maybe 1 channel or 3 channel array.

Returns:

Equalized image.

Return type:

numpy.ndarray

from_float(img, dtype, min_value=None, max_value=None)[source]

gamma_transform(img, gamma)[source]

gauss_noise(image, gauss)[source]

invert(img: ndarray) → ndarray[source]

multiply(img, multiplier)[source]

Parameters:

img (numpy.ndarray) – Image.
multiplier (numpy.ndarray) – Multiplier coefficient.

Returns:

Image multiplied by multiplier coefficient.

Return type:

numpy.ndarray

noop(input_obj, **params)[source]

normalize(img, mean, std)[source]

to_float(img, min_value=None, max_value=None)[source]

unsharp_mask(image: ndarray, ksize: int, sigma: float = 0.0, alpha: float = 0.2, threshold: float = 0.05, mode: str = 'constant', cval: float | int = 0)[source]

Utils

angle_2pi_range(func: Callable[[Concatenate[Tuple[float, float, float, float], P]], Tuple[float, float, float, float]]) → Callable[[Concatenate[Tuple[float, float, float, float], P]], Tuple[float, float, float, float]][source]

clip(img: ndarray, dtype: dtype, minval: float, maxval: float) → ndarray[source]

clipped(func: Callable[[Concatenate[ndarray, P]], ndarray]) → Callable[[Concatenate[ndarray, P]], ndarray][source]

ensure_contiguous(func: Callable[[Concatenate[ndarray, P]], ndarray]) → Callable[[Concatenate[ndarray, P]], ndarray][source]: Ensure that input img is contiguous.

get_num_channels(image: ndarray) → int[source]

is_grayscale_image(image: ndarray) → bool[source]

is_multispectral_image(image: ndarray) → bool[source]

is_rgb_image(image: ndarray) → bool[source]

non_rgb_warning(image: ndarray) → None[source]

preserve_channel_dim(func: Callable[[Concatenate[ndarray, P]], ndarray]) → Callable[[Concatenate[ndarray, P]], ndarray][source]: Preserve dummy channel dim.

preserve_shape(func: Callable[[Concatenate[ndarray, P]], ndarray]) → Callable[[Concatenate[ndarray, P]], ndarray][source]: Preserve shape of the image

read_dcm_image(path: str, include_header: bool = True, ends_with: str = '')[source]

Reads in an alphabetically sorted series of dcm file types stored in a directory as a np.ndarray and optionally a dicom header in a dict format.

Parameters:

path (str) – The filepath to the directory that stores the dcm files.
include_header (bool) – Whether to return the dicom header metadata associated with the scan. Default: True
ends_with (str) – If empty string, then all files in directory will be processed. If multiple file types are within the directory, you may filter the results by setting ends_with=”.dcm” Default: “”

Note

DICOM object types are dictionaries with the following keys:

PixelSpaxing (tuple): The space in mm between pixels for both height and width of a slice, respectively
RescaleIntercept (float): The value to add to each pixel of the scan after scaling with RescaleSlope to turn the pixel values of the scan into Hounsfield Units (HU)
RescaleSlope (float): The value to multiply each pixel of the scan by before adding RescaleIntercept to turn the pixel values of the scan into Hounsfield Units (HU)
ConvolutionKernel (str): A label describing the convolution kernel or algorithm used to reconstruct the data
XRayTubeCurrent (int): X-Ray Tube Current in mA.

See example below:

dicom = {
    "PixelSpacing" : (0.5, 0.5),
    "RescaleIntercept" : -1024.0,
    "RescaleSlope" : 1.0,
    "ConvolutionKernel" : 'STANDARD',
    "XRayTubeCurrent" : 160
}