Core
bbox_utils
- class BboxParams(format: str, label_fields: Sequence[str] | None = None, min_planar_area: float = 0.0, min_volume: float = 0.0, min_area_visibility: float = 0.0, min_volume_visibility: float = 0.0, min_width: float = 0.0, min_height: float = 0.0, min_depth: float = 0.0, check_each_transform: bool = True)[source]
Bases:
ParamsParameters of bounding boxes
- Parameters:
format (str) –
format of bounding boxes. Should be ‘coco_3d’, ‘pascal_voc_3d’, ‘dicaugment_3d’ or ‘yolo_3d’.
- The coco_3d format
[x_min, y_min, z_min, width, height, depth], e.g. [97, 12, 5, 150, 200, 10].
- The pascal_voc_3d format
[x_min, y_min, z_min, x_max, y_max, z_min], e.g. [97, 12, 5, 247, 212, 10].
- The dicaugment_3d format
is like pascal_voc_3d, but normalized, in other words: [x_min, y_min, z_min, x_max, y_max, z_min], e.g. [0.2, 0.3, 0.5, 0.4, 0.5, 0.8].
- The yolo_3d format
[x, y, z, width, height, depth], e.g. [0.3, 0.4, 0.5, 0.1, 0.2, 0.3]; x, y, z - normalized bbox center; width, height, depth - normalized bbox width, height, and depth
You may also pass a predefined string such as albumentation3D.BB_COCO_3D or
label_fields (list) – list of fields that are joined with boxes, e.g labels. Should be same type as boxes.
min_planar_area (float) – minimum area of a bounding box for a single slice. All bounding boxes whose visible area in pixels is less than this value will be removed. Default: 0.0.
min_volume (float) – minimum volume of a bounding box. All bounding boxes whose visible volume in pixels is less than this value will be removed. Default: 0.0. This assumes that pixel spacing of the Height and Width dimensions are equal to the slice spacing of the Depth dimension
min_area_visibility (float) – minimum fraction of planar area for a bounding box to remain this box in list. Default: 0.0.
min_volume_visibility (float) – minimum fraction of volume for a bounding box to remain this box in list. Default: 0.0.
min_width (float) – Minimum width of a bounding box. All bounding boxes whose width is less than this value will be removed. Default: 0.0.
min_height (float) – Minimum height of a bounding box. All bounding boxes whose height is less than this value will be removed. Default: 0.0.
min_depth (float) – Minimum depth of a bounding box. All bounding boxes whose depth is less than this value will be removed. Default: 0.0.
check_each_transform (bool) – if True, then bboxes will be checked after each dual transform. Default: True
- class BboxProcessor(params: BboxParams, additional_targets: Dict[str, str] | None = None)[source]
Bases:
DataProcessor- convert_from_dicaugment(data: Sequence, rows: int, cols: int, slices: int) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
- convert_to_dicaugment(data: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], rows: int, cols: int, slices: int) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
- property default_data_name: str
- check_bbox(bbox: Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]) None[source]
Check if bbox boundaries are in range 0, 1 and minimums are lesser then maximums
- check_bboxes(bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]]) None[source]
Check if bboxes boundaries are in range 0, 1 and minimums are lesser then maximums
- convert_bbox_from_dicaugment(bbox: Tuple[float, float, float, float] | Tuple[float, float, float, float, Any], target_format: str, rows: int, cols: int, slices: int, check_validity: bool = False) Tuple[float, float, float, float] | Tuple[float, float, float, float, Any][source]
Convert a bounding box from the format used by dicaugment to a format, specified in target_format.
- Parameters:
bbox – An dicaugment bounding box (x_min, y_min, z_min, x_max, y_max, z_max).
target_format – required format of the output bounding box. Should be ‘coco_3d’, ‘pascal_voc_3d’ or ‘yolo_3d’.
rows – Image height.
cols – Image width.
slices – Image depth.
check_validity – Check if all boxes are valid boxes.
- Returns:
A bounding box.
- Return type:
tuple
Note
The coco_3d format of a bounding box looks like (x_min, y_min, z_min, width, height, depth), e.g. ([97, 12, 5, 150, 200, 10). The pascal_voc_3d format of a bounding box looks like (x_min, y_min, z_min, x_max, y_max, z_min), e.g. (97, 12, 5, 247, 212, 15). The yolo_3d format of a bounding box looks like (x, y, z, width, height, depth), e.g. (0.3, 0.1, 0.5, 0.05, 0.07, 0.23) where x, y, and z are coordinates of the center of the box, all values normalized to 1 by image height, width, and depth.
- Raises:
ValueError – if target_format is not equal to coco_3d, pascal_voc_3d or yolo_3d.
- convert_bbox_to_dicaugment(bbox: Tuple[float, float, float, float] | Tuple[float, float, float, float, Any], source_format: str, rows: int, cols: int, slices: int, check_validity: bool = False) Tuple[float, float, float, float] | Tuple[float, float, float, float, Any][source]
Convert a bounding box from a format specified in source_format to the format used by dicaugment: normalized coordinates of closest top-left and furthest bottom-right corners of the bounding box in a form of (x_min, y_min, z_min, x_max, y_max, z_max) e.g. (0.15, 0.27, 0.12, 0.67, 0.5, 0.48).
- Parameters:
bbox – A bounding box tuple.
source_format – format of the bounding box. Should be ‘coco_3d’, ‘pascal_voc_3d’, or ‘yolo_3d’.
check_validity – Check if all boxes are valid boxes.
rows – Image height.
cols – Image width.
slices – Image depth
- Returns:
A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).
- Return type:
tuple
Note
The coco_3d format of a bounding box looks like (x_min, y_min, z_min, width, height, depth), e.g. ([97, 12, 5, 150, 200, 10). The pascal_voc_3d format of a bounding box looks like (x_min, y_min, z_min, x_max, y_max, z_min), e.g. (97, 12, 5, 247, 212, 15). The yolo_3d format of a bounding box looks like (x, y, z, width, height, depth), e.g. (0.3, 0.1, 0.5, 0.05, 0.07, 0.23) where x, y, and z are coordinates of the center of the box, all values normalized to 1 by image height, width, and depth.
- Raises:
ValueError – if target_format is not equal to coco_3d or pascal_voc_3d, or yolo_3d.
ValueError – If in YOLO format all labels not in range (0, 1).
- convert_bboxes_from_dicaugment(bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], target_format: str, rows: int, cols: int, slices: int, check_validity: bool = False) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
Convert a list of bounding boxes from the format used by dicaugment to a format, specified in target_format.
- Parameters:
bboxes – List of albumentation bounding box (x_min, y_min, z_min, x_max, y_max, z_max).
target_format – required format of the output bounding box. Should be ‘coco_3d’, ‘pascal_voc_3d’ or ‘yolo_3d’.
rows – Image height.
cols – Image width.
slices – Image depth
check_validity – Check if all boxes are valid boxes.
- Returns:
List of bounding boxes.
- convert_bboxes_to_dicaugment(bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], source_format: str, rows: int, cols: int, slices: int, check_validity=False) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
Convert a list bounding boxes from a format specified in source_format to the format used by dicaugment
- denormalize_bbox(bbox: TBox, rows: int, cols: int, slices: int) TBox[source]
Denormalize coordinates of a bounding box. Multiply x-coordinates by image width, y-coordinates by image height, and z-coordinates by image depth. This is an inverse operation for
normalize_bbox().- Parameters:
bbox – Normalized bounding box (x_min, y_min, z_min, x_max, y_max, z_max).
rows – Image height.
cols – Image width.
slices – Image depth.
- Returns:
Denormalized bounding box (x_min, y_min, z_min, x_max, y_max, z_max).
- Raises:
ValueError – If rows, cols, or slices is less or equal zero
- denormalize_bboxes(bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], rows: int, cols: int, slices: int) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
Denormalize a list of bounding boxes.
- Parameters:
bboxes – Normalized bounding boxes [(x_min, y_min, z_min, x_max, y_max, z_max)].
rows – Image height.
cols – Image width.
slices – Image depth.
- Returns:
Denormalized bounding boxes [(x_min, y_min, z_min, x_max, y_max, z_max)].
- Return type:
List
- filter_bboxes(bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], rows: int, cols: int, slices: int, min_area_visibility: float = 0.0, min_volume_visibility: float = 0.0, min_planar_area: float = 0.0, min_volume: float = 0.0, min_width: float = 0.0, min_height: float = 0.0, min_depth: float = 0.0) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
Remove bounding boxes that either lie outside of the visible planar area or volume by more then min_area_visibility or min_volume_visibility, respectively, as well as any bounding boxes or whose planar area/volumne in pixels is under the threshold set by min_area and min_volume. Also it crops boxes to final image size.
- Parameters:
bboxes – List of dicaugment bounding boxes (x_min, y_min, z_min, x_max, y_max, z_max).
rows – Image height.
cols – Image width.
slices – Image depth.
min_planar_area – Minimum planar area of a bounding box. All bounding boxes whose visible planar area in pixels. is less than this value will be removed. Default: 0.0.
min_volume – Minimum volume of a bounding box. All bounding boxes whose visible volume in pixels. is less than this value will be removed. Default: 0.0.
min_area_visibility – Minimum fraction of planar area for a bounding box to remain this box in list. Default: 0.0.
min_volume_visibility – Minimum fraction of volume for a bounding box to remain this box in list. Default: 0.0.
min_width – Minimum width of a bounding box in pixels. All bounding boxes whose width is less than this value will be removed. Default: 0.0.
min_height – Minimum height of a bounding box in pixels. All bounding boxes whose height is less than this value will be removed. Default: 0.0.
min_depth – Minimum depth of a bounding box in pixels. All bounding boxes whose height is less than this value will be removed. Default: 0.0.
- Returns:
List of bounding boxes.
- filter_bboxes_by_visibility(original_shape: Sequence[int], bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], transformed_shape: Sequence[int], transformed_bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], area_threshold: float = 0.0, volume_threshold: float = 0.0, min_area: float = 0.0, min_volume: float = 0.0) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
Filter bounding boxes and return only those boxes whose visibility after transformation is above the threshold and minimal area of bounding box in pixels is more then min_area.
- Parameters:
original_shape – Original image shape (height, width, depth, …).
bboxes – Original bounding boxes [(x_min, y_min, z_min, x_max, y_max, z_max)].
transformed_shape – Transformed image shape (height, width, depth).
transformed_bboxes – Transformed bounding boxes [(x_min, y_min, z_min, x_max, y_max, z_max)].
area_threshold – planar area visibility threshold. Should be a value in the range [0.0, 1.0].
volume_threshold – volume visibility threshold. Should be a value in the range [0.0, 1.0].
min_area – Minimal area threshold.
min_volumne – Minimal volume theshold
- Returns:
Filtered bounding boxes [(x_min, y_min, z_min, x_max, y_max, z_max)].
- normalize_bbox(bbox: TBox, rows: int, cols: int, slices: int) TBox[source]
Normalize coordinates of a bounding box. Divide x-coordinates by image width, y-coordinates by image height, and z-coordinates by image depth
- Parameters:
bbox – Denormalized bounding box (x_min, y_min, z_min, x_max, y_max, z_max).
rows – Image height.
cols – Image width.
slices – Image depth.
- Returns:
Normalized bounding box (x_min, y_min, z_min, x_max, y_max, z_max).
- Raises:
ValueError – If rows, cols, or slices is less or equal zero
- normalize_bboxes(bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], rows: int, cols: int, slices: int) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
Normalize a list of bounding boxes.
- Parameters:
bboxes – Denormalized bounding boxes [(x_min, y_min, z_min, x_max, y_max, z_max)].
rows – Image height.
cols – Image width.
slices – Image depth
- Returns:
Normalized bounding boxes [(x_min, y_min, z_min, x_max, y_max, z_max)].
- union_of_bboxes(height: int, width: int, depth: int, bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], erosion_rate: float = 0.0) Tuple[float, float, float, float] | Tuple[float, float, float, float, Any][source]
Calculate union of bounding boxes.
- Parameters:
height – Height of image or space.
width – Width of image or space.
depth – Depth of image or space.
bboxes – List of dicaugment bounding boxes (x_min, y_min, z_min, x_max, y_max, z_max).
erosion_rate – How much each bounding box can be shrinked, useful for erosive cropping. Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox to lose its volume.
- Returns:
A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).
- Return type:
tuple
composition
- class BaseCompose(transforms: Sequence[BasicTransform | BaseCompose], p: float)[source]
Bases:
Serializable
- class Compose(transforms: Sequence[BasicTransform | BaseCompose], bbox_params: dict | BboxParams | None = None, keypoint_params: dict | KeypointParams | None = None, additional_targets: Dict[str, str] | None = None, p: float = 1.0, is_check_shapes: bool = True)[source]
Bases:
BaseComposeCompose transforms and handle all transformations regarding bounding boxes
- Parameters:
transforms (list) – list of transformations to compose.
bbox_params (BboxParams) – Parameters for bounding boxes transforms
keypoint_params (KeypointParams) – Parameters for keypoints transforms
additional_targets (dict) – Dict with keys - new target name, values - old target name. ex: {‘image2’: ‘image’}
p (float) – probability of applying all list of transforms. Default: 1.0.
is_check_shapes (bool) – If True shapes consistency of images/mask/masks would be checked on each call. If you would like to disable this check - pass False (do it only if you are sure in your data consistency).
- class OneOf(transforms: Sequence[BasicTransform | BaseCompose], p: float = 0.5)[source]
Bases:
BaseComposeSelect one of transforms to apply. Selected transform will be called with force_apply=True. Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.
- Parameters:
transforms (list) – list of transformations to compose.
p (float) – probability of applying selected transform. Default: 0.5.
- class OneOrOther(first: BasicTransform | BaseCompose | None = None, second: BasicTransform | BaseCompose | None = None, transforms: Sequence[BasicTransform | BaseCompose] | None = None, p: float = 0.5)[source]
Bases:
BaseComposeSelect one or another transform to apply. Selected transform will be called with force_apply=True.
- class ReplayCompose(transforms: Sequence[BasicTransform | BaseCompose], bbox_params: dict | BboxParams | None = None, keypoint_params: dict | KeypointParams | None = None, additional_targets: Dict[str, str] | None = None, p: float = 1.0, is_check_shapes: bool = True, save_key: str = 'replay')[source]
Bases:
Compose
- class Sequential(transforms: Sequence[BasicTransform | BaseCompose], p: float = 0.5)[source]
Bases:
BaseComposeSequentially applies all transforms to targets.
Note
This transform is not intended to be a replacement for Compose. Instead, it should be used inside Compose the same way OneOf or OneOrOther are used. For instance, you can combine OneOf with Sequential to create an augmentation pipeline that contains multiple sequences of augmentations and applies one randomly chose sequence to input data (see the Example section for an example definition of such pipeline).
Example
>>> import dicaugment as dca >>> transform = dca.Compose([ >>> dca.OneOf([ >>> dca.Sequential([ >>> dca.HorizontalFlip(p=0.5), >>> dca.ShiftScaleRotate(p=0.5), >>> ]), >>> dca.Sequential([ >>> dca.VerticalFlip(p=0.5), >>> dca.RandomBrightnessContrast(p=0.5), >>> ]), >>> ], p=1) >>> ])
- class SomeOf(transforms: Sequence[BasicTransform | BaseCompose], n: int, replace: bool = True, p: float = 1)[source]
Bases:
BaseComposeSelect N transforms to apply. Selected transforms will be called with force_apply=True. Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.
- Parameters:
transforms (list) – list of transformations to compose.
n (int) – number of transforms to apply.
replace (bool) – Whether the sampled transforms are with or without replacement. Default: True.
p (float) – probability of applying selected transform. Default: 1.
keypoints_utils
- class KeypointParams(format: str, label_fields: Sequence[str] | None = None, remove_invisible: bool = True, angle_in_degrees: bool = True, check_each_transform: bool = True)[source]
Bases:
ParamsParameters of keypoints
- Parameters:
format (str) –
format of keypoints. Should be ‘xyz’, ‘zyx’, ‘xyza’, ‘xyzs’, ‘xyzas’, ‘xyzsa’.
x - X coordinate,
y - Y coordinate,
z - Z coordinate,
s - Keypoint scale
a - Keypoint planar orientation in radians or degrees (depending on KeypointParams.angle_in_degrees)
label_fields (list) – list of fields that are joined with keypoints, e.g labels. Should be same type as keypoints.
remove_invisible (bool) – to remove invisible points after transform or not
angle_in_degrees (bool) – planar angle in degrees or radians in ‘xyza’, ‘xyzas’, ‘xyzsa’ keypoints
check_each_transform (bool) – if True, then keypoints will be checked after each dual transform. Default: True
- class KeypointsProcessor(params: KeypointParams, additional_targets: Dict[str, str] | None = None)[source]
Bases:
DataProcessor- convert_from_dicaugment(data: Sequence[Sequence], rows: int, cols: int, slices: int) List[Tuple][source]
- convert_to_dicaugment(data: Sequence[Sequence], rows: int, cols: int, slices: int) List[Tuple][source]
- property default_data_name: str
- check_keypoints(keypoints: Sequence[Sequence], rows: int, cols: int, slices: int) None[source]
Check if keypoints boundaries are less than image shapes
- convert_keypoints_from_dicaugment(keypoints: Sequence[Sequence], target_format: str, rows: int, cols: int, slices: int, check_validity: bool = False, angle_in_degrees: bool = True) List[Tuple][source]
serialization
- from_dict(transform_dict: Dict[str, Any], nonserializable: Dict[str, Any] | None = None, lambda_transforms: Dict[str, Any] | None | str = 'deprecated') Serializable | None[source]
- Parameters:
transform_dict (dict) – A dictionary with serialized transform pipeline.
nonserializable (dict) – A dictionary that contains non-serializable transforms. This dictionary is required when you are restoring a pipeline that contains non-serializable transforms. Keys in that dictionary should be named same as name arguments in respective transforms from a serialized pipeline.
lambda_transforms (dict) – Deprecated. Use ‘nonserizalizable’ instead.
- load(filepath: str, data_format: str = 'json', nonserializable: Dict[str, Any] | None = None, lambda_transforms: Dict[str, Any] | None | str = 'deprecated') object[source]
Load a serialized pipeline from a json or yaml file and construct a transform pipeline.
- Parameters:
filepath (str) – Filepath to read from.
data_format (str) – Serialization format. Should be either json or ‘yaml’.
nonserializable (dict) – A dictionary that contains non-serializable transforms. This dictionary is required when you are restoring a pipeline that contains non-serializable transforms. Keys in that dictionary should be named same as name arguments in respective transforms from a serialized pipeline.
lambda_transforms (dict) – Deprecated. Use ‘nonserizalizable’ instead.
- save(transform: Serializable, filepath: str, data_format: str = 'json', on_not_implemented_error: str = 'raise') None[source]
Take a transform pipeline, serialize it and save a serialized version to a file using either json or yaml format.
- Parameters:
transform (obj) – Transform to serialize.
filepath (str) – Filepath to write to.
data_format (str) – Serialization format. Should be either json or ‘yaml’.
on_not_implemented_error (str) – Parameter that describes what to do if a transform doesn’t implement the to_dict method. If ‘raise’ then NotImplementedError is raised, if warn then the exception will be ignored and no transform arguments will be saved.
- to_dict(transform: Serializable, on_not_implemented_error: str = 'raise') Dict[str, Any][source]
Take a transform pipeline and convert it to a serializable representation that uses only standard python data types: dictionaries, lists, strings, integers, and floats.
- Parameters:
transform – A transform that should be serialized. If the transform doesn’t implement the to_dict method and on_not_implemented_error equals to ‘raise’ then NotImplementedError is raised. If on_not_implemented_error equals to ‘warn’ then NotImplementedError will be ignored but no transform parameters will be serialized.
on_not_implemented_error (str) – raise or warn.
transforms_interface
- class BasicTransform(always_apply: bool = False, p: float = 0.5)[source]
Bases:
Serializable- add_targets(additional_targets: Dict[str, str])[source]
Add targets to transform them the same way as one of existing targets ex: {‘target_image’: ‘image’} ex: {‘obj1_mask’: ‘mask’, ‘obj2_mask’: ‘mask’} by the way you must have at least one object with key ‘image’
- Parameters:
additional_targets (dict) – keys - new target name, values - old target name. ex: {‘image2’: ‘image’}
- call_backup = None
- fill_value: Any
- interpolation: Any
- mask_fill_value: Any
- set_deterministic(flag: bool, save_key: str = 'replay') BasicTransform[source]
- property target_dependence: Dict
- property targets: Dict[str, Callable]
- property targets_as_params: List[str]
- class DualTransform(always_apply: bool = False, p: float = 0.5)[source]
Bases:
BasicTransformTransform for segmentation task.
- apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
- apply_to_bboxes(bboxes: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], **params) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
- apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
- apply_to_keypoints(keypoints: Sequence[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]], **params) List[Tuple[float, float, float, float] | Tuple[float, float, float, float, Any]][source]
- property targets: Dict[str, Callable]
- class ImageOnlyTransform(always_apply: bool = False, p: float = 0.5)[source]
Bases:
BasicTransformTransform applied to image only.
- property targets: Dict[str, Callable]
- class NoOp(always_apply: bool = False, p: float = 0.5)[source]
Bases:
DualTransformDoes nothing
- apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]
- to_tuple(param, low=None, bias=None)[source]
Convert input argument to min-max tuple.
- Parameters:
param (scalar, tuple or list of 2+ elements) – Input value. If value is scalar, return value would be (offset - value, offset + value). If value is tuple, return value would be value + offset (broadcasted).
low – Second element of tuple can be passed as optional argument
bias – An offset factor added to each element
utils
- class DataProcessor(params: Params, additional_targets: Dict[str, str] | None = None)[source]
Bases:
ABC- check_and_convert(data: Sequence, rows: int, cols: int, slices: int, direction: str = 'to') Sequence[source]
- abstract convert_from_dicaugment(data: Sequence, rows: int, cols: int, slices: int) Sequence[source]
- abstract property default_data_name: str