mindspore.dataset.vision

mindspore.dataset.vision.c_transforms

The module vision.c_transforms is inherited from _c_dataengine and is implemented based on OpenCV in C++. It’s a high performance module to process images. Users can apply suitable augmentations on image data to improve their training models.

Note

A constructor’s arguments for every class in this module must be saved into the class attributes (self.xxx) to support save() and load().

Examples

>>> import mindspore.dataset as ds
>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.vision import Border, Inter
>>>
>>> dataset_dir = "path/to/imagefolder_directory"
>>> # create a dataset that reads all files in dataset_dir with 8 threads
>>> data1 = ds.ImageFolderDataset(dataset_dir, num_parallel_workers=8)
>>> # create a list of transformations to be applied to the image data
>>> transforms_list = [c_vision.Decode(),
>>>                    c_vision.Resize((256, 256), interpolation=Inter.LINEAR),
>>>                    c_vision.RandomCrop(200, padding_mode=Border.EDGE),
>>>                    c_vision.RandomRotation((0, 15)),
>>>                    c_vision.Normalize((100, 115.0, 121.0), (71.0, 68.0, 70.0)),
>>>                    c_vision.HWC2CHW()]
>>> onehot_op = c_transforms.OneHot(num_classes=10)
>>> # apply the transformation to the dataset through data1.map()
>>> data1 = data1.map(operations=transforms_list, input_columns="image")
>>> data1 = data1.map(operations=onehot_op, input_columns="label")
class mindspore.dataset.vision.c_transforms.AutoContrast(cutoff=0.0, ignore=None)[source]

Apply automatic contrast on input image.

Parameters
  • cutoff (float, optional) – Percent of pixels to cut off from the histogram (default=0.0).

  • ignore (Union[int, sequence], optional) – Pixel values to ignore (default=None).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.AutoContrast(cutoff=10.0, ignore=[10, 20])]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.BoundingBoxAugment(transform, ratio=0.3)[source]

Apply a given image transform on a random selection of bounding box regions of a given image.

Parameters
  • transform – C++ transformation function to be applied on random selection of bounding box regions of a given image.

  • ratio (float, optional) – Ratio of bounding boxes to apply augmentation on. Range: [0, 1] (default=0.3).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> # set bounding box operation with ratio of 1 to apply rotation on all bounding boxes
>>> bbox_aug_op = c_vision.BoundingBoxAugment(c_vision.RandomRotation(90), 1)
>>> # map to apply ops
>>> data3 = data3.map(operations=[bbox_aug_op],
>>>                   input_columns=["image", "bbox"],
>>>                   output_columns=["image", "bbox"],
>>>                   column_order=["image", "bbox"])
class mindspore.dataset.vision.c_transforms.CenterCrop(size)[source]

Crops the input image at the center to the given size.

Parameters

size (Union[int, sequence]) – The output size of the cropped image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> # crop image to a square
>>> transforms_list1 = [c_vision.Decode(), c_vision.CenterCrop(50)]
>>> data1 = data1.map(operations=transforms_list1, input_columns=["image"])
>>> # crop image to portrait style
>>> transforms_list2 = [c_vision.Decode(), c_vision.CenterCrop((60, 40))]
>>> data2 = data2.map(operations=transforms_list2, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.CutMixBatch(image_batch_format, alpha=1.0, prob=1.0)[source]

Apply CutMix transformation on input batch of images and labels. Note that you need to make labels into one-hot format and batch before calling this function.

Parameters
  • image_batch_format (Image Batch Format) – The method of padding. Can be any of [ImageBatchFormat.NHWC, ImageBatchFormat.NCHW]

  • alpha (float, optional) – hyperparameter of beta distribution (default = 1.0).

  • prob (float, optional) – The probability by which CutMix is applied to each image (default = 1.0).

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.transforms.vision import ImageBatchFormat
>>>
>>> onehot_op = c_transforms.OneHot(num_classes=10)
>>> data1 = data1.map(operations=onehot_op, input_columns=["label"])
>>> cutmix_batch_op = c_vision.CutMixBatch(ImageBatchFormat.NHWC, 1.0, 0.5)
>>> data1 = data1.batch(5)
>>> data1 = data1.map(operations=cutmix_batch_op, input_columns=["image", "label"])
class mindspore.dataset.vision.c_transforms.CutOut(length, num_patches=1)[source]

Randomly cut (mask) out a given number of square patches from the input NumPy image array.

Parameters
  • length (int) – The side length of each square patch.

  • num_patches (int, optional) – Number of patches to be cut out of an image (default=1).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.CutOut(80, num_patches=10)]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.Decode(rgb=True)[source]

Decode the input image in RGB mode.

Parameters

rgb (bool, optional) – Mode of decoding input image (default=True).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomHorizontalFlip()]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.Equalize[source]

Apply histogram equalization on input image.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.Equalize()]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.HWC2CHW[source]

Transpose the input image; shape (H, W, C) to shape (C, H, W).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomHorizontalFlip(0.75), c_vision.RandomCrop(),
>>>     c_vision.HWC2CHW()]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.Invert[source]

Apply invert on input image in RGB mode.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.Invert()]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.MixUpBatch(alpha=1.0)[source]

Apply MixUp transformation on input batch of images and labels. Each image is multiplied by a random weight (lambda) and then added to a randomly selected image from the batch multiplied by (1 - lambda). The same formula is also applied to the one-hot labels. Note that you need to make labels into one-hot format and batch before calling this function.

Parameters

alpha (float, optional) – Hyperparameter of beta distribution (default = 1.0).

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> onehot_op = c_transforms.OneHot(num_classes=10)
>>> data1 = data1.map(operations=onehot_op, input_columns=["label"])
>>> mixup_batch_op = c_vision.MixUpBatch(alpha=0.9)
>>> data1 = data1.batch(5)
>>> data1 = data1.map(operations=mixup_batch_op, input_columns=["image", "label"])
class mindspore.dataset.vision.c_transforms.Normalize(mean, std)[source]

Normalize the input image with respect to mean and standard deviation.

Parameters
  • mean (sequence) – List or tuple of mean values for each channel, with respect to channel order. The mean values must be in range (0.0, 255.0].

  • std (sequence) – List or tuple of standard deviations for each channel, with respect to channel order. The standard deviation values must be in range (0.0, 255.0].

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> decode_op = c_vision.Decode()
>>> normalize_op = c_vision.Normalize(mean=[121.0, 115.0, 100.0], std=[70.0, 68.0, 71.0])
>>> transforms_list = [decode_op, normalize_op]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.Pad(padding, fill_value=0, padding_mode=Border.CONSTANT)[source]

Pads the image according to padding parameters.

Parameters
  • padding (Union[int, sequence]) – The number of pixels to pad the image. If a single number is provided, it pads all borders with this value. If a tuple or list of 2 values are provided, it pads the (left and top) with the first value and (right and bottom) with the second value. If 4 values are provided as a list or tuple, it pads the left, top, right and bottom respectively.

  • fill_value (Union[int, tuple], optional) – The pixel intensity of the borders if the padding_mode is Border.CONSTANT (default=0). If it is a 3-tuple, it is used to fill R, G, B channels respectively.

  • padding_mode (Border mode, optional) –

    The method of padding (default=Border.CONSTANT). Can be any of [Border.CONSTANT, Border.EDGE, Border.REFLECT, Border.SYMMETRIC].

    • Border.CONSTANT, means it fills the border with constant values.

    • Border.EDGE, means it pads with the last value on the edge.

    • Border.REFLECT, means it reflects the values on the edge omitting the last value of edge.

    • Border.SYMMETRIC, means it reflects the values on the edge repeating the last value of edge.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.transforms.vision import Border
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.Pad([100, 100, 100, 100])]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomAffine(degrees, translate=None, scale=None, shear=None, resample=Inter.NEAREST, fill_value=0)[source]

Apply Random affine transformation to the input image.

Parameters
  • degrees (int or float or sequence) – Range of the rotation degrees. If degrees is a number, the range will be (-degrees, degrees). If degrees is a sequence, it should be (min, max).

  • translate (sequence, optional) – Sequence (tx_min, tx_max, ty_min, ty_max) of minimum/maximum translation in x(horizontal) and y(vertical) directions (default=None). The horizontal and vertical shift is selected randomly from the range: (tx_min*width, tx_max*width) and (ty_min*height, ty_max*height), respectively. If a tuple or list of size 2, then a translate parallel to the X axis in the range of (translate[0], translate[1]) is applied. If a tuple of list of size 4, then a translate parallel to the X axis in the range of (translate[0], translate[1]) and a translate parallel to the Y axis in the range of (translate[2], translate[3]) are applied. If None, no translation is applied.

  • scale (sequence, optional) – Scaling factor interval (default=None, original scale is used).

  • shear (int or float or sequence, optional) – Range of shear factor (default=None). If a number, then a shear parallel to the X axis in the range of (-shear, +shear) is applied. If a tuple or list of size 2, then a shear parallel to the X axis in the range of (shear[0], shear[1]) is applied. If a tuple of list of size 4, then a shear parallel to X axis in the range of (shear[0], shear[1]) and a shear parallel to Y axis in the range of (shear[2], shear[3]) is applied. If None, no shear is applied.

  • resample (Inter mode, optional) –

    An optional resampling filter (default=Inter.NEAREST). If omitted, or if the image has mode “1” or “P”, it is set to be Inter.NEAREST. It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means resample method is bilinear interpolation.

    • Inter.NEAREST, means resample method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means resample method is bicubic interpolation.

  • fill_value (tuple or int, optional) – Optional fill_value to fill the area outside the transform in the output image. There must be three elements in tuple and the value of single element is [0, 255]. Used only in Pillow versions > 5.0.0 (default=0, filling is performed).

Raises
  • ValueError – If degrees is negative.

  • ValueError – If translation value is not between -1 and 1.

  • ValueError – If scale is not positive.

  • ValueError – If shear is a number but is not positive.

  • TypeError – If degrees is not a number or a list or a tuple. If degrees is a list or tuple, its length is not 2.

  • TypeError – If translate is specified but is not list or a tuple of length 2 or 4.

  • TypeError – If scale is not a list or tuple of length 2.’’

  • TypeError – If shear is not a list or tuple of length 2 or 4.

  • TypeError – If fill_value is not a single integer or a 3-tuple.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.transforms.vision import Inter
>>>
>>> decode_op = c_vision.Decode()
>>> random_affine_op = c_vision.RandomAffine(degrees=15, translate=(-0.1, 0.1, 0, 0), scale=(0.9, 1.1),
>>>     resample=Inter.NEAREST)
>>> transforms_list = [decode_op, random_affine_op]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomColor(degrees=(0.1, 1.9))[source]

Adjust the color of the input image by a fixed or random degree. This operation works only with 3-channel color images.

Parameters

degrees (sequence, optional) – Range of random color adjustment degrees. It should be in (min, max) format. If min=max, then it is a single fixed magnitude operation (default=(0.1, 1.9)).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomColor((0.5, 2.0))]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomColorAdjust(brightness=(1, 1), contrast=(1, 1), saturation=(1, 1), hue=(0, 0))[source]

Randomly adjust the brightness, contrast, saturation, and hue of the input image.

Parameters
  • brightness (Union[float, tuple], optional) – Brightness adjustment factor (default=(1, 1)). Cannot be negative. If it is a float, the factor is uniformly chosen from the range [max(0, 1-brightness), 1+brightness]. If it is a sequence, it should be [min, max] for the range.

  • contrast (Union[float, tuple], optional) – Contrast adjustment factor (default=(1, 1)). Cannot be negative. If it is a float, the factor is uniformly chosen from the range [max(0, 1-contrast), 1+contrast]. If it is a sequence, it should be [min, max] for the range.

  • saturation (Union[float, tuple], optional) – Saturation adjustment factor (default=(1, 1)). Cannot be negative. If it is a float, the factor is uniformly chosen from the range [max(0, 1-saturation), 1+saturation]. If it is a sequence, it should be [min, max] for the range.

  • hue (Union[float, tuple], optional) – Hue adjustment factor (default=(0, 0)). If it is a float, the range will be [-hue, hue]. Value should be 0 <= hue <= 0.5. If it is a sequence, it should be [min, max] where -0.5 <= min <= max <= 0.5.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> decode_op = c_vision.Decode()
>>> transform_op = c_vision.RandomColorAdjust(brightness=(0.5, 1), contrast=(0.4, 1), saturation=(0.3, 1))
>>> transforms_list = [decode_op, transform_op]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomCrop(size, padding=None, pad_if_needed=False, fill_value=0, padding_mode=Border.CONSTANT)[source]

Crop the input image at a random location.

Parameters
  • size (Union[int, sequence]) – The output size of the cropped image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • padding (Union[int, sequence], optional) – The number of pixels to pad the image (default=None). If padding is not None, pad image firstly with padding values. If a single number is provided, pad all borders with this value. If a tuple or list of 2 values are provided, pad the (left and top) with the first value and (right and bottom) with the second value. If 4 values are provided as a list or tuple, pad the left, top, right and bottom respectively.

  • pad_if_needed (bool, optional) – Pad the image if either side is smaller than the given output size (default=False).

  • fill_value (Union[int, tuple], optional) – The pixel intensity of the borders if the padding_mode is Border.CONSTANT (default=0). If it is a 3-tuple, it is used to fill R, G, B channels respectively.

  • padding_mode (Border mode, optional) –

    The method of padding (default=Border.CONSTANT). It can be any of [Border.CONSTANT, Border.EDGE, Border.REFLECT, Border.SYMMETRIC].

    • Border.CONSTANT, means it fills the border with constant values.

    • Border.EDGE, means it pads with the last value on the edge.

    • Border.REFLECT, means it reflects the values on the edge omitting the last value of edge.

    • Border.SYMMETRIC, means it reflects the values on the edge repeating the last value of edge.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> decode_op = c_vision.Decode()
>>> random_crop_op = c_vision.RandomCrop(512, [200, 200, 200, 200], padding_mode=Border.EDGE)
>>> transforms_list = [decode_op, random_crop_op]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomCropDecodeResize(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=Inter.BILINEAR, max_attempts=10)[source]

Equivalent to RandomResizedCrop, but crops before decodes.

Parameters
  • size (Union[int, sequence]) – The size of the output image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • scale (tuple, optional) – Range [min, max) of respective size of the original size to be cropped (default=(0.08, 1.0)).

  • ratio (tuple, optional) – Range [min, max) of aspect ratio to be cropped (default=(3. / 4., 4. / 3.)).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.BILINEAR). It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means interpolation method is bilinear interpolation.

    • Inter.NEAREST, means interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means interpolation method is bicubic interpolation.

  • max_attempts (int, optional) – The maximum number of attempts to propose a valid crop_area (default=10). If exceeded, fall back to use center_crop instead.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.transforms.vision import Inter
>>>
>>> resize_crop_decode_op = c_vision.RandomCropDecodeResize(size=(50, 75), scale=(0.25, 0.5),
>>>     interpolation=Inter.NEAREST, max_attempts=5)
>>> transforms_list = [resize_crop_decode_op]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomCropWithBBox(size, padding=None, pad_if_needed=False, fill_value=0, padding_mode=Border.CONSTANT)[source]

Crop the input image at a random location and adjust bounding boxes accordingly.

Parameters
  • size (Union[int, sequence]) – The output size of the cropped image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • padding (Union[int, sequence], optional) – The number of pixels to pad the image (default=None). If padding is not None, first pad image with padding values. If a single number is provided, pad all borders with this value. If a tuple or list of 2 values are provided, pad the (left and top) with the first value and (right and bottom) with the second value. If 4 values are provided as a list or tuple, pad the left, top, right and bottom respectively.

  • pad_if_needed (bool, optional) – Pad the image if either side is smaller than the given output size (default=False).

  • fill_value (Union[int, tuple], optional) – The pixel intensity of the borders if the padding_mode is Border.CONSTANT (default=0). If it is a 3-tuple, it is used to fill R, G, B channels respectively.

  • padding_mode (Border mode, optional) –

    The method of padding (default=Border.CONSTANT). It can be any of [Border.CONSTANT, Border.EDGE, Border.REFLECT, Border.SYMMETRIC].

    • Border.CONSTANT, means it fills the border with constant values.

    • Border.EDGE, means it pads with the last value on the edge.

    • Border.REFLECT, means it reflects the values on the edge omitting the last value of edge.

    • Border.SYMMETRIC, means it reflects the values on the edge repeating the last value of edge.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> decode_op = c_vision.Decode()
>>> random_crop_with_bbox_op = c_vision.RandomCrop([512, 512], [200, 200, 200, 200])
>>> transforms_list = [decode_op, random_crop_with_bbox_op]
>>> data3 = data3.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomHorizontalFlip(prob=0.5)[source]

Flip the input image horizontally, randomly with a given probability.

Parameters

prob (float, optional) – Probability of the image being flipped (default=0.5).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomHorizontalFlip(0.75)]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomHorizontalFlipWithBBox(prob=0.5)[source]

Flip the input image horizontally, randomly with a given probability and adjust bounding boxes accordingly.

Parameters

prob (float, optional) – Probability of the image being flipped (default=0.5).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomHorizontalFlipWithBBox(0.70)]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomPosterize(bits=(8, 8))[source]

Reduce the number of bits for each color channel.

Parameters

bits (sequence or int, optional) – Range of random posterize to compress image. Bits values must be in range of [1,8], and include at least one integer value in the given range. It must be in (min, max) or integer format. If min=max, then it is a single fixed magnitude operation (default=[4, 8]).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomPosterize((6, 8))]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomResize(size)[source]

Tensor operation to resize the input image using a randomly selected interpolation mode.

Parameters

size (Union[int, sequence]) – The output size of the resized image. If size is an integer, smaller edge of the image will be resized to this value with the same image aspect ratio. If size is a sequence of length 2, it should be (height, width).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> # randomly resize image, keeping aspect ratio
>>> transforms_list1 = [c_vision.Decode(), c_vision.RandomResize(50)]
>>> data1 = data1.map(operations=transforms_list1, input_columns=["image"])
>>> # randomly resize image to landscape style
>>> transforms_list2 = [c_vision.Decode(), c_vision.RandomResize((40, 60))]
>>> data2 = data2.map(operations=transforms_list2, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomResizeWithBBox(size)[source]

Tensor operation to resize the input image using a randomly selected interpolation mode and adjust bounding boxes accordingly.

Parameters

size (Union[int, sequence]) – The output size of the resized image. If size is an integer, smaller edge of the image will be resized to this value with the same image aspect ratio. If size is a sequence of length 2, it should be (height, width).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> # randomly resize image with bounding boxes, keeping aspect ratio
>>> transforms_list1 = [c_vision.Decode(), c_vision.RandomResizeWithBBox(60)]
>>> data1 = data1.map(operations=transforms_list1, input_columns=["image"])
>>> # randomly resize image with bounding boxes to portrait style
>>> transforms_list2 = [c_vision.Decode(), c_vision.RandomResizeWithBBox((80, 60))]
>>> data2 = data2.map(operations=transforms_list2, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=Inter.BILINEAR, max_attempts=10)[source]

Crop the input image to a random size and aspect ratio.

Parameters
  • size (Union[int, sequence]) – The size of the output image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • scale (tuple, optional) – Range [min, max) of respective size of the original size to be cropped (default=(0.08, 1.0)).

  • ratio (tuple, optional) – Range [min, max) of aspect ratio to be cropped (default=(3. / 4., 4. / 3.)).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.BILINEAR). It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means interpolation method is bilinear interpolation.

    • Inter.NEAREST, means interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means interpolation method is bicubic interpolation.

  • max_attempts (int, optional) – The maximum number of attempts to propose a valid crop_area (default=10). If exceeded, fall back to use center_crop instead.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.transforms.vision import Inter
>>>
>>> decode_op = c_vision.Decode()
>>> resize_crop_op = c_vision.RandomResizedCrop(size=(50, 75), scale=(0.25, 0.5),
>>>     interpolation=Inter.BILINEAR)
>>> transforms_list = [decode_op, resize_crop_op]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomResizedCropWithBBox(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=Inter.BILINEAR, max_attempts=10)[source]

Crop the input image to a random size and aspect ratio and adjust bounding boxes accordingly.

Parameters
  • size (Union[int, sequence]) – The size of the output image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • scale (tuple, optional) – Range (min, max) of respective size of the original size to be cropped (default=(0.08, 1.0)).

  • ratio (tuple, optional) – Range (min, max) of aspect ratio to be cropped (default=(3. / 4., 4. / 3.)).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.BILINEAR). It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means interpolation method is bilinear interpolation.

    • Inter.NEAREST, means interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means interpolation method is bicubic interpolation.

  • max_attempts (int, optional) – The maximum number of attempts to propose a valid crop area (default=10). If exceeded, fall back to use center crop instead.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.transforms.vision import Inter
>>>
>>> decode_op = c_vision.Decode()
>>> bbox_op = c_vision.RandomResizedCropWithBBox(size=50, interpolation=Inter.NEAREST)
>>> transforms_list = [decode_op, bbox_op]
>>> data3 = data3.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomRotation(degrees, resample=Inter.NEAREST, expand=False, center=None, fill_value=0)[source]

Rotate the input image by a random angle.

Parameters
  • degrees (Union[int, float, sequence) – Range of random rotation degrees. If degrees is a number, the range will be converted to (-degrees, degrees). If degrees is a sequence, it should be (min, max).

  • resample (Inter mode, optional) –

    An optional resampling filter (default=Inter.NEAREST). If omitted, or if the image has mode “1” or “P”, it is set to be Inter.NEAREST. It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means resample method is bilinear interpolation.

    • Inter.NEAREST, means resample method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means resample method is bicubic interpolation.

  • expand (bool, optional) – Optional expansion flag (default=False). If set to True, expand the output image to make it large enough to hold the entire rotated image. If set to False or omitted, make the output image the same size as the input. Note that the expand flag assumes rotation around the center and no translation.

  • center (tuple, optional) – Optional center of rotation (a 2-tuple) (default=None). Origin is the top left corner. None sets to the center of the image.

  • fill_value (Union[int, tuple], optional) – Optional fill color for the area outside the rotated image (default=0). If it is a 3-tuple, it is used for R, G, B channels respectively. If it is an integer, it is used for all RGB channels.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.transforms.vision import Inter
>>>
>>> transforms_list = [c_vision.Decode(),
>>>                    c_vision.RandomRotation(degrees=5.0, resample=Inter.NEAREST, expand=True)]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomSelectSubpolicy(policy)[source]

Choose a random sub-policy from a list to be applied on the input image. A sub-policy is a list of tuples (op, prob), where op is a TensorOp operation and prob is the probability that this op will be applied. Once a sub-policy is selected, each op within the subpolicy with be applied in sequence according to its probability.

Parameters

policy (list(list(tuple(TensorOp,float))) – List of sub-policies to choose from.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> policy = [[(c_vision.RandomRotation((45, 45)), 0.5), (c_vision.RandomVerticalFlip(), 1),
>>>            (c_vision.RandomColorAdjust(), 0.8)],
>>>           [(c_vision.RandomRotation((90, 90)), 1), (c_vision.RandomColorAdjust(), 0.2)]]
>>> data_policy = data1.map(operations=c_vision.RandomSelectSubpolicy(policy), input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomSharpness(degrees=(0.1, 1.9))[source]

Adjust the sharpness of the input image by a fixed or random degree. Degree of 0.0 gives a blurred image, degree of 1.0 gives the original image, and degree of 2.0 gives a sharpened image.

Parameters

degrees (tuple, optional) – Range of random sharpness adjustment degrees. It should be in (min, max) format. If min=max, then it is a single fixed magnitude operation (default = (0.1, 1.9)).

Raises
  • TypeError – If degrees is not a list or tuple.

  • ValueError – If degrees is negative.

  • ValueError – If degrees is in (max, min) format instead of (min, max).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomSharpness(degrees=(0.2, 1.9))]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomSolarize(threshold=(0, 255))[source]

Invert all pixel values above a threshold.

Parameters

threshold (tuple, optional) – Range of random solarize threshold. Threshold values should always be in the range (0, 255), include at least one integer value in the given range and be in (min, max) format. If min=max, then it is a single fixed magnitude operation (default=(0, 255)).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomSolarize(threshold=(10,100))]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomVerticalFlip(prob=0.5)[source]

Flip the input image vertically, randomly with a given probability.

Parameters

prob (float, optional) – Probability of the image being flipped (default=0.5).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomVerticalFlip(0.25)]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.RandomVerticalFlipWithBBox(prob=0.5)[source]

Flip the input image vertically, randomly with a given probability and adjust bounding boxes accordingly.

Parameters

prob (float, optional) – Probability of the image being flipped (default=0.5).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.RandomVerticalFlipWithBBox(0.20)]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.Rescale(rescale, shift)[source]

Tensor operation to rescale the input image.

Parameters
  • rescale (float) – Rescale factor.

  • shift (float) – Shift factor.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> transforms_list = [c_vision.Decode(), c_vision.Rescale(1.0 / 255.0, -1.0)]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.Resize(size, interpolation=Inter.BILINEAR)[source]

Resize the input image to the given size.

Parameters
  • size (Union[int, sequence]) – The output size of the resized image. If size is an integer, the smaller edge of the image will be resized to this value with the same image aspect ratio. If size is a sequence of length 2, it should be (height, width).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.LINEAR). It can be any of [Inter.LINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.LINEAR, means interpolation method is bilinear interpolation.

    • Inter.NEAREST, means interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means interpolation method is bicubic interpolation.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.transforms.vision import Inter
>>>
>>> decode_op = c_vision.Decode()
>>> resize_op = c_vision.Resize([100, 75], Inter.BICUBIC)
>>> transforms_list = [decode_op, resize_op]
>>> data1 = data1.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.ResizeWithBBox(size, interpolation=Inter.BILINEAR)[source]

Resize the input image to the given size and adjust bounding boxes accordingly.

Parameters
  • size (Union[int, sequence]) – The output size of the resized image. If size is an integer, smaller edge of the image will be resized to this value with the same image aspect ratio. If size is a sequence of length 2, it should be (height, width).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.LINEAR). It can be any of [Inter.LINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.LINEAR, means interpolation method is bilinear interpolation.

    • Inter.NEAREST, means interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means interpolation method is bicubic interpolation.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> from mindspore.dataset.transforms.vision import Inter
>>>
>>> decode_op = c_vision.Decode()
>>> bbox_op = c_vision.ResizeWithBBox(50, Inter.NEAREST)
>>> transforms_list = [decode_op, bbox_op]
>>> data3 = data3.map(operations=transforms_list, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.SoftDvppDecodeRandomCropResizeJpeg(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), max_attempts=10)[source]

Tensor operation to decode, random crop and resize JPEG image using the simulation algorithm of Ascend series chip DVPP module.

The usage scenario is consistent with SoftDvppDecodeReiszeJpeg. The input image size should be in range [32*32, 8192*8192]. The zoom-out and zoom-in multiples of the image length and width should in the range [1/32, 16]. Only images with an even resolution can be output. The output of odd resolution is not supported.

Parameters
  • size (Union[int, sequence]) – The size of the output image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • scale (tuple, optional) – Range [min, max) of respective size of the original size to be cropped (default=(0.08, 1.0)).

  • ratio (tuple, optional) – Range [min, max) of aspect ratio to be cropped (default=(3. / 4., 4. / 3.)).

  • max_attempts (int, optional) – The maximum number of attempts to propose a valid crop_area (default=10). If exceeded, fall back to use center_crop instead.

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> # decode, randomly crop and resize image, keeping aspect ratio
>>> transforms_list1 = [c_vision.Decode(), c_vision.SoftDvppDecodeRandomCropResizeJpeg(90)]
>>> data1 = data1.map(operations=transforms_list1, input_columns=["image"])
>>> # decode, randomly crop and resize to landscape style
>>> transforms_list2 = [c_vision.Decode(), c_vision.SoftDvppDecodeRandomCropResizeJpeg((80, 100))]
>>> data2 = data2.map(operations=transforms_list2, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.SoftDvppDecodeResizeJpeg(size)[source]

Tensor operation to decode and resize JPEG image using the simulation algorithm of Ascend series chip DVPP module.

It is recommended to use this algorithm in the following scenarios: When training, the DVPP of the Ascend chip is not used, and the DVPP of the Ascend chip is used during inference, and the accuracy of inference is lower than the accuracy of training; and the input image size should be in range [32*32, 8192*8192]. The zoom-out and zoom-in multiples of the image length and width should in the range [1/32, 16]. Only images with an even resolution can be output. The output of odd resolution is not supported.

Parameters

size (Union[int, sequence]) – The output size of the resized image. If size is an integer, smaller edge of the image will be resized to this value with the same image aspect ratio. If size is a sequence of length 2, it should be (height, width).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> # decode and resize image, keeping aspect ratio
>>> transforms_list1 = [c_vision.Decode(), c_vision.SoftDvppDecodeResizeJpeg(70)]
>>> data1 = data1.map(operations=transforms_list1, input_columns=["image"])
>>> # decode and resize to portrait style
>>> transforms_list2 = [c_vision.Decode(), c_vision.SoftDvppDecodeResizeJpeg((80, 60))]
>>> data2 = data2.map(operations=transforms_list2, input_columns=["image"])
class mindspore.dataset.vision.c_transforms.UniformAugment(transforms, num_ops=2)[source]

Tensor operation to perform randomly selected augmentation.

Parameters
  • transforms – List of C++ operations (Python operations are not accepted).

  • num_ops (int, optional) – Number of operations to be selected and applied (default=2).

Examples

>>> import mindspore.dataset.vision.c_transforms as c_vision
>>> import mindspore.dataset.vision.py_transforms as py_vision
>>>
>>> transforms_list = [c_vision.RandomHorizontalFlip(),
>>>                    c_vision.RandomVerticalFlip(),
>>>                    c_vision.RandomColorAdjust(),
>>>                    c_vision.RandomRotation(degrees=45)]
>>> uni_aug_op = c_vision.UniformAugment(transforms=transforms_list, num_ops=2)
>>> transforms_all = [c_vision.Decode(), c_vision.Resize(size=[224, 224]),
>>>                   uni_aug_op, py_vision.ToTensor()]
>>> data_aug = data1.map(operations=transforms_all, input_columns="image",
>>>                      num_parallel_workers=1)

mindspore.dataset.vision.py_transforms

The module vision.py_transforms is implemented based on Python PIL. This module provides many kinds of image augmentations. It also provides transferring methods between PIL image and NumPy array. For users who prefer Python PIL in image learning tasks, this module is a good tool to process images. Users can also self-define their own augmentations with Python PIL.

class mindspore.dataset.vision.py_transforms.AutoContrast(cutoff=0.0, ignore=None)[source]

Automatically maximize the contrast of the input PIL image.

Parameters
  • cutoff (float, optional) – Percent of pixels to cut off from the histogram (default=0.0).

  • ignore (Union[int, sequence], optional) – Pixel values to ignore (default=None).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.AutoContrast(),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.CenterCrop(size)[source]

Crop the central reigion of the input PIL image to the given size.

Parameters

size (Union[int, sequence]) – The output size of the cropped image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.CenterCrop(64),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.Cutout(length, num_patches=1)[source]

Randomly cut (mask) out a given number of square patches from the input NumPy image array.

Terrance DeVries and Graham W. Taylor ‘Improved Regularization of Convolutional Neural Networks with Cutout’ 2017 See https://arxiv.org/pdf/1708.04552.pdf

Parameters
  • length (int) – The side length of each square patch.

  • num_patches (int, optional) – Number of patches to be cut out of an image (default=1).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.ToTensor(),
>>>          py_vision.Cutout(80)])
class mindspore.dataset.vision.py_transforms.Decode[source]

Decode the input image to PIL image format in RGB mode.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomHorizontalFlip(0.5),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.Equalize[source]

Equalize the histogram of input PIL image.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.Equalize(),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.FiveCrop(size)[source]

Generate 5 cropped images (one central image and four corners images).

Parameters

size (int or sequence) – The output size of the crop. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.FiveCrop(size),
>>>          # 4D stack of 5 images
>>>          lambda images: numpy.stack([py_vision.ToTensor()(image) for image in images])])
class mindspore.dataset.vision.py_transforms.Grayscale(num_output_channels=1)[source]

Convert the input PIL image to grayscale image.

Parameters

num_output_channels (int) – Number of channels of the output grayscale image (1 or 3). Default is 1. If set to 3, the returned image has 3 identical RGB channels.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.Grayscale(3),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.HWC2CHW[source]

Transpose a NumPy image array; shape (H, W, C) to shape (C, H, W).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.HWC2CHW()])
class mindspore.dataset.vision.py_transforms.HsvToRgb(is_hwc=False)[source]

Convert a NumPy HSV image or one batch NumPy HSV images to RGB images.

Parameters

is_hwc (bool) – The flag of image shape, (H, W, C) or (N, H, W, C) if True and (C, H, W) or (N, C, H, W) if False (default=False).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.CenterCrop(20),
>>>          py_vision.ToTensor(),
>>>          py_vision.HsvToRgb()])
class mindspore.dataset.vision.py_transforms.Invert[source]

Invert colors of input PIL image.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.Invert(),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.LinearTransformation(transformation_matrix, mean_vector)[source]

Apply linear transformation to the input NumPy image array, given a square transformation matrix and a mean vector.

The transformation first flattens the input array and subtracts the mean vector from it. It then computes the dot product with the transformation matrix, and reshapes it back to its original shape.

Parameters
  • transformation_matrix (numpy.ndarray) – a square transformation matrix of shape (D, D), D = C x H x W.

  • mean_vector (numpy.ndarray) – a NumPy ndarray of shape (D,) where D = C x H x W.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.Resize(256),
>>>          py_vision.ToTensor(),
>>>          py_vision.LinearTransformation(transformation_matrix, mean_vector)])
class mindspore.dataset.vision.py_transforms.MixUp(batch_size, alpha, is_single=True)[source]

Apply mix up transformation to the input image and label. Make one input data combined with others.

Parameters
  • batch_size (int) – Batch size of dataset.

  • alpha (float) – Mix up rate.

  • is_single (bool) – Identify if single batch or multi-batch mix up transformation is to be used (Default=True, which is single batch).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>>
>>> # Setup multi-batch mixup transformation
>>> transform = [py_vision.MixUp(batch_size=16, alpha=0.2, is_single=False)]
>>> # Apply the transform to the dataset through dataset.map()
>>> dataset = dataset.map(input_columns="image", operations=transform())
class mindspore.dataset.vision.py_transforms.Normalize(mean, std)[source]

Normalize the input NumPy image array of shape (C, H, W) with the given mean and standard deviation.

The values of the array need to be in the range (0.0, 1.0].

Parameters
  • mean (sequence) – List or tuple of mean values for each channel, with respect to channel order. The mean values must be in the range (0.0, 1.0].

  • std (sequence) – List or tuple of standard deviations for each channel, w.r.t. channel order. The standard deviation values must be in the range (0.0, 1.0].

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomHorizontalFlip(0.5),
>>>          py_vision.ToTensor(),
>>>          py_vision.Normalize((0.491, 0.482, 0.447), (0.247, 0.243, 0.262))])
class mindspore.dataset.vision.py_transforms.Pad(padding, fill_value=0, padding_mode=Border.CONSTANT)[source]

Pad the input PIL image according to padding parameters.

Parameters
  • padding (Union[int, sequence]) – The number of pixels to pad the image. If a single number is provided, pad all borders with this value. If a tuple or list of 2 values is provided, pad the left and top with the first value and the right and bottom with the second value. If 4 values are provided as a list or tuple, pad the left, top, right and bottom respectively.

  • fill_value (Union[int, tuple], optional) – Filling value for the pixel intensity of the borders if the padding_mode is Border.CONSTANT (Default=0). If it is a 3-tuple, it is used to fill R, G, B channels respectively.

  • padding_mode (Border mode, optional) –

    The method of padding (default=Border.CONSTANT). It can be any of [Border.CONSTANT, Border.EDGE, Border.REFLECT, Border.SYMMETRIC].

    • Border.CONSTANT, means it fills the border with constant values.

    • Border.EDGE, means it pads with the last value on the edge.

    • Border.REFLECT, means it reflects the values on the edge omitting the last value of edge.

    • Border.SYMMETRIC, means it reflects the values on the edge repeating the last value of edge.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          # adds 10 pixels (default black) to each side of the border of the image
>>>          py_vision.Pad(padding=10),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomAffine(degrees, translate=None, scale=None, shear=None, resample=Inter.NEAREST, fill_value=0)[source]

Apply Random affine transformation to the input PIL image.

Parameters
  • degrees (Union[int, float, sequence]) – Range of the rotation degrees. If degrees is a number, the range will be (-degrees, degrees). If degrees is a sequence, it should be (min, max).

  • translate (sequence, optional) – Sequence (tx, ty) of maximum translation in x(horizontal) and y(vertical) directions (default=None). The horizontal shift and vertical shift are selected randomly from the range: (-tx*width, tx*width) and (-ty*height, ty*height), respectively. If None, no translations are applied.

  • scale (sequence, optional) – Scaling factor interval (default=None, original scale is used).

  • shear (Union[int, float, sequence], optional) – Range of shear factor (default=None). If shear is an integer, then a shear parallel to the X axis in the range of (-shear, +shear) is applied. If shear is a tuple or list of size 2, then a shear parallel to the X axis in the range of (shear[0], shear[1]) is applied. If shear is a tuple of list of size 4, then a shear parallel to X axis in the range of (shear[0], shear[1]) and a shear parallel to Y axis in the range of (shear[2], shear[3]) is applied. If shear is None, no shear is applied.

  • resample (Inter mode, optional) –

    An optional resampling filter (default=Inter.NEAREST). If omitted, or if the image has mode “1” or “P”, it is set to be Inter.NEAREST. It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means resample method is bilinear interpolation.

    • Inter.NEAREST, means resample method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means resample method is bicubic interpolation.

  • fill_value (Union[tuple, int], optional) – Optional filling value to fill the area outside the transform in the output image. There must be three elements in the tuple and the value of a single element is within the range [0, 255]. Used only in Pillow versions > 5.0.0 (default=0, filling is performed).

Raises
  • ValueError – If degrees is negative.

  • ValueError – If translation value is not between 0 and 1.

  • ValueError – If scale is not positive.

  • ValueError – If shear is a number but is not positive.

  • TypeError – If degrees is not a number or a list or a tuple. If degrees is a list or tuple, its length is not 2.

  • TypeError – If translate is specified but is not list or a tuple of length 2.

  • TypeError – If scale is not a list or tuple of length 2.

  • TypeError – If shear is not a list or tuple of length 2 or 4.

  • TypeError – If fill_value is not a single integer or a 3-tuple.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomAffine(degrees=15, translate=(0.1, 0.1), scale=(0.9, 1.1)),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomColor(degrees=(0.1, 1.9))[source]

Adjust the color of the input PIL image by a random degree.

Parameters

degrees (sequence) – Range of random color adjustment degrees. It should be in (min, max) format (default=(0.1,1.9)).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomColor((0.5, 2.0)),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomColorAdjust(brightness=(1, 1), contrast=(1, 1), saturation=(1, 1), hue=(0, 0))[source]

Perform a random brightness, contrast, saturation, and hue adjustment on the input PIL image.

Parameters
  • brightness (Union[float, tuple], optional) – Brightness adjustment factor (default=(1, 1)). Cannot be negative. If it is a float, the factor is uniformly chosen from the range [max(0, 1-brightness), 1+brightness]. If it is a sequence, it should be [min, max] for the range.

  • contrast (Union[float, tuple], optional) – Contrast adjustment factor (default=(1, 1)). Cannot be negative. If it is a float, the factor is uniformly chosen from the range [max(0, 1-contrast), 1+contrast]. If it is a sequence, it should be [min, max] for the range.

  • saturation (Union[float, tuple], optional) – Saturation adjustment factor (default=(1, 1)). Cannot be negative. If it is a float, the factor is uniformly chosen from the range [max(0, 1-saturation), 1+saturation]. If it is a sequence, it should be [min, max] for the range.

  • hue (Union[float, tuple], optional) – Hue adjustment factor (default=(0, 0)). If it is a float, the range will be [-hue, hue]. Value should be 0 <= hue <= 0.5. If it is a sequence, it should be [min, max] where -0.5 <= min <= max <= 0.5.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomColorAdjust(0.4, 0.4, 0.4, 0.1),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomCrop(size, padding=None, pad_if_needed=False, fill_value=0, padding_mode=Border.CONSTANT)[source]

Crop the input PIL image at a random location.

Parameters
  • size (Union[int, sequence]) – The output size of the cropped image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • padding (Union[int, sequence], optional) – The number of pixels to pad the image (default=None). If padding is not None, first pad image with padding values. If a single number is provided, pad all borders with this value. If a tuple or list of 2 values are provided, pad the (left and top) with the first value and (right and bottom) with the second value. If 4 values are provided as a list or tuple, pad the left, top, right and bottom respectively.

  • pad_if_needed (bool, optional) – Pad the image if either side is smaller than the given output size (default=False).

  • fill_value (int or tuple, optional) – filling value (default=0). The pixel intensity of the borders if the padding_mode is Border.CONSTANT. If it is a 3-tuple, it is used to fill R, G, B channels respectively.

  • padding_mode (str, optional) –

    The method of padding (default=Border.CONSTANT). It can be any of [Border.CONSTANT, Border.EDGE, Border.REFLECT, Border.SYMMETRIC].

    • Border.CONSTANT, means it fills the border with constant values.

    • Border.EDGE, means it pads with the last value on the edge.

    • Border.REFLECT, means it reflects the values on the edge omitting the last value of edge.

    • Border.SYMMETRIC, means it reflects the values on the edge repeating the last value of edge.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomCrop(224),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomErasing(prob=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0, inplace=False, max_attempts=10)[source]

Erase the pixels, within a selected rectangle region, to the given value.

Randomly applied on the input NumPy image array with a given probability.

Zhun Zhong et al. ‘Random Erasing Data Augmentation’ 2017 See https://arxiv.org/pdf/1708.04896.pdf

Parameters
  • prob (float, optional) – Probability of applying RandomErasing (default=0.5).

  • scale (sequence of floats, optional) – Range of the relative erase area to the original image (default=(0.02, 0.33)).

  • ratio (sequence of floats, optional) – Range of the aspect ratio of the erase area (default=(0.3, 3.3)).

  • value (Union[int, sequence, string]) – Erasing value (default=0). If value is a single intieger, it is applied to all pixels to be erased. If value is a sequence of length 3, it is applied to R, G, B channels respectively. If value is a string ‘random’, the erase value will be obtained from a standard normal distribution.

  • inplace (bool, optional) – Apply this transform in-place (default=False).

  • max_attempts (int, optional) – The maximum number of attempts to propose a valid erase_area (default=10). If exceeded, return the original image.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.ToTensor(),
>>>          py_vision.RandomErasing(value='random')])
class mindspore.dataset.vision.py_transforms.RandomGrayscale(prob=0.1)[source]

Randomly convert the input image into grayscale image with a given probability.

Parameters

prob (float, optional) – Probability of the image being converted to grayscale (default=0.1).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomGrayscale(0.3),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomHorizontalFlip(prob=0.5)[source]

Randomly flip the input image horizontally with a given probability.

Parameters

prob (float, optional) – Probability of the image being flipped (default=0.5).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomHorizontalFlip(0.5),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomPerspective(distortion_scale=0.5, prob=0.5, interpolation=Inter.BICUBIC)[source]

Randomly apply perspective transformation to the input PIL image with a given probability.

Parameters
  • distortion_scale (float, optional) – The scale of distortion, a float value between 0 and 1 (default=0.5).

  • prob (float, optional) – Probability of the image being applied perspective transformation (default=0.5).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.BICUBIC). It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means the interpolation method is bilinear interpolation.

    • Inter.NEAREST, means the interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means the interpolation method is bicubic interpolation.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomPerspective(prob=0.1),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=Inter.BILINEAR, max_attempts=10)[source]

Extract crop from the input image and resize it to a random size and aspect ratio.

Parameters
  • size (Union[int, sequence]) – The size of the output image. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • scale (tuple, optional) – Range (min, max) of respective size of the original size to be cropped (default=(0.08, 1.0)).

  • ratio (tuple, optional) – Range (min, max) of aspect ratio to be cropped (default=(3. / 4., 4. / 3.)).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.BILINEAR). It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means the interpolation method is bilinear interpolation.

    • Inter.NEAREST, means the interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means the interpolation method is bicubic interpolation.

  • max_attempts (int, optional) – The maximum number of attempts to propose a valid crop area (default=10). If exceeded, fall back to use center crop instead.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomResizedCrop(224),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomRotation(degrees, resample=Inter.NEAREST, expand=False, center=None, fill_value=0)[source]

Rotate the input PIL image by a random angle.

Parameters
  • degrees (Union[int, float, sequence]) – Range of random rotation degrees. If degrees is a number, the range will be converted to (-degrees, degrees). If degrees is a sequence, it should be (min, max).

  • resample (Inter mode, optional) –

    An optional resampling filter (default=Inter.NEAREST). If omitted, or if the image has mode “1” or “P”, it is set to be Inter.NEAREST. It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means the resampling method is bilinear interpolation.

    • Inter.NEAREST, means the resampling method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means the resampling method is bicubic interpolation.

  • expand (bool, optional) – Optional expansion flag (default=False). If set to True, expand the output image to make it large enough to hold the entire rotated image. If set to False or omitted, make the output image the same size as the input. Note that the expand flag assumes rotation around the center and no translation.

  • center (tuple, optional) – Optional center of rotation (a 2-tuple) (default=None). Origin is the top left corner. Default None sets to the center of the image.

  • fill_value (int or tuple, optional) – Optional fill color for the area outside the rotated image (default=0). If it is a 3-tuple, it is used for R, G, B channels respectively. If it is an integer, it is used for all RGB channels. Default is 0.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomRotation(30),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomSharpness(degrees=(0.1, 1.9))[source]

Adjust the sharpness of the input PIL image by a random degree.

Parameters

degrees (sequence) – Range of random sharpness adjustment degrees. It should be in (min, max) format (default=(0.1,1.9)).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomSharpness((0.5, 1.5)),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RandomVerticalFlip(prob=0.5)[source]

Randomly flip the input image vertically with a given probability.

Parameters

prob (float, optional) – Probability of the image being flipped (default=0.5).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomVerticalFlip(0.5),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.Resize(size, interpolation=Inter.BILINEAR)[source]

Resize the input PIL image to the given size.

Parameters
  • size (Union[int, sequence]) – The output size of the resized image. If size is an integer, the smaller edge of the image will be resized to this value with the same image aspect ratio. If size is a sequence of length 2, it should be (height, width).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.BILINEAR). It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means the interpolation method is bilinear interpolation.

    • Inter.NEAREST, means the interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means the interpolation method is bicubic interpolation.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.Resize(256),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.RgbToHsv(is_hwc=False)[source]

Convert a NumPy RGB image or a batch of NumPy RGB images to HSV images.

Parameters

is_hwc (bool) – The flag of image shape, (H, W, C) or (N, H, W, C) if True and (C, H, W) or (N, C, H, W) if False (default=False).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.CenterCrop(20),
>>>          py_vision.ToTensor(),
>>>          py_vision.RgbToHsv()])
class mindspore.dataset.vision.py_transforms.TenCrop(size, use_vertical_flip=False)[source]

Generate 10 cropped images (first 5 images from FiveCrop, second 5 images from their flipped version as per input flag to flip vertically or horizontally).

Parameters
  • size (Union[int, sequence]) – The output size of the crop. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • use_vertical_flip (bool, optional) – Flip the image vertically instead of horizontally if set to True (default=False).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.TenCrop(size),
>>>          # 4D stack of 10 images
>>>          lambda images: numpy.stack([py_vision.ToTensor()(image) for image in images])])
class mindspore.dataset.vision.py_transforms.ToPIL[source]

Convert the input decoded NumPy image array of RGB mode to a PIL image of RGB mode.

Examples

>>> # data is already decoded, but not in PIL image format
>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.ToPIL(),  py_vision.RandomHorizontalFlip(0.5),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.ToTensor(output_type=<class 'numpy.float32'>)[source]

Convert the input NumPy image array or PIL image of shape (H, W, C) to a NumPy ndarray of shape (C, H, W).

Note

The values in the input arrays are rescaled from [0, 255] to [0.0, 1.0]. The type is cast to output_type (default NumPy float32). The number of channels remains the same.

Parameters

output_type (NumPy datatype, optional) – The datatype of the NumPy output (default=np.float32).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(), py_vision.RandomHorizontalFlip(0.5),
>>>          py_vision.ToTensor()])
class mindspore.dataset.vision.py_transforms.ToType(output_type)[source]

Convert the input NumPy image array to desired NumPy dtype.

Parameters

output_type (NumPy datatype) – The datatype of the NumPy output, e.g. numpy.float32.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>> import numpy as np
>>>
>>> Compose([py_vision.Decode(), py_vision.RandomHorizontalFlip(0.5),
>>>          py_vision.ToTensor(),
>>>          py_vision.ToType(np.float32)])
class mindspore.dataset.vision.py_transforms.UniformAugment(transforms, num_ops=2)[source]

Uniformly select and apply a number of transforms sequentially from a list of transforms. Randomly assign a probability to each transform for each image to decide whether to apply the transform or not.

All the transforms in transform list must have the same input/output data type.

Parameters
  • transforms (list) – List of transformations to be chosen from to apply.

  • num_ops (int, optional) – number of transforms to sequentially apply (default=2).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> transforms_list = [py_vision.CenterCrop(64),
>>>                    py_vision.RandomColor(),
>>>                    py_vision.RandomSharpness(),
>>>                    py_vision.RandomRotation(30)]
>>> Compose([py_vision.Decode(),
>>>          py_vision.UniformAugment(transforms_list),
>>>          py_vision.ToTensor()])