mindspore.dataset.transforms.vision.py_transforms

The module vision.py_transforms is implemented basing on python PIL. This module provides many kinds of image augmentations. It also provides transferring methods between PIL Image and numpy array. For users who prefer python PIL in image learning task, this module is a good tool to process image augmentations. Users could also self-define their own augmentations with python PIL.

class mindspore.dataset.transforms.vision.py_transforms.AutoContrast[source]

Automatically maximize the contrast of the input PIL image.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.AutoContrast(),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.CenterCrop(size)[source]

Crop the central reigion of the input PIL Image to the given size.

Parameters

size (int or sequence) – The output size of the cropped image. If size is an int, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.CenterCrop(64),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.ComposeOp(transforms)[source]

Compose a list of transforms.

Note

ComposeOp takes a list of transformations either provided in py_transforms or from user-defined implementation; each can be an initialized transformation class or a lambda function, as long as the output from the last transformation is a single tensor of type numpy.ndarray. See below for an example of how to use ComposeOp with py_transforms classes and check out FiveCrop or TenCrop for the use of them in conjunction with lambda functions.

Parameters

transforms (list) – List of transformations to be applied.

Examples

>>> import mindspore.dataset as ds
>>> import mindspore.dataset.transforms.vision.py_transforms as py_transforms
>>> dataset_dir = "path/to/imagefolder_directory"
>>> # create a dataset that reads all files in dataset_dir with 8 threads
>>> dataset = ds.ImageFolderDatasetV2(dataset_dir, num_parallel_workers=8)
>>> # create a list of transformations to be applied to the image data
>>> transform = py_transforms.ComposeOp([py_transforms.Decode(),
>>>                                      py_transforms.RandomHorizontalFlip(0.5),
>>>                                      py_transforms.ToTensor(),
>>>                                      py_transforms.Normalize((0.491, 0.482, 0.447), (0.247, 0.243, 0.262)),
>>>                                      py_transforms.RandomErasing()])
>>> # apply the transform to the dataset through dataset.map()
>>> dataset = dataset.map(input_columns="image", operations=transform())
class mindspore.dataset.transforms.vision.py_transforms.Cutout(length, num_patches=1)[source]

Randomly cut (mask) out a given number of square patches from the input Numpy image array.

Terrance DeVries and Graham W. Taylor ‘Improved Regularization of Convolutional Neural Networks with Cutout’ 2017 See https://arxiv.org/pdf/1708.04552.pdf

Parameters
  • length (int) – The side length of each square patch.

  • num_patches (int, optional) – Number of patches to be cut out of an image (default=1).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.ToTensor(),
>>>                          py_transforms.Cutout(80)])
class mindspore.dataset.transforms.vision.py_transforms.Decode[source]

Decode the input image to PIL Image format in RGB mode.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomHorizontalFlip(0.5),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.Equalize[source]

Equalize the histogram of input PIL image.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.Equalize(),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.FiveCrop(size)[source]

Generate 5 cropped images (one central and four corners).

Parameters

size (int or sequence) – The output size of the crop. If size is an int, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.FiveCrop(size),
>>>                          # 4D stack of 5 images
>>>                          lambda images: numpy.stack([py_transforms.ToTensor()(image) for image in images])])
class mindspore.dataset.transforms.vision.py_transforms.Grayscale(num_output_channels=1)[source]

Convert the input PIL image to grayscale image.

Parameters

num_output_channels (int) – Number of channels of the output grayscale image (1 or 3). Default is 1. If set to 3, the returned image has 3 identical RGB channels.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.Grayscale(3),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.HWC2CHW[source]

Transpose a Numpy image array; shape (H, W, C) to shape (C, H, W).

class mindspore.dataset.transforms.vision.py_transforms.HsvToRgb(is_hwc=False)[source]

Convert a Numpy HSV image or one batch Numpy HSV images to RGB images.

Parameters

is_hwc (bool) – The flag of image shape, (H, W, C) or (N, H, W, C) if True and (C, H, W) or (N, C, H, W) if False (default=False).

class mindspore.dataset.transforms.vision.py_transforms.Invert[source]

Invert colors of input PIL image.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.Invert(),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.LinearTransformation(transformation_matrix, mean_vector)[source]

Apply linear transformation to the input Numpy image array, given a square transformation matrix and a mean_vector.

The transformation first flattens the input array and subtract mean_vector from it, then computes the dot product with the transformation matrix, and reshapes it back to its original shape.

Parameters
  • transformation_matrix (numpy.ndarray) – a square transformation matrix of shape (D, D), D = C x H x W.

  • mean_vector (numpy.ndarray) – a numpy ndarray of shape (D,) where D = C x H x W.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.Resize(256),
>>>                          py_transforms.ToTensor(),
>>>                          py_transforms.LinearTransformation(transformation_matrix, mean_vector)])
class mindspore.dataset.transforms.vision.py_transforms.MixUp(batch_size, alpha, is_single=True)[source]

Apply mix up transformation to the input image and label, make one input data combined with others.

Parameters
  • batch_size (int) – the batch size of dataset.

  • alpha (float) – the mix up rate.

  • is_single (bool) – for deciding using single batch or muti batch mix up transformation.

class mindspore.dataset.transforms.vision.py_transforms.Normalize(mean, std)[source]

Normalize the input Numpy image array of shape (C, H, W) with the given mean and standard deviation.

The values of the array need to be in range [0.0, 1.0].

Parameters
  • mean (sequence) – List or tuple of mean values for each channel, w.r.t channel order.

  • std (sequence) – List or tuple of standard deviations for each channel, w.r.t. channel order.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomHorizontalFlip(0.5),
>>>                          py_transforms.ToTensor(),
>>>                          py_transforms.Normalize((0.491, 0.482, 0.447), (0.247, 0.243, 0.262))])
class mindspore.dataset.transforms.vision.py_transforms.Pad(padding, fill_value=0, padding_mode=Border.CONSTANT)[source]

Pad the input PIL image according to padding parameters.

Parameters
  • padding (int or sequence) – The number of pixels to pad the image. If a single number is provided, it pads all borders with this value. If a tuple or list of 2 values are provided, it pads the (left and top) with the first value and (right and bottom) with the second value. If 4 values are provided as a list or tuple, it pads the left, top, right and bottom respectively.

  • fill_value (int or tuple, optional) – Filling value (default=0). The pixel intensity of the borders if the padding_mode is Border.CONSTANT. If it is a 3-tuple, it is used to fill R, G, B channels respectively.

  • padding_mode (Border mode, optional) –

    The method of padding (default=Border.CONSTANT). Can be any of [Border.CONSTANT, Border.EDGE, Border.REFLECT, Border.SYMMETRIC].

    • Border.CONSTANT, means it fills the border with constant values.

    • Border.EDGE, means it pads with the last value on the edge.

    • Border.REFLECT, means it reflects the values on the edge omitting the last value of edge.

    • Border.SYMMETRIC, means it reflects the values on the edge repeating the last value of edge.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          # adds 10 pixels (default black) to each side of the border of the image
>>>                          py_transforms.Pad(padding=10),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomAffine(degrees, translate=None, scale=None, shear=None, resample=Inter.NEAREST, fill_value=0)[source]

Apply Random affine transformation to the input PIL image.

Parameters
  • degrees (int or float or sequence) – Range of the rotation degrees. If degrees is a number, the range will be (-degrees, degrees). If degrees is a sequence, it should be (min, max).

  • translate (sequence, optional) – Sequence (tx, ty) of maximum translation in x(horizontal) and y(vertical) directions (default=None). The horizontal and vertical shift is selected randomly from the range: (-tx*width, tx*width) and (-ty*height, ty*height), respectively. If None, no translations gets applied.

  • scale (sequence, optional) – Scaling factor interval (default=None, riginal scale is used).

  • shear (int or float or sequence, optional) – Range of shear factor (default=None). If a number ‘shear’, then a shear parallel to the x axis in the range of (-shear, +shear) is applied. If a tuple or list of size 2, then a shear parallel to the x axis in the range of (shear[0], shear[1]) is applied. If a tuple of list of size 4, then a shear parallel to x axis in the range of (shear[0], shear[1]) and a shear parallel to y axis in the range of (shear[2], shear[3]) is applied. If None, no shear is applied.

  • resample (Inter mode, optional) –

    An optional resampling filter (default=Inter.NEAREST). If omitted, or if the image has mode “1” or “P”, it is set to be Inter.NEAREST. It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means resample method is bilinear interpolation.

    • Inter.NEAREST, means resample method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means resample method is bicubic interpolation.

  • fill_value (tuple or int, optional) – Optional fill_value to fill the area outside the transform in the output image. Used only in Pillow versions > 5.0.0 (default=0, filling is performed).

Raises
  • ValueError – If degrees is negative.

  • ValueError – If translation value is not between 0 and 1.

  • ValueError – If scale is not positive.

  • ValueError – If shear is a number but is not positive.

  • TypeError – If degrees is not a number or a list or a tuple. If degrees is a list or tuple, its length is not 2.

  • TypeError – If translate is specified but is not list or a tuple of length 2.

  • TypeError – If scale is not a list or tuple of length 2.

  • TypeError – If shear is not a list or tuple of length 2 or 4.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomAffine(degrees=15, translate=(0.1, 0.1), scale=(0.9, 1.1)),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomApply(transforms, prob=0.5)[source]

Randomly perform a series of transforms with a given probability.

Parameters
  • transforms (list) – List of transformations to be applied.

  • prob (float, optional) – The probability to apply the transformation list (default=0.5).

Examples

>>> transforms_list = [py_transforms.CenterCrop(64), py_transforms.RandomRotation(30)]
>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomApply(transforms_list, prob=0.6),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomChoice(transforms)[source]

Randomly select one transform from a series of transforms and applies that on the image.

Parameters

transforms (list) – List of transformations to be chosen from to apply.

Examples

>>> transforms_list = [py_transforms.CenterCrop(64), py_transforms.RandomRotation(30)]
>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomChoice(transforms_list),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomColor(degrees=(0.1, 1.9))[source]

Adjust the color of the input PIL image by a random degree.

Parameters

degrees (sequence) – Range of random color adjustment degrees. It should be in (min, max) format (default=(0.1,1.9)).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomColor(0.5,1.5),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomColorAdjust(brightness=(1, 1), contrast=(1, 1), saturation=(1, 1), hue=(0, 0))[source]

Perform a random brightness, contrast, saturation, and hue adjustment on the input PIL image.

Parameters
  • brightness (float or tuple, optional) – Brightness adjustment factor (default=(1, 1)). Cannot be negative. If it is a float, the factor is uniformly chosen from the range [max(0, 1-brightness), 1+brightness]. If it is a sequence, it should be [min, max] for the range.

  • contrast (float or tuple, optional) – Contrast adjustment factor (default=(1, 1)). Cannot be negative. If it is a float, the factor is uniformly chosen from the range [max(0, 1-contrast), 1+contrast]. If it is a sequence, it should be [min, max] for the range.

  • saturation (float or tuple, optional) – Saturation adjustment factor (default=(1, 1)). Cannot be negative. If it is a float, the factor is uniformly chosen from the range [max(0, 1-saturation), 1+saturation]. If it is a sequence, it should be [min, max] for the range.

  • hue (float or tuple, optional) – Hue adjustment factor (default=(0, 0)). If it is a float, the range will be [-hue, hue]. Value should be 0 <= hue <= 0.5. If it is a sequence, it should be [min, max] where -0.5 <= min <= max <= 0.5.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomColorAdjust(0.4, 0.4, 0.4, 0.1),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomCrop(size, padding=None, pad_if_needed=False, fill_value=0, padding_mode=Border.CONSTANT)[source]

Crop the input PIL Image at a random location.

Parameters
  • size (int or sequence) – The output size of the cropped image. If size is an int, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • padding (int or sequence, optional) – The number of pixels to pad the image (default=None). If padding is not None, pad image firstly with padding values. If a single number is provided, it pads all borders with this value. If a tuple or list of 2 values are provided, it pads the (left and top) with the first value and (right and bottom) with the second value. If 4 values are provided as a list or tuple, it pads the left, top, right and bottom respectively.

  • pad_if_needed (bool, optional) – Pad the image if either side is smaller than the given output size (default=False).

  • fill_value (int or tuple, optional) – filling value (default=0). The pixel intensity of the borders if the padding_mode is Border.CONSTANT. If it is a 3-tuple, it is used to fill R, G, B channels respectively.

  • padding_mode (str, optional) –

    The method of padding (default=Border.CONSTANT). Can be any of [Border.CONSTANT, Border.EDGE, Border.REFLECT, Border.SYMMETRIC].

    • Border.CONSTANT, means it fills the border with constant values.

    • Border.EDGE, means it pads with the last value on the edge.

    • Border.REFLECT, means it reflects the values on the edge omitting the last value of edge.

    • Border.SYMMETRIC, means it reflects the values on the edge repeating the last value of edge.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomCrop(224),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomErasing(prob=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0, inplace=False, max_attempts=10)[source]

Erase the pixels, within a selected rectangle region, to the given value.

Randomly applied on the input Numpy image array with a given probability.

Zhun Zhong et al. ‘Random Erasing Data Augmentation’ 2017 See https://arxiv.org/pdf/1708.04896.pdf

Parameters
  • prob (float, optional) – Probability of applying RandomErasing (default=0.5).

  • scale (sequence of floats, optional) – Range of the relative erase area to the original image (default=(0.02, 0.33)).

  • ratio (sequence of floats, optional) – Range of the aspect ratio of the erase area (default=(0.3, 3.3)).

  • value (int or sequence) – Erasing value (default=0). If value is a single int, it is applied to all pixels to be erases. If value is a sequence of length 3, it is applied to R, G, B channels respectively. If value is a str ‘random’, the erase value will be obtained from a standard normal distribution.

  • inplace (bool, optional) – Apply this transform inplace (default=False).

  • max_attempts (int, optional) – The maximum number of attempts to propose a valid erase_area (default=10). If exceeded, return the original image.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.ToTensor(),
>>>                          py_transforms.RandomErasing(value='random')])
class mindspore.dataset.transforms.vision.py_transforms.RandomGrayscale(prob=0.1)[source]

Randomly convert the input image into grayscale image with a given probability.

Parameters

prob (float, optional) – Probability of the image being converted to grayscale (default=0.1).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomGrayscale(0.3),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomHorizontalFlip(prob=0.5)[source]

Randomly flip the input image horizontally with a given probability.

Parameters

prob (float, optional) – Probability of the image being flipped (default=0.5).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomHorizontalFlip(0.5),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomOrder(transforms)[source]

Perform a series of transforms to the input PIL image in a random oreder.

Parameters

transforms (list) – List of the transformations to be applied.

Examples

>>> transforms_list = [py_transforms.CenterCrop(64), py_transforms.RandomRotation(30)]
>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomOrder(transforms_list),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomPerspective(distortion_scale=0.5, prob=0.5, interpolation=Inter.BICUBIC)[source]

Randomly apply perspective transformation to the input PIL Image with a given probability.

Parameters
  • distortion_scale (float, optional) – The scale of distortion, float between 0 and 1 (default=0.5).

  • prob (float, optional) – Probability of the image being applied perspective transformation (default=0.5).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.BICUBIC). It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means interpolation method is bilinear interpolation.

    • Inter.NEAREST, means interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means interpolation method is bicubic interpolation.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomPerspective(prob=0.1),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=Inter.BILINEAR, max_attempts=10)[source]

Extract crop from the input image and resize it to a random size and aspect ratio.

Parameters
  • size (int or sequence) – The size of the output image. If size is an int, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • scale (tuple, optional) – Range (min, max) of respective size of the original size to be cropped (default=(0.08, 1.0)).

  • ratio (tuple, optional) – Range (min, max) of aspect ratio to be cropped (default=(3. / 4., 4. / 3.)).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.BILINEAR). It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means interpolation method is bilinear interpolation.

    • Inter.NEAREST, means interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means interpolation method is bicubic interpolation.

  • max_attempts (int, optional) – The maximum number of attempts to propose a valid crop_area (default=10). If exceeded, fall back to use center_crop instead.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomResizedCrop(224),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomRotation(degrees, resample=Inter.NEAREST, expand=False, center=None, fill_value=0)[source]

Rotate the input PIL image by a random angle.

Parameters
  • degrees (int or float or sequence) – Range of random rotation degrees. If degrees is a number, the range will be converted to (-degrees, degrees). If degrees is a sequence, it should be (min, max).

  • resample (Inter mode, optional) –

    An optional resampling filter (default=Inter.NEAREST). If omitted, or if the image has mode “1” or “P”, it is set to be Inter.NEAREST. It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means resampling method is bilinear interpolation.

    • Inter.NEAREST, means resampling method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means resampling method is bicubic interpolation.

  • expand (bool, optional) – Optional expansion flag (default=False). If set to True, expand the output image to make it large enough to hold the entire rotated image. If set to False or omitted, make the output image the same size as the input. Note that the expand flag assumes rotation around the center and no translation.

  • center (tuple, optional) – Optional center of rotation (a 2-tuple) (default=None). Origin is the top left corner. Default None sets to the center of the image.

  • fill_value (int or tuple, optional) – Optional fill color for the area outside the rotated image (default=0). If it is a 3-tuple, it is used for R, G, B channels respectively. If it is an int, it is used for all RGB channels. Default is 0.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomRotation(30),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomSharpness(degrees=(0.1, 1.9))[source]

Adjust the sharpness of the input PIL image by a random degree.

Parameters

degrees (sequence) – Range of random sharpness adjustment degrees. It should be in (min, max) format (default=(0.1,1.9)).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomColor(0.5,1.5),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RandomVerticalFlip(prob=0.5)[source]

Randomly flip the input image vertically with a given probability.

Parameters

prob (float, optional) – Probability of the image being flipped (default=0.5).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomVerticalFlip(0.5),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.Resize(size, interpolation=Inter.BILINEAR)[source]

Resize the input PIL Image to the given size.

Parameters
  • size (int or sequence) – The output size of the resized image. If size is an int, smaller edge of the image will be resized to this value with the same image aspect ratio. If size is a sequence of length 2, it should be (height, width).

  • interpolation (Inter mode, optional) –

    Image interpolation mode (default=Inter.BILINEAR). It can be any of [Inter.BILINEAR, Inter.NEAREST, Inter.BICUBIC].

    • Inter.BILINEAR, means interpolation method is bilinear interpolation.

    • Inter.NEAREST, means interpolation method is nearest-neighbor interpolation.

    • Inter.BICUBIC, means interpolation method is bicubic interpolation.

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.Resize(256),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.RgbToHsv(is_hwc=False)[source]

Convert a Numpy RGB image or one batch Numpy RGB images to HSV images.

Parameters

is_hwc (bool) – The flag of image shape, (H, W, C) or (N, H, W, C) if True and (C, H, W) or (N, C, H, W) if False (default=False).

class mindspore.dataset.transforms.vision.py_transforms.TenCrop(size, use_vertical_flip=False)[source]

Generate 10 cropped images (first 5 from FiveCrop, second 5 from their flipped version).

Parameters
  • size (int or sequence) – The output size of the crop. If size is an int, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).

  • use_vertical_flip (bool, optional) – Flip the image vertically instead of horizontally if set to True (default=False).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.TenCrop(size),
>>>                          # 4D stack of 10 images
>>>                          lambda images: numpy.stack([py_transforms.ToTensor()(image) for image in images])])
class mindspore.dataset.transforms.vision.py_transforms.ToPIL[source]

Convert the input decoded Numpy image array of RGB mode to a PIL Image of RGB mode.

Examples

>>> # data is already decoded, but not in PIL Image format
>>> py_transforms.ComposeOp([py_transforms.ToPIL(),
>>>                          py_transforms.RandomHorizontalFlip(0.5),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.ToTensor(output_type=<class 'numpy.float32'>)[source]

Convert the input Numpy image array or PIL image of shape (H,W,C) to a Numpy ndarray of shape (C,H,W).

Note

The ranges of values in height and width dimension changes from [0, 255] to [0.0, 1.0]. Type cast to output_type (default Numpy float 32). The range of channel dimension remains the same.

Parameters

output_type (numpy datatype, optional) – The datatype of the numpy output (default=np.float32).

Examples

>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomHorizontalFlip(0.5),
>>>                          py_transforms.ToTensor()])
class mindspore.dataset.transforms.vision.py_transforms.ToType(output_type)[source]

Convert the input Numpy image array to desired numpy dtype.

Parameters

output_type (numpy datatype) – The datatype of the numpy output. e.g. np.float32.

Examples

>>> import numpy as np
>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.RandomHorizontalFlip(0.5),
>>>                          py_transforms.ToTensor(),
>>>                          py_transforms.ToType(np.float32)])
class mindspore.dataset.transforms.vision.py_transforms.UniformAugment(transforms, num_ops=2)[source]

Uniformly select and apply a number of transforms sequentially from a list of transforms. Randomly assigns a probability to each transform for each image to decide whether apply it or not.

Parameters
  • transforms (list) – List of transformations to be chosen from to apply.

  • num_ops (int, optional) – number of transforms to sequentially apply (default=2).

Examples

>>> transforms_list = [py_transforms.CenterCrop(64),
>>>                    py_transforms.RandomColor(),
>>>                    py_transforms.RandomSharpness(),
>>>                    py_transforms.RandomRotation(30)]
>>> py_transforms.ComposeOp([py_transforms.Decode(),
>>>                          py_transforms.UniformAugment(transforms_list),
>>>                          py_transforms.ToTensor()])