mindspore.dataset.transforms

mindspore.dataset.transforms.c_transforms

The module transforms.c_transforms provides common operations, including OneHotOp and TypeCast.

class mindspore.dataset.transforms.c_transforms.Compose(transforms)[source]

Compose a list of transforms into a single transform.

Parameters

transforms (list) – List of transformations to be applied.

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> compose = c_transforms.Compose([c_vision.Decode(), c_vision.RandomCrop()])
>>> data1 = data1.map(operations=compose)
class mindspore.dataset.transforms.c_transforms.Concatenate(axis=0, prepend=None, append=None)[source]

Tensor operation that concatenates all columns into a single tensor.

Parameters
  • axis (int, optional) – Concatenate the tensors along given axis (Default=0).

  • prepend (numpy.array, optional) – NumPy array to be prepended to the already concatenated tensors (Default=None).

  • append (numpy.array, optional) – NumPy array to be appended to the already concatenated tensors (Default=None).

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>>
>>> # concatenate string
>>> prepend_tensor = np.array(["dw", "df"], dtype='S')
>>> append_tensor = np.array(["dwsdf", "df"], dtype='S')
>>> concatenate_op = c_transforms.Concatenate(0, prepend_tensor, append_tensor)
class mindspore.dataset.transforms.c_transforms.Duplicate[source]

Duplicate the input tensor to a new output tensor. The input tensor is carried over to the output list.

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>>
>>> # Data before
>>> # |  x      |
>>> # +---------+
>>> # | [1,2,3] |
>>> # +---------+
>>> data1 = data1.map(operations=c_transforms.Duplicate(), input_columns=["x"],
>>>         output_columns=["x", "y"], column_order=["x", "y"])
>>> # Data after
>>> # |  x      |  y      |
>>> # +---------+---------+
>>> # | [1,2,3] | [1,2,3] |
>>> # +---------+---------+
class mindspore.dataset.transforms.c_transforms.Fill(fill_value)[source]

Tensor operation to create a tensor filled with input scalar value. The output tensor will have the same shape and type as the input tensor.

Parameters

fill_value (Union[str, bytes, int, float, bool])) – scalar value to fill created tensor with.

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>>
>>> fill_op = c_transforms.Fill(3)
class mindspore.dataset.transforms.c_transforms.Mask(operator, constant, dtype=mindspore.bool)[source]

Mask content of the input tensor with the given predicate. Any element of the tensor that matches the predicate will be evaluated to True, otherwise False.

Parameters
  • operator (Relational) – One of the relational operators EQ, NE LT, GT, LE or GE

  • constant (Union[str, int, float, bool]) – Constant to be compared to. Constant will be cast to the type of the input tensor.

  • dtype (mindspore.dtype, optional) – Type of the generated mask (Default to bool).

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>>
>>> # Data before
>>> # |  col1   |
>>> # +---------+
>>> # | [1,2,3] |
>>> # +---------+
>>> data1 = data1.map(operations=c_transforms.Mask(Relational.EQ, 2))
>>> # Data after
>>> # |       col1         |
>>> # +--------------------+
>>> # | [False,True,False] |
>>> # +--------------------+
class mindspore.dataset.transforms.c_transforms.OneHot(num_classes)[source]

Tensor operation to apply one hot encoding.

Parameters

num_classes (int) – Number of classes of the label. It should be larger than the largest label number in the dataset.

Raises

RuntimeError – feature size is bigger than num_classes.

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> onehot_op = c_transforms.OneHot(num_classes=10)
>>> data1 = data1.map(operations=onehot_op, input_columns=["label"])
>>> mixup_batch_op = c_vision.MixUpBatch(alpha=0.8)
>>> data1 = data1.batch(4)
>>> data1 = data1.map(operations=mixup_batch_op, input_columns=["image", "label"])
class mindspore.dataset.transforms.c_transforms.PadEnd(pad_shape, pad_value=None)[source]

Pad input tensor according to pad_shape, need to have same rank.

Parameters
  • pad_shape (list(int)) – List of integers representing the shape needed. Dimensions that set to None will not be padded (i.e., original dim will be used). Shorter dimensions will truncate the values.

  • pad_value (Union[str, bytes, int, float, bool]), optional) – Value used to pad. Default to 0 or empty string in case of tensors of strings.

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>>
>>> # Data before
>>> # |   col   |
>>> # +---------+
>>> # | [1,2,3] |
>>> # +---------|
>>> data1 = data1.map(operations=c_transforms.PadEnd(pad_shape=[4], pad_value=10))
>>> # Data after
>>> # |    col     |
>>> # +------------+
>>> # | [1,2,3,10] |
>>> # +------------|
class mindspore.dataset.transforms.c_transforms.RandomApply(transforms, prob=0.5)[source]

Randomly perform a series of transforms with a given probability.

Parameters
  • transforms (list) – List of transformations to be applied.

  • prob (float, optional) – The probability to apply the transformation list (default=0.5)

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> rand_apply = c_transforms.RandomApply([c_vision.RandomCrop()])
>>> data1 = data1.map(operations=rand_apply)
class mindspore.dataset.transforms.c_transforms.RandomChoice(transforms)[source]

Randomly selects one transform from a list of transforms to perform operation.

Parameters

transforms (list) – List of transformations to be chosen from to apply.

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> rand_choice = c_transforms.RandomChoice([c_vision.CenterCrop(), c_vision.RandomCrop()])
>>> data1 = data1.map(operations=rand_choice)
class mindspore.dataset.transforms.c_transforms.Relational(value)[source]

An enumeration.

class mindspore.dataset.transforms.c_transforms.Slice(*slices)[source]

Slice operation to extract a tensor out using the given n slices.

The functionality of Slice is similar to NumPy’s indexing feature. (Currently only rank-1 tensors are supported).

Parameters

slices (Union[int, list(int), slice, None, Ellipses]) –

Maximum n number of arguments to slice a tensor of rank n. One object in slices can be one of:

  1. int: Slice this index only. Negative index is supported.

  2. list(int): Slice these indices ion the list only. Negative indices are supported.

  3. slice: Slice the generated indices from the slice object. Similar to start:stop:step.

  4. None: Slice the whole dimension. Similar to : in Python indexing.

  5. Ellipses: Slice all dimensions between the two slices. Similar to in Python indexing.

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>>
>>> # Data before
>>> # |   col   |
>>> # +---------+
>>> # | [1,2,3] |
>>> # +---------|
>>> data1 = data1.map(operations=c_transforms.Slice(slice(1,3))) # slice indices 1 and 2 only
>>> # Data after
>>> # |   col   |
>>> # +---------+
>>> # |  [2,3]  |
>>> # +---------|
class mindspore.dataset.transforms.c_transforms.TypeCast(data_type)[source]

Tensor operation to cast to a given MindSpore data type.

Parameters

data_type (mindspore.dtype) – mindspore.dtype to be cast to.

Examples

>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>>
>>> type_cast_op = c_transforms.TypeCast(mstype.int32)

mindspore.dataset.transforms.py_transforms

The module transforms.py_transform is implemented based on Python. It provides common operations including OneHotOp.

class mindspore.dataset.transforms.py_transforms.Compose(transforms)[source]

Compose a list of transforms.

Note

Compose takes a list of transformations either provided in py_transforms or from user-defined implementation; each can be an initialized transformation class or a lambda function, as long as the output from the last transformation is a single tensor of type numpy.ndarray. See below for an example of how to use Compose with py_transforms classes and check out FiveCrop or TenCrop for the use of them in conjunction with lambda functions.

Parameters

transforms (list) – List of transformations to be applied.

Examples

>>> import mindspore.dataset as ds
>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> import mindspore.dataset.transforms.py_transforms as py_transforms
>>>
>>> dataset_dir = "path/to/imagefolder_directory"
>>> # create a dataset that reads all files in dataset_dir with 8 threads
>>> dataset = ds.ImageFolderDataset(dataset_dir, num_parallel_workers=8)
>>> # create a list of transformations to be applied to the image data
>>> transform = py_transforms.Compose([py_vision.Decode(),
>>>                                    py_vision.RandomHorizontalFlip(0.5),
>>>                                    py_vision.ToTensor(),
>>>                                    py_vision.Normalize((0.491, 0.482, 0.447), (0.247, 0.243, 0.262)),
>>>                                    py_vision.RandomErasing()])
>>> # apply the transform to the dataset through dataset.map()
>>> dataset = dataset.map(operations=transform, input_columns="image")
>>>
>>> # Compose is also be invoked implicitly, by just passing in a list of ops
>>> # the above example then becomes:
>>> transform_list = [py_vision.Decode(),
>>>                   py_vision.RandomHorizontalFlip(0.5),
>>>                   py_vision.ToTensor(),
>>>                   py_vision.Normalize((0.491, 0.482, 0.447), (0.247, 0.243, 0.262)),
>>>                   py_vision.RandomErasing()]
>>>
>>> # apply the transform to the dataset through dataset.map()
>>> dataset = dataset.map(operations=transform_list, input_columns="image")
>>>
>>> # Certain C++ and Python ops can be combined, but not all of them
>>> # An example of combined operations
>>> import mindspore.dataset as ds
>>> import mindspore.dataset.transforms.c_transforms as c_transforms
>>> import mindspore.dataset.vision.c_transforms as c_vision
>>>
>>> data = ds.NumpySlicesDataset(arr, column_names=["cols"], shuffle=False)
>>> transformed_list = [py_transforms.OneHotOp(2), c_transforms.Mask(c_transforms.Relational.EQ, 1)]
>>> data = data.map(operations=transformed_list, input_columns=["cols"])
>>>
>>> # Here is an example of mixing vision ops
>>> data_dir = "/path/to/imagefolder_directory"
>>> data1 = ds.ImageFolderDataset(dataset_dir=data_dir, shuffle=False)
>>> input_columns = ["column_names"]
>>> op_list=[c_vision.Decode(),
>>>          c_vision.Resize((224, 244)),
>>>          py_vision.ToPIL(),
>>>          np.array, # need to convert PIL image to a NumPy array to pass it to C++ operation
>>>          c_vision.Resize((24, 24))]
>>> data1 = data1.map(operations=op_list, input_columns=input_columns)
class mindspore.dataset.transforms.py_transforms.OneHotOp(num_classes, smoothing_rate=0.0)[source]

Apply one hot encoding transformation to the input label, make label be more smoothing and continuous.

Parameters
  • num_classes (int) – Number of classes of objects in dataset. Value must be larger than 0.

  • smoothing_rate (float, optional) – Adjustable hyperparameter for label smoothing level. (Default=0.0 means no smoothing is applied.)

Examples

>>> import mindspore.dataset.transforms as py_transforms
>>>
>>> transforms_list = [py_transforms.OneHotOp(num_classes=10, smoothing_rate=0.1)]
>>> transform = py_transforms.Compose(transforms_list)
>>> data1 = data1.map(input_columns=["label"], operations=transform())
class mindspore.dataset.transforms.py_transforms.RandomApply(transforms, prob=0.5)[source]

Randomly perform a series of transforms with a given probability.

Parameters
  • transforms (list) – List of transformations to apply.

  • prob (float, optional) – The probability to apply the transformation list (default=0.5).

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomApply(transforms_list, prob=0.6),
>>>          py_vision.ToTensor()])
class mindspore.dataset.transforms.py_transforms.RandomChoice(transforms)[source]

Randomly select one transform from a series of transforms and applies that on the image.

Parameters

transforms (list) – List of transformations to be chosen from to apply.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose, RandomChoice
>>>
>>> Compose([py_vision.Decode(),
>>>          RandomChoice(transforms_list),
>>>          py_vision.ToTensor()])
class mindspore.dataset.transforms.py_transforms.RandomOrder(transforms)[source]

Perform a series of transforms to the input PIL image in a random order.

Parameters

transforms (list) – List of the transformations to apply.

Examples

>>> import mindspore.dataset.vision.py_transforms as py_vision
>>> from mindspore.dataset.transforms.py_transforms import Compose
>>>
>>> Compose([py_vision.Decode(),
>>>          py_vision.RandomOrder(transforms_list),
>>>          py_vision.ToTensor()])