# Auto Augmentation

`Linux` `Ascend` `GPU` `CPU` `Data Preparation` `Intermediate` `Expert`

[![View Source On Gitee](../_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/r1.0/tutorials/training/source_en/advanced_use/enable_auto_augmentation.md)

## Overview

Auto Augmentation [1] finds a suitable image augmentation scheme for a specific dataset by searching through a series of image augmentation sub-policies. The `c_transforms` module of MindSpore provides various C++ operators that are used in Auto Augmentation. Users can also customize functions or operators to implement Auto Augmentation. For more details about the MindSpore operators, see the [API document](https://www.mindspore.cn/doc/api_python/en/r1.0/mindspore/mindspore.dataset.vision.html).

The mapping between MindSpore operators and Auto Augmentation operators is as follows:

| Auto Augmentation Operators | MindSpore Operators | Introduction |
| :------: | :------ | ------ |
| shearX | RandomAffine | Horizontal shear |
| shearY | RandomAffine | Vertical shear |
| translateX | RandomAffine | Horizontal translation |
| translateY | RandomAffine | Vertival translation |
| rotate | RandomRotation | Rotational transformation |
| color | RandomColor | Color transformation |
| posterize | RandomPosterize | Decrease the number of color channels |
| solarize | RandomSolarize | Invert all pixels within the specified threshold range |
| contrast | RandomColorAdjust | Contrast adjustment |
| sharpness | RandomSharpness | Sharpness adjustment |
| brightness | RandomColorAdjust | Brightness adjustment |
| autocontrast | AutoContrast | Maximize image contrast |
| equalize | Equalize | Equalize image histogram |
| invert | Invert | Image inversion |

## Auto Augmentation on ImageNet

This tutorial uses the implementation of Auto Augmentation on the ImageNet dataset as an example.

The data augmentation policy for the ImageNet dataset contains 25 sub-policies, and each sub-policy contains two transformations. A combination of sub-policies is randomly selected for each image in a batch, and each transformation in the sub-policy is executed based on a preset probability.

Users can use the `RandomSelectSubpolicy` interface of the `c_transforms` module in MindSpore to implement Auto Augmentation. The standard data augmentation method in ImageNet classification training includes the following steps:

- `RandomCropDecodeResize`: Randomly crop then decode.

- `RandomHorizontalFlip`: Randomly flip horizontally.

- `Normalize`: Normalize the data.

- `HWC2CHW`: Change image channel.

Add Auto Augmentation transformation after the `RandomCropDecodeResize` as follows:

1. Import related modules.

    ```python
    import mindspore.common.dtype as mstype
    import mindspore.dataset.engine as de
    import mindspore.dataset.vision.c_transforms as c_vision
    import mindspore.dataset.transforms.c_transforms as c_transforms
    import matplotlib.pyplot as plt
    ```

2. Define the mapping from the MindSpore operators to the Auto Augmentation operators.

    ```python
    # define Auto Augmentation operators
    PARAMETER_MAX = 10

    def float_parameter(level, maxval):
        return float(level) * maxval /  PARAMETER_MAX

    def int_parameter(level, maxval):
        return int(level * maxval / PARAMETER_MAX)

    def shear_x(level):
        v = float_parameter(level, 0.3)
        return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, shear=(-v,-v)), c_vision.RandomAffine(degrees=0, shear=(v, v))])

    def shear_y(level):
        v = float_parameter(level, 0.3)
        return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, shear=(0, 0, -v,-v)), c_vision.RandomAffine(degrees=0, shear=(0, 0, v, v))])

    def translate_x(level):
        v = float_parameter(level, 150 / 331)
        return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, translate=(-v,-v)), c_vision.RandomAffine(degrees=0, translate=(v, v))])

    def translate_y(level):
        v = float_parameter(level, 150 / 331)
        return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, translate=(0, 0, -v,-v)), c_vision.RandomAffine(degrees=0, translate=(0, 0, v, v))])

    def color_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return c_vision.RandomColor(degrees=(v, v))

    def rotate_impl(level):
        v = int_parameter(level, 30)
        return c_transforms.RandomChoice([c_vision.RandomRotation(degrees=(-v, -v)), c_vision.RandomRotation(degrees=(v, v))])

    def solarize_impl(level):
        level = int_parameter(level, 256)
        v = 256 - level
        return c_vision.RandomSolarize(threshold=(0, v))

    def posterize_impl(level):
        level = int_parameter(level, 4)
        v = 4 - level
        return c_vision.RandomPosterize(bits=(v, v))

    def contrast_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return c_vision.RandomColorAdjust(contrast=(v, v))

    def autocontrast_impl(level):
        return c_vision.AutoContrast()

    def sharpness_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return c_vision.RandomSharpness(degrees=(v, v))

    def brightness_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return c_vision.RandomColorAdjust(brightness=(v, v))
    ```

3. Define the Auto Augmentation policy for the ImageNet dataset.

    ```python
    # define the Auto Augmentation policy
    imagenet_policy = [
          [(posterize_impl(8), 0.4), (rotate_impl(9), 0.6)],
          [(solarize_impl(5), 0.6), (autocontrast_impl(5), 0.6)],
          [(c_vision.Equalize(), 0.8), (c_vision.Equalize(), 0.6)],
          [(posterize_impl(7), 0.6), (posterize_impl(6), 0.6)],
          [(c_vision.Equalize(), 0.4), (solarize_impl(4), 0.2)],

          [(c_vision.Equalize(), 0.4), (rotate_impl(8), 0.8)],
          [(solarize_impl(3), 0.6), (c_vision.Equalize(), 0.6)],
          [(posterize_impl(5), 0.8), (c_vision.Equalize(), 1.0)],
          [(rotate_impl(3), 0.2), (solarize_impl(8), 0.6)],
          [(c_vision.Equalize(), 0.6), (posterize_impl(6), 0.4)],

          [(rotate_impl(8), 0.8), (color_impl(0), 0.4)],
          [(rotate_impl(9), 0.4), (c_vision.Equalize(), 0.6)],
          [(c_vision.Equalize(), 0.0), (c_vision.Equalize(), 0.8)],
          [(c_vision.Invert(), 0.6), (c_vision.Equalize(), 1.0)],
          [(color_impl(4), 0.6), (contrast_impl(8), 1.0)],

          [(rotate_impl(8), 0.8), (color_impl(2), 1.0)],
          [(color_impl(8), 0.8), (solarize_impl(7), 0.8)],
          [(sharpness_impl(7), 0.4), (c_vision.Invert(), 0.6)],
          [(shear_x(5), 0.6), (c_vision.Equalize(), 1.0)],
          [(color_impl(0), 0.4), (c_vision.Equalize(), 0.6)],

          [(c_vision.Equalize(), 0.4), (solarize_impl(4), 0.2)],
          [(solarize_impl(5), 0.6), (autocontrast_impl(5), 0.6)],
          [(c_vision.Invert(), 0.6), (c_vision.Equalize(), 1.0)],
          [(color_impl(4), 0.6), (contrast_impl(8), 1.0)],
          [(c_vision.Equalize(), 0.8), (c_vision.Equalize(), 0.6)],
        ]
    ```

4. Add Auto Augmentation transformations after the `RandomCropDecodeResize` operation.

    ```python
    def create_dataset(dataset_path, do_train, repeat_num=1, batch_size=32, shuffle=True, num_samples=5, target="Ascend"):
      # create a train or eval imagenet2012 dataset for ResNet-50
      ds = de.ImageFolderDataset(dataset_path, num_parallel_workers=8,
              shuffle=shuffle, num_samples=num_samples)

      image_size = 224
      mean = [0.485 * 255, 0.456 * 255, 0.406 * 255]
      std = [0.229 * 255, 0.224 * 255, 0.225 * 255]

      # define map operations
      if do_train:
          trans = [
                  c_vision.RandomCropDecodeResize(image_size, scale=(0.08, 1.0), ratio=(0.75, 1.333)),
                  ]

          post_trans = [
                  c_vision.RandomHorizontalFlip(prob=0.5),
                  ]
      else:
          trans = [
                  c_vision.Decode(),
                  c_vision.Resize(256),
                  c_vision.CenterCrop(image_size),
                  c_vision.Normalize(mean=mean, std=std),
                  c_vision.HWC2CHW()
                  ]
      ds = ds.map(operations=trans, input_columns="image")
      if do_train:
          ds = ds.map(operations=c_vision.RandomSelectSubpolicy(imagenet_policy), input_columns=["image"])
          ds = ds.map(operations=post_trans, input_columns="image")
      type_cast_op = c_transforms.TypeCast(mstype.int32)
      ds = ds.map(operations=type_cast_op, input_columns="label")
      # apply the batch operation
      ds = ds.batch(batch_size, drop_remainder=True)
      # apply the repeat operation
      ds = ds.repeat(repeat_num)

      return ds
    ```

5. Verify the effects of Auto Augmentation.

    ```python
    # Define the path to image folder directory. This directory needs to contain sub-directories which contain the images.
    DATA_DIR = "/path/to/imagefolder_directory"
    ds = create_dataset(dataset_path=DATA_DIR, do_train=True, batch_size=5, shuffle=False, num_samples=5)

    epochs = 5
    itr = ds.create_dict_iterator()
    fig=plt.figure(figsize=(8, 8))
    columns = 5
    rows = 5

    step_num = 0
    for ep_num in range(epochs):
        for data in itr:
            step_num += 1
            for index in range(rows):
                fig.add_subplot(rows, columns, ep_num * rows + index + 1)
                plt.imshow(data['image'].asnumpy()[index])
    plt.show()
    ```

    > For better visualization, only five images are read from the dataset without performing `shuffle`, `Normalize`, nor `HWC2CHW` operations.

    ![augment](./images/auto_augmentation.png)

    The images above visualize the effect of Auto Augmentation. The horizontal direction displays five images in one batch, and the vertical direction displays five batches.

## References

[1] [AutoAugment: Learning Augmentation Policies from Data](https://arxiv.org/abs/1805.09501).