Auto Augmentation

Linux Ascend GPU CPU Data Preparation Intermediate Expert

View Source On Gitee

Overview

Auto Augmentation [1] finds a suitable image augmentation scheme for a specific dataset by searching through a series of image augmentation sub-policies. The c_transforms module of MindSpore provides various C++ operators that are used in Auto Augmentation. Users can also customize functions or operators to implement Auto Augmentation. For more details about the MindSpore operators, see the API document.

The mapping between MindSpore operators and Auto Augmentation operators is as follows:

Auto Augmentation Operators

MindSpore Operators

Introduction

shearX

RandomAffine

Horizontal shear

shearY

RandomAffine

Vertical shear

translateX

RandomAffine

Horizontal translation

translateY

RandomAffine

Vertival translation

rotate

RandomRotation

Rotational transformation

color

RandomColor

Color transformation

posterize

RandomPosterize

Decrease the number of color channels

solarize

RandomSolarize

Invert all pixels within the specified threshold range

contrast

RandomColorAdjust

Contrast adjustment

sharpness

RandomSharpness

Sharpness adjustment

brightness

RandomColorAdjust

Brightness adjustment

autocontrast

AutoContrast

Maximize image contrast

equalize

Equalize

Equalize image histogram

invert

Invert

Image inversion

Auto Augmentation on ImageNet

This tutorial uses the implementation of Auto Augmentation on the ImageNet dataset as an example.

The data augmentation policy for the ImageNet dataset contains 25 sub-policies, and each sub-policy contains two transformations. A combination of sub-policies is randomly selected for each image in a batch, and each transformation in the sub-policy is executed based on a preset probability.

Users can use the RandomSelectSubpolicy interface of the c_transforms module in MindSpore to implement Auto Augmentation. The standard data augmentation method in ImageNet classification training includes the following steps:

  • RandomCropDecodeResize: Randomly crop then decode.

  • RandomHorizontalFlip: Randomly flip horizontally.

  • Normalize: Normalize the data.

  • HWC2CHW: Change image channel.

Add Auto Augmentation transformation after the RandomCropDecodeResize as follows:

  1. Import related modules.

    import mindspore.common.dtype as mstype
    import mindspore.dataset.engine as de
    import mindspore.dataset.vision.c_transforms as c_vision
    import mindspore.dataset.transforms.c_transforms as c_transforms
    import matplotlib.pyplot as plt
    
  2. Define the mapping from the MindSpore operators to the Auto Augmentation operators.

    # define Auto Augmentation operators
    PARAMETER_MAX = 10
    
    def float_parameter(level, maxval):
        return float(level) * maxval /  PARAMETER_MAX
    
    def int_parameter(level, maxval):
        return int(level * maxval / PARAMETER_MAX)
    
    def shear_x(level):
        v = float_parameter(level, 0.3)
        return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, shear=(-v,-v)), c_vision.RandomAffine(degrees=0, shear=(v, v))])
    
    def shear_y(level):
        v = float_parameter(level, 0.3)
        return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, shear=(0, 0, -v,-v)), c_vision.RandomAffine(degrees=0, shear=(0, 0, v, v))])
    
    def translate_x(level):
        v = float_parameter(level, 150 / 331)
        return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, translate=(-v,-v)), c_vision.RandomAffine(degrees=0, translate=(v, v))])
    
    def translate_y(level):
        v = float_parameter(level, 150 / 331)
        return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, translate=(0, 0, -v,-v)), c_vision.RandomAffine(degrees=0, translate=(0, 0, v, v))])
    
    def color_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return c_vision.RandomColor(degrees=(v, v))
    
    def rotate_impl(level):
        v = int_parameter(level, 30)
        return c_transforms.RandomChoice([c_vision.RandomRotation(degrees=(-v, -v)), c_vision.RandomRotation(degrees=(v, v))])
    
    def solarize_impl(level):
        level = int_parameter(level, 256)
        v = 256 - level
        return c_vision.RandomSolarize(threshold=(0, v))
    
    def posterize_impl(level):
        level = int_parameter(level, 4)
        v = 4 - level
        return c_vision.RandomPosterize(bits=(v, v))
    
    def contrast_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return c_vision.RandomColorAdjust(contrast=(v, v))
    
    def autocontrast_impl(level):
        return c_vision.AutoContrast()
    
    def sharpness_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return c_vision.RandomSharpness(degrees=(v, v))
    
    def brightness_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return c_vision.RandomColorAdjust(brightness=(v, v))
    
  3. Define the Auto Augmentation policy for the ImageNet dataset.

    # define the Auto Augmentation policy
    imagenet_policy = [
          [(posterize_impl(8), 0.4), (rotate_impl(9), 0.6)],
          [(solarize_impl(5), 0.6), (autocontrast_impl(5), 0.6)],
          [(c_vision.Equalize(), 0.8), (c_vision.Equalize(), 0.6)],
          [(posterize_impl(7), 0.6), (posterize_impl(6), 0.6)],
          [(c_vision.Equalize(), 0.4), (solarize_impl(4), 0.2)],
    
          [(c_vision.Equalize(), 0.4), (rotate_impl(8), 0.8)],
          [(solarize_impl(3), 0.6), (c_vision.Equalize(), 0.6)],
          [(posterize_impl(5), 0.8), (c_vision.Equalize(), 1.0)],
          [(rotate_impl(3), 0.2), (solarize_impl(8), 0.6)],
          [(c_vision.Equalize(), 0.6), (posterize_impl(6), 0.4)],
    
          [(rotate_impl(8), 0.8), (color_impl(0), 0.4)],
          [(rotate_impl(9), 0.4), (c_vision.Equalize(), 0.6)],
          [(c_vision.Equalize(), 0.0), (c_vision.Equalize(), 0.8)],
          [(c_vision.Invert(), 0.6), (c_vision.Equalize(), 1.0)],
          [(color_impl(4), 0.6), (contrast_impl(8), 1.0)],
    
          [(rotate_impl(8), 0.8), (color_impl(2), 1.0)],
          [(color_impl(8), 0.8), (solarize_impl(7), 0.8)],
          [(sharpness_impl(7), 0.4), (c_vision.Invert(), 0.6)],
          [(shear_x(5), 0.6), (c_vision.Equalize(), 1.0)],
          [(color_impl(0), 0.4), (c_vision.Equalize(), 0.6)],
    
          [(c_vision.Equalize(), 0.4), (solarize_impl(4), 0.2)],
          [(solarize_impl(5), 0.6), (autocontrast_impl(5), 0.6)],
          [(c_vision.Invert(), 0.6), (c_vision.Equalize(), 1.0)],
          [(color_impl(4), 0.6), (contrast_impl(8), 1.0)],
          [(c_vision.Equalize(), 0.8), (c_vision.Equalize(), 0.6)],
        ]
    
  4. Add Auto Augmentation transformations after the RandomCropDecodeResize operation.

    def create_dataset(dataset_path, do_train, repeat_num=1, batch_size=32, shuffle=True, num_samples=5, target="Ascend"):
      # create a train or eval imagenet2012 dataset for ResNet-50
      ds = de.ImageFolderDataset(dataset_path, num_parallel_workers=8,
              shuffle=shuffle, num_samples=num_samples)
    
      image_size = 224
      mean = [0.485 * 255, 0.456 * 255, 0.406 * 255]
      std = [0.229 * 255, 0.224 * 255, 0.225 * 255]
    
      # define map operations
      if do_train:
          trans = [
                  c_vision.RandomCropDecodeResize(image_size, scale=(0.08, 1.0), ratio=(0.75, 1.333)),
                  ]
    
          post_trans = [
                  c_vision.RandomHorizontalFlip(prob=0.5),
                  ]
      else:
          trans = [
                  c_vision.Decode(),
                  c_vision.Resize(256),
                  c_vision.CenterCrop(image_size),
                  c_vision.Normalize(mean=mean, std=std),
                  c_vision.HWC2CHW()
                  ]
      ds = ds.map(operations=trans, input_columns="image")
      if do_train:
          ds = ds.map(operations=c_vision.RandomSelectSubpolicy(imagenet_policy), input_columns=["image"])
          ds = ds.map(operations=post_trans, input_columns="image")
      type_cast_op = c_transforms.TypeCast(mstype.int32)
      ds = ds.map(operations=type_cast_op, input_columns="label")
      # apply the batch operation
      ds = ds.batch(batch_size, drop_remainder=True)
      # apply the repeat operation
      ds = ds.repeat(repeat_num)
    
      return ds
    
  5. Verify the effects of Auto Augmentation.

    # Define the path to image folder directory. This directory needs to contain sub-directories which contain the images.
    DATA_DIR = "/path/to/imagefolder_directory"
    ds = create_dataset(dataset_path=DATA_DIR, do_train=True, batch_size=5, shuffle=False, num_samples=5)
    
    epochs = 5
    itr = ds.create_dict_iterator()
    fig=plt.figure(figsize=(8, 8))
    columns = 5
    rows = 5
    
    step_num = 0
    for ep_num in range(epochs):
        for data in itr:
            step_num += 1
            for index in range(rows):
                fig.add_subplot(rows, columns, ep_num * rows + index + 1)
                plt.imshow(data['image'].asnumpy()[index])
    plt.show()
    

    For better visualization, only five images are read from the dataset without performing shuffle, Normalize, nor HWC2CHW operations.

    augment

    The images above visualize the effect of Auto Augmentation. The horizontal direction displays five images in one batch, and the vertical direction displays five batches.

References

[1] AutoAugment: Learning Augmentation Policies from Data.