Illustration of vision transforms

View Source On Gitee

This example illustrates the various transforms available in the mindspore.dataset.vision module.

Preparation

[1]:
from download import download
import matplotlib.pyplot as plt
from PIL import Image

import mindspore.dataset as ds
import mindspore.dataset.vision as vision

# Download opensource datasets
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/flamingos.jpg"
download(url, './flamingos.jpg', replace=True)
orig_img = Image.open('flamingos.jpg')

# Env set for randomness and prepare plot function
ds.config.set_seed(66)

def plot(imgs, first_origin=True, **kwargs):
    num_rows = 1
    num_cols = len(imgs) + first_origin

    _, axs = plt.subplots(nrows=num_rows, ncols=num_cols, squeeze=False)
    if first_origin:
        imgs = [orig_img] + imgs
    for idx, img in enumerate(imgs):
        ax = axs[0, idx]
        ax.imshow(img, **kwargs)
        ax.set(xticklabels=[], yticklabels=[], xticks=[], yticks=[])

    if first_origin:
        axs[0, 0].set(title='Original image')
        axs[0, 0].title.set_size(8)
    plt.tight_layout()
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/flamingos.jpg (45 kB)

file_sizes: 100%|███████████████████████████| 45.8k/45.8k [00:00<00:00, 639kB/s]
Successfully downloaded file to ./flamingos.jpg

Geometric Transforms

Geometric image transformation refers to the process of altering the geometric properties of an image, such as its shape, size, orientation, or position. It involves applying mathematical operations to the image pixels or coordinates to achieve the desired transformation.

Pad

The mindspore.dataset.vision.Pad transform pads the borders of image with some pixels.

[2]:
padded_imgs = [vision.Pad(padding=padding)(orig_img) for padding in (3, 10, 30, 50)]
plot(padded_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_4_0.png

Resize

The mindspore.dataset.vision.Resize transform resizes an image to a given size.

[3]:
resized_imgs = [vision.Resize(size=size)(orig_img) for size in (30, 50, 100)]
plot(resized_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_6_0.png

CenterCrop

The mindspore.dataset.vision.CenterCrop transform crop the image at the center with given size.

[4]:
center_crops = [vision.CenterCrop(size=size)(orig_img) for size in (30, 50, 100)]
plot(center_crops)
../../../_images/api_python_samples_dataset_vision_gallery_8_0.png

FiveCrop

The mindspore.dataset.vision.FiveCrop transform crops the given image into one central crop and four corners.

[5]:
(top_left, top_right, bottom_left, bottom_right, center) = vision.FiveCrop(size=(100, 100))(orig_img)
plot([top_left, top_right, bottom_left, bottom_right, center], True)
../../../_images/api_python_samples_dataset_vision_gallery_10_0.png

RandomPerspective

The mindspore.dataset.vision.RandomPerspective transform performs random perspective transform on an image.

[6]:
perspective_transformer = vision.RandomPerspective(distortion_scale=0.6, prob=1.0)
perspective_imgs = [perspective_transformer(orig_img) for _ in range(4)]
plot(perspective_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_12_0.png

RandomRotation

The mindspore.dataset.vision.RandomRotation transform rotates an image with random angle.

[7]:
rotater = vision.RandomRotation(degrees=(0, 180))
rotated_imgs = [rotater(orig_img) for _ in range(4)]
plot(rotated_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_14_0.png

RandomAffine

The mindspore.dataset.vision.RandomAffine transform performs random affine transform on an image.

[8]:
affine_transformer = vision.RandomAffine(degrees=(30, 70), translate=(0.1, 0.3), scale=(0.5, 0.75))
affine_imgs = [affine_transformer(orig_img) for _ in range(4)]
plot(affine_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_16_0.png

RandomCrop

The mindspore.dataset.vision.RandomCrop transform crops an image at a random location.

[9]:
cropper = vision.RandomCrop(size=(128, 128))
crops = [cropper(orig_img) for _ in range(4)]
plot(crops)
../../../_images/api_python_samples_dataset_vision_gallery_18_0.png

RandomResizedCrop

The mindspore.dataset.vision.RandomResizedCrop transform crops an image at a random location, and then resizes the crop to a given size.

[10]:
resize_cropper = vision.RandomResizedCrop(size=(32, 32))
resized_crops = [resize_cropper(orig_img) for _ in range(4)]
plot(resized_crops)
../../../_images/api_python_samples_dataset_vision_gallery_20_0.png

Photometric Transforms

Photometric image transformation refers to the process of modifying the photometric properties of an image, such as its brightness, contrast, color, or tone. These transformations are applied to change the visual appearance of an image while preserving its geometric structure.

Grayscale

The mindspore.dataset.vision.Grayscale transform converts an image to grayscale.

[11]:
gray_img = vision.Grayscale()(orig_img)
plot([gray_img], cmap='gray')
../../../_images/api_python_samples_dataset_vision_gallery_23_0.png

RandomColorAdjust

The mindspore.dataset.vision.RandomColorAdjust transform randomly changes the brightness, contrast, saturation and hue of the input image.

[12]:
jitter = vision.RandomColorAdjust(brightness=.5, hue=.3)
jitted_imgs = [jitter(orig_img) for _ in range(4)]
plot(jitted_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_25_0.png

GaussianBlur

The mindspore.dataset.vision.GaussianBlur transform performs gaussian blur transform on an image.

[13]:
blurrer = vision.GaussianBlur(kernel_size=(5, 9), sigma=(0.1, 5))
blurred_imgs = [blurrer(orig_img) for _ in range(4)]
plot(blurred_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_27_0.png

RandomInvert

The mindspore.dataset.vision.RandomInvert transform randomly inverts the colors of the given image.

[14]:
inverter = vision.RandomInvert()
invertered_imgs = [inverter(orig_img) for _ in range(4)]
plot(invertered_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_29_0.png

RandomPosterize

The mindspore.dataset.vision.RandomPosterize transform randomly reduces the bit depth of the color channels of image to create a high contrast and vivid color image.

[15]:
posterizer = vision.RandomPosterize(bits=2)
posterized_imgs = [posterizer(orig_img) for _ in range(4)]
plot(posterized_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_31_0.png

RandomSolarize

The mindspore.dataset.vision.RandomSolarize transform randomly solarizes the image by inverting pixel values within specified threshold.

[16]:
solarizer = vision.RandomSolarize(threshold=(0, 192))
solarized_imgs = [solarizer(orig_img) for _ in range(4)]
plot(solarized_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_33_0.png

RandomAdjustSharpness

The mindspore.dataset.vision.RandomAdjustSharpness transform randomly adjusts the sharpness of the given image.

[17]:
sharpness_adjuster = vision.RandomAdjustSharpness(degree=2)
sharpened_imgs = [sharpness_adjuster(orig_img) for _ in range(4)]
plot(sharpened_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_35_0.png

RandomAutoContrast

The mindspore.dataset.vision.RandomAutoContrast transform randomly applies autocontrast to the given image.

[18]:
autocontraster = vision.RandomAutoContrast()
autocontrasted_imgs = [autocontraster(orig_img) for _ in range(4)]
plot(autocontrasted_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_37_0.png

RandomEqualize

The mindspore.dataset.vision.RandomEqualize transform randomly equalizes the histogram of the given image.

[19]:
equalizer = vision.RandomEqualize()
equalized_imgs = [equalizer(orig_img) for _ in range(4)]
plot(equalized_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_39_0.png

Augmentation Transforms

The following transforms are combinations of multiple transforms, they usually come from papers.

AutoAugment

The mindspore.dataset.vision.AutoAugment transform applies AutoAugment method based on AutoAugment: Learning Augmentation Strategies from Data.

[20]:
augmenter = vision.AutoAugment(policy=vision.AutoAugmentPolicy.IMAGENET)
imgs = [augmenter(orig_img) for _ in range(4)]
plot(imgs)
../../../_images/api_python_samples_dataset_vision_gallery_42_0.png

RandAugment

The mindspore.dataset.vision.RandAugment applies RandAugment method based on RandAugment: Learning Augmentation Strategies from Data.

[21]:
augmenter = vision.RandAugment()
imgs = [augmenter(orig_img) for _ in range(4)]
plot(imgs)
../../../_images/api_python_samples_dataset_vision_gallery_44_0.png

TrivialAugmentWide

The mindspore.dataset.vision.TrivialAugmentWide applies TrivialAugmentWide method based on TrivialAugmentWide: Tuning-free Yet State-of-the-Art Data Augmentation.

[22]:
augmenter = vision.TrivialAugmentWide()
imgs = [augmenter(orig_img) for _ in range(4)]
plot(imgs)
../../../_images/api_python_samples_dataset_vision_gallery_46_0.png

Randomly-applied transforms

Some transforms are randomly-applied with a probability. That is, the transformed image may be the same as the original one.

RandomHorizontalFlip

The mindspore.dataset.vision.RandomHorizontalFlip transform performs horizontal flip of an image, with a given probability.

[23]:
hflipper = vision.RandomHorizontalFlip(0.5)
transformed_imgs = [hflipper(orig_img) for _ in range(4)]
plot(transformed_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_48_0.png

RandomVerticalFlip

The mindspore.dataset.vision.RandomVerticalFlip transform performs vertical flip of an image, with a given probability.

[24]:
vflipper = vision.RandomVerticalFlip(0.5)
transformed_imgs = [vflipper(orig_img) for _ in range(4)]
plot(transformed_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_50_0.png

RandomApply

The mindspore.dataset.transforms.RandomApply transform randomly applies a list of transforms, with a given probability.

[25]:
import mindspore.dataset.transforms as T

applier = T.RandomApply(transforms=[vision.RandomCrop(size=(64, 64))], prob=0.5)
transformed_imgs = [applier(orig_img) for _ in range(4)]
plot(transformed_imgs)
../../../_images/api_python_samples_dataset_vision_gallery_52_0.png

Process Image File In Dataset Pipeline

Use the mindspore.dataset.ImageFolderDataset to read image content into dataset pipeline and then we can do further transforms based on pipeline.

[26]:
from download import download

import os
import mindspore.dataset as ds

# Download a small imagenet as example
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/imageset.zip"
download(url, "./", kind="zip", replace=True)

# There are 5 classes in the image folder.
os.listdir("./imageset")

# Load these 5 classes into dataset pipeline
dataset = ds.ImageFolderDataset("./imageset", shuffle=False)

# check the column names inside the dataset. "image" column represents the image content and "label" column represents the corresponding label of image.
print("column names:", dataset.get_col_names())

# since the original image is not decoded, apply decode first on "image" column
dataset = dataset.map(vision.Decode(), input_columns=["image"])

# check results
print(">>>>> after decode")
for data, label in dataset:
    print(data.shape, label)

# let's do some transforms on dataset
# apply resize on images
dataset = dataset.map(vision.Resize(size=(48, 48)), input_columns=["image"])

# check results
print(">>>>> after resize")
images = []
for image, label in dataset:
    images.append(image.asnumpy())
    print(image.shape, label)

plot(images[:5], first_origin=False)
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/imageset.zip (45 kB)

file_sizes: 100%|███████████████████████████| 45.7k/45.7k [00:00<00:00, 996kB/s]
Extracting zip file...
Successfully downloaded / unzipped to ./
column names: ['image', 'label']
>>>>> after decode
(64, 64, 3) 0
(64, 64, 3) 0
(64, 64, 3) 0
(64, 64, 3) 1
(64, 64, 3) 1
(64, 64, 3) 1
(64, 64, 3) 1
(64, 64, 3) 2
(64, 64, 3) 2
(64, 64, 3) 2
(64, 64, 3) 3
(64, 64, 3) 3
(64, 64, 3) 3
(64, 64, 3) 4
(64, 64, 3) 4
(64, 64, 3) 4
>>>>> after resize
(48, 48, 3) 0
(48, 48, 3) 0
(48, 48, 3) 0
(48, 48, 3) 1
(48, 48, 3) 1
(48, 48, 3) 1
(48, 48, 3) 1
(48, 48, 3) 2
(48, 48, 3) 2
(48, 48, 3) 2
(48, 48, 3) 3
(48, 48, 3) 3
(48, 48, 3) 3
(48, 48, 3) 4
(48, 48, 3) 4
(48, 48, 3) 4
../../../_images/api_python_samples_dataset_vision_gallery_54_1.png