MindSpore Case Study | AnimeGAN2 for Animation Style Transfer

MindSpore Case Study | AnimeGAN2 for Animation Style Transfer

MindSpore Case Study | AnimeGAN2 for Animation Style Transfer

Author: Zhang Tengfei | School: Tianjin University

Animation is a common art form in our daily life, widely used in advertising, movies, and kids education, among other fields. At present, animation production primarily depends on manual implementation, which is labor-intensive and requires highly specialized artistic skills. For animation artists, creating high-quality animation works requires careful consideration of lines, textures, colors, and shadows, making the whole process both challenging and time-consuming. Therefore, the automation technology capable of transforming real-life photos into high-quality animation-style images holds significant value. It not only enables artists to concentrate more on their creative work, but also simplifies the process for regular individuals to create their own animation works. This case provides a comprehensive explanation of the AnimeGAN model, including a detailed walkthrough of its algorithms and an analysis of its strengths and weaknesses in animation style transfer.

Model Overview

AnimeGAN is a study from Wuhan University and Hubei University of Technology. It combines neural style transfer with a generative adversarial network (GAN) to animate real-life images. This model was proposed in paper AnimeGAN: A Novel Lightweight GAN for Photo Animation. The generator is a symmetric encoding and decoding structure that comprises standard convolutions, depthwise separable convolutions, inverted residual blocks (IRBs), and upsampling and downsampling modules. The discriminator consists of standard convolutions.

Network Features

AnimeGAN has the following improvements:

1. The problem of high-frequency artifacts in generated images is solved.

2. The model is easy to train and can achieve the effect described in the paper.

3. The number of parameters of the generator network is further reduced (generator size now: 8.07 MB).

4. It uses high-quality style data from BD movies as much as possible.

Data Preparation

The dataset contains 6656 real landscape images and three animation styles: Hayao, Shinkai, Paprika. Each animation style is generated by randomly cropping video frames from the corresponding movie. In addition, the dataset also includes various sizes of images for testing purposes. The following figure shows the dataset information.

The following shows some images in the dataset.

After the dataset is downloaded and decompressed, its directory structure is as follows:

This model uses the VGG 19 network for image feature extraction and loss function calculation, so we need to load parameters of the pre-trained network.

After downloading the pre-trained VGG 19 network, place the vgg.ckpt file in the same directory as this file.

Data Preprocessing

Animation images with smooth edges are required for loss function calculation. The dataset mentioned above already contains such images. To create an animation dataset by yourself, you can use the following code to generate the required animation images with smooth edges:

from src.animeganv2_utils.edge_smooth import make_edge_smooth

# Animation image directory

style_dir = './dataset/Sakura/style'

# Output image directory

output_dir = './dataset/Sakura/smooth'

# Size of each output image

size = 256

# Smooth image. The output result is stored in the smooth folder.

make_edge_smooth(style_dir, output_dir, size)

Training Dataset Visualization

import argparse

import matplotlib.pyplot as plt

from src.process_datasets.animeganv2_dataset import AnimeGANDataset

import numpy as np

# Load parameters.

parser = argparse.ArgumentParser()

parser.add_argument('--dataset', default='Hayao', choices=['Hayao', 'Shinkai', 'Paprika'], type=str)

parser.add_argument('--data_dir', default='./dataset', type=str)

parser.add_argument('--batch_size', default=4, type=int)

parser.add_argument('--debug_samples', default=0, type=int)

parser.add_argument('--num_parallel_workers', default=1, type=int)

args = parser.parse_args(args=[])

plt.figure()

# Load the dataset.

data = AnimeGANDataset(args)

data = data.run()

iter = next(data.create_tuple_iterator())

# Perform cyclic processing.

for i in range(1, 5):

plt.subplot(1, 4, i)

temp = np.clip(iter[i - 1][0].asnumpy().transpose(2, 1, 0), 0, 1)

plt.imshow(temp)

plt.axis("off")

Mean(B, G, R) of Hayao are [-4.4346958 -8.66591597 13.10061177]

Dataset: real 6656 style 1792, smooth 1792

Network Building

After data processing, let's build the network. According to the AnimeGAN paper, all model weights should be randomly initialized according to a normal distribution with mean of 0 and sigma of 0.02.

Generator

The function of generator G is to transform real-life photos into animation-style images. In practice, this is implemented by the convolution, depthwise separable convolution, IRBs, and upsampling and downsampling modules. The network architecture is as follows.

import os

import mindspore.nn as nn

from src.models.upsample import UpSample

from src.models.conv2d_block import ConvBlock

from src.models.inverted_residual_block import InvertedResBlock

class Generator(nn.Cell):

"""AnimeGAN network generator"""

def __init__(self):

super(Generator, self).__init__()

has_bias = False

self.generator = nn.SequentialCell()

self.generator.append(ConvBlock(3, 32, kernel_size=7))

self.generator.append(ConvBlock(32, 64, stride=2))

self.generator.append(ConvBlock(64, 128, stride=2))

self.generator.append(ConvBlock(128, 128))

self.generator.append(ConvBlock(128, 128))

self.generator.append(InvertedResBlock(128, 256))

self.generator.append(InvertedResBlock(256, 256))

self.generator.append(InvertedResBlock(256, 256))

self.generator.append(InvertedResBlock(256, 256))

self.generator.append(ConvBlock(256, 128))

self.generator.append(UpSample(128, 128))

self.generator.append(ConvBlock(128, 128))

self.generator.append(UpSample(128, 64))

self.generator.append(ConvBlock(64, 64))

self.generator.append(ConvBlock(64, 32, kernel_size=7))

self.generator.append(

nn.Conv2d(32, 3, kernel_size=1, stride=1, pad_mode='same', padding=0,

weight_init=Normal(mean=0, sigma=0.02), has_bias=has_bias))

self.generator.append(nn.Tanh())

def construct(self, x):

out1 = self.generator(x)

return out1

Discriminator

Discriminator D is actually a binary network model that outputs the probability of determining that an image is a real-life image. It processes the image through a series of Conv2d, LeakyReLU, and InstanceNorm layers, and finally outputs the probability through a Conv2d layer.

import mindspore.nn as nn

from mindspore.common.initializer import Normal

class Discriminator(nn.Cell):

"""AnimeGAN network discriminator"""

def __init__(self, args):

super(Discriminator, self).__init__()

self.name = f'discriminator_{args.dataset}'

self.has_bias = False

channels = args.ch // 2

layers = [

nn.Conv2d(3, channels, kernel_size=3, stride=1, pad_mode='same', padding=0,

weight_init=Normal(mean=0, sigma=0.02), has_bias=self.has_bias),

nn.LeakyReLU(alpha=0.2)

]

for _ in range(1, args.n_dis):

layers += [

nn.Conv2d(channels, channels * 2, kernel_size=3, stride=2, pad_mode='same', padding=0,

weight_init=Normal(mean=0, sigma=0.02), has_bias=self.has_bias),

nn.LeakyReLU(alpha=0.2),

nn.Conv2d(channels * 2, channels * 4, kernel_size=3, stride=1, pad_mode='same', padding=0,

weight_init=Normal(mean=0, sigma=0.02), has_bias=self.has_bias),

nn.InstanceNorm2d(channels * 4, affine=False),

nn.LeakyReLU(alpha=0.2),

]

channels *= 4

layers += [

nn.Conv2d(channels, channels, kernel_size=3, stride=1, pad_mode='same', padding=0,

weight_init=Normal(mean=0, sigma=0.02), has_bias=self.has_bias),

nn.InstanceNorm2d(channels, affine=False),

nn.LeakyReLU(alpha=0.2),

nn.Conv2d(channels, 1, kernel_size=3, stride=1, pad_mode='same', padding=0,

weight_init=Normal(mean=0, sigma=0.02), has_bias=self.has_bias),

]

self.discriminate = nn.SequentialCell(layers)

def construct(self, x):

return self.discriminate(x)

Loss Function

The loss function includes the adversarial loss, content loss, grayscale style loss, and color reconstruction loss. Different losses have different weight coefficients. The overall loss function is expressed as follows:

import mindspore

from src.losses.gram_loss import GramLoss

from src.losses.color_loss import ColorLoss

from src.losses.vgg19 import Vgg

def vgg19(args, num_classes=1000):

"""Load the parameters of the pre-trained VGG19 model."""

# Build the network.

net = Vgg([64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512], num_classes=num_classes,

batch_norm=True)

# Load the model.

param_dict = load_checkpoint(args.vgg19_path)

load_param_into_net(net, param_dict)

net.requires_grad = False

return net

class GeneratorLoss(nn.Cell):

"""Connect the generator and loss function."""

def __init__(self, discriminator, generator, args):

super(GeneratorLoss, self).__init__(auto_prefix=True)

self.discriminator = discriminator

self.generator = generator

self.content_loss = nn.L1Loss()

self.gram_loss = GramLoss()

self.color_loss = ColorLoss()

self.wadvg = args.wadvg

self.wadvd = args.wadvd

self.wcon = args.wcon

self.wgra = args.wgra

self.wcol = args.wcol

self.vgg19 = vgg19(args)

self.adv_type = args.gan_loss

self.bce_loss = nn.BCELoss()

self.relu = nn.ReLU()

self.adv_type = args.gan_loss

def construct(self, img, anime_gray):

"""Construct the loss calculation structure of the generator."""

fake_img = self.generator(img)

fake_d = self.discriminator(fake_img)

fake_feat = self.vgg19(fake_img)

anime_feat = self.vgg19(anime_gray)

img_feat = self.vgg19(img)

result = self.wadvg * self.adv_loss_g(fake_d) + \

self.wcon * self.content_loss(img_feat, fake_feat) + \

self.wgra * self.gram_loss(anime_feat, fake_feat) + \

self.wcol * self.color_loss(img, fake_img)

return result

def adv_loss_g(self, pred):

"""Select a loss function type."""

if self.adv_type == 'hinge':

return -mindspore.numpy.mean(pred)

if self.adv_type == 'lsgan':

return mindspore.numpy.mean(mindspore.numpy.square(pred - 1.0))

if self.adv_type == 'normal':

return self.bce_loss(pred, mindspore.numpy.zeros_like(pred))

return mindspore.numpy.mean(mindspore.numpy.square(pred - 1.0))

Discriminator Loss

class DiscriminatorLoss(nn.Cell):

"""Connect the discriminator and loss function."""

def __init__(self, discriminator, generator, args):

nn.Cell.__init__(self, auto_prefix=True)

self.discriminator = discriminator

self.generator = generator

self.content_loss = nn.L1Loss()

self.gram_loss = nn.L1Loss()

self.color_loss = ColorLoss()

self.wadvg = args.wadvg

self.wadvd = args.wadvd

self.wcon = args.wcon

self.wgra = args.wgra

self.wcol = args.wcol

self.vgg19 = vgg19(args)

self.adv_type = args.gan_loss

self.bce_loss = nn.BCELoss()

self.relu = nn.ReLU()

def construct(self, img, anime, anime_gray, anime_smt_gray):

"""Construct the loss calculation structure of the discriminator."""

fake_img = self.generator(img)

fake_d = self.discriminator(fake_img)

real_anime_d = self.discriminator(anime)

real_anime_gray_d = self.discriminator(anime_gray)

real_anime_smt_gray_d = self.discriminator(anime_smt_gray)

return self.wadvd * (

1.7 * self.adv_loss_d_real(real_anime_d) +

1.7 * self.adv_loss_d_fake(fake_d) +

1.7 * self.adv_loss_d_fake(real_anime_gray_d) +

1.0 * self.adv_loss_d_fake(real_anime_smt_gray_d)

)

def adv_loss_d_real(self, pred):

"""Loss type of a real animation image"""

if self.adv_type == 'hinge':

return mindspore.numpy.mean(self.relu(1.0 - pred))

if self.adv_type == 'lsgan':

return mindspore.numpy.mean(mindspore.numpy.square(pred - 1.0))

if self.adv_type == 'normal':

return self.bce_loss(pred, mindspore.numpy.ones_like(pred))

return mindspore.numpy.mean(mindspore.numpy.square(pred - 1.0))

def adv_loss_d_fake(self, pred):

"""Loss type of the generated animation image"""

if self.adv_type == 'hinge':

return mindspore.numpy.mean(self.relu(1.0 + pred))

if self.adv_type == 'lsgan':

return mindspore.numpy.mean(mindspore.numpy.square(pred))

if self.adv_type == 'normal':

return self.bce_loss(pred, mindspore.numpy.zeros_like(pred))

return mindspore.numpy.mean(mindspore.numpy.square(pred))

Model Implementation

Due to the particularity of the GAN structure, its loss is the multi-output form of the discriminator and generator, which makes the GAN different from a common classification network. MindSpore requires that operations related to the loss function and optimizer be considered as subclasses of nn.Cell, so that you can customize the AnimeGAN class to connect the network and the loss function.

class AnimeGAN(nn.Cell):

"""Define the AnimeGAN network."""

def __init__(self, my_train_one_step_cell_for_d, my_train_one_step_cell_for_g):

super(AnimeGAN, self).__init__(auto_prefix=True)

self.my_train_one_step_cell_for_g = my_train_one_step_cell_for_g

self.my_train_one_step_cell_for_d = my_train_one_step_cell_for_d

def construct(self, img, anime, anime_gray, anime_smt_gray):

output_d_loss = self.my_train_one_step_cell_for_d(img, anime, anime_gray, anime_smt_gray)

output_g_loss = self.my_train_one_step_cell_for_g(img, anime_gray)

return output_d_loss, output_g_loss

Model Training

Training is divided into two parts: discriminator training and generator training. The discriminator is trained to improve the probability of discriminating real images to the greatest extent. The generator is trained to generate better fake animation images. Both can achieve the optimal results by minimizing the loss function.

import argparse

import os

import cv2

import numpy as np

import mindspore

from mindspore import Tensor

from mindspore import float32 as dtype

from mindspore import nn

from tqdm import tqdm

from src.models.generator import Generator

from src.models.discriminator import Discriminator

from src.models.animegan import AnimeGAN

from src.animeganv2_utils.pre_process import denormalize_input

from src.losses.loss import GeneratorLoss, DiscriminatorLoss

from src.process_datasets.animeganv2_dataset import AnimeGANDataset

# Load parameters.

parser = argparse.ArgumentParser(description='train')

parser.add_argument('--device_target', default='Ascend', choices=['CPU', 'GPU', 'Ascend'], type=str)

parser.add_argument('--device_id', default=0, type=int)

parser.add_argument('--dataset', default='Paprika', choices=['Hayao', 'Shinkai', 'Paprika'], type=str)

parser.add_argument('--data_dir', default='./dataset', type=str)

parser.add_argument('--checkpoint_dir', default='./checkpoints', type=str)

parser.add_argument('--vgg19_path', default='./vgg.ckpt', type=str)

parser.add_argument('--save_image_dir', default='./images', type=str)

parser.add_argument('--resume', default=False, type=bool)

parser.add_argument('--phase', default='train', type=str)

parser.add_argument('--epochs', default=2, type=int)

parser.add_argument('--init_epochs', default=5, type=int)

parser.add_argument('--batch_size', default=4, type=int)

parser.add_argument('--num_parallel_workers', default=1, type=int)

parser.add_argument('--save_interval', default=1, type=int)

parser.add_argument('--debug_samples', default=0, type=int)

parser.add_argument('--lr_g', default=2.0e-4, type=float)

parser.add_argument('--lr_d', default=4.0e-4, type=float)

parser.add_argument('--init_lr', default=1.0e-3, type=float)

parser.add_argument('--gan_loss', default='lsgan', choices=['lsgan', 'hinge', 'bce'], type=str)

parser.add_argument('--wadvg', default=1.7, type=float, help='Adversarial loss weight for G')

parser.add_argument('--wadvd', default=300, type=float, help='Adversarial loss weight for D')

parser.add_argument('--wcon', default=1.8, type=float, help='Content loss weight')

parser.add_argument('--wgra', default=3.0, type=float, help='Gram loss weight')

parser.add_argument('--wcol', default=10.0, type=float, help='Color loss weight')

parser.add_argument('--img_ch', default=3, type=int, help='The size of image channel')

parser.add_argument('--ch', default=64, type=int, help='Base channel number per layer')

parser.add_argument('--n_dis', default=3, type=int, help='The number of discriminator layer')

args = parser.parse_args(args=[])

# Instantiate the generator and discriminator.

generator = Generator()

discriminator = Discriminator(args.ch, args.n_dis)

# Set up two separate optimizers, one for D and the other for G.

optimizer_g = nn.Adam(generator.trainable_params(), learning_rate=args.lr_g, beta1=0.5, beta2=0.999)

optimizer_d = nn.Adam(discriminator.trainable_params(), learning_rate=args.lr_d, beta1=0.5, beta2=0.999)

# Instantiate WithLossCell.

net_d_with_criterion = DiscriminatorLoss(discriminator, generator, args)

net_g_with_criterion = GeneratorLoss(discriminator, generator, args)

# Instantiate TrainOneStepCell.

my_train_one_step_cell_for_d = nn.TrainOneStepCell(net_d_with_criterion, optimizer_d)

my_train_one_step_cell_for_g = nn.TrainOneStepCell(net_g_with_criterion, optimizer_g)

animegan = AnimeGAN(my_train_one_step_cell_for_d, my_train_one_step_cell_for_g)

animegan.set_train()

# Load the dataset.

data = AnimeGANDataset(args)

data = data.run()

size = data.get_dataset_size()

for epoch in range(args.epochs):

iters = 0

# Read data for each round of training.

for img, anime, anime_gray, anime_smt_gray in tqdm(data.create_tuple_iterator()):

img = Tensor(img, dtype=dtype)

anime = Tensor(anime, dtype=dtype)

anime_gray = Tensor(anime_gray, dtype=dtype)

anime_smt_gray = Tensor(anime_smt_gray, dtype=dtype)

net_d_loss, net_g_loss = animegan(img, anime, anime_gray, anime_smt_gray)

if iters % 50 == 0:

# Output training records.

print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f' % (

epoch + 1, args.epochs, iters, size, net_d_loss.asnumpy().min(), net_g_loss.asnumpy().min()))

# After each epoch ends, use the generator to generate a group of images.

if (epoch % args.save_interval) == 0 and (iters == size - 1):

stylized = denormalize_input(generator(img)).asnumpy()

no_stylized = denormalize_input(img).asnumpy()

imgs = cv2.cvtColor(stylized[0, :, :, :].transpose(1, 2, 0), cv2.COLOR_RGB2BGR)

imgs1 = cv2.cvtColor(no_stylized[0, :, :, :].transpose(1, 2, 0), cv2.COLOR_RGB2BGR)

for i in range(1, args.batch_size):

imgs = np.concatenate(

(imgs, cv2.cvtColor(stylized[i, :, :, :].transpose(1, 2, 0), cv2.COLOR_RGB2BGR)), axis=1)

imgs1 = np.concatenate(

(imgs1, cv2.cvtColor(no_stylized[i, :, :, :].transpose(1, 2, 0), cv2.COLOR_RGB2BGR)), axis=1)

cv2.imwrite(

os.path.join(args.save_image_dir, args.dataset, 'epoch_' + str(epoch) + '.jpg'),

np.concatenate((imgs1, imgs), axis=0))

# Save the network model parameters as a CKPT file.

mindspore.save_checkpoint(generator, os.path.join(args.checkpoint_dir, args.dataset,

'netG_' + str(epoch) + '.ckpt'))

iters += 1

Mean(B, G, R) of Paprika are [-22.43617309 -0.19372649 22.62989958]

Dataset: real 6656 style 1553, smooth 1553

Model Inference

Run the following code and input a real-life landscape image into the network to generate an animation image:

import argparse

import os

import cv2

from mindspore import Tensor

from mindspore import float32 as dtype

from mindspore import load_checkpoint, load_param_into_net

from mindspore.train.model import Model

from tqdm import tqdm

from src.models.generator import Generator

from src.animeganv2_utils.pre_process import transform, inverse_transform_infer

# Load parameters.

parser = argparse.ArgumentParser(description='infer')

parser.add_argument('--device_target', default='Ascend', choices=['CPU', 'GPU', 'Ascend'], type=str)

parser.add_argument('--device_id', default=0, type=int)

parser.add_argument('--infer_dir', default='./dataset/test/real', type=str)

parser.add_argument('--infer_output', default='./dataset/output', type=str)

parser.add_argument('--ckpt_file_name', default='./checkpoints/Hayao/netG_30.ckpt', type=str)

args = parser.parse_args(args=[])

# Instantiate the generator.

net = Generator()

# Obtain model parameters from the file and load them to the network.

param_dict = load_checkpoint(args.ckpt_file_name)

load_param_into_net(net, param_dict)

data = os.listdir(args.infer_dir)

bar = tqdm(data)

model = Model(net)

if not os.path.exists(args.infer_output):

os.mkdir(args.infer_output)

# Read and process images cyclically.

for img_path in bar:

img = transform(os.path.join(args.infer_dir, img_path))

img = Tensor(img, dtype=dtype)

output = model.predict(img)

img = inverse_transform_infer(img)

output = inverse_transform_infer(output)

output = cv2.resize(output, (img.shape[1], img.shape[0]))

# Save the generated image.

cv2.imwrite(os.path.join(args.infer_output, img_path), output)

print('Successfully output images in ' + args.infer_output)

Model inference results of each style:

Video Processing

In the following method, the format of the input video file is MP4. After the video is processed, the sound is not retained.

import argparse

import cv2

from mindspore import Tensor

from mindspore import float32 as dtype

from mindspore import load_checkpoint, load_param_into_net

from mindspore.train.model import Model

from tqdm import tqdm

from src.models.generator import Generator

from src.animeganv2_utils.adjust_brightness import adjust_brightness_from_src_to_dst

from src.animeganv2_utils.pre_process import preprocessing, convert_image, inverse_image

# Load parameters. Set video_input and video_output to the actual input and output video paths, and select an inference model for video_ckpt_file_name.

parser = argparse.ArgumentParser(description='video2anime')

parser.add_argument('--device_target', default='GPU', choices=['CPU', 'GPU', 'Ascend'], type=str)

parser.add_argument('--device_id', default=0, type=int)

parser.add_argument('--video_ckpt_file_name', default='./checkpoints/Hayao/netG_30.ckpt', type=str)

parser.add_argument('--video_input', default='./video/test.mp4', type=str)

parser.add_argument('--video_output', default='./video/output.mp4', type=str)

parser.add_argument('--output_format', default='mp4v', type=str)

parser.add_argument('--img_size', default=[256, 256], type=list, help='The size of image: H and W')

args = parser.parse_args(args=[])

# Instantiate the generator.

net = Generator()

param_dict = load_checkpoint(args.video_ckpt_file_name)

# Read the video file.

vid = cv2.VideoCapture(args.video_input)

total = int(vid.get(cv2.CAP_PROP_FRAME_COUNT))

fps = int(vid.get(cv2.CAP_PROP_FPS))

codec = cv2.VideoWriter_fourcc(*args.output_format)

# Obtain model parameters from the file and load them to the network.

load_param_into_net(net, param_dict)

model = Model(net)

ret, img = vid.read()

img = preprocessing(img, args.img_size)

height, width = img.shape[:2]

# Set the resolution of the output video.

out = cv2.VideoWriter(args.video_output, codec, fps, (width, height))

pbar = tqdm(total=total)

vid.set(cv2.CAP_PROP_POS_FRAMES, 0)

# Process video frames.

while ret:

ret, frame = vid.read()

if frame is None:

print('Warning: got empty frame.')

continue

img = convert_image(frame, args.img_size)

img = Tensor(img, dtype=dtype)

fake_img = model.predict(img).asnumpy()

fake_img = inverse_image(fake_img)

fake_img = adjust_brightness_from_src_to_dst(fake_img, cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))

# Save the video file.

out.write(cv2.cvtColor(fake_img, cv2.COLOR_BGR2RGB))

pbar.update(1)

pbar.close()

vid.release()

Algorithm Process

References

[1] Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2414-2423). [2] Johnson, J., Alahi, A., & Fei-Fei, L. (2016, October). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (pp. 694-711). Springer, Cham. [3] Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., & Yang, M. H. (2017). Diversified texture synthesis with feed-forward networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3920-3928). [4] Chen, Y., Lai, Y. K., & Liu, Y. J. (2018). Cartoongan: Generative adversarial networks for photo cartoonization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9465-9474). [5] Li, Y., Liu, M. Y., Li, X., Yang, M. H., & Kautz, J. (2018). A closed-form solution to photorealistic image stylization. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 453-468).

For more MindSpore application cases, visit https://www.mindspore.cn/en.