[{"data":1,"prerenderedAt":524},["ShallowReactive",2],{"content-query-ZkvBTcLUTp":3},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"date":10,"cover":11,"type":12,"category":13,"body":14,"_type":518,"_id":519,"_source":520,"_file":521,"_stem":522,"_extension":523},"/technology-blogs/en/2529","en",false,"","A Practice of Using the Automatic Colorization Algorithm","This article shares a practice of using Colorization for automatic coloring of grayscale images.","2023-04-02","https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/e56978dedcfb4089ae6c6fb066f96d34.png","technology-blogs","Practices",{"type":15,"children":16,"toc":515},"root",[17,25,35,60,68,73,78,83,93,124,132,163,170,179,187,201,206,214,219,226,233,238,245,264,271,276,283,288,295,300,319,326,331,338,356,363,368,375,380,387,394,399,406,411,418,429,441,448,465,472,477,484,491,496,503,510],{"type":18,"tag":19,"props":20,"children":22},"element","h1",{"id":21},"a-practice-of-using-the-automatic-colorization-algorithm",[23],{"type":24,"value":8},"text",{"type":18,"tag":26,"props":27,"children":28},"p",{},[29],{"type":18,"tag":30,"props":31,"children":32},"strong",{},[33],{"type":24,"value":34},"Model Description",{"type":18,"tag":26,"props":36,"children":37},{},[38,40,49,51,58],{"type":24,"value":39},"The Colorization algorithm (",{"type":18,"tag":41,"props":42,"children":46},"a",{"href":43,"rel":44},"https://github.com/mindspore-courses/applications/tree/master/Colorization",[45],"nofollow",[47],{"type":24,"value":48},"project code",{"type":24,"value":50},") that uses the convolutional neural network (CNN) structure is a study from the University of California. This algorithm realizes automatic coloring of grayscale images. It was proposed by Richard Zhang et al. in the paper ",{"type":18,"tag":41,"props":52,"children":55},{"href":53,"rel":54},"https://arxiv.org/pdf/1603.08511.pdf",[45],[56],{"type":24,"value":57},"Colorful Image Colorization",{"type":24,"value":59}," which was published at the European Conference on Computer Vision (ECCV) in 2016. The model they proposed consists of 8 convolutional layers. Each convolutional layer consists of 2 or 3 repeated convolutional layers and ReLU layers, followed by a BatchNorm layer. The network does not contain pooling layers.",{"type":18,"tag":26,"props":61,"children":62},{},[63],{"type":18,"tag":30,"props":64,"children":65},{},[66],{"type":24,"value":67},"Network Features",{"type":18,"tag":26,"props":69,"children":70},{},[71],{"type":24,"value":72},"A suitable loss function is designed to deal with the multi-modal uncertainty in coloring, which maintains the color diversity.",{"type":18,"tag":26,"props":74,"children":75},{},[76],{"type":24,"value":77},"The image coloring task is transformed into a self-supervised representation learning task.",{"type":18,"tag":26,"props":79,"children":80},{},[81],{"type":24,"value":82},"State-of-the-art results are achieved on some benchmark models.",{"type":18,"tag":84,"props":85,"children":87},"pre",{"code":86},"encode_layer = NNEncLayer(opt) \n\n    boost_layer = PriorBoostLayer(opt) \n\n    non_gray_mask = NonGrayMaskLayer() \n\n    net = ColorizationModel() \n\n    net_opt = nn.Adam(net.trainable_params(), learning_rate=opt.learning_rate) \n\n    net_with_criterion = NetLoss(net) \n\n    scale_sense = nn.FixedLossScaleUpdateCell(1) \n\n    my_train_one_step_cell_for_net = nn.TrainOneStepWithLossScaleCell(net_with_criterion, net_opt, \n\n                                                                      scale_sense=scale_sense) \n\n    colormodel = ColorModel(my_train_one_step_cell_for_net) \n\n    colormodel.set_train() \n",[88],{"type":18,"tag":89,"props":90,"children":91},"code",{"__ignoreMap":7},[92],{"type":24,"value":86},{"type":18,"tag":26,"props":94,"children":95},{},[96,98,103,105,110,112,116,118,122],{"type":24,"value":97},"The main principle of the algorithm is as follows: Feed the ",{"type":18,"tag":30,"props":99,"children":100},{},[101],{"type":24,"value":102},"L",{"type":24,"value":104}," channel of a grayscale image in LAB color space into the model to infer the ",{"type":18,"tag":30,"props":106,"children":107},{},[108],{"type":24,"value":109},"AB",{"type":24,"value":111}," channel, and then combine the original ",{"type":18,"tag":30,"props":113,"children":114},{},[115],{"type":24,"value":102},{"type":24,"value":117}," channel with the inferred ",{"type":18,"tag":30,"props":119,"children":120},{},[121],{"type":24,"value":109},{"type":24,"value":123}," channel to obtain a colored image.",{"type":18,"tag":26,"props":125,"children":126},{},[127],{"type":18,"tag":128,"props":129,"children":131},"img",{"alt":7,"src":130},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/4b1909d6000248069b4b06d795807c66.png",[],{"type":18,"tag":26,"props":133,"children":134},{},[135,137,141,143,147,149,154,156,161],{"type":24,"value":136},"RGB is a common image format that uses three color channels: red, green, and blue. The three are combined to reproduce a broad variety of colors. In the LAB color space, channel ",{"type":18,"tag":30,"props":138,"children":139},{},[140],{"type":24,"value":102},{"type":24,"value":142}," indicates the image lightness, whose value ranges from 0 to 100. A larger value indicates a brighter color. The value of ",{"type":18,"tag":30,"props":144,"children":145},{},[146],{"type":24,"value":109},{"type":24,"value":148}," ranges from –128 to +128. ",{"type":18,"tag":30,"props":150,"children":151},{},[152],{"type":24,"value":153},"A",{"type":24,"value":155}," indicates the red-green component of a color, and ",{"type":18,"tag":30,"props":157,"children":158},{},[159],{"type":24,"value":160},"B",{"type":24,"value":162}," the blue-yellow component of a color.",{"type":18,"tag":26,"props":164,"children":165},{},[166],{"type":18,"tag":128,"props":167,"children":169},{"alt":7,"src":168},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/ad0a2d4b41e24de19a303518b77397af.png",[],{"type":18,"tag":26,"props":171,"children":172},{},[173],{"type":18,"tag":174,"props":175,"children":176},"em",{},[177],{"type":24,"value":178},"S__ource: Linshang Technology",{"type":18,"tag":26,"props":180,"children":181},{},[182],{"type":18,"tag":30,"props":183,"children":184},{},[185],{"type":24,"value":186},"Data Preparation",{"type":18,"tag":26,"props":188,"children":189},{},[190,192,199],{"type":24,"value":191},"This practice uses subsets of the ",{"type":18,"tag":41,"props":193,"children":196},{"href":194,"rel":195},"https://www.image-net.org/",[45],[197],{"type":24,"value":198},"ImageNet dataset",{"type":24,"value":200}," (152 GB) as the training dataset and test dataset. The training dataset contains 1,000 categories and about 1.2 million images in total. The test dataset contains 50,000 images.",{"type":18,"tag":26,"props":202,"children":203},{},[204],{"type":24,"value":205},"Visualize the training dataset.",{"type":18,"tag":84,"props":207,"children":209},{"code":208},"import os \n\nimport argparse \n\nfrom tqdm import tqdm \n\nimport numpy as np \n\nimport matplotlib.pyplot as plt \n\nimport mindspore \n\nfrom src.process_datasets.data_generator import ColorizationDataset \n\n  \n\n  \n\n# Load parameters.\n\nparser = argparse.ArgumentParser() \n\nparser.add_argument('--image_dir', type=str, default='./dataset/train', help='path to dataset') \n\nparser.add_argument('--batch_size', type=int, default=4) \n\nparser.add_argument('--num_parallel_workers', type=int, default=1) \n\nparser.add_argument('--shuffle', type=bool, default=True) \n\nargs = parser.parse_args(args=[]) \n\nplt.figure() \n\n  \n\n# Load the dataset.\n\ndataset = ColorizationDataset(args.image_dir, args.batch_size, args.shuffle, args.num_parallel_workers) \n\ndata = dataset.run() \n\nshow_data = next(data.create_tuple_iterator()) \n\nshow_images_original, _ = show_data \n\nshow_images_original = show_images_original.asnumpy() \n\n# Loop processing\n\nfor i in range(1, 5): \n\n    plt.subplot(1, 4, i) \n\n    temp = show_images_original[i-1] \n\n    temp = np.clip(temp, 0, 1) \n\n    plt.imshow(temp) \n\n    plt.axis(\"off\") \n\n    plt.subplots_adjust(wspace=0.05, hspace=0) \nplt.show()\n",[210],{"type":18,"tag":89,"props":211,"children":212},{"__ignoreMap":7},[213],{"type":24,"value":208},{"type":18,"tag":26,"props":215,"children":216},{},[217],{"type":24,"value":218},"A message is displayed, indicating that a dependency is missing.",{"type":18,"tag":26,"props":220,"children":221},{},[222],{"type":18,"tag":128,"props":223,"children":225},{"alt":7,"src":224},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/a0b1992c533745ee83d2463d0102b412.png",[],{"type":18,"tag":26,"props":227,"children":228},{},[229],{"type":18,"tag":128,"props":230,"children":232},{"alt":7,"src":231},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/46ee409930f14f8399fddcb00199b937.png",[],{"type":18,"tag":26,"props":234,"children":235},{},[236],{"type":24,"value":237},"The visualization result is as follows:",{"type":18,"tag":26,"props":239,"children":240},{},[241],{"type":18,"tag":128,"props":242,"children":244},{"alt":7,"src":243},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/bc24d42bdba54716a2143db30ecd2240.png",[],{"type":18,"tag":26,"props":246,"children":247},{},[248,250,255,257,262],{"type":24,"value":249},"Train the model by executing ",{"type":18,"tag":30,"props":251,"children":252},{},[253],{"type":24,"value":254},"train.py",{"type":24,"value":256}," in the ",{"type":18,"tag":30,"props":258,"children":259},{},[260],{"type":24,"value":261},"src",{"type":24,"value":263}," directory.",{"type":18,"tag":26,"props":265,"children":266},{},[267],{"type":18,"tag":128,"props":268,"children":270},{"alt":7,"src":269},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/f2d85c1295e646a99f2bfe008d1ac8fa.png",[],{"type":18,"tag":26,"props":272,"children":273},{},[274],{"type":24,"value":275},"A message is displayed, indicating that another dependency is missing.",{"type":18,"tag":26,"props":277,"children":278},{},[279],{"type":18,"tag":128,"props":280,"children":282},{"alt":7,"src":281},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/6128e327da1b49ba964fc7cb0a93ad9a.png",[],{"type":18,"tag":26,"props":284,"children":285},{},[286],{"type":24,"value":287},"Install the dependency and continue with the training.",{"type":18,"tag":26,"props":289,"children":290},{},[291],{"type":18,"tag":128,"props":292,"children":294},{"alt":7,"src":293},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/1106c420003343319477b996e9a157b0.png",[],{"type":18,"tag":26,"props":296,"children":297},{},[298],{"type":24,"value":299},"An error is reported again.",{"type":18,"tag":26,"props":301,"children":302},{},[303,305,310,312,317],{"type":24,"value":304},"Check the code. It is found that the value of ",{"type":18,"tag":30,"props":306,"children":307},{},[308],{"type":24,"value":309},"device_id",{"type":24,"value":311}," is set to 1. However, given that there is only one GPU card, change the value to ",{"type":18,"tag":30,"props":313,"children":314},{},[315],{"type":24,"value":316},"0",{"type":24,"value":318},".",{"type":18,"tag":26,"props":320,"children":321},{},[322],{"type":18,"tag":128,"props":323,"children":325},{"alt":7,"src":324},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/60a664372e60427d88531073a1c823d3.png",[],{"type":18,"tag":26,"props":327,"children":328},{},[329],{"type":24,"value":330},"Let's proceed.",{"type":18,"tag":26,"props":332,"children":333},{},[334],{"type":18,"tag":128,"props":335,"children":337},{"alt":7,"src":336},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/fc6a4d65debd4798abd161012f670b92.png",[],{"type":18,"tag":26,"props":339,"children":340},{},[341,343,348,350,355],{"type":24,"value":342},"Our GPU memory is 12 GB only, which is insufficient when the batch size is set to ",{"type":18,"tag":30,"props":344,"children":345},{},[346],{"type":24,"value":347},"128",{"type":24,"value":349},". So reduce the value to ",{"type":18,"tag":30,"props":351,"children":352},{},[353],{"type":24,"value":354},"64",{"type":24,"value":318},{"type":18,"tag":26,"props":357,"children":358},{},[359],{"type":18,"tag":128,"props":360,"children":362},{"alt":7,"src":361},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/2db2755554984ad2883edba8304a9a05.png",[],{"type":18,"tag":26,"props":364,"children":365},{},[366],{"type":24,"value":367},"Check the GPU status. The power is 200 W, and the memory usage is approximately 12 GB.",{"type":18,"tag":26,"props":369,"children":370},{},[371],{"type":18,"tag":128,"props":372,"children":374},{"alt":7,"src":373},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/d14ae3c7438848818eef39860ff2ca79.png",[],{"type":18,"tag":26,"props":376,"children":377},{},[378],{"type":24,"value":379},"Due to limitations on a single device, the training has not finished after about 12 hours.",{"type":18,"tag":26,"props":381,"children":382},{},[383],{"type":18,"tag":128,"props":384,"children":386},{"alt":7,"src":385},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/470132e85703460bad3153752de68d86.png",[],{"type":18,"tag":26,"props":388,"children":389},{},[390],{"type":18,"tag":128,"props":391,"children":393},{"alt":7,"src":392},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/01bea0e421b94f32af44e2b079c9030c.png",[],{"type":18,"tag":26,"props":395,"children":396},{},[397],{"type":24,"value":398},"The size of the saved model files is 22 GB in total.",{"type":18,"tag":26,"props":400,"children":401},{},[402],{"type":18,"tag":128,"props":403,"children":405},{"alt":7,"src":404},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/a33521a33b914058a6e2f5c28479c5f3.png",[],{"type":18,"tag":26,"props":407,"children":408},{},[409],{"type":24,"value":410},"Stop training and select the latest weight for inference.",{"type":18,"tag":26,"props":412,"children":413},{},[414],{"type":18,"tag":128,"props":415,"children":417},{"alt":7,"src":416},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/7c8e8ee6ffb74bfe860255df73802236.png",[],{"type":18,"tag":26,"props":419,"children":420},{},[421,423,428],{"type":24,"value":422},"Execute ",{"type":18,"tag":30,"props":424,"children":425},{},[426],{"type":24,"value":427},"infer.py",{"type":24,"value":318},{"type":18,"tag":26,"props":430,"children":431},{},[432,434,439],{"type":24,"value":433},"During dataset reading, it is found that there is no folder in dataset ",{"type":18,"tag":30,"props":435,"children":436},{},[437],{"type":24,"value":438},"valtest",{"type":24,"value":440},". So create a folder.",{"type":18,"tag":26,"props":442,"children":443},{},[444],{"type":18,"tag":128,"props":445,"children":447},{"alt":7,"src":446},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/8d01075b23da45f1a4e816c9810338a2.png",[],{"type":18,"tag":26,"props":449,"children":450},{},[451,453,457,459,463],{"type":24,"value":452},"Change the value of ",{"type":18,"tag":30,"props":454,"children":455},{},[456],{"type":24,"value":309},{"type":24,"value":458}," to ",{"type":18,"tag":30,"props":460,"children":461},{},[462],{"type":24,"value":316},{"type":24,"value":464}," as previously.",{"type":18,"tag":26,"props":466,"children":467},{},[468],{"type":18,"tag":128,"props":469,"children":471},{"alt":7,"src":470},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/31532d7e06ff4fbe9477044f2f182aa5.png",[],{"type":18,"tag":26,"props":473,"children":474},{},[475],{"type":24,"value":476},"The middle part shows the inference result of the pretrained MindSpore model, and the rightmost part shows the inference result of the trained model in this practice.",{"type":18,"tag":26,"props":478,"children":479},{},[480],{"type":18,"tag":128,"props":481,"children":483},{"alt":7,"src":482},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/a86d1510c157469ca4c4169f294ebad0.png",[],{"type":18,"tag":26,"props":485,"children":486},{},[487],{"type":18,"tag":128,"props":488,"children":490},{"alt":7,"src":489},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/97d4e0f67c1e45e59eed004337cffdf3.png",[],{"type":18,"tag":26,"props":492,"children":493},{},[494],{"type":24,"value":495},"Uh, it seems that the input should be grayscale images. Let's get it correct and check the result.",{"type":18,"tag":26,"props":497,"children":498},{},[499],{"type":18,"tag":128,"props":500,"children":502},{"alt":7,"src":501},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/bf8531df27c34637b3a868b9f2e30eb2.png",[],{"type":18,"tag":26,"props":504,"children":505},{},[506],{"type":18,"tag":128,"props":507,"children":509},{"alt":7,"src":508},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/06/05/7deb091217774738954e82804467bcd3.png",[],{"type":18,"tag":26,"props":511,"children":512},{},[513],{"type":24,"value":514},"Well, as is seen, there's still large room for improvement in our model accuracy due to a lot of reasons including the limited training time. Stay tuned for more inspiring practices.",{"title":7,"searchDepth":516,"depth":516,"links":517},4,[],"markdown","content:technology-blogs:en:2529.md","content","technology-blogs/en/2529.md","technology-blogs/en/2529","md",1776506106297]