[{"data":1,"prerenderedAt":341},["ShallowReactive",2],{"content-query-D7oltSYxZb":3},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"date":10,"cover":11,"type":12,"body":13,"_type":335,"_id":336,"_source":337,"_file":338,"_stem":339,"_extension":340},"/technology-blogs/en/3135","en",false,"","Implementation of MindSpore-Powered Models in CV (2) — Fashion-MNIST Image Classification Experiment: Functional Programming","This experiment outlines the process of building a feedforward neural network (FNN) using MindSpore and leveraging the Fashion-MNIST dataset for model training and testing.","2024-04-12","https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2024/05/31/e66810ec74554173a77f5c5284cbe56d.png","technology-blogs",{"type":14,"children":15,"toc":332},"root",[16,24,29,38,43,48,53,61,66,71,79,87,100,108,116,121,126,131,162,172,180,185,190,195,200,205,213,218,226,238,243,251,256,264,272,277,285,290,298,303,311,319,324],{"type":17,"tag":18,"props":19,"children":21},"element","h1",{"id":20},"implementation-of-mindspore-powered-models-in-cv-2-fashion-mnist-image-classification-experiment-functional-programming",[22],{"type":23,"value":8},"text",{"type":17,"tag":25,"props":26,"children":27},"p",{},[28],{"type":23,"value":9},{"type":17,"tag":25,"props":30,"children":31},{},[32],{"type":17,"tag":33,"props":34,"children":35},"strong",{},[36],{"type":23,"value":37},"1. Objective",{"type":17,"tag":25,"props":39,"children":40},{},[41],{"type":23,"value":42},"Master the construction of a basic FNN using MindSpore.",{"type":17,"tag":25,"props":44,"children":45},{},[46],{"type":23,"value":47},"Learn how to use MindSpore to train simple image classification tasks.",{"type":17,"tag":25,"props":49,"children":50},{},[51],{"type":23,"value":52},"Learn how to use MindSpore for testing and prediction in simple image classification tasks.",{"type":17,"tag":25,"props":54,"children":55},{},[56],{"type":17,"tag":33,"props":57,"children":58},{},[59],{"type":23,"value":60},"2. FNN Principles",{"type":17,"tag":25,"props":62,"children":63},{},[64],{"type":23,"value":65},"FNN is a type of artificial neural networks. It adopts a unidirectional multi-layer structure, in which each layer contains several neurons. In this neural network, every neuron is fed signals from a neuron in the preceding layer and generates output to the next layer. Layer 0 is called the input layer, the last layer is called the output layer, and other intermediate layers are called hidden layers. The network incorporates one or more hidden layers.",{"type":17,"tag":25,"props":67,"children":68},{},[69],{"type":23,"value":70},"There is no feedback present throughout the network, with signals being transmitted in a unidirectional manner from the input layer to the output layer. The network can be depicted as a directed acyclic graph (DAG).",{"type":17,"tag":25,"props":72,"children":73},{},[74],{"type":17,"tag":75,"props":76,"children":78},"img",{"alt":7,"src":77},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2024/05/31/ad572de27bf94527ae1381fc00589c92.png",[],{"type":17,"tag":25,"props":80,"children":81},{},[82],{"type":17,"tag":33,"props":83,"children":84},{},[85],{"type":23,"value":86},"3. Experiment Environment",{"type":17,"tag":25,"props":88,"children":89},{},[90,92,98],{"type":23,"value":91},"MindSpore 2.0 or later. The MindSpore version is updated periodically, and this guide will also be updated periodically to align with the version. This experiment can be conducted on the win_x86 and Linux OSs, and can run on CPUs, GPUs, and Ascend. If you run this experiment on a local computer, refer to the ",{"type":17,"tag":93,"props":94,"children":95},"em",{},[96],{"type":23,"value":97},"MindSpore Lab Environment Setup Manual",{"type":23,"value":99}," to install MindSpore on the computer.",{"type":17,"tag":25,"props":101,"children":102},{},[103],{"type":17,"tag":33,"props":104,"children":105},{},[106],{"type":23,"value":107},"4. Data Processing",{"type":17,"tag":25,"props":109,"children":110},{},[111],{"type":17,"tag":33,"props":112,"children":113},{},[114],{"type":23,"value":115},"4.1 Dataset Preparation",{"type":17,"tag":25,"props":117,"children":118},{},[119],{"type":23,"value":120},"Fashion-MNIST is an image dataset that serves as a replacement for the MNIST handwritten digit dataset. It is provided by the research department of Zalando, a Germany fashion technology company. It covers a total of 70,000 front images of different commodities from 10 classes. The size, format, and training/test dataset division of Fashion-MNIST are the same as those of the original MNIST dataset. The training/test dataset division is 60,000/10,000, and the images are 28 x 28 x 1 grayscale images.",{"type":17,"tag":25,"props":122,"children":123},{},[124],{"type":23,"value":125},"Here is an introduction to the classic MNIST (handwritten digits) dataset. This classic dataset contains a large number of handwritten digits. For over a decade, researchers from the fields of machine learning, computer vision, artificial intelligence, and deep learning have used this dataset as one of the benchmarks for evaluating algorithms. The MNIST dataset has become one of the must-test datasets for algorithm developers. However, it is too simple. Many deep learning algorithms have achieved an accuracy of 99.6% on its test dataset.",{"type":17,"tag":25,"props":127,"children":128},{},[129],{"type":23,"value":130},"Download the following four files from the Fashion-MNIST repository on GitHub to the local computer and decompress them:",{"type":17,"tag":25,"props":132,"children":133},{},[134,139,141,146,148,153,155,160],{"type":17,"tag":33,"props":135,"children":136},{},[137],{"type":23,"value":138},"train-images-idx3-ubyte",{"type":23,"value":140}," training dataset images (47,042,560 bytes) ",{"type":17,"tag":33,"props":142,"children":143},{},[144],{"type":23,"value":145},"train-labels-idx1-ubyte",{"type":23,"value":147}," training dataset labels (61,440 bytes) ",{"type":17,"tag":33,"props":149,"children":150},{},[151],{"type":23,"value":152},"t10k-images-idx3-ubyte",{"type":23,"value":154}," test dataset images (7,843,840 bytes) ",{"type":17,"tag":33,"props":156,"children":157},{},[158],{"type":23,"value":159},"t10k-labels-idx1-ubyte",{"type":23,"value":161}," test dataset labels (12,288 bytes)",{"type":17,"tag":163,"props":164,"children":166},"pre",{"code":165},"from download import download\n\n#Download the MNIST dataset.\nurl = \"https://ascend-professional-construction-dataset.obs.cn-north-4.myhuaweicloud.com:443/deep-learning/Fashion-MNIST.zip\"\npath = download(url, \"./\", kind=\"zip\", replace=True)\n",[167],{"type":17,"tag":168,"props":169,"children":170},"code",{"__ignoreMap":7},[171],{"type":23,"value":165},{"type":17,"tag":25,"props":173,"children":174},{},[175],{"type":17,"tag":33,"props":176,"children":177},{},[178],{"type":23,"value":179},"4.2 Data Loading",{"type":17,"tag":25,"props":181,"children":182},{},[183],{"type":23,"value":184},"Import MindSpore and auxiliary modules, which are described as follows:",{"type":17,"tag":25,"props":186,"children":187},{},[188],{"type":23,"value":189},"MindSpore, used to build neural networks",{"type":17,"tag":25,"props":191,"children":192},{},[193],{"type":23,"value":194},"NumPy, used to process certain data",{"type":17,"tag":25,"props":196,"children":197},{},[198],{"type":23,"value":199},"Matplotlib, used to draw and display images",{"type":17,"tag":25,"props":201,"children":202},{},[203],{"type":23,"value":204},"struct, used to process binary files",{"type":17,"tag":163,"props":206,"children":208},{"code":207},"import os\nimport struct\nfrom easydict import EasyDict as edict\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nimport mindspore\nimport mindspore.dataset as ds\nimport mindspore.nn as nn\nfrom mindspore.train import Model, Accuracy\nfrom mindspore.train import ModelCheckpoint, CheckpointConfig, LossMonitor\nfrom mindspore import Tensor\n\nmindspore.set_context(mode=mindspore.GRAPH_MODE, device_target='Ascend')\n",[209],{"type":17,"tag":168,"props":210,"children":211},{"__ignoreMap":7},[212],{"type":23,"value":207},{"type":17,"tag":25,"props":214,"children":215},{},[216],{"type":23,"value":217},"Variable definitions:",{"type":17,"tag":163,"props":219,"children":221},{"code":220},"cfg = edict({\n    'train_size': 60000,    # training dataset size\n    'test_size': 10000,     # test dataset size\n    'channel': 1,           # number of image channels\n    'image_height': 28,     # image height\n    'image_width': 28,      # image width\n    'batch_size': 60,\n    'num_classes': 10,      # number of classes\n    'lr': 0.001,            # learning rate\n    'epoch_size': 10,       # number of training epochs\n    # Change the path here to the actual path that stores your dataset. Use the train and test folders to store the training dataset and test dataset, respectively.\n    'data_dir_train': os.path.join('./Fashion-MNIST/train/'),\n    'data_dir_test': os.path.join('./Fashion-MNIST/test/'),\n    'save_checkpoint_steps': 1,  # number of steps for saving a model\n    'keep_checkpoint_max': 3,    # maximum number of models that can be saved\n    'output_directory': './model_fashion',         # path for saving models\n    'output_prefix': \"checkpoint_fashion_forward\"  # name of a saved model file\n})\n",[222],{"type":17,"tag":168,"props":223,"children":224},{"__ignoreMap":7},[225],{"type":23,"value":220},{"type":17,"tag":25,"props":227,"children":228},{},[229,231,236],{"type":23,"value":230},"Read and process data. After being read by the data read function ",{"type":17,"tag":33,"props":232,"children":233},{},[234],{"type":23,"value":235},"read_image",{"type":23,"value":237},", the data is in the following format:",{"type":17,"tag":25,"props":239,"children":240},{},[241],{"type":23,"value":242},"Binary format of the images for training:",{"type":17,"tag":163,"props":244,"children":246},{"code":245},"[offset] [type]          [value]          [description]\n0000     32 bit integer  0x00000803(2051) magic number\n0004     32 bit integer  60000            number of images\n0008     32 bit integer  28               number of rows\n0012     32 bit integer  28               number of columns\n0016     unsigned byte   ??               pixel\n0017     unsigned byte   ??               pixel\n........\nxxxx     unsigned byte   ??               pixel\nLabel format:\n[offset] [type]          [value]          [description]\n0000     32 bit integer  0x00000801(2049) magic number (MSB first)\n0004     32 bit integer  60000            number of items\n0008     unsigned byte   ??               label\n0009     unsigned byte   ??               label\n........\nxxxx     unsigned byte   ??               label\nThe labels values are 0 to 9.\nThe code for reading data is as follows:\ndef read_image(file_name):\n    file_handle = open(file_name, \"rb\")  # Open a document in binary mode.\n    file_content = file_handle.read()    # Read data to the buffer.\n    head = struct.unpack_from('>IIII', file_content, 0)  # Take the first four integers and return a tuple.\n    offset = struct.calcsize('>IIII')\n    imgNum = head[1]   # number of images\n    width = head[2]    # width\n    height = head[3]   # height\n    bits = imgNum * width * height      # The data has 60,000 x 28 x 28 pixels.\n    bitsString = '>' + str(bits) + 'B'  # fmt format: '>47040000B'\n    imgs = struct.unpack_from(bitsString, file_content, offset)    # Take data and return a tuple.\n    imgs_array = np.array(imgs).reshape((imgNum, width * height))  # Reshape the read data into a two-dimensional array of [number of images, image pixels].\n    return imgs_array\n\n\ndef read_label(file_name):\n    file_handle = open(file_name, \"rb\")  # Open a document in binary mode.\n    file_content = file_handle.read()    # Read data to the buffer.\n    head = struct.unpack_from('>II', file_content, 0)  # Take the first two integers and return a tuple.\n    offset = struct.calcsize('>II')\n    labelNum = head[1]  # number of labels\n    bitsString = '>' + str(labelNum) + 'B'  # fmt format: '>47040000B'\n    label = struct.unpack_from(bitsString, file_content, offset)  # Take data and return a tuple.\n    return np.array(label)\n\n\ndef get_data():\n    # Obtain files.\n    train_image = os.path.join(cfg.data_dir_train, 'train-images-idx3-ubyte')\n    test_image = os.path.join(cfg.data_dir_test, \"t10k-images-idx3-ubyte\")\n    train_label = os.path.join(cfg.data_dir_train, \"train-labels-idx1-ubyte\")\n    test_label = os.path.join(cfg.data_dir_test, \"t10k-labels-idx1-ubyte\")\n    # Read data.\n    train_x = read_image(train_image)\n    test_x = read_image(test_image)\n    train_y = read_label(train_label)\n    test_y = read_label(test_label)\n    return train_x, train_y, test_x, test_y\n",[247],{"type":17,"tag":168,"props":248,"children":249},{"__ignoreMap":7},[250],{"type":23,"value":245},{"type":17,"tag":25,"props":252,"children":253},{},[254],{"type":23,"value":255},"Data preprocessing and result image display",{"type":17,"tag":163,"props":257,"children":259},{"code":258},"train_x, train_y, test_x, test_y = get_data()\n# The first dimension is the batch size data, the second dimension is the number of image channels, and the third and fourth dimensions are the height and width.\ntrain_x = train_x.reshape(-1, 1, cfg.image_height, cfg.image_width)\ntest_x = test_x.reshape(-1, 1, cfg.image_height, cfg.image_width)\n# Normalize data to values between 0 and 1.\ntrain_x = train_x / 255.0\ntest_x = test_x / 255.0\n# Modify the data format.\ntrain_x = train_x.astype('float32')\ntest_x = test_x.astype('float32')\ntrain_y = train_y.astype('int32')\ntest_y = test_y.astype('int32')\nprint('number of samples in the training dataset: ', train_x.shape[0])\nprint('number of samples in the test dataset:', test_y.shape[0])\nprint('number of channels/image length/width: ', train_x.shape[1:])\n# There are 10 classes, expressed in numbers from 0 to 9.\nprint('label style of an image: ', train_y[0])  \n\nplt.figure()\nplt.imshow(train_x[0,0,...])\nplt.colorbar()\nplt.grid(False)\nplt.show()\nUse the MindSpore GeneratorDataset API to convert data of the numpy.ndarray type into the dataset type.\n# Convert the data to the dataset type.\nXY_train = list(zip(train_x, train_y))\n# Convert the data and label to the dataset type, and set the data to x and label to y.\nds_train = ds.GeneratorDataset(XY_train, ['x', 'y'])\nds_train = ds_train.shuffle(buffer_size=cfg.train_size).batch(cfg.batch_size, drop_remainder=True)\nXY_test = list(zip(test_x, test_y))\nds_test = ds.GeneratorDataset(XY_test, ['x', 'y'])\nds_test = ds_test.shuffle(buffer_size=cfg.test_size).batch(cfg.batch_size, drop_remainder=True)\n",[260],{"type":17,"tag":168,"props":261,"children":262},{"__ignoreMap":7},[263],{"type":23,"value":258},{"type":17,"tag":25,"props":265,"children":266},{},[267],{"type":17,"tag":33,"props":268,"children":269},{},[270],{"type":23,"value":271},"5. FNN Construction",{"type":17,"tag":25,"props":273,"children":274},{},[275],{"type":23,"value":276},"The FNN is the simplest neural network architecture in which neurons are organized into layers, with each layer consisting of multiple neurons. Each neuron is connected only to a neuron at the previous layer. It receives an output of the previous layer as its input, and outputs the computation result to the next layer. The FNN is currently one of the most widely used and rapidly developing artificial neural networks. Layer 0 is called the input layer, the last layer is called the output layer, and other intermediate layers are called hidden layers. The network incorporates one or more hidden layers, which are formed by stacking fully connected layers.",{"type":17,"tag":163,"props":278,"children":280},{"code":279},"# Define an FNN.\nclass Forward_fashion(nn.Cell):\n    def __init__(self, num_class=10):  # a total of ten classes, with one image channel.\n        super(Forward_fashion, self).__init__()\n        self.num_class = num_class\n        self.flatten = nn.Flatten()\n        self.fc1 = nn.Dense(cfg.channel * cfg.image_height * cfg.image_width, 128)\n        self.relu = nn.ReLU()\n        self.fc2 = nn.Dense(128, self.num_class)\n\n    def construct(self, x):\n        x = self.flatten(x)\n        x = self.fc1(x)\n        x = self.relu(x)\n        x = self.fc2(x)\n        return x\n",[281],{"type":17,"tag":168,"props":282,"children":283},{"__ignoreMap":7},[284],{"type":23,"value":279},{"type":17,"tag":25,"props":286,"children":287},{},[288],{"type":23,"value":289},"6. Model Training",{"type":17,"tag":163,"props":291,"children":293},{"code":292},"# Build a network.\nnetwork = Forward_fashion(cfg.num_classes)\n# Define the loss function and optimizer of the model.\nnet_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\nnet_opt = nn.Adam(network.trainable_params(), cfg.lr)\n# Train the model.\nmodel = Model(network, loss_fn=net_loss, optimizer=net_opt, metrics={\"acc\"})\n\n# Define the train_loop function for training.\ndef train_loop(model, dataset, loss_fn, optimizer):\n    # Define the forward propagation function.\n    def forward_fn(data, label):\n        logits = model(data)\n        loss = loss_fn(logits, label)\n        return loss\n\n    # Define the differentiation function. Use mindspore.value_and_grad to obtain the grad_fn differentiation function, and output the loss and gradient.\n    # Since taking derivatives with respect to model parameters is involved, set grad_position to None and pass trainable parameters.\n    grad_fn = mindspore.value_and_grad(forward_fn, None, optimizer.parameters)\n\n    # Define the one-step training function.\n    def train_step(data, label):\n        loss, grads = grad_fn(data, label)\n        optimizer(grads)\n        return loss\n\n    size = dataset.get_dataset_size()\n    model.set_train()\n    for batch, (data, label) in enumerate(dataset.create_tuple_iterator()):\n        loss = train_step(data, label)\n\n        if batch % 100 == 0:\n            loss, current = loss.asnumpy(), batch\n            print(f\"loss: {loss:>7f}  [{current:>3d}/{size:>3d}]\")\n\n# Define the test_loop function for testing.\ndef test_loop(model, dataset, loss_fn):\n    num_batches = dataset.get_dataset_size()\n    model.set_train(False)\n    total, test_loss, correct = 0, 0, 0\n    for data, label in dataset.create_tuple_iterator():\n        pred = model(data)\n        total += len(data)\n        test_loss += loss_fn(pred, label).asnumpy()\n        correct += (pred.argmax(1) == label).asnumpy().sum()\n    test_loss /= num_batches\n    correct /= total\n    print(f\"Test: \\n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \\n\")\n\nepochs = cfg.epoch_size\nfor t in range(epochs):\n    print(f\"Epoch {t+1}\\n-------------------------------\")\n    train_loop(network, ds_train, net_loss, net_opt)\n    test_loop(network, ds_test, net_loss)\nprint(\"Done!\")\n",[294],{"type":17,"tag":168,"props":295,"children":296},{"__ignoreMap":7},[297],{"type":23,"value":292},{"type":17,"tag":25,"props":299,"children":300},{},[301],{"type":23,"value":302},"Use the Fashion-MNIST dataset to train the FNN model defined above.",{"type":17,"tag":25,"props":304,"children":305},{},[306],{"type":17,"tag":33,"props":307,"children":308},{},[309],{"type":23,"value":310},"7. Model Prediction and Visualization",{"type":17,"tag":163,"props":312,"children":314},{"code":313},"# Use the test dataset to evaluate the model and print the overall accuracy.\nmetric = model.eval(ds_test)\nprint(metric)\n\nclass_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',\n               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']\n\n# Take a group of samples from the test dataset and input them to the model for prediction.\ntest_ = ds_test.create_dict_iterator().__next__()\n# Use the key value to select samples.\ntest = Tensor(test_['x'], mindspore.float32)\npredictions = model.predict(test)\nsoftmax = nn.Softmax()\npredictions = softmax(predictions)\npredictions = predictions.asnumpy()\ntrue_label = test_['y'].asnumpy()\n\nfor i in range(15):\n    p_np = predictions[i, :]\n    pre_label = np.argmax(p_np)\n    print('' + str(i) + ' sample prediction result: ', class_names[pre_label], '   actual result: ', class_names[true_label[i]])\n\nVisualize the prediction results.\n",[315],{"type":17,"tag":168,"props":316,"children":317},{"__ignoreMap":7},[318],{"type":23,"value":313},{"type":17,"tag":25,"props":320,"children":321},{},[322],{"type":23,"value":323},"Visualize the prediction results and input the prediction result sequence, real label sequence, and image sequence. The goal is to display the labels in red or blue based on the predicted values. Correct: blue label; incorrect: red label. Predict 15 images with labels and display the predicted results in a bar chart, with blue representing correct predictions and red representing incorrect predictions.",{"type":17,"tag":163,"props":325,"children":327},{"code":326},"# -------------------Define the visualization function.--------------------------------\n# Input the prediction result sequence, real label sequence, and image sequence.\n# The goal is to display the labels in red or blue based on the predicted values. Correct: blue label; incorrect: red label.\ndef plot_image(predicted_label, true_label, img):\n    plt.grid(False)\n    plt.xticks([])\n    plt.yticks([])\n    # Display corresponding images.\n    plt.imshow(img, cmap=plt.cm.binary)\n    # Display the colors of the prediction results, with blue representing correct predictions and red representing incorrect predictions.\n    if predicted_label == true_label:\n        color = 'blue'\n    else:\n        color = 'red'\n    # Display the formats and styles of corresponding labels.\n    plt.xlabel('{},({})'.format(class_names[predicted_label],\n                                    class_names[true_label]), color=color)\n# Display the prediction results in a bar chart, with blue representing correct predictions and red representing incorrect predictions.\ndef plot_value_array(predicted_label, true_label,predicted_array):\n    plt.grid(False)\n    plt.xticks([])\n    plt.yticks([])\n    this_plot = plt.bar(range(10), predicted_array, color='#777777')\n    plt.ylim([0, 1])\n    this_plot[predicted_label].set_color('red')\n    this_plot[true_label].set_color('blue')\n# Predict 15 images with labels, and display the results.\nnum_rows = 5\nnum_cols = 3\nnum_images = num_rows * num_cols\nplt.figure(figsize=(2 * 2 * num_cols, 2 * num_rows))\n\nfor i in range(num_images):\n    plt.subplot(num_rows, 2 * num_cols, 2 * i + 1)\n    pred_np_ = predictions[i, :]\n    predicted_label = np.argmax(pred_np_)\n    image_single = test_['x'][i, 0, ...].asnumpy()\n    plot_image(predicted_label, true_label[i], image_single)\n    plt.subplot(num_rows, 2 * num_cols, 2 * i + 2)\n    plot_value_array(predicted_label, true_label[i], pred_np_)\nplt.show()\n",[328],{"type":17,"tag":168,"props":329,"children":330},{"__ignoreMap":7},[331],{"type":23,"value":326},{"title":7,"searchDepth":333,"depth":333,"links":334},4,[],"markdown","content:technology-blogs:en:3135.md","content","technology-blogs/en/3135.md","technology-blogs/en/3135","md",1776506110670]