{ "cells": [ { "cell_type": "markdown", "source": [ "# 运行方式\n", "\n", "`Ascend` `GPU` `模型运行`\n", "\n", "[![在线运行](https://gitee.com/mindspore/docs/raw/r1.6/resource/_static/logo_modelarts.png)](https://authoring-modelarts-cnnorth4.huaweicloud.com/console/lab?share-url-b64=aHR0cHM6Ly9taW5kc3BvcmUtd2Vic2l0ZS5vYnMuY24tbm9ydGgtNC5teWh1YXdlaWNsb3VkLmNvbS9ub3RlYm9vay9tb2RlbGFydHMvcHJvZ3JhbW1pbmdfZ3VpZGUvbWluZHNwb3JlX3J1bi5pcHluYg==&imageid=65f636a0-56cf-49df-b941-7d2a07ba8c8c) [![下载Notebook](https://gitee.com/mindspore/docs/raw/r1.6/resource/_static/logo_notebook.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.6/programming_guide/zh_cn/mindspore_run.ipynb) [![下载样例代码](https://gitee.com/mindspore/docs/raw/r1.6/resource/_static/logo_download_code.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.6/programming_guide/zh_cn/mindspore_run.py) [![查看源文件](https://gitee.com/mindspore/docs/raw/r1.6/resource/_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/r1.6/docs/mindspore/programming_guide/source_zh_cn/run.ipynb)" ], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 概述\n", "\n", "执行主要有三种方式:单算子、普通函数和网络训练模型。\n", "\n", "> 本文示例适用于GPU和Ascend环境。\n", "\n", "## 执行单算子\n", "\n", "执行单个算子,并打印相关结果。\n", "\n", "代码样例如下:" ], "metadata": {} }, { "cell_type": "code", "execution_count": 1, "source": [ "import numpy as np\n", "import mindspore.nn as nn\n", "from mindspore import context, Tensor\n", "\n", "context.set_context(mode=context.GRAPH_MODE, device_target=\"GPU\")\n", "\n", "conv = nn.Conv2d(3, 4, 3, bias_init='zeros')\n", "input_data = Tensor(np.ones([1, 3, 5, 5]).astype(np.float32))\n", "output = conv(input_data)\n", "print(output.asnumpy())" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "[[[[ 0.01647821 0.05310077 0.05310077 0.05310077 0.05118286]\n", " [ 0.03007141 0.0657572 0.0657572 0.0657572 0.04350833]\n", " [ 0.03007141 0.0657572 0.0657572 0.0657572 0.04350833]\n", " [ 0.03007141 0.0657572 0.0657572 0.0657572 0.04350833]\n", " [ 0.01847598 0.04713529 0.04713529 0.04713529 0.03720935]]\n", "\n", " [[-0.03362034 -0.06124294 -0.06124294 -0.06124294 -0.04334928]\n", " [-0.02676596 -0.08040315 -0.08040315 -0.08040315 -0.06846539]\n", " [-0.02676596 -0.08040315 -0.08040315 -0.08040315 -0.06846539]\n", " [-0.02676596 -0.08040315 -0.08040315 -0.08040315 -0.06846539]\n", " [-0.00557975 -0.06808633 -0.06808633 -0.06808633 -0.08389233]]\n", "\n", " [[-0.01602227 0.02266152 0.02266152 0.02266152 0.06030601]\n", " [-0.06764769 -0.02966945 -0.02966945 -0.02966945 0.04861854]\n", " [-0.06764769 -0.02966945 -0.02966945 -0.02966945 0.04861854]\n", " [-0.06764769 -0.02966945 -0.02966945 -0.02966945 0.04861854]\n", " [-0.06528193 -0.03500666 -0.03500666 -0.03500666 0.02858584]]\n", "\n", " [[-0.03102187 -0.03846825 -0.03846825 -0.03846825 -0.00858424]\n", " [-0.04270145 -0.070785 -0.070785 -0.070785 -0.05362675]\n", " [-0.04270145 -0.070785 -0.070785 -0.070785 -0.05362675]\n", " [-0.04270145 -0.070785 -0.070785 -0.070785 -0.05362675]\n", " [-0.01230605 -0.04999261 -0.04999261 -0.04999261 -0.04718029]]]]\n" ] } ], "metadata": { "scrolled": true } }, { "cell_type": "markdown", "source": [ "> 由于weight初始化存在随机因素,实际输出结果可能不同,仅供参考。" ], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 执行普通函数\n", "\n", "将若干算子组合成一个函数,然后直接通过函数调用的方式执行这些算子,并打印相关结果,如下例所示。\n", "\n", "代码样例如下:" ], "metadata": {} }, { "cell_type": "code", "execution_count": 2, "source": [ "import numpy as np\n", "from mindspore import context, Tensor\n", "import mindspore.ops as ops\n", "\n", "context.set_context(mode=context.GRAPH_MODE, device_target=\"GPU\")\n", "\n", "def add_func(x, y):\n", " z = ops.add(x, y)\n", " z = ops.add(z, x)\n", " return z\n", "\n", "x = Tensor(np.ones([3, 3], dtype=np.float32))\n", "y = Tensor(np.ones([3, 3], dtype=np.float32))\n", "output = add_func(x, y)\n", "print(output.asnumpy())" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "[[3. 3. 3.]\n", " [3. 3. 3.]\n", " [3. 3. 3.]]\n" ] } ], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 执行网络模型\n", "\n", "MindSpore的[Model接口](https://www.mindspore.cn/docs/api/zh-CN/r1.6/api_python/mindspore/mindspore.Model.html)是用于训练和验证的高级接口。可以将有训练或推理功能的layers组合成一个对象,通过调用`train`、`eval`、`predict`接口可以分别实现训练、推理和预测功能。\n", "\n", "> MindSpore不支持使用多线程来进行训练、推理和预测功能。\n", "\n", "用户可以根据实际需要传入网络、损失函数和优化器等初始化Model接口,还可以通过配置`amp_level`实现混合精度,配置`metrics`实现模型评估。\n", "\n", "> 执行网络模型会在执行目录下生成`kernel_meta`目录,并在执行过程中保存网络编译生成的算子缓存文件到此目录,包括`.o`,`.info`和`.json`文件。若用户再次执行相同的网络模型,或者仅有部分变化,MindSpore会自动调用`kernel_meta`目录下可复用的算子缓存文件,显著减少网络编译时间,提升执行性能。详细内容请参考[算子增量编译](https://www.mindspore.cn/docs/programming_guide/zh-CN/r1.6/incremental_operator_build.html)。\n", "\n", "在执行网络之前,先执行以下示例代码将数据集下载并解压到指定位置。" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "import os\n", "import requests\n", "\n", "requests.packages.urllib3.disable_warnings()\n", "\n", "def download_dataset(dataset_url, path):\n", " filename = dataset_url.split(\"/\")[-1]\n", " save_path = os.path.join(path, filename)\n", " if os.path.exists(save_path):\n", " return\n", " if not os.path.exists(path):\n", " os.makedirs(path)\n", " res = requests.get(dataset_url, stream=True, verify=False)\n", " with open(save_path, \"wb\") as f:\n", " for chunk in res.iter_content(chunk_size=512):\n", " if chunk:\n", " f.write(chunk)\n", " print(\"The {} file is downloaded and saved in the path {} after processing\".format(os.path.basename(dataset_url), path))\n", "\n", "train_path = \"datasets/MNIST_Data/train\"\n", "test_path = \"datasets/MNIST_Data/test\"\n", "\n", "download_dataset(\"https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/mnist/train-labels-idx1-ubyte\", train_path)\n", "download_dataset(\"https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/mnist/train-images-idx3-ubyte\", train_path)\n", "download_dataset(\"https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/mnist/t10k-labels-idx1-ubyte\", test_path)\n", "download_dataset(\"https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/mnist/t10k-images-idx3-ubyte\", test_path)" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "解压后数据集文件的目录结构如下:\n", "\n", "```text\n", "./datasets/MNIST_Data\n", "├── test\n", "│ ├── t10k-images-idx3-ubyte\n", "│ └── t10k-labels-idx1-ubyte\n", "└── train\n", " ├── train-images-idx3-ubyte\n", " └── train-labels-idx1-ubyte\n", "```" ], "metadata": {} }, { "cell_type": "markdown", "source": [ "### 执行训练模型\n", "\n", "通过调用Model的train接口可以实现训练。\n", "\n", "代码样例如下:" ], "metadata": {} }, { "cell_type": "code", "execution_count": 4, "source": [ "import os\n", "import mindspore.dataset.vision.c_transforms as CV\n", "from mindspore.dataset.vision import Inter\n", "import mindspore.dataset as ds\n", "import mindspore.dataset.transforms.c_transforms as CT\n", "import mindspore.nn as nn\n", "from mindspore import context, Model\n", "from mindspore import dtype as mstype\n", "from mindspore.common.initializer import Normal\n", "from mindspore.train.callback import LossMonitor, ModelCheckpoint, CheckpointConfig\n", "\n", "\n", "def create_dataset(data_path, batch_size=32, repeat_size=1,\n", " num_parallel_workers=1):\n", " \"\"\"\n", " create dataset for train or test\n", " \"\"\"\n", " # define dataset\n", " mnist_ds = ds.MnistDataset(data_path)\n", "\n", " resize_height, resize_width = 32, 32\n", " rescale = 1.0 / 255.0\n", " shift = 0.0\n", " rescale_nml = 1 / 0.3081\n", " shift_nml = -1 * 0.1307 / 0.3081\n", "\n", " # define map operations\n", " resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR) # Bilinear mode\n", " rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)\n", " rescale_op = CV.Rescale(rescale, shift)\n", " hwc2chw_op = CV.HWC2CHW()\n", " type_cast_op = CT.TypeCast(mstype.int32)\n", "\n", " # apply map operations on images\n", " mnist_ds = mnist_ds.map(input_columns=\"label\", operations=type_cast_op, num_parallel_workers=num_parallel_workers)\n", " mnist_ds = mnist_ds.map(input_columns=\"image\", operations=resize_op, num_parallel_workers=num_parallel_workers)\n", " mnist_ds = mnist_ds.map(input_columns=\"image\", operations=rescale_op, num_parallel_workers=num_parallel_workers)\n", " mnist_ds = mnist_ds.map(input_columns=\"image\", operations=rescale_nml_op, num_parallel_workers=num_parallel_workers)\n", " mnist_ds = mnist_ds.map(input_columns=\"image\", operations=hwc2chw_op, num_parallel_workers=num_parallel_workers)\n", "\n", " # apply DatasetOps\n", " buffer_size = 10000\n", " mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size) # 10000 as in LeNet train script\n", " mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)\n", " mnist_ds = mnist_ds.repeat(repeat_size)\n", "\n", " return mnist_ds\n", "\n", "\n", "class LeNet5(nn.Cell):\n", " \"\"\"\n", " Lenet network\n", "\n", " Args:\n", " num_class (int): Num classes. Default: 10.\n", " num_channel (int): Num channels. Default: 1.\n", "\n", " Returns:\n", " Tensor, output tensor\n", " Examples:\n", " >>> LeNet(num_class=10)\n", "\n", " \"\"\"\n", "\n", " def __init__(self, num_class=10, num_channel=1):\n", " super(LeNet5, self).__init__()\n", " self.conv1 = nn.Conv2d(num_channel, 6, 5, pad_mode='valid')\n", " self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid')\n", " self.fc1 = nn.Dense(16 * 5 * 5, 120, weight_init=Normal(0.02))\n", " self.fc2 = nn.Dense(120, 84, weight_init=Normal(0.02))\n", " self.fc3 = nn.Dense(84, num_class, weight_init=Normal(0.02))\n", " self.relu = nn.ReLU()\n", " self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)\n", " self.flatten = nn.Flatten()\n", "\n", " def construct(self, x):\n", " x = self.max_pool2d(self.relu(self.conv1(x)))\n", " x = self.max_pool2d(self.relu(self.conv2(x)))\n", " x = self.flatten(x)\n", " x = self.relu(self.fc1(x))\n", " x = self.relu(self.fc2(x))\n", " x = self.fc3(x)\n", " return x\n", "\n", "\n", "if __name__ == \"__main__\":\n", " context.set_context(mode=context.GRAPH_MODE, device_target=\"GPU\")\n", "\n", " model_path = \"./models/ckpt/mindspore_run/\"\n", " os.system(\"rm -rf {0}*.ckpt {0}*.meta {0}*.pb\".format(model_path))\n", "\n", " ds_train_path = \"./datasets/MNIST_Data/train/\"\n", " ds_train = create_dataset(ds_train_path, 32)\n", "\n", " network = LeNet5(10)\n", " net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\n", " net_opt = nn.Momentum(network.trainable_params(), 0.01, 0.9)\n", " config_ck = CheckpointConfig(save_checkpoint_steps=1875, keep_checkpoint_max=5)\n", " ckpoint_cb = ModelCheckpoint(prefix=\"checkpoint_lenet\", directory=model_path, config=config_ck)\n", " model = Model(network, net_loss, net_opt)\n", "\n", " print(\"============== Starting Training ==============\")\n", " model.train(1, ds_train, callbacks=[LossMonitor(375), ckpoint_cb], dataset_sink_mode=True)" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "============== Starting Training ==============\n", "epoch: 1 step: 375, loss is 2.2898183\n", "epoch: 1 step: 750, loss is 2.2777305\n", "epoch: 1 step: 1125, loss is 0.27802905\n", "epoch: 1 step: 1500, loss is 0.032973606\n", "epoch: 1 step: 1875, loss is 0.06105463\n" ] } ], "metadata": {} }, { "cell_type": "markdown", "source": [ "> 示例中用到的MNIST数据集的获取方法,可以参照[初学入门](https://www.mindspore.cn/tutorials/zh-CN/r1.6/quick_start.html)的下载数据集部分,下同。\n", ">\n", "> 使用PyNative模式调试, 请参考[使用PyNative模式调试](https://www.mindspore.cn/docs/programming_guide/zh-CN/r1.6/debug_in_pynative_mode.html), 包括单算子、普通函数和网络训练模型的执行。\n", ">\n", "> 使用自由控制循环的迭代次数、遍历数据集等,可以参照[构建训练与评估网络](https://www.mindspore.cn/docs/programming_guide/zh-CN/r1.6/train_and_eval.html)的自定义训练部分。" ], "metadata": {} }, { "cell_type": "markdown", "source": [ "### 执行推理模型\n", "\n", "通过调用Model的eval接口可以实现推理。为了方便评估模型的好坏,可以在Model接口初始化的时候设置评估指标Metric。\n", "\n", "Metric是用于评估模型好坏的指标。常见的主要有Accuracy、Fbeta、Precision、Recall和TopKCategoricalAccuracy等,通常情况下,一种模型指标无法全面的评估模型的好坏,一般会结合多个指标共同作用对模型进行评估。\n", "\n", "常用的内置评估指标:\n", "\n", "- `Accuracy`(准确率):是一个用于评估分类模型的指标。通俗来说,准确率是指我们的模型预测正确的结果所占的比例。 公式:\n", "\n", " $$Accuracy = (TP+TN)/(TP+TN+FP+FN)$$\n", "\n", "- `Precision`(精确率):在被识别为正类别的样本中,确实为正类别的比例。公式:\n", "\n", " $$Precision = TP/(TP+FP)$$\n", "\n", "- `Recall`(召回率):在所有正类别样本中,被正确识别为正类别的比例。 公式:\n", "\n", " $$Recall = TP/(TP+FN)$$\n", "\n", "- `Fbeta`(调和均值):综合考虑precision和recall的调和均值。公式:\n", "\n", " $$F_\\beta = (1 + \\beta^2) \\cdot \\frac{precisiont \\cdot recall}{(\\beta^2 \\cdot precision) + recall}$$\n", "\n", "- `TopKCategoricalAccuracy`(多分类TopK准确率):计算TopK分类准确率。\n", "\n", "代码样例如下:" ], "metadata": {} }, { "cell_type": "code", "execution_count": 5, "source": [ "import mindspore.dataset as ds\n", "import mindspore.dataset.transforms.c_transforms as CT\n", "import mindspore.dataset.vision.c_transforms as CV\n", "import mindspore.nn as nn\n", "from mindspore import context, Model, load_checkpoint, load_param_into_net\n", "from mindspore import dtype as mstype\n", "from mindspore.common.initializer import Normal\n", "from mindspore.dataset.vision import Inter\n", "from mindspore.nn import Accuracy, Precision\n", "\n", "\n", "class LeNet5(nn.Cell):\n", " \"\"\"\n", " Lenet network\n", "\n", " Args:\n", " num_class (int): Num classes. Default: 10.\n", " num_channel (int): Num channels. Default: 1.\n", "\n", " Returns:\n", " Tensor, output tensor\n", " Examples:\n", " >>> LeNet(num_class=10)\n", "\n", " \"\"\"\n", "\n", " def __init__(self, num_class=10, num_channel=1):\n", " super(LeNet5, self).__init__()\n", " self.conv1 = nn.Conv2d(num_channel, 6, 5, pad_mode='valid')\n", " self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid')\n", " self.fc1 = nn.Dense(16 * 5 * 5, 120, weight_init=Normal(0.02))\n", " self.fc2 = nn.Dense(120, 84, weight_init=Normal(0.02))\n", " self.fc3 = nn.Dense(84, num_class, weight_init=Normal(0.02))\n", " self.relu = nn.ReLU()\n", " self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)\n", " self.flatten = nn.Flatten()\n", "\n", " def construct(self, x):\n", " x = self.max_pool2d(self.relu(self.conv1(x)))\n", " x = self.max_pool2d(self.relu(self.conv2(x)))\n", " x = self.flatten(x)\n", " x = self.relu(self.fc1(x))\n", " x = self.relu(self.fc2(x))\n", " x = self.fc3(x)\n", " return x\n", "\n", "\n", "def create_dataset(data_path, batch_size=32, repeat_size=1,\n", " num_parallel_workers=1):\n", " \"\"\"\n", " create dataset for train or test\n", " \"\"\"\n", " # define dataset\n", " mnist_ds = ds.MnistDataset(data_path)\n", "\n", " resize_height, resize_width = 32, 32\n", " rescale = 1.0 / 255.0\n", " shift = 0.0\n", " rescale_nml = 1 / 0.3081\n", " shift_nml = -1 * 0.1307 / 0.3081\n", "\n", " # define map operations\n", " resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR) # Bilinear mode\n", " rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)\n", " rescale_op = CV.Rescale(rescale, shift)\n", " hwc2chw_op = CV.HWC2CHW()\n", " type_cast_op = CT.TypeCast(mstype.int32)\n", "\n", " # apply map operations on images\n", " mnist_ds = mnist_ds.map(input_columns=\"label\", operations=type_cast_op, num_parallel_workers=num_parallel_workers)\n", " mnist_ds = mnist_ds.map(input_columns=\"image\", operations=resize_op, num_parallel_workers=num_parallel_workers)\n", " mnist_ds = mnist_ds.map(input_columns=\"image\", operations=rescale_op, num_parallel_workers=num_parallel_workers)\n", " mnist_ds = mnist_ds.map(input_columns=\"image\", operations=rescale_nml_op, num_parallel_workers=num_parallel_workers)\n", " mnist_ds = mnist_ds.map(input_columns=\"image\", operations=hwc2chw_op, num_parallel_workers=num_parallel_workers)\n", "\n", " # apply DatasetOps\n", " buffer_size = 10000\n", " mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size) # 10000 as in LeNet train script\n", " mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)\n", " mnist_ds = mnist_ds.repeat(repeat_size)\n", "\n", " return mnist_ds\n", "\n", "\n", "if __name__ == \"__main__\":\n", " context.set_context(mode=context.GRAPH_MODE, device_target=\"GPU\")\n", "\n", " model_path = \"./models/ckpt/mindspore_run/\"\n", " ds_eval_path = \"./datasets/MNIST_Data/test/\"\n", " network = LeNet5(10)\n", " net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\n", " repeat_size = 1\n", " net_opt = nn.Momentum(network.trainable_params(), 0.01, 0.9)\n", " model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy(), \"Precision\": Precision()})\n", "\n", " print(\"============== Starting Testing ==============\")\n", " param_dict = load_checkpoint(model_path+\"checkpoint_lenet-1_1875.ckpt\")\n", " load_param_into_net(network, param_dict)\n", " ds_eval = create_dataset(ds_eval_path, 32, repeat_size)\n", "\n", " acc = model.eval(ds_eval, dataset_sink_mode=True)\n", " print(\"============== {} ==============\".format(acc))" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "============== Starting Testing ==============\n", "============== {'Accuracy': 0.960136217948718, 'Precision': array([0.95763547, 0.98059965, 0.99153439, 0.93333333, 0.97322348,\n", " 0.99385749, 0.98502674, 0.93179724, 0.8974359 , 0.97148676])} ==============\n" ] } ], "metadata": {} }, { "cell_type": "markdown", "source": [ "其中:\n", "\n", "- `load_checkpoint`:通过该接口加载CheckPoint模型参数文件,返回一个参数字典。\n", "\n", "- `checkpoint_lenet-1_1875.ckpt`:保存的CheckPoint模型文件名称。\n", "\n", "- `load_param_into_net`:通过该接口把参数加载到网络中。\n", "\n", "> `checkpoint_lenet-1_1875.ckpt`文件的保存方法,可以参考[初学入门](https://www.mindspore.cn/tutorials/zh-CN/r1.6/quick_start.html)的训练网络部分。" ], "metadata": {} } ], "metadata": { "kernelspec": { "display_name": "MindSpore", "language": "python", "name": "mindspore" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }