{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 损失函数\n", "\n", "[![下载Notebook](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_notebook.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.8/tutorials/zh_cn/advanced/network/mindspore_loss.ipynb) [![下载样例代码](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_download_code.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.8/tutorials/zh_cn/advanced/network/mindspore_loss.py) [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/r1.8/tutorials/source_zh_cn/advanced/network/loss.ipynb)\n", "\n", "损失函数,亦称目标函数,用于衡量预测值与真实值差异的程度。\n", "\n", "在深度学习中,模型训练就是通过不断迭代来缩小损失函数值的过程。因此,在模型训练过程中损失函数的选择非常重要,一个好的损失函数能有效提升模型的性能。\n", "\n", "`mindspore.nn`模块中提供了许多[通用损失函数](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/mindspore.nn.html#损失函数),但这些通用损失函数无法满足所有需求,很多情况需要用户自定义所需的损失函数。因此,本教程介绍如何自定义损失函数。\n", "\n", "![lossfun.png](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/tutorials/source_zh_cn/advanced/network/images/loss_function.png)\n", "\n", "## 内置损失函数\n", "\n", "首先介绍`mindspore.nn`模块中内置的[损失函数](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/mindspore.nn.html#损失函数)。\n", "\n", "下例以`nn.L1Loss`为例,计算预测值和目标值之间的平均绝对误差:\n", "\n", "$$\\ell(x, y) = L = \\{l_1,\\dots,l_N\\}^\\top, \\quad \\text{with } l_n = \\left| x_n - y_n \\right|$$\n", "\n", "其中N为数据集中的`batch_size`值。\n", "\n", "$$\\ell(x, y) =\n", " \\begin{cases}\n", " \\operatorname{mean}(L), & \\text{if reduction} = \\text{'mean';}\\\\\n", " \\operatorname{sum}(L), & \\text{if reduction} = \\text{'sum'.}\n", " \\end{cases}$$\n", "\n", "`nn.L1Loss`中的参数`reduction`取值可为`mean`,`sum`,或`none`。若`reduction` 为`mean`或`sum`,则输出一个经过均值或求和后的标量Tensor(降维);若`reduction`为`none`,则所输出Tensor的shape为广播后的shape。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2021-12-29T03:42:22.717822Z", "start_time": "2021-12-29T03:42:20.636585Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "loss: 1.5\n", "loss_sum: 9.0\n", "loss_none:\n", " [[1. 0. 2.]\n", " [1. 2. 3.]]\n" ] } ], "source": [ "import numpy as np\n", "import mindspore.nn as nn\n", "import mindspore as ms\n", "\n", "# 输出loss均值\n", "loss = nn.L1Loss()\n", "# 输出loss和\n", "loss_sum = nn.L1Loss(reduction='sum')\n", "# 输出loss原值\n", "loss_none = nn.L1Loss(reduction='none')\n", "\n", "input_data = ms.Tensor(np.array([[1, 2, 3], [2, 3, 4]]).astype(np.float32))\n", "target_data = ms.Tensor(np.array([[0, 2, 5], [3, 1, 1]]).astype(np.float32))\n", "\n", "print(\"loss:\", loss(input_data, target_data))\n", "print(\"loss_sum:\", loss_sum(input_data, target_data))\n", "print(\"loss_none:\\n\", loss_none(input_data, target_data))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 自定义损失函数\n", "\n", "自定义损失函数的方法有两种:一是基于`nn.Cell`来定义损失函数;二是`nn.LossBase`来定义损失函数。`nn.LossBase`继承自`nn.Cell`,额外提供了`get_loss`方法,利用`reduction`参数对损失值求和或求均值,输出一个标量。\n", "\n", "下面将分别使用继承`Cell`和继承`LossBase`的方法,来定义平均绝对误差损失函数(Mean Absolute Error,MAE),MAE算法的公式如下所示:\n", "\n", "$$ loss= \\frac{1}{m}\\sum_{i=1}^m\\lvert y_i-f(x_i) \\rvert$$\n", "\n", "上式中$f(x)$为预测值,$y$为样本真实值,$loss$为预测值与真实值之间距离的平均值。\n", "\n", "### 基于`nn.Cell`构造损失函数\n", "\n", "`nn.Cell`是MindSpore的基类,不但可用于构建网络,还可用于定义损失函数。使用`nn.Cell`定义损失函数的过程与定义一个普通的网络相似,差别在于,其执行逻辑部分要计算的是前向网络输出与真实值之间的误差。\n", "\n", "下面展示怎样基于`nn.Cell`自定义损失函数`MAELoss`:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-12-29T03:42:22.729232Z", "start_time": "2021-12-29T03:42:22.723517Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.033333335\n" ] } ], "source": [ "import mindspore.ops as ops\n", "\n", "class MAELoss(nn.Cell):\n", " \"\"\"自定义损失函数MAELoss\"\"\"\n", "\n", " def __init__(self):\n", " \"\"\"初始化\"\"\"\n", " super(MAELoss, self).__init__()\n", " self.abs = ops.Abs()\n", " self.reduce_mean = ops.ReduceMean()\n", "\n", " def construct(self, base, target):\n", " \"\"\"调用算子\"\"\"\n", " x = self.abs(base - target)\n", " return self.reduce_mean(x)\n", "\n", "loss = MAELoss()\n", "\n", "input_data = ms.Tensor(np.array([0.1, 0.2, 0.3]).astype(np.float32)) # 生成预测值\n", "target_data = ms.Tensor(np.array([0.1, 0.2, 0.2]).astype(np.float32)) # 生成真实值\n", "\n", "output = loss(input_data, target_data)\n", "print(output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 基于`nn.LossBase`构造损失函数\n", "\n", "基于[nn.LossBase](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/nn/mindspore.nn.LossBase.html#mindspore.nn.LossBase)构造损失函数`MAELoss`与基于`nn.Cell`构造损失函数的过程类似,都要重写`__init__`方法和`construct`方法。\n", "\n", "`nn.LossBase`可使用方法`get_loss`将`reduction`应用于损失计算。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2021-12-29T03:42:22.766767Z", "start_time": "2021-12-29T03:42:22.759510Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.033333335\n" ] } ], "source": [ "class MAELoss(nn.LossBase):\n", " \"\"\"自定义损失函数MAELoss\"\"\"\n", "\n", " def __init__(self, reduction=\"mean\"):\n", " \"\"\"初始化并求loss均值\"\"\"\n", " super(MAELoss, self).__init__(reduction)\n", " self.abs = ops.Abs() # 求绝对值算子\n", "\n", " def construct(self, base, target):\n", " x = self.abs(base - target)\n", " return self.get_loss(x) # 返回loss均值\n", "\n", "loss = MAELoss()\n", "\n", "input_data = ms.Tensor(np.array([0.1, 0.2, 0.3]).astype(np.float32)) # 生成预测值\n", "target_data = ms.Tensor(np.array([0.1, 0.2, 0.2]).astype(np.float32)) # 生成真实值\n", "\n", "output = loss(input_data, target_data)\n", "print(output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 损失函数与模型训练\n", "\n", "损失函数`MAELoss`自定义完成后,可使用MindSpore的接口[Model](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/mindspore/mindspore.Model.html#mindspore.Model)中`train`接口进行模型训练,构造`Model`时需传入前向网络、损失函数和优化器,`Model`会在内部将它们关联起来,生成一个可用于训练的网络模型。\n", "\n", "在`Model`中,前向网络和损失函数通过[nn.WithLossCell](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/nn/mindspore.nn.WithLossCell.html#mindspore.nn.WithLossCell)关联起来,`nn.WithLossCell`支持两个输入,分别为`data`和`label`。" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2021-12-29T03:42:23.488075Z", "start_time": "2021-12-29T03:42:23.312491Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch:[ 0/ 1], step:[ 1/ 10], loss:[9.169/9.169], time:365.966 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 2/ 10], loss:[5.861/7.515], time:0.806 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 3/ 10], loss:[8.759/7.930], time:0.768 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 4/ 10], loss:[9.503/8.323], time:1.080 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 5/ 10], loss:[8.541/8.367], time:0.762 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 6/ 10], loss:[9.158/8.499], time:0.707 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 7/ 10], loss:[9.168/8.594], time:0.900 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 8/ 10], loss:[6.828/8.373], time:1.184 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 9/ 10], loss:[7.149/8.237], time:0.962 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 10/ 10], loss:[6.342/8.048], time:1.273 ms, lr:0.00500\n", "Epoch time: 390.358 ms, per step time: 39.036 ms, avg loss: 8.048\n" ] } ], "source": [ "import mindspore as ms\n", "from mindspore import dataset as ds\n", "from mindspore.common.initializer import Normal\n", "from mindvision.engine.callback import LossMonitor\n", "\n", "def get_data(num, w=2.0, b=3.0):\n", " \"\"\"生成数据及对应标签\"\"\"\n", " for _ in range(num):\n", " x = np.random.uniform(-10.0, 10.0)\n", " noise = np.random.normal(0, 1)\n", " y = x * w + b + noise\n", " yield np.array([x]).astype(np.float32), np.array([y]).astype(np.float32)\n", "\n", "def create_dataset(num_data, batch_size=16):\n", " \"\"\"加载数据集\"\"\"\n", " dataset = ds.GeneratorDataset(list(get_data(num_data)), column_names=['data', 'label'])\n", " dataset = dataset.batch(batch_size)\n", " return dataset\n", "\n", "class LinearNet(nn.Cell):\n", " \"\"\"定义线性回归网络\"\"\"\n", " def __init__(self):\n", " super(LinearNet, self).__init__()\n", " self.fc = nn.Dense(1, 1, Normal(0.02), Normal(0.02))\n", "\n", " def construct(self, x):\n", " return self.fc(x)\n", "\n", "ds_train = create_dataset(num_data=160)\n", "net = LinearNet()\n", "loss = MAELoss()\n", "opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9)\n", "\n", "# 使用model接口将网络、损失函数和优化器关联起来\n", "model = ms.Model(net, loss, opt)\n", "model.train(epoch=1, train_dataset=ds_train, callbacks=[LossMonitor(0.005)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 多标签损失函数与模型训练\n", "\n", "上述定义了一个简单的平均绝对误差损失函数`MAELoss`,但许多深度学习应用的数据集较复杂,如目标检测网络Faster R-CNN的数据中就包含多个标签,而非简单的一条数据对应一个标签,这时损失函数的定义和使用略有不同。\n", "\n", "本节介绍在多标签数据集场景下,如何定义多标签损失函数(Multi label loss function),并使用Model进行模型训练。\n", "\n", "### 多标签数据集\n", "\n", "下例通过`get_multilabel_data`函数拟合两组线性数据$y1$和$y2$,拟合的目标函数为:\n", "\n", "$$f(x)=2x+3$$\n", "\n", "最终数据集应随机分布于函数周边,这里按以下公式的方式生成,其中`noise`为服从标准正态分布的随机值。`get_multilabel_data`函数返回数据$x$、$y1$和$y2$:\n", "\n", "$$f(x)=2x+3+noise$$\n", "\n", "通过`create_multilabel_dataset`生成多标签数据集,并将`GeneratorDataset`中的`column_names`参数设置为['data', 'label1', 'label2'],最终返回的数据集格式为一条数据`data`对应两个标签`label1`和`label2`。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from mindspore import dataset as ds\n", "\n", "def get_multilabel_data(num, w=2.0, b=3.0):\n", " for _ in range(num):\n", " x = np.random.uniform(-10.0, 10.0)\n", " noise1 = np.random.normal(0, 1)\n", " noise2 = np.random.normal(-1, 1)\n", " y1 = x * w + b + noise1\n", " y2 = x * w + b + noise2\n", " yield np.array([x]).astype(np.float32), np.array([y1]).astype(np.float32), np.array([y2]).astype(np.float32)\n", "\n", "def create_multilabel_dataset(num_data, batch_size=16):\n", " dataset = ds.GeneratorDataset(list(get_multilabel_data(num_data)), column_names=['data', 'label1', 'label2'])\n", " dataset = dataset.batch(batch_size) # 每个batch有16个数据\n", " return dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 多标签损失函数\n", "\n", "针对上一步创建的多标签数据集,定义多标签损失函数`MAELossForMultiLabel`。\n", "\n", "$$ loss1= \\frac{1}{m}\\sum_{i=1}^m\\lvert y1_i-f(x_i) \\rvert$$\n", "\n", "$$ loss2= \\frac{1}{m}\\sum_{i=1}^m\\lvert y2_i-f(x_i) \\rvert$$\n", "\n", "$$ loss = \\frac{(loss1 + loss2)}{2}$$\n", "\n", "上式中,$f(x)$ 为样例标签的预测值,$y1$ 和 $y2$ 为样例标签的真实值,$loss1$ 为预测值与真实值 $y1$ 之间距离的平均值,$loss2$ 为预测值与真实值 $y2$ 之间距离的平均值 ,$loss$ 为损失值 $loss1$ 与损失值 $loss2$ 平均值。\n", "\n", "在`MAELossForMultiLabel`中的`construct`方法的输入有三个,预测值`base`,真实值`target1`和`target2`,在`construct`中分别计算预测值与真实值`target1`、预测值与真实值`target2`之间的误差,将两误差取平均后作为最终的损失函数值.\n", "\n", "示例代码如下:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "class MAELossForMultiLabel(nn.LossBase):\n", " def __init__(self, reduction=\"mean\"):\n", " super(MAELossForMultiLabel, self).__init__(reduction)\n", " self.abs = ops.Abs()\n", "\n", " def construct(self, base, target1, target2):\n", " x1 = self.abs(base - target1)\n", " x2 = self.abs(base - target2)\n", " return (self.get_loss(x1) + self.get_loss(x2))/2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 多标签模型训练\n", "\n", "使用`Model`关联指定的前向网络、损失函数和优化器时,因`Model`内默认使用的`nn.WithLossCell`只接受两个输入:`data`和`label`,故不适用于多标签场景。\n", "\n", "在多标签场景下,若想使用`Model`进行模型训练,则需事先把前向网络与多标签损失函数关联起来,即自定义损失网络。\n", "\n", "- 定义损失网络\n", "\n", "下例展示了如何定义损失网络`CustomWithLossCell`,其中`__init__`方法的两个参数`backbone`和`loss_fn`分别表示前向网络和损失函数,`construct`方法的输入分别为样例输入`data`和样例真实标签`label1`、`label2`,将样例输入`data`传给前向网络`backbone`,将预测值和两标签值传给损失函数`loss_fn`。" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "class CustomWithLossCell(nn.Cell):\n", " def __init__(self, backbone, loss_fn):\n", " super(CustomWithLossCell, self).__init__(auto_prefix=False)\n", " self._backbone = backbone\n", " self._loss_fn = loss_fn\n", "\n", " def construct(self, data, label1, label2):\n", " output = self._backbone(data)\n", " return self._loss_fn(output, label1, label2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- 定义网络模型并训练\n", "\n", "使用Model连接前向网络、多标签损失函数和优化器时,`Model`的网络`network`指定为自定义的损失网络`loss_net`,损失函数`loss_fn`不指定,优化器仍使用`Momentum`。\n", "\n", "在未指定`loss_fn`时,`Model`会默认`network`内部已实现损失函数的逻辑,不再在内部使用`nn.WithLossCell`关联前向网络和损失函数。" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2021-12-29T03:42:24.079033Z", "start_time": "2021-12-29T03:42:23.851418Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch:[ 0/ 1], step:[ 1/ 10], loss:[10.329/10.329], time:290.788 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 2/ 10], loss:[10.134/10.231], time:0.813 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 3/ 10], loss:[9.862/10.108], time:2.410 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 4/ 10], loss:[11.182/10.377], time:1.154 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 5/ 10], loss:[8.571/10.015], time:1.137 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 6/ 10], loss:[7.763/9.640], time:0.928 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 7/ 10], loss:[7.542/9.340], time:1.001 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 8/ 10], loss:[8.644/9.253], time:1.156 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 9/ 10], loss:[5.815/8.871], time:1.908 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 10/ 10], loss:[5.086/8.493], time:1.575 ms, lr:0.00500\n", "Epoch time: 323.467 ms, per step time: 32.347 ms, avg loss: 8.493\n" ] } ], "source": [ "ds_train = create_multilabel_dataset(num_data=160)\n", "net = LinearNet()\n", "\n", "# 定义多标签损失函数\n", "loss = MAELossForMultiLabel()\n", "\n", "# 定义损失网络,连接前向网络和多标签损失函数\n", "loss_net = CustomWithLossCell(net, loss)\n", "\n", "# 定义优化器\n", "opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9)\n", "\n", "# 定义Model,多标签场景下Model无需指定损失函数\n", "model = ms.Model(network=loss_net, optimizer=opt)\n", "\n", "model.train(epoch=1, train_dataset=ds_train, callbacks=[LossMonitor(0.005)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "本章节简单讲解了多标签数据集场景下,如何定义损失函数并使用Model进行模型训练。在很多其他场景中,也可采用此类方法进行模型训练。" ] } ], "metadata": { "kernelspec": { "display_name": "MindSpore", "language": "python", "name": "mindspore" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }