{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 算子\n", "\n", "[![](https://gitee.com/mindspore/docs/raw/r1.2/docs/programming_guide/source_zh_cn/_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/r1.2/docs/programming_guide/source_zh_cn/operators.ipynb) [![](https://gitee.com/mindspore/docs/raw/r1.2/docs/programming_guide/source_zh_cn/_static/logo_notebook.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.2/programming_guide/mindspore_operators.ipynb) [![](https://gitee.com/mindspore/docs/raw/r1.2/docs/programming_guide/source_zh_cn/_static/logo_modelarts.png)](https://console.huaweicloud.com/modelarts/?region=cn-north-4#/notebook/loading?share-url-b64=aHR0cHM6Ly9vYnMuZHVhbHN0YWNrLmNuLW5vcnRoLTQubXlodWF3ZWljbG91ZC5jb20vbWluZHNwb3JlLXdlYnNpdGUvbm90ZWJvb2svbW9kZWxhcnRzL3Byb2dyYW1taW5nX2d1aWRlL21pbmRzcG9yZV9vcGVyYXRvcnMuaXB5bmI=&image_id=65f636a0-56cf-49df-b941-7d2a07ba8c8c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 概述\n", "\n", "MindSpore的算子组件,可从算子使用方式和算子功能两种维度进行划分。以下示例代码需在PyNative模式运行。\n", "\n", "## 算子使用方式\n", "\n", "算子相关接口主要包括operations、functional和composite,可通过ops直接获取到这三类算子。\n", "\n", "- operations提供单个的Primitive算子。一个算子对应一个原语,是最小的执行对象,需要实例化之后使用。\n", "\n", "- composite提供一些预定义的组合算子,以及复杂的涉及图变换的算子,如`GradOperation`。\n", "\n", "- functional提供operations和composite实例化后的对象,简化算子的调用流程。\n", "\n", "### mindspore.ops.operations\n", "\n", "operations提供了所有的Primitive算子接口,是开放给用户的最低阶算子接口。算子支持情况可查询[算子支持列表](https://www.mindspore.cn/doc/note/zh-CN/r1.2/operator_list.html)。\n", "\n", "Primitive算子也称为算子原语,它直接封装了底层的Ascend、GPU、AICPU、CPU等多种算子的具体实现,为用户提供基础算子能力。\n", "\n", "Primitive算子接口是构建高阶接口、自动微分、网络模型等能力的基础。\n", "\n", "代码样例如下:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "output = [ 1. 8. 64.]\n" ] } ], "source": [ "import numpy as np\n", "import mindspore\n", "from mindspore import Tensor\n", "import mindspore.ops.operations as P\n", "\n", "input_x = mindspore.Tensor(np.array([1.0, 2.0, 4.0]), mindspore.float32)\n", "input_y = 3.0\n", "pow = P.Pow()\n", "output = pow(input_x, input_y)\n", "print(\"output =\", output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### mindspore.ops.functional\n", "\n", "为了简化没有属性的算子的调用流程,MindSpore提供了一些算子的functional版本。入参要求参考原算子的输入输出要求。算子支持情况可以查询[算子支持列表](https://www.mindspore.cn/doc/note/zh-CN/r1.2/operator_list_ms.html#mindspore-ops-functional)。\n", "\n", "例如`P.Pow`算子,我们提供了functional版本的`F.tensor_pow`算子。\n", "\n", "使用functional的代码样例如下:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "output = [ 1. 8. 64.]\n" ] } ], "source": [ "import numpy as np\n", "import mindspore\n", "from mindspore import Tensor\n", "from mindspore.ops import functional as F\n", "\n", "input_x = mindspore.Tensor(np.array([1.0, 2.0, 4.0]), mindspore.float32)\n", "input_y = 3.0\n", "output = F.tensor_pow(input_x, input_y)\n", "print(\"output =\", output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### mindspore.ops.composite\n", "\n", "composite提供了一些算子的组合,包括`clip_by_value`和`random`相关的一些算子,以及涉及图变换的函数(`GradOperation`、`HyperMap`和`Map`等)。\n", "\n", "算子的组合可以直接像一般函数一样使用,例如使用`normal`生成一个随机分布:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "output = [[2.4911082 0.7941146 1.3117087]\n", " [0.3058231 1.7729738 1.525996 ]]\n" ] } ], "source": [ "from mindspore import dtype as mstype\n", "from mindspore.ops import composite as C\n", "from mindspore import Tensor\n", "\n", "mean = Tensor(1.0, mstype.float32)\n", "stddev = Tensor(1.0, mstype.float32)\n", "output = C.normal((2, 3), mean, stddev, seed=5)\n", "print(\"output =\", output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> 以上代码运行于MindSpore的GPU版本。\n", "\n", "针对涉及图变换的函数,用户可以使用`MultitypeFuncGraph`定义一组重载的函数,根据不同类型,采用不同实现。\n", "\n", "代码样例如下:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor [[2.4 4.2]\n", " [4.4 6.4]]\n", "scalar 3\n" ] } ], "source": [ "import numpy as np\n", "from mindspore.ops.composite import MultitypeFuncGraph\n", "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "\n", "add = MultitypeFuncGraph('add')\n", "@add.register(\"Number\", \"Number\")\n", "def add_scalar(x, y):\n", " return ops.scalar_add(x, y)\n", "\n", "@add.register(\"Tensor\", \"Tensor\")\n", "def add_tensor(x, y):\n", " return ops.tensor_add(x, y)\n", "\n", "tensor1 = Tensor(np.array([[1.2, 2.1], [2.2, 3.2]]).astype('float32'))\n", "tensor2 = Tensor(np.array([[1.2, 2.1], [2.2, 3.2]]).astype('float32'))\n", "print('tensor', add(tensor1, tensor2))\n", "print('scalar', add(1, 2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "此外,高阶函数`GradOperation`提供了根据输入的函数,求这个函数对应的梯度函数的方式,详细可以参阅[API文档](https://www.mindspore.cn/doc/api_python/zh-CN/r1.2/mindspore/ops/mindspore.ops.GradOperation.html)。\n", "\n", "### operations/functional/composite三类算子合并用法\n", "\n", "为了在使用过程中更加简便,除了以上介绍的几种用法外,我们还将`operations`,`functional`和`composite`三种算子封装到了`mindspore.ops`中,推荐直接调用`mindspore.ops`下的接口。\n", "\n", "代码样例如下:\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "import mindspore.ops.operations as P\n", "pow = P.Pow()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "import mindspore.ops as ops\n", "pow = ops.Pow()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> 以上两种写法效果相同。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 算子功能\n", "\n", "算子按功能可分为张量操作、网络操作、数组操作、图像操作、编码操作、调试操作和量化操作七个功能模块。所有的算子在Ascend AI处理器、GPU和CPU的支持情况,参见[算子支持列表](https://www.mindspore.cn/doc/note/zh-CN/r1.2/operator_list.html)。\n", "\n", "### 张量操作\n", "\n", "张量操作包括张量的结构操作和张量的数学运算。\n", "\n", "张量结构操作有:张量创建、索引切片、维度变换和合并分割。\n", "\n", "张量数学运算有:标量运算、向量运算和矩阵运算。\n", "\n", "这里以张量的数学运算和运算的广播机制为例,介绍使用方法。\n", "\n", "### 标量运算\n", "\n", "张量的数学运算符可以分为标量运算符、向量运算符以及矩阵运算符。\n", "\n", "加减乘除乘方,以及三角函数、指数、对数等常见函数,逻辑比较运算符等都是标量运算符。\n", "\n", "标量运算符的特点是对张量实施逐元素运算。\n", "\n", "有些标量运算符对常用的数学运算符进行了重载。并且支持类似NumPy的广播特性。\n", "\n", " 以下代码实现了对`input_x`作乘方数为`input_y`的乘方操作:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 1. 8. 64.]\n" ] } ], "source": [ "import numpy as np\n", "import mindspore\n", "from mindspore import Tensor\n", "\n", "input_x = mindspore.Tensor(np.array([1.0, 2.0, 4.0]), mindspore.float32)\n", "input_y = 3.0\n", "print(input_x**input_y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 加法\n", "\n", "上述代码中`input_x`和`input_y`的相加实现方式如下:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[4. 5. 7.]\n" ] } ], "source": [ "print(input_x + input_y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Element-wise乘法\n", "\n", "以下代码实现了Element-wise乘法示例:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 4. 10. 18.]\n" ] } ], "source": [ "import numpy as np\n", "import mindspore\n", "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "\n", "input_x = Tensor(np.array([1.0, 2.0, 3.0]), mindspore.float32)\n", "input_y = Tensor(np.array([4.0, 5.0, 6.0]), mindspore.float32)\n", "mul = ops.Mul()\n", "res = mul(input_x, input_y)\n", "\n", "print(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 求三角函数\n", "\n", "以下代码实现了Acos:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "import numpy as np\n", "import mindspore\n", "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "\n", "acos = ops.ACos()\n", "input_x = Tensor(np.array([0.74, 0.04, 0.30, 0.56]), mindspore.float32)\n", "output = acos(input_x)\n", "print(output)\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "输出如下:\n", "\n", "```text\n", "[0.7377037, 1.5307858, 1.2661037,0.97641146]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 向量运算\n", "\n", "向量运算符只在一个特定轴上运算,将一个向量映射到一个标量或者另外一个向量。\n", "\n", "#### Squeeze\n", "\n", "以下代码实现了压缩第3个通道维度为1的通道:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1. 1.]\n", " [1. 1.]\n", " [1. 1.]]\n" ] } ], "source": [ "import numpy as np\n", "import mindspore\n", "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "\n", "input_tensor = Tensor(np.ones(shape=[3, 2, 1]), mindspore.float32)\n", "squeeze = ops.Squeeze(2)\n", "output = squeeze(input_tensor)\n", "\n", "print(output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 矩阵运算\n", "\n", "矩阵运算包括矩阵乘法、矩阵范数、矩阵行列式、矩阵求特征值、矩阵分解等运算。\n", "\n", "#### 矩阵乘法\n", "\n", " 以下代码实现了`input_x`和`input_y`的矩阵乘法:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[3. 3. 3. 3.]]\n" ] } ], "source": [ "import numpy as np\n", "import mindspore\n", "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "\n", "input_x = Tensor(np.ones(shape=[1, 3]), mindspore.float32)\n", "input_y = Tensor(np.ones(shape=[3, 4]), mindspore.float32)\n", "matmul = ops.MatMul()\n", "output = matmul(input_x, input_y)\n", "\n", "print(output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 广播机制\n", "\n", "广播表示输入各变量channel数目不一致时,改变他们的channel数以得到结果。\n", "\n", "- 以下代码实现了广播机制的示例:\n", "\n", "```python\n", "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "\n", "shape = (2, 3)\n", "input_x = Tensor(np.array([1, 2, 3]).astype(np.float32))\n", "broadcast_to = ops.BroadcastTo(shape)\n", "output = broadcast_to(input_x)\n", "\n", "print(output)\n", "```\n", "\n", "输出如下:\n", "\n", "```text\n", "[[1. 2. 3.]\n", " [1. 2. 3.]]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 网络操作\n", "\n", "网络操作包括特征提取、激活函数、LossFunction、优化算法等。\n", "\n", "#### 特征提取\n", "\n", "特征提取是机器学习中的常见操作,核心是提取比原输入更具代表性的Tensor。\n", "\n", "卷积操作\n", "\n", "以下代码实现了常见卷积操作之一的2D convolution 操作:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[[288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]\n", " ...\n", " [288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]]]\n", "\n", " ...\n", "\n", " [[288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]\n", " ...\n", " [288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]]\n", "\n", "\n", " ...\n", "\n", "\n", " [[288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]\n", " ...\n", " [288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]\n", " [288. 288. 288. ... 288. 288. 288.]]]]\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "import mindspore\n", "\n", "input = Tensor(np.ones([10, 32, 32, 32]), mindspore.float32)\n", "weight = Tensor(np.ones([32, 32, 3, 3]), mindspore.float32)\n", "conv2d = ops.Conv2D(out_channel=32, kernel_size=3)\n", "res = conv2d(input, weight)\n", "\n", "print(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "卷积的反向传播算子操作\n", "\n", "以下代码实现了反向梯度算子传播操作的具体代码,输出存于dout, weight:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[[ 32. 64. 96. ... 96. 64. 32.]\n", " [ 64. 128. 192. ... 192. 128. 64.]\n", " [ 96. 192. 288. ... 288. 192. 96.]\n", " ...\n", " [ 96. 192. 288. ... 288. 192. 96.]\n", " [ 64. 128. 192. ... 192. 128. 64.]\n", " [ 32. 64. 96. ... 96. 64. 32.]]\n", "\n", " ...\n", "\n", " [[ 32. 64. 96. ... 96. 64. 32.]\n", " [ 64. 128. 192. ... 192. 128. 64.]\n", " [ 96. 192. 288. ... 288. 192. 96.]\n", " ...\n", " [ 96. 192. 288. ... 288. 192. 96.]\n", " [ 64. 128. 192. ... 192. 128. 64.]\n", " [ 32. 64. 96. ... 96. 64. 32.]]]]\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "import mindspore\n", "\n", "dout = Tensor(np.ones([10, 32, 30, 30]), mindspore.float32)\n", "weight = Tensor(np.ones([32, 32, 3, 3]), mindspore.float32)\n", "x = Tensor(np.ones([10, 32, 32, 32]))\n", "conv2d_backprop_input = ops.Conv2DBackpropInput(out_channel=32, kernel_size=3)\n", "res = conv2d_backprop_input(dout, weight, ops.shape(x))\n", "\n", "print(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 激活函数\n", "\n", "以下代码实现Softmax激活函数计算:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.01165623 0.03168492 0.08612853 0.23412164 0.63640857]\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "import mindspore\n", "\n", "input_x = Tensor(np.array([1, 2, 3, 4, 5]), mindspore.float32)\n", "softmax = ops.Softmax()\n", "res = softmax(input_x)\n", "\n", "print(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### LossFunction\n", "\n", " 以下代码实现了L1 loss function:\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0. 0. 0.5]\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "import mindspore\n", "\n", "loss = ops.SmoothL1Loss()\n", "input_data = Tensor(np.array([1, 2, 3]), mindspore.float32)\n", "target_data = Tensor(np.array([1, 2, 2]), mindspore.float32)\n", "res = loss(input_data, target_data)\n", "print(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 优化算法\n", "\n", " 以下代码实现了SGD梯度下降算法的具体实现,输出是result:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(Tensor(shape=[4], dtype=Float32, value= [ 1.99000001e+00, -4.90300000e-01, 1.69500005e+00, 3.98009992e+00]),)\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "import mindspore\n", "\n", "sgd = ops.SGD()\n", "parameters = Tensor(np.array([2, -0.5, 1.7, 4]), mindspore.float32)\n", "gradient = Tensor(np.array([1, -1, 0.5, 2]), mindspore.float32)\n", "learning_rate = Tensor(0.01, mindspore.float32)\n", "accum = Tensor(np.array([0.1, 0.3, -0.2, -0.1]), mindspore.float32)\n", "momentum = Tensor(0.1, mindspore.float32)\n", "stat = Tensor(np.array([1.5, -0.3, 0.2, -0.7]), mindspore.float32)\n", "result = sgd(parameters, gradient, learning_rate, accum, momentum, stat)\n", "\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 数组操作\n", "\n", "数组操作指操作对象是一些数组的操作。\n", "\n", "#### DType\n", "\n", "返回跟输入的数据类型一致的并且适配Mindspore的Tensor变量,常用于Mindspore工程内。\n", "\n", "具体可参见示例:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Float32\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "import mindspore\n", "\n", "input_tensor = Tensor(np.array([[2, 2], [2, 2]]), mindspore.float32)\n", "typea = ops.DType()(input_tensor)\n", "\n", "print(typea)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Cast\n", "\n", "转换输入的数据类型并且输出与目标数据类型相同的变量。\n", "\n", "具体参见以下示例:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Float16\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "import mindspore\n", "\n", "input_np = np.random.randn(2, 3, 4, 5).astype(np.float32)\n", "input_x = Tensor(input_np)\n", "type_dst = mindspore.float16\n", "cast = ops.Cast()\n", "result = cast(input_x, type_dst)\n", "print(result.dtype)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Shape\n", "\n", "返回输入数据的形状。\n", "\n", " 以下代码实现了返回输入数据input_tensor的操作:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(3, 2, 1)\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "import mindspore\n", "\n", "input_tensor = Tensor(np.ones(shape=[3, 2, 1]), mindspore.float32)\n", "shape = ops.Shape()\n", "output = shape(input_tensor)\n", "print(output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 图像操作\n", "\n", "图像操作包括图像预处理操作,如图像剪切(Crop,便于得到大量训练样本)和大小变化(Reise,用于构建图像金字塔等)。\n", "\n", " 以下代码实现了Crop和Resize操作:\n", "\n", "```python\n", "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "\n", "BATCH_SIZE = 1\n", "NUM_BOXES = 5\n", "IMAGE_HEIGHT = 256\n", "IMAGE_WIDTH = 256\n", "CHANNELS = 3\n", "image = np.random.normal(size=[BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH, CHANNELS]).astype(np.float32)\n", "boxes = np.random.uniform(size=[NUM_BOXES, 4]).astype(np.float32)\n", "box_index = np.random.uniform(size=[NUM_BOXES], low=0, high=BATCH_SIZE).astype(np.int32)\n", "crop_size = (24, 24)\n", "crop_and_resize = ops.CropAndResize()\n", "output = crop_and_resize(Tensor(image), Tensor(boxes), Tensor(box_index), crop_size)\n", "print(output.asnumpy())\n", "```\n", "\n", "输出如下:\n", "\n", "```text\n", "[[[[ 6.51672244e-01 -1.85958534e-01 5.19907832e-01]\n", "[ 1.53466597e-01 4.10562098e-01 6.26138210e-01]\n", "[ 6.62892580e-01 3.81776541e-01 4.69261825e-01]\n", "...\n", "[-5.83377600e-01 -3.53377648e-02 -6.01786733e-01]\n", "[ 1.36125124e+00 5.84172308e-02 -6.41442612e-02]\n", "[-9.11651254e-01 -1.19495761e+00 1.96810793e-02]]\n", "\n", "[[ 6.06956100e-03 -3.73778701e-01 1.88935513e-03]\n", "[-1.06859171e+00 2.00272346e+00 1.37180305e+00]\n", "[ 1.69524819e-01 2.90421434e-02 -4.12243098e-01]\n", "...\n", "\n", "[[-2.04489112e-01 2.36615837e-01 1.33802962e+00]\n", "[ 1.08329034e+00 -9.00492966e-01 -8.21497202e-01]\n", "[ 7.54147097e-02 -3.72897685e-01 -2.91040149e-02]\n", "...\n", "[ 1.12317121e+00 8.98950577e-01 4.22795087e-01]\n", "[ 5.13781667e-01 5.12095273e-01 -3.68211865e-01]\n", "[-7.04941899e-02 -1.09924078e+00 6.89047515e-01]]]]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> 以上代码运行于MindSpore的Ascend版本。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 编码运算\n", "\n", "编码运算包括BoundingBox Encoding、BoundingBox Decoding、IOU计算等。\n", "\n", "#### BoundingBoxEncode\n", "\n", "对物体所在区域方框进行编码,得到类似PCA的更精简信息,以便做后续类似特征提取,物体检测,图像恢复等任务。\n", "\n", " 以下代码实现了对anchor_box和groundtruth_box的boundingbox encode:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-1. 0.25 0. 0.40546513]\n", " [-1. 0.25 0. 0.40546513]]\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import mindspore\n", "\n", "anchor_box = Tensor([[2,2,2,3],[2,2,2,3]],mindspore.float32)\n", "groundtruth_box = Tensor([[1,2,1,4],[1,2,1,4]],mindspore.float32)\n", "boundingbox_encode = ops.BoundingBoxEncode(means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0))\n", "res = boundingbox_encode(anchor_box, groundtruth_box)\n", "print(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### BoundingBoxDecode\n", "\n", "编码器对区域位置信息解码之后,用此算子进行解码。\n", "\n", " 以下代码实现了:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 4.194528 0. 0. 5.194528 ]\n", " [ 2.1408591 0. 3.8591409 60.59815 ]]\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import mindspore\n", "\n", "anchor_box = Tensor([[4,1,2,1],[2,2,2,3]],mindspore.float32)\n", "deltas = Tensor([[3,1,2,2],[1,2,1,4]],mindspore.float32)\n", "boundingbox_decode = ops.BoundingBoxDecode(means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), max_shape=(768, 1280), wh_ratio_clip=0.016)\n", "res = boundingbox_decode(anchor_box, deltas)\n", "print(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### IOU计算\n", "\n", "计算预测的物体所在方框和真实物体所在方框的交集区域与并集区域的占比大小,常作为一种损失函数,用以优化模型。\n", "\n", " 以下代码实现了计算两个变量`anchor_boxes`和`gt_boxes`之间的IOU,以out输出:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0. -0. 0.]\n", " [ 0. -0. 0.]\n", " [ 0. 0. 0.]]\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "import mindspore\n", "\n", "iou = ops.IOU()\n", "anchor_boxes = Tensor(np.random.randint(1.0, 5.0, [3, 4]), mindspore.float16)\n", "gt_boxes = Tensor(np.random.randint(1.0, 5.0, [3, 4]), mindspore.float16)\n", "out = iou(anchor_boxes, gt_boxes)\n", "print(out)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 调试操作\n", "\n", "调试操作指的是用于调试网络的一些常用算子及其操作,例如HookBackward等, 此操作非常方便,对入门深度学习重要,极大提高学习者的学习体验。\n", "\n", "#### HookBackward\n", "\n", "打印中间变量的梯度,是比较常用的算子,目前仅支持PyNative模式。\n", "\n", " 以下代码实现了打印中间变量(例中x,y)的梯度:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(Tensor(shape=[], dtype=Float32, value= 2),)\n", "(Tensor(shape=[], dtype=Float32, value= 4), Tensor(shape=[], dtype=Float32, value= 4))\n" ] } ], "source": [ "from mindspore import Tensor\n", "import mindspore.ops as ops\n", "import numpy as np\n", "from mindspore import dtype as mstype\n", "\n", "def hook_fn(grad_out):\n", " print(grad_out)\n", "\n", "grad_all = ops.GradOperation(get_all=True)\n", "hook = ops.HookBackward(hook_fn)\n", "\n", "def hook_test(x, y):\n", " z = x * y\n", " z = hook(z)\n", " z = z * y\n", " return z\n", "\n", "def backward(x, y):\n", " return grad_all(hook_test)(Tensor(x, mstype.float32), Tensor(y, mstype.float32))\n", "\n", "print(backward(1, 2))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }