# 在MCU或小型系统上执行推理或训练 [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source.svg)](https://gitee.com/mindspore/docs/blob/master/docs/lite/docs/source_zh_cn/use/micro.md) ## 概述 本教程介绍面向IoT边缘侧设备的超轻量AI部署方案。 相较于移动设备,IoT设备上通常使用MicroControllerUnits(MCUs),不仅设备系统ROM资源非常有限,而且硬件资源内存和算力都非常弱小。 因此IOT设备上的AI应用对AI模型推理的运行时内存和功耗都有严格限制。 MindSpore Lite针对MCUs部署硬件后端,提供了一种超轻量Micro AI部署解决方案:离线阶段直接将模型生成轻量化代码,不再需要在线解析模型和图编译,生成的Micro代码非常直观易懂,运行时内存小,代码体积也更小。 用户使用MindSpore Lite转换工具`converter_lite`非常容易生成可在x86/ARM64/ARM32/Cortex-M平台部署的推理或训练代码。 通过Micro部署一个模型进行推理或训练,通常包含以下四步:模型代码生成、`Micro`库获取、代码集成、编译部署。 ## 模型推理代码生成 ### 概述 通过MindSpore Lite转换工具`converter_lite`,并在转换工具的参数配置文件中,配置Micro配置项,就能为输入模型生成推理代码。 此章只介绍转换工具中生成代码的相关功能,关于转换工具的基本使用方法,请参考[推理模型转换](https://www.mindspore.cn/lite/docs/zh-CN/master/use/converter_tool.html)。 ### 环境准备 以Linux环境下使用转换工具为例,需要进行如下环境准备。 1. 转换工具运行所需的系统环境 本例采用Linux下的系统环境,推荐使用Ubuntu 18.04.02LTS。 2. 获取转换工具 可以通过两种方式获取转换工具: - MindSpore官网下载[Release版本](https://www.mindspore.cn/lite/docs/zh-CN/master/use/downloads.html)。 用户需下载操作系统为Linux-x86_64,硬件平台为CPU的发布包。 - 从源码开始[编译构建](https://www.mindspore.cn/lite/docs/zh-CN/master/use/build.html)。 3. 解压下载的包 ```bash tar -zxf mindspore-lite-${version}-linux-x64.tar.gz ``` ${version}是发布包的版本号。 4. 将转换工具运行时需要的动态链接库加入环境变量LD_LIBRARY_PATH ```bash export LD_LIBRARY_PATH=${PACKAGE_ROOT_PATH}/tools/converter/lib:${LD_LIBRARY_PATH} ``` ${PACKAGE_ROOT_PATH}是解压得到的文件夹路径。 ### 单模型生成推理代码 1. 进入转换目录 ```bash cd ${PACKAGE_ROOT_PATH}/tools/converter/converter ``` 2. 设置Micro配置项 在当前目录下新建`micro.cfg`文件,文件内容如下: ```text [micro_param] # enable code-generation for MCU HW enable_micro=true # specify HW target, support x86,Cortex-M, AMR32A, ARM64 only. target=x86 # enable parallel inference or not. support_parallel=false ``` 配置文件中,第一行的`[micro_param]`表明后续的变量参数属于Micro配置项`micro_param`,这些参数用于控制代码生成,各参数含义如下表1所示。 本例中,我们将生成适用底层架构为x86_64的Linux系统上的单模型推理代码,故设置`target=x86`以声明生成的推理代码将用于底层架构为x86_64的Linux系统。 3. 准备要生成推理代码的模型 用户可点击此处下载本例中用到的[MNIST手写数字识别模型](https://download.mindspore.cn/model_zoo/official/lite/quick_start/micro/mnist.tar.gz)。 下载后,解压包,得到`mnist.tflite`,该模型为已经训练完的MNIST分类模型,为TFLITE模型。将`mnist.tflite`模型拷贝到当前所在的转换工具目录。 4. 执行converter_lite,生成代码 ```bash ./converter_lite --fmk=TFLITE --modelFile=mnist.tflite --outputFile=mnist --configFile=micro.cfg ``` 运行成功后的结果显示为: ```text CONVERT RESULT SUCCESS:0 ``` 用户若想了解converter_lite转换工具的相关参数,可参考[converter参数说明](https://www.mindspore.cn/lite/docs/zh-CN/master/use/converter_tool.html#参数说明)。 在转换工具执行成功后,生成的代码被保存在用户指定的`outputFile`路径下,在本例中,为当前转换目录下的mnist文件夹,内容如下: ```text mnist # 指定的生成代码根目录名称 ├── benchmark # 对模型推理代码进行集成调用的benchmark例程 │ ├── benchmark.c │ ├── calib_output.c │ ├── calib_output.h │ ├── load_input.c │ └── load_input.h ├── CMakeLists.txt # benchmark例程的cmake工程文件 └── src # 模型推理代码目录 ├── model0 # 与模型相关的文件目录 ├── model0.c ├── net0.bin # 二进制形式的模型权重 ├── net0.c ├── net0.h ├── weight0.c ├── weight0.h ├── CMakeLists.txt ├── allocator.c ├── allocator.h ├── net.cmake ├── model.c ├── model.h ├── context.c ├── context.h ├── tensor.c ├── tensor.h ``` 生成代码中的`src`目录即为模型推理代码所在目录,`benchmark`只是对`src`目录代码进行集成调用的一个例程。 关于集成调用的更多详细说明,请参照[代码集成及编译部署](#代码集成及编译部署)章节。 表1:micro_param参数定义 | 参数 | 是否必选 | 参数说明 | 取值范围 | 默认值 | |--------------------|--------------|-----------------------------------------|-----------------------------|-------| | enable_micro | 是 | 模型会生成代码,否则生成.ms | true, false | false | | target | 是 | 生成代码针对的平台 | x86, Cortex-M, ARM32, ARM64 | x86 | | support_parallel | 否 | 是否生成多线程推理代码,仅在x86、ARM32、ARM64平台可设置为true | true, false | false | | save_path | 否(多模型参数) | 多模型生成代码文件路径 | 无 | 无 | | project_name | 否(多模型参数) | 多模型生成代码工程名 | 无 | 无 | | inputs_shape | 否(动态shape参数) | 动态shape场景下模型的输入shape信息 | 无 | 无 | | dynamic_dim_params | 否(动态shape参数) | 动态shape场景下可变维度的取值范围 | 无 | 无 | ### 多模型生成推理代码 1. 进入转换目录 ```bash cd ${PACKAGE_ROOT_PATH}/tools/converter/converter ``` 2. 设置Micro配置项 在当前目录下新建`micro.cfg`文件,文件内容如下: ```text [micro_param] # enable code-generation for MCU HW enable_micro=true # specify HW target, support x86,Cortex-M, AMR32A, ARM64 only. target=x86 # enable parallel inference or not. support_parallel=false # save generated code path. save_path=workpath/ # set project name. project_name=minst [model_param] # input model type. fmk=TFLITE # path of input model file. modelFile=mnist.tflite [model_param] # input model type. fmk=TFLITE # path of input model file. modelFile=mnist.tflite ``` 配置文件中,`[micro_param]`表明后续的变量参数属于Micro配置项`micro_param`,这些参数用于控制代码生成,各参数含义如表1所示。`[model_param]`表明后续的变量参数属于对应Model配置项`model_param`,这些参数用于控制不同模型的转换,参数的范围包括`converter_lite`支持的必要参数。 本例中,我们将生成适用底层架构为x86_64的Linux系统上的多模型推理代码,故设置`target=x86`以声明生成的推理代码将用于底层架构为x86_64的Linux系统。 3. 准备要生成推理代码的模型 用户可点击此处下载本例中用到的[MNIST手写数字识别模型](https://download.mindspore.cn/model_zoo/official/lite/quick_start/micro/mnist.tar.gz)。 下载后,解压包,得到`mnist.tflite`,该模型为已经训练完的MNIST分类模型,为TFLITE模型。将`mnist.tflite`模型拷贝到当前所在的转换工具目录。 4. 执行converter_lite,只需要配置config文件,生成代码 ```bash ./converter_lite --configFile=micro.cfg ``` 运行成功后的结果显示为: ```text CONVERT RESULT SUCCESS:0 ``` 用户若想了解converter_lite转换工具的相关参数,可参考[converter参数说明](https://www.mindspore.cn/lite/docs/zh-CN/master/use/converter_tool.html#参数说明)。 在转换工具执行成功后,生成的代码被保存在用户指定的`save_path`+`project_name`路径下,在本例中,为当前转换目录下的mnist文件夹,内容如下: ```text mnist # 指定的生成代码根目录(工程)名称 ├── benchmark # 对模型推理代码进行集成调用的benchmark例程 │ ├── benchmark.c │ ├── calib_output.c │ ├── calib_output.h │ ├── load_input.c │ └── load_input.h ├── CMakeLists.txt # benchmark例程的cmake工程文件 ├── include ├── model_handle.h # 模型对外接口文件 └── src # 模型推理代码目录 ├── model0 # 第一个模型相关的文件目录 ├── model0.c ├── net0.bin # 二进制形式的模型权重 ├── net0.c ├── net0.h ├── weight0.c ├── weight0.h ├── model1 # 第二个模型相关的文件目录 ├── model1.c ├── net1.bin ├── net1.c ├── net1.h ├── weight1.c ├── weight1.h ├── CMakeLists.txt ├── allocator.c ├── allocator.h ├── net.cmake ├── model.c ├── model.h ├── context.c ├── context.h ├── tensor.c ├── tensor.h ``` 生成代码中的`src`目录即为模型推理代码所在目录,`benchmark`只是对`src`目录代码进行集成调用的一个例程,在多模型场景下,用户需根据自己的需求对`benchmark`进行微调。 关于集成调用的更多详细说明,请参照[代码集成及编译部署](#代码集成及编译部署)章节。 ### 模型输入shape配置(可选) 通常在生成代码时,通过配置模型输入shape为实际推理时的输入shape,可以减少部署过程中出错的概率。 当模型含有`Shape`算子或者原模型输入shape非固定值时,必须配置模型的输入shape值,以支持相关shape优化和代码生成。 通过转换工具的`--inputShape=`命令可以配置生成代码的输入shape,具体参数含义,请参考[转换工具使用说明](https://www.mindspore.cn/lite/docs/zh-CN/master/use/converter_tool.html)。 ### 动态shape配置(可选) 在某些推理场景,如检测出目标后再执行目标识别网络,由于目标个数不固定导致目标识别网络输入BatchSize不固定。如果每次推理都按照需要的BatchSize或分辨率进行重新生成和部署,会造成内存资源浪费和开发效率降低。因此,Micro需要支持动态shape能力,在convert阶段通过configFile配置`[micro_param]`中的动态参数,推理时使用[MSModelResize](#推理代码的调用接口)功能,改变输入shape。 其中,`inputs_shape`中配置模型的所有输入shape信息,固定维度用真实数字表示,动态维度用占位符表示,目前仅支持配置2个可变维度。`dynamic_dim_params`表示可变维度的取值范围,需与`inputs_shape`配置的占位符对应;如果范围为离散值,则用`,`隔开,如果范围为连续值,则用`~`隔开。所有参数均为紧凑书写,中间不要留有空格;若存在多个输入,不同输入对应的挡位需要一致,并用`;`隔开,否则解析失败。 ```text [micro_param] # the name and shapes of model's all inputs. # the format is 'input1_name:[d0,d1];input2_name:[1,d0]' inputs_shape=input1:[d0,d1];input2:[1,d0] # the value range of dynamic dims. dynamic_dim_params=d0:[1,3];d1:[1~8] ``` ### 生成多线程并行推理代码(可选) 在通常的Linux-x86/Android环境下,拥有多核CPU,使能Micro多线程推理能够发挥设备性能,加快模型推理速度。 #### 配置文件 通过在配置文件中设置support_parallel为true,将生成支持多线程推理的代码,关于配置文件各选项含义请参考表1。 一个 `x86` 的多线程代码生成配置文件的示例如下: ```text [micro_param] # enable code-generation for MCU HW enable_micro=true # specify HW target, support x86,Cortex-M, AMR32A, ARM64 only. target=x86 # enable parallel inference or not. support_parallel=true ``` #### 涉及的调用接口 通过集成代码,并调用下述接口,用户可以配置模型的多线程推理,具体接口参数请参考[API文档](https://www.mindspore.cn/lite/api/zh-CN/master/index.html)。 表2:多线程配置API接口 | 功能 | 函数原型 | | ---------------- | ----------------------------------------------------------------------- | | 设置推理时线程数 | void MSContextSetThreadNum(MSContextHandle context, int32_t thread_num) | | 设置线程绑核模式 | void MSContextSetThreadAffinityMode(MSContextHandle context, int mode) | | 获取推理时线程数 | int32_t MSContextGetThreadNum(const MSContextHandle context); | | 获取线程绑核模式 | int MSContextGetThreadAffinityMode(const MSContextHandle context) | #### 集成说明 生成多线程代码后,用户需链接`pthread`标准库,以及Micro库内的`libwrapper.a`静态库。具体可参考生成代码中的`CMakeLists.txt`文件。 #### 限制说明 目前该功能仅在 `target` 配置为x86/ARM32/ARM64时使能,最大可设置推理线程数为4线程。 ### 生成int8量化推理代码(可选) 在Cortex-M等MCU场景下,受限于设备的内存大小及算力,通常需要使用int8量化算子来进行部署推理以减少运行时内存大小并加速运算。 如果用户已经有一个int8全量化模型,可参考[执行converter_lite生成推理代码](https://www.mindspore.cn/lite/docs/zh-CN/master/use/micro.html#执行converter-lite生成推理代码)章节尝试直接生成int8量化推理代码而不需要阅读本章内容。 在通常的情况下,用户只有一个训练好的float32模型,此时若要生成int8量化推理代码,则需配合转换工具的后量化功能进行代码生成,具体步骤可参考下文。 #### 配置文件 通过在配置文件中配置量化控制参数可以实现int8量化推理代码生成,关于量化控制参数(通用量化参数`common_quant_param`和全量化参数`full_quant_param`)的说明,请参考转换工具的[训练后量化文档](https://www.mindspore.cn/lite/docs/zh-CN/master/use/post_training_quantization.html)。 一个 `Cortex-M` 平台的int8量化推理代码生成配置文件的示例如下: ```text [micro_param] # enable code-generation for MCU HW enable_micro=true # specify HW target, support x86,Cortex-M, ARM32, ARM64 only. target=Cortex-M # code generation for Inference or Train codegen_mode=Inference # enable parallel inference or not support_parallel=false [common_quant_param] # Supports WEIGHT_QUANT or FULL_QUANT quant_type=FULL_QUANT # Weight quantization support the number of bits [0,16], Set to 0 is mixed bit quantization, otherwise it is fixed bit quantization # Full quantization support the number of bits [1,8] bit_num=8 [data_preprocess_param] calibrate_path=inputs:/home/input_dir calibrate_size=100 input_type=BIN [full_quant_param] activation_quant_method=MAX_MIN bias_correction=true target_device=DSP ``` ##### 限制说明 - 目前仅支持全量化的推理代码生成。 - 配置文件中全量化参数`full_quant_param`的target_device通常需设置为DSP,以支持更多的算子进行后量化。 - 目前Micro已支持34个int8量化算子,如果在生成代码时,有相关量化算子不支持,可通过通用量化参数`common_quant_param`的`skip_quant_node`来规避该算子,被规避的算子节点仍然采用float32推理。 ## 模型训练代码生成 ### 概述 通过MindSpore Lite转换工具`converter_lite`,并在转换工具的参数配置文件中,配置Micro配置项,就能为输入模型生成训练代码。 此章只介绍转换工具中生成代码的相关功能,关于转换工具的基本使用方法,请参考[训练模型转换](https://www.mindspore.cn/lite/docs/zh-CN/master/use/converter_train.html)。 ### 环境准备 环境准备小节参考[上文](#环境准备),此处不再赘述。 ### 执行converter_lite生成推理代码 1. 进入转换目录 ```bash cd ${PACKAGE_ROOT_PATH}/tools/converter/converter ``` 2. 设置Micro配置项 在当前目录下新建`micro.cfg`文件,文件内容如下: ```text [micro_param] # enable code-generation for MCU HW enable_micro=true # specify HW target, support x86,Cortex-M, AMR32A, ARM64 only. target=x86 # code generation for Inference or Train. Cortex-M is unsupported when codegen_mode is Train. codegen_mode=Train ``` 3. 执行converter_lite,生成代码 ```bash ./converter_lite --fmk=MINDIR --trainModel=True --modelFile=my_model.mindir --outputFile=my_model --configFile=micro.cfg ``` 运行成功后的结果显示为: ```text CONVERT RESULT SUCCESS:0 ``` 在转换工具执行成功后,生成的代码被保存在用户指定的`outputFile`路径下,在本例中,为当前转换目录下的my_model文件夹,内容如下: ```text my_model # 指定的生成代码根目录名称 ├── benchmark # 对模型训练代码进行集成调用的benchmark例程 │ ├── benchmark.c │ ├── calib_output.c │ ├── calib_output.h │ ├── load_input.c │ └── load_input.h ├── CMakeLists.txt # benchmark例程的cmake工程文件 └── src # 模型推理代码目录 ├── CMakeLists.txt ├── net.bin # 二进制形式的模型权重 ├── net.c ├── net.cmake ├── net.h ├── model.c ├── context.c ├── context.h ├── tensor.c ├── tensor.h ├── weight.c └── weight.h ``` 训练执行流程涉及的API请参考[训练接口介绍](#训练代码的调用接口) ## `Micro`库获取 在生成模型推理代码之后,用户在对代码进行集成开发之前,需要获得生成的推理代码所依赖的`Micro`库。 不同平台的推理代码依赖对应平台的`Micro`库,用户需根据使用的平台,在生成代码时,通过Micro配置项`target`指定该平台,并在获取`Micro`库时,获得该平台的`Micro`库。 用户可通过MindSpore官网下载对应平台的[Release版本](https://www.mindspore.cn/lite/docs/zh-CN/master/use/downloads.html)。 在[模型推理代码生成](#模型推理代码生成)章节,我们得到了x86_64架构Linux平台的模型推理代码,而该代码所依赖的`Micro`库,就在转换工具所使用的发布包内。 发布包内,推理代码所依赖的库和头文件如下: ```text mindspore-lite-{version}-linux-x64 ├── runtime │ └── include │ └── c_api # MindSpore Lite集成开发的C API头文件 └── tools └── codegen # 代码生成的source code 依赖include和lib ├── include # 推理框架头文件 │ ├── nnacl # nnacl 算子头文件 │ └── wrapper # wrapper 算子头文件 ├── lib │ ├── libwrapper.a # MindSpore Lite codegen生成代码依赖的部分算子静态库 │ └── libnnacl.a # MindSpore Lite codegen生成代码依赖的nnacl算子静态库 └── third_party ├── include │ └── CMSIS # ARM CMSIS NN 算子头文件 └── lib └── libcmsis_nn.a # ARM CMSIS NN 算子静态库 ``` ## 代码集成及编译部署 在生成代码的`benchmark`目录中,包含了对推理代码的接口调用示例,用户可参考benchmark例程,来对`src`推理代码进行集成开发以实现自身的应用。 ### 推理代码的调用接口 以下是推理代码的一般调用接口,关于接口的详细说明,请参考[API文档](https://www.mindspore.cn/lite/api/zh-CN/master/index.html)。 表3:推理通用API接口 | 功能 | 函数原型 | |-----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 创建 Model | MSModelHandle MSModelCreate() | | 销毁 Model | void MSModelDestroy(MSModelHandle *model) | | 计算 Model 运行时所需的缓存大小(仅支持Cortex-M平台) | size_t MSModelCalcWorkspaceSize(MSModelHandle model) | | 设置 Model 运行时的缓存(仅支持Cortex-M平台) | void MSModelSetWorkspace(MSModelHandle model, void *workspace, size_t workspace_size) | | 编译 Model | MSStatus MSModelBuild(MSModelHandle model, const void *model_data, size_t data_size, MSModelType model_type, const MSContextHandle model_context) | | 设置 Model 的输入shape | MSStatus MSModelResize(MSModelHandle model, const MSTensorHandleArray inputs, MSShapeInfo *shape_infos, size_t shape_info_num) | | 推理 Model | MSStatus MSModelPredict(MSModelHandle model, const MSTensorHandleArray inputs, MSTensorHandleArray *outputs, const MSKernelCallBackC before, const MSKernelCallBackC after) | | 获取所有输入 Tensor | MSTensorHandleArray MSModelGetInputs(const MSModelHandle model) | | 获取所有输出 Tensor | MSTensorHandleArray MSModelGetOutputs(const MSModelHandle model) | | 通过名字取输入 Tensor | MSTensorHandle MSModelGetInputByTensorName(const MSModelHandle model, const char *tensor_name) | | 通过名字取输出 Tensor | MSTensorHandle MSModelGetOutputByTensorName(const MSModelHandle model, const char *tensor_name) | ### 训练代码的调用接口 以下是训练代码的一般调用接口。 表4:训练通用API接口(此处只列举训练相关接口) | 功能 | 函数原型 | | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | 单步执行 Model | MSStatus MSModelRunStep(MSModelHandle model, const MSKernelCallBackC before, const MSKernelCallBackC after) | | 设置执行模式 Model | MSStatus MSModelSetTrainMode(MSModelHandle model, bool train) | | 权重导出 Model | MSStatus MSModelExportWeight(MSModelHandle model,const char *export_path) | ### 不同的平台的集成差异 不同的平台在代码集成和编译部署上会有不同的差异。 - 对于cortex-M架构的MCU请参考[在MCU上执行推理](#在mcu上执行推理) - 对于x86_64架构Linux平台,请参考[Linux_x86_64平台编译部署](https://gitee.com/mindspore/mindspore/tree/master/mindspore/lite/examples/quick_start_micro/mnist_x86) - 对于arm32或arm64的Android平台编译部署,请参考[Android平台编译部署](https://gitee.com/mindspore/mindspore/tree/master/mindspore/lite/examples/quick_start_micro/mobilenetv2_arm64) - 对于在OpenHarmony平台上编译部署,请参考[在轻鸿蒙设备上执行推理](#在轻鸿蒙设备上执行推理) ### 多模型推理集成 多模型集成与单模型的类似。唯有一点不同:单模型场景下,用户可通过`MSModelCreate`接口创建模型。而在多模型场景下,为用户提供了`MSModelHandle`句柄,用户可通过操纵不同模型的`MSModelHandle`句柄,调用单模型通用的推理API接口,实现对不同模型的集成,`MSModelHandle`句柄可参考多模型文件目录下的`model_handle.h`文件。 ## 在MCU上执行推理 ### 概述 本教程以[MNIST模型](https://download.mindspore.cn/model_zoo/official/lite/quick_start/micro/mnist.tar.gz)在STM32F767芯片的部署为例,演示如何在Cortex-M架构的MCU上部署推理模型,主要包括以下几步: - 通过converter_lite转换工具,生成适配Cortex-M架构的模型推理代码 - 下载得到该Cortex-M架构对应的`Micro`库 - 对得到的推理代码和`Micro`库进行集成,编译并部署验证 在Windows平台,我们演示了如何通过IAR进行推理代码的集成开发,在Linux平台上,我们演示了如何通过MakeFile交叉编译的方式进行代码集成开发。 ### 生成MCU推理代码 为MCU生成推理代码,请参考[模型推理代码生成](#模型推理代码生成)章节,只需将Micro配置项中的`target=x86`改为`target=Cortex-M`,就可以为MCU生成推理代码。 生成成功之后,文件夹内容如下所示: ```text mnist # 指定的生成代码根目录名称 ├── benchmark # 对模型推理代码进行集成调用的benchmark例程 │ ├── benchmark.c │ ├── calib_output.c │ ├── calib_output.h │ ├── data.c │ ├── data.h │ ├── load_input.c │ └── load_input.h ├── build.sh # 一键编译脚本 ├── CMakeLists.txt # benchmark例程的cmake工程文件 ├── cortex-m7.toolchain.cmake # cortex-m7的交叉编译cmake文件 └── src # 模型推理代码目录 ├── CMakeLists.txt ├── context.c ├── context.h ├── model.c ├── net.c ├── net.cmake ├── net.h ├── tensor.c ├── tensor.h ├── weight.c └── weight.h ``` ### 下载Cortex-M架构`Micro`库 STM32F767芯片为Cortex-M7架构,可以通过以下两种方式获取该架构的`Micro`库: - MindSpore官网下载[Release版本](https://www.mindspore.cn/lite/docs/zh-CN/master/use/downloads.html)。 用户需下载操作系统为None,硬件平台为Cortex-M7的发布包。 - 从源码开始[编译构建](https://www.mindspore.cn/lite/docs/zh-CN/master/use/build.html)。 用户可通过`MSLITE_MICRO_PLATFORM=cortex-m7 bash build.sh -I x86_64`命令,来编译得到`Cortex-M7`的发布包。 对于暂未提供发布包进行下载的其他Cortex-M架构平台,用户可参考从源码编译构建的方式,修改MindSpore源码,进行手动编译,得到发布包。 ### 在Windows上的代码集成及编译部署:通过IAR进行集成开发 本例通过IAR进行代码集成及烧录,演示如何在Windows上对生成的推理代码进行集成开发。主要分为以下几步: - 下载所需要的相关软件,做好集成的环境准备 - 通过`STM32CubeMX`软件生成所需要的MCU启动代码及演示工程 - 在`IAR`内集成模型推理代码及`Micro`库 - 编译并仿真运行 #### 环境准备 - [STM32CubeMX Windows版本](https://www.st.com/en/development-tools/stm32cubemx.html) >= 6.0.1 - `STM32CubeMX`是`意法半导体`提供的STM32芯片图形化配置工具,该工具用于生成STM芯片的启动代码及工程。 - [IAR EWARM](https://www.iar.com/ewarm) >= 9.1 - `IAR EWARM`是一款IARSystems公司为ARM微处理器开发的一个集成开发环境。 #### 获取MCU启动代码及工程 如果用户已经有自己的MCU工程,请忽略该章节。 本章,以生成STM32F767芯片的启动工程为例,演示如何通过`STM32CubeMX`生成STM32芯片的MCU工程。 - 启动`STM32CubeMX`,在`File`选项中选择`New Project`来新建工程。 - 在`MCU/MPU Selector`窗口,搜索并选择`STM32F767IGT6`,点击`Start Project`创建该芯片的工程。 - 在`Project Manager`界面,配置工程名及生成的工程路径,在`Toolchain / IDE`选项选择`EWARM`,以指定生成IAR工程。 - 点击上方的`GENERATE CODE`生成代码。 - 在已安装`IAR`的PC机上,双击生成工程内`EWARM`目录下的`Project.eww`即可打开该IAR工程。 #### 集成模型推理代码及`Micro`库 - 将生成的推理代码拷贝到工程内,并将[下载Cortex-M架构`Micro`库](#下载cortex-m架构micro库)章节获得的压缩包解压后放到生成的推理代码目录内,目录如下图所示: ```text test_stm32f767 # MCU工程目录 ├── Core │   ├── Inc │   └── Src │ ├── main.c │ └── ... ├── Drivers ├── EWARM # IAR工程文件目录 └── mnist # 生成代码根目录    ├── benchmark # 对模型推理代码进行集成调用的benchmark例程 │ ├── benchmark.c │ ├── data.c │ ├── data.h │ └── ... │── mindspore-lite-1.8.0-none-cortex-m7 # 下载的Cortex-M7架构`Micro`库 ├── src # 模型推理代码目录    └── ... ``` - 向IAR工程导入源文件 打开IAR工程,在`Workspace`界面,右键该项目,选择`Add -> Add Group`,添加一个`mnist`分组,右键点击该分组,重复新建分组操作,新建`src`和`benchmark`分组。 在各自分组下,选择`Add -> Add Files`,将`mnist`文件夹内`src`和`benchmark`目录下的源文件引入各自分组。 - 加入依赖的头文件路径和静态库 在`Workspace`界面,右键该项目,选择`Options`,打开项目选项窗口。在项目选项窗口左侧选择`C/C++ Compiler`选项,在右侧的子窗口中,选择`Preprocessor`子界面,将推理代码依赖的头文件路径加入到列表中。本例中添加的头文件路径如下: ```text $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/runtime $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/runtime/include $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/include $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/third_party/include/CMSIS/Core $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/third_party/include/CMSIS/DSP $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/third_party/include/CMSIS/NN $PROJ_DIR$/../mnist ``` 在项目选项窗口左侧选择`Linker`选项,在右侧的子窗口中,选择`Library`子界面,将推理代码依赖的算子静态库文件加入到列表中。本例中添加的静态库文件如下: ```text $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/lib/libwrapper.a $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/lib/libnnacl.a $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/third_party/lib/libcmsis_nn.a ``` - 修改main.c文件,调用benchmark函数 在`main.c`开头增加头文件引用,并在main函数中调用`benchmark.c`中的`benchmark`函数,benchmark文件夹内的程序为对生成的`src`内的推理代码进行推理调用并比较输出的示范样例程序,用户可以自由对它进行修改。 ```c++ #include "benchmark/benchmark.h" ... int main(void) { ... if (benchmark() == 0) { printf("\nrun success.\n"); } else { printf("\nrun failed.\n"); } ... } ``` - 修改`mnist/benchmark/data.c`文件,将标杆输入输出数据存放在程序内以进行对比 在benchmark例程内,会设置模型的输入数据,并将推理结果和设定的期望结果进行对比,得到误差偏移值。 在本例中,通过修改`data.c`的`calib_input0_data`数组,设置模型的输入数据,通过修改`calib_output0_data`,设定期望结果。 ```c++ float calib_input0_data[NET_INPUT0_SIZE] = {0.54881352186203,0.7151893377304077,0.6027633547782898,0.5448831915855408,0.42365479469299316,0.6458941102027893,0.4375872015953064,0.891772985458374,0.9636627435684204,0.3834415078163147,0.7917250394821167,0.5288949012756348,0.5680445432662964,0.9255966544151306,0.07103605568408966,0.08712930232286453,0.020218396559357643,0.832619845867157,0.7781567573547363,0.8700121641159058,0.978618323802948,0.7991585731506348,0.4614793658256531,0.7805292010307312,0.11827442795038223,0.6399210095405579,0.14335328340530396,0.9446688890457153,0.5218483209609985,0.4146619439125061,0.26455560326576233,0.7742336988449097,0.4561503231525421,0.568433940410614,0.018789799883961678,0.6176354885101318,0.6120957136154175,0.6169340014457703,0.9437480568885803,0.681820273399353,0.35950788855552673,0.43703195452690125,0.6976311802864075,0.0602254718542099,0.6667667031288147,0.670637845993042,0.21038256585597992,0.12892629206180573,0.31542834639549255,0.36371076107025146,0.5701967477798462,0.4386015236377716,0.9883738160133362,0.10204481333494186,0.20887675881385803,0.16130951046943665,0.6531082987785339,0.25329160690307617,0.4663107693195343,0.24442559480667114,0.15896958112716675,0.11037514358758926,0.6563295722007751,0.13818295300006866,0.1965823620557785,0.3687251806259155,0.8209932446479797,0.09710127860307693,0.8379449248313904,0.0960984081029892,0.9764594435691833,0.4686512053012848,0.9767611026763916,0.6048455238342285,0.7392635941505432,0.03918779268860817,0.28280696272850037,0.12019655853509903,0.296140193939209,0.11872772127389908,0.3179831802845001,0.414262980222702,0.06414749473333359,0.6924721002578735,0.5666014552116394,0.26538950204849243,0.5232480764389038,0.09394051134586334,0.5759465098381042,0.9292961955070496,0.3185689449310303,0.6674103736877441,0.13179786503314972,0.7163271903991699,0.28940609097480774,0.18319135904312134,0.5865129232406616,0.02010754682123661,0.8289400339126587,0.004695476032793522,0.6778165102005005,0.2700079679489136,0.7351940274238586,0.9621885418891907,0.2487531453371048,0.5761573314666748,0.5920419096946716,0.5722519159317017,0.22308163344860077,0.9527490139007568,0.4471253752708435,0.8464086651802063,0.6994792819023132,0.2974369525909424,0.8137978315353394,0.396505743265152,0.8811032176017761,0.5812729001045227,0.8817353844642639,0.6925315856933594,0.7252542972564697,0.5013243556022644,0.9560836553573608,0.6439902186393738,0.4238550364971161,0.6063932180404663,0.019193198531866074,0.30157482624053955,0.6601735353469849,0.2900775969028473,0.6180154085159302,0.42876869440078735,0.1354740709066391,0.29828232526779175,0.5699648857116699,0.5908727645874023,0.5743252635002136,0.6532008051872253,0.6521032452583313,0.43141844868659973,0.8965466022491455,0.36756187677383423,0.4358649253845215,0.8919233679771423,0.806194007396698,0.7038885951042175,0.10022688657045364,0.9194825887680054,0.7142413258552551,0.9988470077514648,0.14944830536842346,0.8681260347366333,0.16249293088912964,0.6155595779418945,0.1238199844956398,0.8480082154273987,0.8073189854621887,0.5691007375717163,0.40718328952789307,0.06916699558496475,0.6974287629127502,0.45354267954826355,0.7220556139945984,0.8663823008537292,0.9755215048789978,0.855803370475769,0.011714084073901176,0.359978049993515,0.729990541934967,0.17162968218326569,0.5210366249084473,0.054337989538908005,0.19999653100967407,0.01852179504930973,0.793697714805603,0.2239246815443039,0.3453516662120819,0.9280812740325928,0.704414427280426,0.031838931143283844,0.1646941602230072,0.6214783787727356,0.5772286057472229,0.23789282143115997,0.9342139959335327,0.6139659285545349,0.5356327891349792,0.5899099707603455,0.7301220297813416,0.31194499135017395,0.39822107553482056,0.20984375476837158,0.18619300425052643,0.9443724155426025,0.739550769329071,0.49045881628990173,0.22741462290287018,0.2543564736843109,0.058029159903526306,0.43441662192344666,0.3117958903312683,0.6963434815406799,0.37775182723999023,0.1796036809682846,0.024678727611899376,0.06724963337182999,0.6793927550315857,0.4536968469619751,0.5365791916847229,0.8966712951660156,0.990338921546936,0.21689698100090027,0.6630781888961792,0.2633223831653595,0.02065099962055683,0.7583786249160767,0.32001715898513794,0.38346388936042786,0.5883170962333679,0.8310484290122986,0.6289818286895752,0.872650682926178,0.27354204654693604,0.7980468273162842,0.18563593924045563,0.9527916312217712,0.6874882578849792,0.21550767123699188,0.9473705887794495,0.7308558225631714,0.2539416551589966,0.21331197023391724,0.518200695514679,0.02566271834075451,0.20747007429599762,0.4246854782104492,0.3741699755191803,0.46357542276382446,0.27762871980667114,0.5867843627929688,0.8638556003570557,0.11753185838460922,0.517379105091095,0.13206811249256134,0.7168596982955933,0.39605969190597534,0.5654212832450867,0.1832798421382904,0.14484776556491852,0.4880562722682953,0.35561272501945496,0.9404319524765015,0.7653252482414246,0.748663604259491,0.9037197232246399,0.08342243731021881,0.5521924495697021,0.5844760537147522,0.961936354637146,0.29214751720428467,0.24082878232002258,0.10029394179582596,0.016429629176855087,0.9295293092727661,0.669916570186615,0.7851529121398926,0.28173011541366577,0.5864101648330688,0.06395526975393295,0.48562759160995483,0.9774951338768005,0.8765052556991577,0.3381589651107788,0.961570143699646,0.23170162737369537,0.9493188261985779,0.9413776993751526,0.799202561378479,0.6304479241371155,0.8742879629135132,0.2930202782154083,0.8489435315132141,0.6178767085075378,0.013236857950687408,0.34723350405693054,0.14814086258411407,0.9818294048309326,0.4783703088760376,0.49739137291908264,0.6394725441932678,0.36858460307121277,0.13690027594566345,0.8221177458763123,0.1898479163646698,0.5113189816474915,0.2243170291185379,0.09784448146820068,0.8621914982795715,0.9729194641113281,0.9608346819877625,0.9065554738044739,0.774047315120697,0.3331451416015625,0.08110138773918152,0.40724116563796997,0.2322341352701187,0.13248763978481293,0.053427182137966156,0.7255943417549133,0.011427458375692368,0.7705807685852051,0.14694663882255554,0.07952208071947098,0.08960303664207458,0.6720477938652039,0.24536721408367157,0.4205394685268402,0.557368814945221,0.8605511784553528,0.7270442843437195,0.2703278958797455,0.131482794880867,0.05537432059645653,0.3015986382961273,0.2621181607246399,0.45614057779312134,0.6832813620567322,0.6956254243850708,0.28351885080337524,0.3799269497394562,0.18115095794200897,0.7885454893112183,0.05684807524085045,0.6969972252845764,0.7786954045295715,0.7774075865745544,0.25942257046699524,0.3738131523132324,0.5875996351242065,0.27282190322875977,0.3708527982234955,0.19705428183078766,0.4598558843135834,0.044612299650907516,0.7997958660125732,0.07695644348859787,0.5188351273536682,0.3068101108074188,0.5775429606437683,0.9594333171844482,0.6455702185630798,0.03536243736743927,0.4304024279117584,0.5100168585777283,0.5361775159835815,0.6813924908638,0.2775960862636566,0.12886056303977966,0.3926756680011749,0.9564056992530823,0.1871308982372284,0.9039839506149292,0.5438059568405151,0.4569114148616791,0.8820413947105408,0.45860394835472107,0.7241676449775696,0.3990253210067749,0.9040443897247314,0.6900250315666199,0.6996220350265503,0.32772040367126465,0.7567786574363708,0.6360610723495483,0.2400202751159668,0.16053882241249084,0.796391487121582,0.9591665863990784,0.4581388235092163,0.5909841656684875,0.8577226400375366,0.45722344517707825,0.9518744945526123,0.5757511854171753,0.8207671046257019,0.9088436961174011,0.8155238032341003,0.15941447019577026,0.6288984417915344,0.39843425154685974,0.06271295249462128,0.4240322411060333,0.25868406891822815,0.849038302898407,0.03330462798476219,0.9589827060699463,0.35536885261535645,0.3567068874835968,0.01632850244641304,0.18523232638835907,0.40125951170921326,0.9292914271354675,0.0996149331331253,0.9453015327453613,0.869488537311554,0.4541623890399933,0.326700896024704,0.23274412751197815,0.6144647002220154,0.03307459130883217,0.015606064349412918,0.428795725107193,0.06807407736778259,0.2519409954547882,0.2211609184741974,0.253191202878952,0.13105523586273193,0.012036222964525223,0.11548429727554321,0.6184802651405334,0.9742562174797058,0.9903450012207031,0.40905410051345825,0.1629544198513031,0.6387617588043213,0.4903053343296051,0.9894098043441772,0.06530420482158661,0.7832344174385071,0.28839850425720215,0.24141861498355865,0.6625045537948608,0.24606318771839142,0.6658591032028198,0.5173085331916809,0.4240889847278595,0.5546877980232239,0.2870515286922455,0.7065746784210205,0.414856880903244,0.3605455458164215,0.8286569118499756,0.9249669313430786,0.04600730910897255,0.2326269894838333,0.34851935505867004,0.8149664998054504,0.9854914546012878,0.9689717292785645,0.904948353767395,0.2965562641620636,0.9920112490653992,0.24942004680633545,0.10590615123510361,0.9509525895118713,0.2334202527999878,0.6897682547569275,0.05835635960102081,0.7307090759277344,0.8817201852798462,0.27243688702583313,0.3790569007396698,0.3742961883544922,0.7487882375717163,0.2378072440624237,0.17185309529304504,0.4492916464805603,0.30446839332580566,0.8391891121864319,0.23774182796478271,0.5023894309997559,0.9425836205482483,0.6339976787567139,0.8672894239425659,0.940209686756134,0.7507648468017578,0.6995750665664673,0.9679655432701111,0.9944007992744446,0.4518216848373413,0.07086978107690811,0.29279401898384094,0.15235470235347748,0.41748636960983276,0.13128933310508728,0.6041178107261658,0.38280805945396423,0.8953858613967896,0.96779465675354,0.5468848943710327,0.2748235762119293,0.5922304391860962,0.8967611789703369,0.40673333406448364,0.5520782470703125,0.2716527581214905,0.4554441571235657,0.4017135500907898,0.24841345846652985,0.5058664083480835,0.31038081645965576,0.37303486466407776,0.5249704718589783,0.7505950331687927,0.3335074782371521,0.9241587519645691,0.8623185753822327,0.048690296709537506,0.2536425292491913,0.4461355209350586,0.10462789237499237,0.34847599267959595,0.7400975227355957,0.6805144548416138,0.6223844289779663,0.7105283737182617,0.20492368936538696,0.3416981101036072,0.676242470741272,0.879234790802002,0.5436780452728271,0.2826996445655823,0.030235258862376213,0.7103368043899536,0.007884103804826736,0.37267908453941345,0.5305371880531311,0.922111451625824,0.08949454873800278,0.40594232082366943,0.024313200265169144,0.3426109850406647,0.6222310662269592,0.2790679335594177,0.2097499519586563,0.11570323258638382,0.5771402716636658,0.6952700018882751,0.6719571352005005,0.9488610029220581,0.002703213831409812,0.6471966505050659,0.60039222240448,0.5887396335601807,0.9627703428268433,0.016871673986315727,0.6964824199676514,0.8136786222457886,0.5098071694374084,0.33396488428115845,0.7908401489257812,0.09724292904138565,0.44203564524650574,0.5199523568153381,0.6939564347267151,0.09088572859764099,0.2277594953775406,0.4103015661239624,0.6232946515083313,0.8869608044624329,0.618826150894165,0.13346147537231445,0.9805801510810852,0.8717857599258423,0.5027207732200623,0.9223479628562927,0.5413808226585388,0.9233060479164124,0.8298973441123962,0.968286395072937,0.919782817363739,0.03603381663560867,0.1747720092535019,0.3891346752643585,0.9521427154541016,0.300028920173645,0.16046763956546783,0.8863046765327454,0.4463944137096405,0.9078755974769592,0.16023047268390656,0.6611174941062927,0.4402637481689453,0.07648676633834839,0.6964631676673889,0.2473987489938736,0.03961552307009697,0.05994429811835289,0.06107853725552559,0.9077329635620117,0.7398838996887207,0.8980623483657837,0.6725823283195496,0.5289399027824402,0.30444636940956116,0.997962236404419,0.36218905448913574,0.47064894437789917,0.37824517488479614,0.979526937007904,0.1746583878993988,0.32798799872398376,0.6803486943244934,0.06320761889219284,0.60724937915802,0.47764649987220764,0.2839999794960022,0.2384132742881775,0.5145127177238464,0.36792758107185364,0.4565199017524719,0.3374773859977722,0.9704936742782593,0.13343943655490875,0.09680395573377609,0.3433917164802551,0.5910269021987915,0.6591764688491821,0.3972567617893219,0.9992780089378357,0.35189300775527954,0.7214066386222839,0.6375827193260193,0.8130538463592529,0.9762256741523743,0.8897936344146729,0.7645619511604309,0.6982485055923462,0.335498183965683,0.14768557250499725,0.06263600289821625,0.2419017106294632,0.432281494140625,0.521996259689331,0.7730835676193237,0.9587409496307373,0.1173204779624939,0.10700414329767227,0.5896947383880615,0.7453980445861816,0.848150372505188,0.9358320832252502,0.9834262132644653,0.39980170130729675,0.3803351819515228,0.14780867099761963,0.6849344372749329,0.6567619442939758,0.8620625734329224,0.09725799411535263,0.49777689576148987,0.5810819268226624,0.2415570467710495,0.16902540624141693,0.8595808148384094,0.05853492394089699,0.47062090039253235,0.11583399772644043,0.45705875754356384,0.9799623489379883,0.4237063527107239,0.857124924659729,0.11731556057929993,0.2712520658969879,0.40379273891448975,0.39981213212013245,0.6713835000991821,0.3447181284427643,0.713766872882843,0.6391869187355042,0.399161159992218,0.43176013231277466,0.614527702331543,0.0700421929359436,0.8224067091941833,0.65342116355896,0.7263424396514893,0.5369229912757874,0.11047711223363876,0.4050356149673462,0.40537357330322266,0.3210429847240448,0.029950324445962906,0.73725426197052,0.10978446155786514,0.6063081622123718,0.7032175064086914,0.6347863078117371,0.95914226770401,0.10329815745353699,0.8671671748161316,0.02919023483991623,0.534916877746582,0.4042436182498932,0.5241838693618774,0.36509987711906433,0.19056691229343414,0.01912289671599865,0.5181497931480408,0.8427768349647522,0.3732159435749054,0.2228638231754303,0.080532006919384,0.0853109210729599,0.22139644622802734,0.10001406073570251,0.26503971219062805,0.06614946573972702,0.06560486555099487,0.8562761545181274,0.1621202677488327,0.5596824288368225,0.7734555602073669,0.4564095735549927,0.15336887538433075,0.19959613680839539,0.43298420310020447,0.52823406457901,0.3494403064250946,0.7814795970916748,0.7510216236114502,0.9272118210792542,0.028952548280358315,0.8956912755966187,0.39256879687309265,0.8783724904060364,0.690784752368927,0.987348735332489,0.7592824697494507,0.3645446300506592,0.5010631680488586,0.37638914585113525,0.364911824464798,0.2609044909477234,0.49597030878067017,0.6817399263381958,0.27734026312828064,0.5243797898292542,0.117380291223526,0.1598452925682068,0.04680635407567024,0.9707314372062683,0.0038603513967245817,0.17857997119426727,0.6128667593002319,0.08136960119009018,0.8818964958190918,0.7196201682090759,0.9663899540901184,0.5076355338096619,0.3004036843776703,0.549500584602356,0.9308187365531921,0.5207614302635193,0.2672070264816284,0.8773987889289856,0.3719187378883362,0.0013833499979227781,0.2476850152015686,0.31823351979255676,0.8587774634361267,0.4585031569004059,0.4445872902870178,0.33610227704048157,0.880678117275238,0.9450267553329468,0.9918903112411499,0.3767412602901459,0.9661474227905273,0.7918795943260193,0.675689160823822,0.24488948285579681,0.21645726263523102,0.1660478264093399,0.9227566123008728,0.2940766513347626,0.4530942440032959,0.49395784735679626,0.7781715989112854,0.8442349433898926,0.1390727013349533,0.4269043505191803,0.842854917049408,0.8180332779884338}; float calib_output0_data[NET_OUTPUT0_SIZE] = {3.5647096e-05,6.824297e-08,0.009327697,3.2340475e-05,1.1117579e-05,1.5117058e-06,4.6314454e-07,5.161628e-11,0.9905911,3.8835238e-10}; ``` #### 编译并仿真运行 在本例中,采用软件仿真的方式对推理结果进行查看分析。 在`Workspace`界面,右键该项目,选择`Options`,打开项目选项窗口。在项目选项窗口左侧选择`Debugger`选项,在右侧的`Setup`子窗口中,`Driver`选择为`Simulator`,使能软件仿真。 关闭项目选项窗口,点击菜单栏`Project -> Download and Debug`,进行项目编译并仿真。通过在`benchmark`调用处加入断点,可以观察到仿真运行的推理结果,以及benchmark()函数的返回值。 ### 在Linux上的代码集成及编译部署:通过MakeFile进行代码集成 本章教程以在Linux平台上,通过MakeFile对生成的模型代码进行集成开发为例,演示了在Linux上进行MCU推理代码集成开发的一般步骤。 主要分为以下几步: - 下载所需要的相关软件,准备好交叉编译及烧录环境 - 通过`STM32CubeMX`软件生成所需要的MCU启动代码及演示工程 - 修改`MakeFile`集成模型推理代码及`Micro`库 - 编译工程及烧录 - 读取板端运行结果并验证 本例构建完成的完整demo代码,可点击[此处下载](https://download.mindspore.cn/model_zoo/official/lite/quick_start/micro/test_stmf767.tar.gz)。 #### 环境准备 - [CMake](https://cmake.org/download/) >= 3.18.3 - [GNU Arm Embedded Toolchain](https://developer.arm.com/downloads/-/gnu-rm) >= 10-2020-q4-major-x86_64-linux - 该工具为适用Cortex-M的Linux下交叉编译工具。 - 下载x86_64-linux版本的`gcc-arm-none-eabi`,解压缩后,将目录下的bin路径加入到PATH环境变量中:`export PATH=gcc-arm-none-eabi路径/bin:$PATH`。 - [STM32CubeMX-Lin](https://www.st.com/en/development-tools/stm32cubemx.html) >= 6.5.0 - `STM32CubeMX`是`意法半导体`提供的STM32芯片图形化配置工具,该工具用于生成STM芯片的启动代码及工程。 - [STM32CubePrg-Lin](https://www.st.com/en/development-tools/stm32cubeprog.html) >= 6.5.0 - 该工具是`意法半导体`提供的烧录工具,可用于程序烧录和数据读取。 #### 获取MCU启动代码及工程 如果用户已经有自己的MCU工程,请忽略该章节。 本章,以生成STM32F767芯片的启动工程为例,演示如何通过`STM32CubeMX`生成STM32芯片的MCU工程。 - 启动`STM32CubeMX`,在`File`选项中选择`New Project`来新建工程。 - 在`MCU/MPU Selector`窗口,搜索并选择`STM32F767IGT6`,点击`Start Project`创建该芯片的工程。 - 在`Project Manager`界面,配置工程名及生成的工程路径,在`Toolchain / IDE`选项选择`Makefile`,以👈指定生成`MakeFile`工程。 - 点击上方的`GENERATE CODE`生成代码 - 在生成的工程目录下执行`make`,测试代码是否成功编译。 #### 集成模型推理代码 - 将生成的推理代码拷贝到MCU工程内,并将[下载Cortex-M架构`Micro`库](#下载cortex-m架构micro库)章节获得的压缩包解压后放到生成的推理代码目录内,目录如下图所示: ```text stm32f767 # MCU工程目录 ├── Core │   ├── Inc │   └── Src │ ├── main.c │ └── ... ├── Drivers ├── mnist # 生成代码根目录 │   ├── benchmark # 对模型推理代码进行集成调用的benchmark例程 │ │ ├── benchmark.c │ │ ├── data.c │ │ ├── data.h │ │ └── ... │ │── mindspore-lite-1.8.0-none-cortex-m7 # 下载的Cortex-M7架构`Micro`库 │ ├── src # 模型推理代码目录 │   └── ... ├── Makefile ├── startup_stm32f767xx.s └── STM32F767IGTx_FLASH.ld ``` - 修改`MakeFile`,将模型推理代码及依赖库加入工程 本例中要加入工程的源代码包括`src`下的模型推理代码和`benchmark`目录下的模型推理调用示例代码, 修改`MakeFile`中的`C_SOURCES`变量定义,加入源文件路径: ```bash C_SOURCES = \ mnist/src/context.c \ mnist/src/model.c \ mnist/src/net.c \ mnist/src/tensor.c \ mnist/src/weight.c \ mnist/benchmark/benchmark.c \ mnist/benchmark/calib_output.c \ mnist/benchmark/load_input.c \ mnist/benchmark/data.c \ ... ``` 加入依赖的头文件路径:修改`MakeFile`中的`C_INCLUDES`变量定义,加入以下路径: ```text LITE_PACK = mindspore-lite-1.8.0-none-cortex-m7 C_INCLUDES = \ -Imnist/$(LITE_PACK)/runtime \ -Imnist/$(LITE_PACK)/runtime/include \ -Imnist/$(LITE_PACK)/tools/codegen/include \ -Imnist/$(LITE_PACK)/tools/codegen/third_party/include/CMSIS/Core \ -Imnist/$(LITE_PACK)/tools/codegen/third_party/include/CMSIS/DSP \ -Imnist/$(LITE_PACK)/tools/codegen/third_party/include/CMSIS/NN \ -Imnist \ ... ``` 加入依赖的算子库(`-lnnacl -lwrapper -lcmsis_nn`),声明算子库文件所在路径,增加链接时编译选项(`-specs=nosys.specs`)。 本例中修改后的相关变量定义如下: ```text LIBS = -lc -lm -lnosys -lnnacl -lwrapper -lcmsis_nn LIBDIR = -Lmnist/$(LITE_PACK)/tools/codegen/lib -Lmnist/$(LITE_PACK)/tools/codegen/third_party/lib LDFLAGS = $(MCU) -specs=nosys.specs -specs=nano.specs -T$(LDSCRIPT) $(LIBDIR) $(LIBS) -Wl,-Map=$(BUILD_DIR)/$(TARGET).map,--cref -Wl,--gc-sections ``` - 修改main.c文件,调用benchmark函数 在main函数中调用`benchmark.c`中的`benchmark`函数,benchmark文件夹内的程序为对生成的`src`内的推理代码进行推理调用并比较输出的示范样例程序,用户可以自由对它进行修改,在本例中,我们直接调用`benchmark`函数,并根据返回结果,赋值`run_dnn_flag`变量。 ```c++ run_dnn_flag = '0'; if (benchmark() == 0) { printf("\nrun success.\n"); run_dnn_flag = '1'; } else { printf("\nrun failed.\n"); run_dnn_flag = '2'; } ``` 在`main.c`开头增加头文件引用和`run_dnn_flag`变量的定义。 ```c++ #include "benchmark/benchmark.h" char run_dnn_flag __attribute__((section(".myram"))) ;//测试用数组 ``` 在本例中,为方便直接使用烧录器对推理结果进行读取,把变量定义在了自定义的section段(`myram`)中,用户可以使用下面方式设置自定义的section段,或者忽略该声明,采用串口或其他交互方式得到推理结果。 自定义section段的设置方法如下: 修改`STM32F767IGTx_FLASH.ld`中的`MEMORY`段,增加一个自定义内存段`MYRAM`(在本例中,将`RAM`内存起始地址加4,以腾出内存给`MYRAM`);接着在`SECTIONS`段内增加自定义的`myram`段声明。 ```text MEMORY { MYRAM (xrw) : ORIGIN = 0x20000000, LENGTH = 1 RAM (xrw) : ORIGIN = 0x20000004, LENGTH = 524284 ... } ... SECTIONS { ... .myram (NOLOAD): { . = ALIGN(4); _smyram = .; /* create a global symbol at data start */ *(.sram) /* .data sections */ *(.sram*) /* .data* sections */ . = ALIGN(4); _emyram = .; /* define a global symbol at data end */ } >MYRAM AT> FLASH } ``` - 修改`mnist/benchmark/data.c`文件,将标杆输入输出数据存放在程序内以进行对比 在benchmark例程内,会设置模型的输入数据,并将推理结果和设定的期望结果进行对比,得到误差偏移值。 在本例中,通过修改`data.c`的`calib_input0_data`数组,设置模型的输入数据,通过修改`calib_output0_data`,设定期望结果。 ```c++ float calib_input0_data[NET_INPUT0_SIZE] = {0.54881352186203,0.7151893377304077,0.6027633547782898,0.5448831915855408,0.42365479469299316,0.6458941102027893,0.4375872015953064,0.891772985458374,0.9636627435684204,0.3834415078163147,0.7917250394821167,0.5288949012756348,0.5680445432662964,0.9255966544151306,0.07103605568408966,0.08712930232286453,0.020218396559357643,0.832619845867157,0.7781567573547363,0.8700121641159058,0.978618323802948,0.7991585731506348,0.4614793658256531,0.7805292010307312,0.11827442795038223,0.6399210095405579,0.14335328340530396,0.9446688890457153,0.5218483209609985,0.4146619439125061,0.26455560326576233,0.7742336988449097,0.4561503231525421,0.568433940410614,0.018789799883961678,0.6176354885101318,0.6120957136154175,0.6169340014457703,0.9437480568885803,0.681820273399353,0.35950788855552673,0.43703195452690125,0.6976311802864075,0.0602254718542099,0.6667667031288147,0.670637845993042,0.21038256585597992,0.12892629206180573,0.31542834639549255,0.36371076107025146,0.5701967477798462,0.4386015236377716,0.9883738160133362,0.10204481333494186,0.20887675881385803,0.16130951046943665,0.6531082987785339,0.25329160690307617,0.4663107693195343,0.24442559480667114,0.15896958112716675,0.11037514358758926,0.6563295722007751,0.13818295300006866,0.1965823620557785,0.3687251806259155,0.8209932446479797,0.09710127860307693,0.8379449248313904,0.0960984081029892,0.9764594435691833,0.4686512053012848,0.9767611026763916,0.6048455238342285,0.7392635941505432,0.03918779268860817,0.28280696272850037,0.12019655853509903,0.296140193939209,0.11872772127389908,0.3179831802845001,0.414262980222702,0.06414749473333359,0.6924721002578735,0.5666014552116394,0.26538950204849243,0.5232480764389038,0.09394051134586334,0.5759465098381042,0.9292961955070496,0.3185689449310303,0.6674103736877441,0.13179786503314972,0.7163271903991699,0.28940609097480774,0.18319135904312134,0.5865129232406616,0.02010754682123661,0.8289400339126587,0.004695476032793522,0.6778165102005005,0.2700079679489136,0.7351940274238586,0.9621885418891907,0.2487531453371048,0.5761573314666748,0.5920419096946716,0.5722519159317017,0.22308163344860077,0.9527490139007568,0.4471253752708435,0.8464086651802063,0.6994792819023132,0.2974369525909424,0.8137978315353394,0.396505743265152,0.8811032176017761,0.5812729001045227,0.8817353844642639,0.6925315856933594,0.7252542972564697,0.5013243556022644,0.9560836553573608,0.6439902186393738,0.4238550364971161,0.6063932180404663,0.019193198531866074,0.30157482624053955,0.6601735353469849,0.2900775969028473,0.6180154085159302,0.42876869440078735,0.1354740709066391,0.29828232526779175,0.5699648857116699,0.5908727645874023,0.5743252635002136,0.6532008051872253,0.6521032452583313,0.43141844868659973,0.8965466022491455,0.36756187677383423,0.4358649253845215,0.8919233679771423,0.806194007396698,0.7038885951042175,0.10022688657045364,0.9194825887680054,0.7142413258552551,0.9988470077514648,0.14944830536842346,0.8681260347366333,0.16249293088912964,0.6155595779418945,0.1238199844956398,0.8480082154273987,0.8073189854621887,0.5691007375717163,0.40718328952789307,0.06916699558496475,0.6974287629127502,0.45354267954826355,0.7220556139945984,0.8663823008537292,0.9755215048789978,0.855803370475769,0.011714084073901176,0.359978049993515,0.729990541934967,0.17162968218326569,0.5210366249084473,0.054337989538908005,0.19999653100967407,0.01852179504930973,0.793697714805603,0.2239246815443039,0.3453516662120819,0.9280812740325928,0.704414427280426,0.031838931143283844,0.1646941602230072,0.6214783787727356,0.5772286057472229,0.23789282143115997,0.9342139959335327,0.6139659285545349,0.5356327891349792,0.5899099707603455,0.7301220297813416,0.31194499135017395,0.39822107553482056,0.20984375476837158,0.18619300425052643,0.9443724155426025,0.739550769329071,0.49045881628990173,0.22741462290287018,0.2543564736843109,0.058029159903526306,0.43441662192344666,0.3117958903312683,0.6963434815406799,0.37775182723999023,0.1796036809682846,0.024678727611899376,0.06724963337182999,0.6793927550315857,0.4536968469619751,0.5365791916847229,0.8966712951660156,0.990338921546936,0.21689698100090027,0.6630781888961792,0.2633223831653595,0.02065099962055683,0.7583786249160767,0.32001715898513794,0.38346388936042786,0.5883170962333679,0.8310484290122986,0.6289818286895752,0.872650682926178,0.27354204654693604,0.7980468273162842,0.18563593924045563,0.9527916312217712,0.6874882578849792,0.21550767123699188,0.9473705887794495,0.7308558225631714,0.2539416551589966,0.21331197023391724,0.518200695514679,0.02566271834075451,0.20747007429599762,0.4246854782104492,0.3741699755191803,0.46357542276382446,0.27762871980667114,0.5867843627929688,0.8638556003570557,0.11753185838460922,0.517379105091095,0.13206811249256134,0.7168596982955933,0.39605969190597534,0.5654212832450867,0.1832798421382904,0.14484776556491852,0.4880562722682953,0.35561272501945496,0.9404319524765015,0.7653252482414246,0.748663604259491,0.9037197232246399,0.08342243731021881,0.5521924495697021,0.5844760537147522,0.961936354637146,0.29214751720428467,0.24082878232002258,0.10029394179582596,0.016429629176855087,0.9295293092727661,0.669916570186615,0.7851529121398926,0.28173011541366577,0.5864101648330688,0.06395526975393295,0.48562759160995483,0.9774951338768005,0.8765052556991577,0.3381589651107788,0.961570143699646,0.23170162737369537,0.9493188261985779,0.9413776993751526,0.799202561378479,0.6304479241371155,0.8742879629135132,0.2930202782154083,0.8489435315132141,0.6178767085075378,0.013236857950687408,0.34723350405693054,0.14814086258411407,0.9818294048309326,0.4783703088760376,0.49739137291908264,0.6394725441932678,0.36858460307121277,0.13690027594566345,0.8221177458763123,0.1898479163646698,0.5113189816474915,0.2243170291185379,0.09784448146820068,0.8621914982795715,0.9729194641113281,0.9608346819877625,0.9065554738044739,0.774047315120697,0.3331451416015625,0.08110138773918152,0.40724116563796997,0.2322341352701187,0.13248763978481293,0.053427182137966156,0.7255943417549133,0.011427458375692368,0.7705807685852051,0.14694663882255554,0.07952208071947098,0.08960303664207458,0.6720477938652039,0.24536721408367157,0.4205394685268402,0.557368814945221,0.8605511784553528,0.7270442843437195,0.2703278958797455,0.131482794880867,0.05537432059645653,0.3015986382961273,0.2621181607246399,0.45614057779312134,0.6832813620567322,0.6956254243850708,0.28351885080337524,0.3799269497394562,0.18115095794200897,0.7885454893112183,0.05684807524085045,0.6969972252845764,0.7786954045295715,0.7774075865745544,0.25942257046699524,0.3738131523132324,0.5875996351242065,0.27282190322875977,0.3708527982234955,0.19705428183078766,0.4598558843135834,0.044612299650907516,0.7997958660125732,0.07695644348859787,0.5188351273536682,0.3068101108074188,0.5775429606437683,0.9594333171844482,0.6455702185630798,0.03536243736743927,0.4304024279117584,0.5100168585777283,0.5361775159835815,0.6813924908638,0.2775960862636566,0.12886056303977966,0.3926756680011749,0.9564056992530823,0.1871308982372284,0.9039839506149292,0.5438059568405151,0.4569114148616791,0.8820413947105408,0.45860394835472107,0.7241676449775696,0.3990253210067749,0.9040443897247314,0.6900250315666199,0.6996220350265503,0.32772040367126465,0.7567786574363708,0.6360610723495483,0.2400202751159668,0.16053882241249084,0.796391487121582,0.9591665863990784,0.4581388235092163,0.5909841656684875,0.8577226400375366,0.45722344517707825,0.9518744945526123,0.5757511854171753,0.8207671046257019,0.9088436961174011,0.8155238032341003,0.15941447019577026,0.6288984417915344,0.39843425154685974,0.06271295249462128,0.4240322411060333,0.25868406891822815,0.849038302898407,0.03330462798476219,0.9589827060699463,0.35536885261535645,0.3567068874835968,0.01632850244641304,0.18523232638835907,0.40125951170921326,0.9292914271354675,0.0996149331331253,0.9453015327453613,0.869488537311554,0.4541623890399933,0.326700896024704,0.23274412751197815,0.6144647002220154,0.03307459130883217,0.015606064349412918,0.428795725107193,0.06807407736778259,0.2519409954547882,0.2211609184741974,0.253191202878952,0.13105523586273193,0.012036222964525223,0.11548429727554321,0.6184802651405334,0.9742562174797058,0.9903450012207031,0.40905410051345825,0.1629544198513031,0.6387617588043213,0.4903053343296051,0.9894098043441772,0.06530420482158661,0.7832344174385071,0.28839850425720215,0.24141861498355865,0.6625045537948608,0.24606318771839142,0.6658591032028198,0.5173085331916809,0.4240889847278595,0.5546877980232239,0.2870515286922455,0.7065746784210205,0.414856880903244,0.3605455458164215,0.8286569118499756,0.9249669313430786,0.04600730910897255,0.2326269894838333,0.34851935505867004,0.8149664998054504,0.9854914546012878,0.9689717292785645,0.904948353767395,0.2965562641620636,0.9920112490653992,0.24942004680633545,0.10590615123510361,0.9509525895118713,0.2334202527999878,0.6897682547569275,0.05835635960102081,0.7307090759277344,0.8817201852798462,0.27243688702583313,0.3790569007396698,0.3742961883544922,0.7487882375717163,0.2378072440624237,0.17185309529304504,0.4492916464805603,0.30446839332580566,0.8391891121864319,0.23774182796478271,0.5023894309997559,0.9425836205482483,0.6339976787567139,0.8672894239425659,0.940209686756134,0.7507648468017578,0.6995750665664673,0.9679655432701111,0.9944007992744446,0.4518216848373413,0.07086978107690811,0.29279401898384094,0.15235470235347748,0.41748636960983276,0.13128933310508728,0.6041178107261658,0.38280805945396423,0.8953858613967896,0.96779465675354,0.5468848943710327,0.2748235762119293,0.5922304391860962,0.8967611789703369,0.40673333406448364,0.5520782470703125,0.2716527581214905,0.4554441571235657,0.4017135500907898,0.24841345846652985,0.5058664083480835,0.31038081645965576,0.37303486466407776,0.5249704718589783,0.7505950331687927,0.3335074782371521,0.9241587519645691,0.8623185753822327,0.048690296709537506,0.2536425292491913,0.4461355209350586,0.10462789237499237,0.34847599267959595,0.7400975227355957,0.6805144548416138,0.6223844289779663,0.7105283737182617,0.20492368936538696,0.3416981101036072,0.676242470741272,0.879234790802002,0.5436780452728271,0.2826996445655823,0.030235258862376213,0.7103368043899536,0.007884103804826736,0.37267908453941345,0.5305371880531311,0.922111451625824,0.08949454873800278,0.40594232082366943,0.024313200265169144,0.3426109850406647,0.6222310662269592,0.2790679335594177,0.2097499519586563,0.11570323258638382,0.5771402716636658,0.6952700018882751,0.6719571352005005,0.9488610029220581,0.002703213831409812,0.6471966505050659,0.60039222240448,0.5887396335601807,0.9627703428268433,0.016871673986315727,0.6964824199676514,0.8136786222457886,0.5098071694374084,0.33396488428115845,0.7908401489257812,0.09724292904138565,0.44203564524650574,0.5199523568153381,0.6939564347267151,0.09088572859764099,0.2277594953775406,0.4103015661239624,0.6232946515083313,0.8869608044624329,0.618826150894165,0.13346147537231445,0.9805801510810852,0.8717857599258423,0.5027207732200623,0.9223479628562927,0.5413808226585388,0.9233060479164124,0.8298973441123962,0.968286395072937,0.919782817363739,0.03603381663560867,0.1747720092535019,0.3891346752643585,0.9521427154541016,0.300028920173645,0.16046763956546783,0.8863046765327454,0.4463944137096405,0.9078755974769592,0.16023047268390656,0.6611174941062927,0.4402637481689453,0.07648676633834839,0.6964631676673889,0.2473987489938736,0.03961552307009697,0.05994429811835289,0.06107853725552559,0.9077329635620117,0.7398838996887207,0.8980623483657837,0.6725823283195496,0.5289399027824402,0.30444636940956116,0.997962236404419,0.36218905448913574,0.47064894437789917,0.37824517488479614,0.979526937007904,0.1746583878993988,0.32798799872398376,0.6803486943244934,0.06320761889219284,0.60724937915802,0.47764649987220764,0.2839999794960022,0.2384132742881775,0.5145127177238464,0.36792758107185364,0.4565199017524719,0.3374773859977722,0.9704936742782593,0.13343943655490875,0.09680395573377609,0.3433917164802551,0.5910269021987915,0.6591764688491821,0.3972567617893219,0.9992780089378357,0.35189300775527954,0.7214066386222839,0.6375827193260193,0.8130538463592529,0.9762256741523743,0.8897936344146729,0.7645619511604309,0.6982485055923462,0.335498183965683,0.14768557250499725,0.06263600289821625,0.2419017106294632,0.432281494140625,0.521996259689331,0.7730835676193237,0.9587409496307373,0.1173204779624939,0.10700414329767227,0.5896947383880615,0.7453980445861816,0.848150372505188,0.9358320832252502,0.9834262132644653,0.39980170130729675,0.3803351819515228,0.14780867099761963,0.6849344372749329,0.6567619442939758,0.8620625734329224,0.09725799411535263,0.49777689576148987,0.5810819268226624,0.2415570467710495,0.16902540624141693,0.8595808148384094,0.05853492394089699,0.47062090039253235,0.11583399772644043,0.45705875754356384,0.9799623489379883,0.4237063527107239,0.857124924659729,0.11731556057929993,0.2712520658969879,0.40379273891448975,0.39981213212013245,0.6713835000991821,0.3447181284427643,0.713766872882843,0.6391869187355042,0.399161159992218,0.43176013231277466,0.614527702331543,0.0700421929359436,0.8224067091941833,0.65342116355896,0.7263424396514893,0.5369229912757874,0.11047711223363876,0.4050356149673462,0.40537357330322266,0.3210429847240448,0.029950324445962906,0.73725426197052,0.10978446155786514,0.6063081622123718,0.7032175064086914,0.6347863078117371,0.95914226770401,0.10329815745353699,0.8671671748161316,0.02919023483991623,0.534916877746582,0.4042436182498932,0.5241838693618774,0.36509987711906433,0.19056691229343414,0.01912289671599865,0.5181497931480408,0.8427768349647522,0.3732159435749054,0.2228638231754303,0.080532006919384,0.0853109210729599,0.22139644622802734,0.10001406073570251,0.26503971219062805,0.06614946573972702,0.06560486555099487,0.8562761545181274,0.1621202677488327,0.5596824288368225,0.7734555602073669,0.4564095735549927,0.15336887538433075,0.19959613680839539,0.43298420310020447,0.52823406457901,0.3494403064250946,0.7814795970916748,0.7510216236114502,0.9272118210792542,0.028952548280358315,0.8956912755966187,0.39256879687309265,0.8783724904060364,0.690784752368927,0.987348735332489,0.7592824697494507,0.3645446300506592,0.5010631680488586,0.37638914585113525,0.364911824464798,0.2609044909477234,0.49597030878067017,0.6817399263381958,0.27734026312828064,0.5243797898292542,0.117380291223526,0.1598452925682068,0.04680635407567024,0.9707314372062683,0.0038603513967245817,0.17857997119426727,0.6128667593002319,0.08136960119009018,0.8818964958190918,0.7196201682090759,0.9663899540901184,0.5076355338096619,0.3004036843776703,0.549500584602356,0.9308187365531921,0.5207614302635193,0.2672070264816284,0.8773987889289856,0.3719187378883362,0.0013833499979227781,0.2476850152015686,0.31823351979255676,0.8587774634361267,0.4585031569004059,0.4445872902870178,0.33610227704048157,0.880678117275238,0.9450267553329468,0.9918903112411499,0.3767412602901459,0.9661474227905273,0.7918795943260193,0.675689160823822,0.24488948285579681,0.21645726263523102,0.1660478264093399,0.9227566123008728,0.2940766513347626,0.4530942440032959,0.49395784735679626,0.7781715989112854,0.8442349433898926,0.1390727013349533,0.4269043505191803,0.842854917049408,0.8180332779884338}; float calib_output0_data[NET_OUTPUT0_SIZE] = {3.5647096e-05,6.824297e-08,0.009327697,3.2340475e-05,1.1117579e-05,1.5117058e-06,4.6314454e-07,5.161628e-11,0.9905911,3.8835238e-10}; ``` #### 编译工程及烧录 - 编译 在MCU工程目录下,执行`make`命令进行编译,编译成功后显示如下,test_stm767为本例的MCU工程名: ```text arm-none-eabi-size build/test_stm767.elf text data bss dec hex filename 120316 3620 87885 211821 33b6d build/test_stm767.elf arm-none-eabi-objcopy -O ihex build/test_stm767.elf build/test_stm767.hex arm-none-eabi-objcopy -O binary -S build/test_stm767.elf build/test_stm767.bin ``` - 烧录运行 我们可以通过`STMSTM32CubePrg`烧录工具进行代码烧录并运行。在PC机上,通过`STLink`连接一块可烧录的开发板,然后在当前MCU工程目录下运行以下命令,执行烧录并运行程序: ```bash bash ${STMSTM32CubePrg_PATH}/bin/STM32_Programmer.sh -c port=SWD -w build/test_stm767.bin 0x08000000 -s 0x08000000 ``` ${STMSTM32CubePrg_PATH为}为`STMSTM32CubePrg`安装路径。关于命令中的各参数含义,请参考`STMSTM32CubePrg`的使用手册。 #### 推理结果验证 本例中,我们把benchmark运行结果标志保存在了起始地址为`0x20000000`且大小为1字节的内存段内,故可以直接通过烧录器获取该处地址的数据,以得到程序返回结果。 在PC机上,通过`STLink`连接一块已烧录好程序的开发板,通过执行以下命令读取内存数据: ```bash bash ${STMSTM32CubePrg_PATH为}/bin/STM32_Programmer.sh -c port=SWD model=HOTPLUG --upload 0x20000000 0x1 ret.bin ``` ${STMSTM32CubePrg_PATH为}为`STMSTM32CubePrg`安装路径。关于命令中的各参数含义,请参考`STMSTM32CubePrg`的使用手册。 读取的数据被保存在`ret.bin`文件内,运行`cat ret.bin`,如果板端推理成功,`ret.bin`内保存着字符`1`,会显示如下: ```text 1 ``` ## 在轻鸿蒙设备上执行推理 ### 轻鸿蒙编译环境准备 用户可以通过[OpenHarmony官网](https://www.openharmony.cn)来学习轻鸿蒙环境下的编译及烧录。 本教程以Hi3516开发板为例,演示如何在轻鸿蒙环境上使用Micro部署推理模型。 ### 编译模型 使用converter_lite编译[lenet模型](https://download.mindspore.cn/model_zoo/official/lite/quick_start/micro/mnist.tar.gz),生成对应轻鸿蒙平台的推理代码,命令如下: ```shell ./converter_lite --fmk=TFLITE --modelFile=mnist.tflite --outputFile=${SOURCE_CODE_DIR} --configFile=${COFIG_FILE} ``` 其中config配置文件设置target = ARM32。 ### 编写构建脚本 轻鸿蒙应用程序开发请先参考[运行Hello OHOS](https://device.harmonyos.com/cn/docs/start/introduce/quickstart-lite-steps-board3516-running-0000001151888681)。将上一步生成的mnist目录拷贝到任意鸿蒙源码路径下,假设为applications/sample/,然后新建BUILD.gn文件: ```text /applications/sample/mnist ├── benchmark ├── CMakeLists.txt ├── BUILD.gn └── src ``` 下载适用于OpenHarmony的[预编译推理runtime包](https://www.mindspore.cn/lite/docs/zh-CN/master/use/downloads.html),然后将其解压至任意鸿蒙源码路径下。编写BUILD.gn文件: ```text import("//build/lite/config/component/lite_component.gni") import("//build/lite/ndk/ndk.gni") lite_component("mnist_benchmark") { target_type = "executable" sources = [ "benchmark/benchmark.cc", "benchmark/calib_output.cc", "benchmark/load_input.c", "src/net.c", "src/weight.c", "src/session.cc", "src/tensor.cc", ] features = [] include_dirs = [ "/runtime", "/tools/codegen/include", "//applications/sample/mnist/benchmark", "//applications/sample/mnist/src", ] ldflags = [ "-fno-strict-aliasing", "-Wall", "-pedantic", "-std=gnu99", ] libs = [ "/runtime/lib/libmindspore-lite.a", "/tools/codegen/lib/libwrapper.a", ] defines = [ "NOT_USE_STL", "ENABLE_NEON", "ENABLE_ARM", "ENABLE_ARM32" ] cflags = [ "-fno-strict-aliasing", "-Wall", "-pedantic", "-std=gnu99", ] cflags_cc = [ "-fno-strict-aliasing", "-Wall", "-pedantic", "-std=c++17", ] } ``` ``是解压出来的推理runtime包路径,比如//applications/sample/mnist/mindspore-lite-1.3.0-ohos-aarch32。 修改文件build/lite/components/applications.json,添加组件mnist_benchmark的配置: ```text { "component": "mnist_benchmark", "description": "Communication related samples.", "optional": "true", "dirs": [ "applications/sample/mnist" ], "targets": [ "//applications/sample/mnist:mnist_benchmark" ], "rom": "", "ram": "", "output": [], "adapted_kernel": [ "liteos_a" ], "features": [], "deps": { "components": [], "third_party": [] } }, ``` 修改文件vendor/hisilicon/hispark_taurus/config.json,新增mnist_benchmark组件的条目: ```text { "component": "mnist_benchmark", "features":[] } ``` ### 编译benchmark ```text cd hb set(设置编译路径) .(选择当前路径) 选择ipcamera_hispark_taurus@hisilicon并回车 hb build mnist_benchmark(执行编译) ``` 生成结果文件out/hispark_taurus/ipcamera_hispark_taurus/bin/mnist_benchmark。 ### 执行benchmark 将mnist_benchmark、权重文件(mnist/src/net.bin)以及[输入文件](https://download.mindspore.cn/model_zoo/official/lite/quick_start/micro/mnist.tar.gz)解压后拷贝到开发板上,然后执行: ```text OHOS # ./mnist_benchmark mnist_input.bin net.bin 1 OHOS # =======run benchmark====== input 0: mnist_input.bin loop count: 1 total time: 10.11800ms, per time: 10.11800ms outputs: name: int8toft32_Softmax-7_post0/output-0, DataType: 43, Elements: 10, Shape: [1 10 ], Data: 0.000000, 0.000000, 0.003906, 0.000000, 0.000000, 0.992188, 0.000000, 0.000000, 0.000000, 0.000000, ========run success======= ``` ## 自定义算子 使用前请先参考[自定义算子](https://www.mindspore.cn/lite/docs/zh-CN/master/use/register.html)了解基本概念。Micro目前仅支持custom类型的自定义算子注册和实现,暂不支持内建算子(比如conv2d、fc等)的注册和自定义实现。下面以海思Hi3516D开发板为例,说明如何在Micro中使用自定义算子。 使用转换工具生成NNIE的自定义算子具体步骤请参考[集成NNIE使用说明](https://www.mindspore.cn/lite/docs/zh-CN/master/use/nnie.html)。 模型生成代码方式与非自定义算子模型保持一致: ```shell ./converter_lite --fmk=TFLITE --modelFile=mnist.tflite --outputFile=${SOURCE_CODE_DIR} --configFile=${COFIG_FILE} ``` 其中config配置文件设置target = ARM32。 ### 用户实现自定义算子 上一步会在用户指定路径下生成源码目录,其有一个名为`src/registered_kernel.h`的头文件指定了custom算子的函数声明: ``` C++ int CustomKernel(TensorC *inputs, int input_num, TensorC *outputs, int output_num, CustomParameter *param); ``` 用户需要提供该函数的实现,并将相关源码或者库集成到生成代码的cmake工程中。例如,我们提供了支持海思NNIE的custom kernel示例动态库libmicro_nnie.so,该文件包含在[官网下载页](https://www.mindspore.cn/lite/docs/zh-CN/master/use/downloads.html)“NNIE 推理runtime库、benchmark工具”组件中。用户需要修改生成代码的CMakeLists.txt,添加链接的库名称和路径。例如: ``` shell link_directories(/mindspore-lite-1.8.1-linux-aarch32/providers/Hi3516D) link_directories() target_link_libraries(benchmark net micro_nnie nnie mpi VoiceEngine upvqe dnvqe securec -lm -pthread) ``` 在生成的`benchmark/benchmark.c`文件中,在main函数的调用前后添加[NNIE设备相关初始化代码](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/test/config_level0/micro/svp_sys_init.c),最后进行源码编译: ``` shell mkdir buid && cd build cmake -DCMAKE_TOOLCHAIN_FILE=/mindspore/lite/cmake/himix200.toolchain.cmake -DPLATFORM_ARM32=ON -DPKG_PATH= .. make ``` ## Micro推理与端侧训练结合 ### 概述 除MCU外,Micro推理是一种模型结构与权重分离的推理模式。训练一般是改变了权重,但不会改变模型结构。那么,在训练与推理配合的场景下,可以采用端侧训练+Micro推理的模式,以利用Micro推理运行内存小、功耗小的优势。具体过程包括以下几步: - 基于端侧训练导出推理模型 - 通过converter_lite转换工具,生成与端侧训练相同架构下的模型推理代码 - 下载得到与端侧训练相同架构对应的`Micro`库 - 对得到的推理代码和`Micro`库进行集成,编译并部署 - 基于端侧训练导出推理模型的权重,覆盖原有权重文件,进行验证 接下来我们将详细介绍各个步骤及其注意事项。 ### 训练导出推理模型 用户可以直接参考[端侧训练](https://www.mindspore.cn/lite/docs/zh-CN/master/use/runtime_train_cpp.html)一节。 ### 生成推理代码 用户可以直接参考上述内容,但需要注意两个点。第一,训练导出的模型是ms模型,因此在转换时,需设置`fmk`为`MSLITE`;第二,为了能够将训练与Micro推理结合,就需要保证训练导出的权重和Micro导出的权重完全匹配,因此,我们在Micro配置参数中新增了两个属性,以保证权重的一致性。 ```text [micro_param] # false indicates that only the required weights will be saved. Default is false. # If collaborate with lite-train, the parameter must be true. keep_original_weight=false # the names of those weight-tensors whose shape is changeable, only embedding-table supports change now. # the parameter is used to collaborate with lite-train. If set, `keep_original_weight` must be true. changeable_weights_name=name0,name1 ``` `keep_original_weight`是保证权重一致性的关键属性,与训练配合时,此属性必须为true。`changeable_weights_name`是针对特殊场景下的属性,例如某些权重的shape发生了变化,当然,当前仅支持embedding表的个数发生变化,一般而言,用户无需设置该属性。 ### 编译部署 用户可以直接参考上述内容。 ### 训练导出推理模型的权重 MindSpore的Serialization类提供了ExportWeightsCollaborateWithMicro函数,ExportWeightsCollaborateWithMicro原型如下: ```cpp static Status ExportWeightsCollaborateWithMicro(const Model &model, ModelType model_type, const std::string &weight_file, bool is_inference = true, bool enable_fp16 = false, const std::vector &changeable_weights_name = {}); ``` 其中,`is_inference`当前仅支持为true。