Model Analysis and Preparation

View Source On Gitee

Reproducing Algorithm Implementation

  1. Obtain the PyTorch reference code.

  2. Analyze the algorithm, network structure, and tricks in the original code, including the method of data augmentation, learning rate attenuation policy, optimizer parameters, and the initialization method of training parameters, etc.

  3. Reproduce the accuracy of the reference implementation, obtain the performance data of the reference implementation, and identify some issues in advance.

Please refer to Details of Reproducing Algorithm Implementation.

Analyzing API Compliance

Before practicing migration, it is recommended to analyze the API compliance in MindSpore’s migration code to avoid affecting code implementation due to the lack of API support.

The API missing analysis here refers to APIs in the network execution diagram, including MindSpore operators and advanced encapsulated APIs, and excluding the APIs used in data processing. You are advised to use third-party APIs, such as NumPy, OpenCV, Pandas, and PIL, to replace APIs used in data processing.

There are two methods to analyze API compliance:

  1. Scanning API by MindSpore Dev Toolkit (recommended).

  2. Querying the API Mapping Table.

Scanning API by Toolkit

MindSpore Dev Toolkit is a development kit supporting PyCharm and Visual Studio Code plug-in developed by MindSpore, which can scan API based on file-level or project-level.

Refer to PyCharm API Scanning for the tutorials of Dev Toolkit in PyCharm.

api_scan_pycharm

Refer to Visual Studio Code API Scanning for the tutorials of Dev Toolkit in Visual Studio Code.

api_scan_pycharm

Querying the API Mapping Table

Take the PyTorch code migration as an example. After obtaining the reference code implementation, you can filter keywords such as torch, nn, and ops to obtain the used APIs. If the method of another repository is invoked, you need to manually analyze the API. Then, check the PyTorch and MindSpore API Mapping Table. Alternatively, the API searches for the corresponding API implementation.

For details about the mapping of other framework APIs, see the API naming and function description. For APIs with the same function, the names of MindSpore may be different from those of other frameworks. The parameters and functions of APIs with the same name may also be different from those of other frameworks. For details, see the official description.

Processing Missing API

You can use the following methods to process the missing API:

  1. Use equivalent replacement

  2. Use existing APIs to package equivalent function logic

  3. Customize operators

  4. Seek help from the community

Refer to Missing API Processing Policy for details.

Analyzing Function Compliance

During continuous delivery of MindSpore, some functions are restricted. If restricted functions are involved during network migration, before migration, functional compliance needs to be analyzed. It can be analyzed from the following points:

  1. Dynamic shape.

  2. Sparse.

Dynamic Shape

Currently MindSpore dynamic shape feature is under iterative development, and the dynamic shape functionality is not well supported. The following will give several scenarios where dynamic shape is introduced. During network migration, the presence of one of the following scenarios indicates the presence of dynamic shape in the network.

  • Several scenarios that introduces dynamic shapes:

  • Several solutions for dynamic shapes:

    • Input shape is not fixed: Dynamic shape can be converted to static shape through the mask mechanism. Mask mechanism example code is as follows:

      def _convert_ids_and_mask(input_tokens, seq_max_bucket_length):
          input_ids = tokenizer.convert_tokens_to_ids(input_tokens)
          input_mask = [1] * len(input_ids)
          assert len(input_ids) <= max_seq_length
      
          while len(input_ids) < seq_max_bucket_length:
              input_ids.append(0)
              input_mask.append(0)
      
          assert len(input_ids) == seq_max_bucket_length
          assert len(input_mask) == seq_max_bucket_length
      
          return input_ids, input_mask
      
    • There is an API that triggers a shape change during network execution: If this scenario is encountered to introduce a dynamic shape, the essence is that the dynamically changing values need to be modified to a fixed shape to solve the problem. As in the case of the TopK operator, if K is changing during execution, a dynamic shape is introduced. Solution: You can fix a maximum number of targets, first get the confidence level of all targets by static shape, then choose the K number of highest targets as the result output, and other targets are removed by mask mechanism. Sample code such as the multiclass_nms interface of FasterRCNN.

    • Different branches of the control flow introduce changes on the shape: You can try to use equal, select operators to replace the if condition. Sample code is as follows:

      # Code example for introducing control flow:
      if ms.ops.reduce_sum(object_masks)==0:
         stage2_loss = stage2_loss.fill(0.0)
      # modified code example
      stage2_loss = ms.ops.select(ms.ops.equal(ms.ops.reduce_sum(object_masks), 0), stage2_loss.fill(0), stage2_loss)
      

Sparse

MindSpore now supports the two most commonly used sparse data formats, CSR and COO, but due to the limited support for sparse operators at the moment, most of the sparse features are still limited. In this case, it is recommended to find whether the corresponding operator supports sparse computation first, and if not it needs to be converted to a normal operator. For details, see Sparse.