Release Notes

MindSpore Lite 2.8.0 Release Notes

Key Features and Enhancements

  • MindSpore Lite supports Python 3.12.

  • MindSpore Lite supports saving the intermediate graphs generated during the conversion process. You can use environment variables to control whether to save these intermediate graphs, which can be used for troubleshooting issues during model conversion.

Cloud-side inference

  • Performance optimization of LoRA weight update: the latency of calling the Model.UpdateWeights() interface is reduced from the second-level to the hundred-millisecond level.

  • The ACL inference of MindSpore Lite Ascend backend supports timeout configuration.

  • MindSpore Lite cloud-side inference supports concurrent model loading.

  • The GE inference of MindSpore Lite Ascend backend supports zero-copy data under static shape and dynamic batching scenarios.

Device-side inference

  • MindSpore Lite supports offline model inference on Android NPU.

  • Removed the MindData data preprocessing module from MindSpore Lite.

  • Removed the support for Cortex-M CMSIS in MindSpore Lite Micro.

API Changes

  • Configuration change for LoRA weight update conversion: the content format of the variable_weights_file has been changed from:

    weight_name:(shape);node_name
    

    to

    weight_name:shape;node_name
    
  • Added new environment variable switches for saving intermediate graphs during the conversion process:

    When users configure export MSLITE_DUMP_LEVEL=0, it means dumping detailed graph structures and constant Tensor data.
    When users configure export MSLITE_DUMP_LEVEL=1, it means dumping only graph structures without constant Tensor data.
    When users configure export MSLITE_DUMP_PATH="/xx/xx/", it specifies the path for dumping graphs.
    
  • Removed the high-level Train()/Evaluate() interfaces for on-device training, which can be replaced by the low-level RunStep() interface.

  • MindSpore Lite has added the C++ interface Model.Build and the Python interface Model.build_from_buffer for cloud-side inference, which supports buffer-based model loading in the weight separation scenario.

Contributors

YeFeng_24, xiong-pan, jjfeing, liuf9, xu_anyue, yiguangzheng, zxx_xxz, jianghui58, hbhu_bin, chenyihang5, qll1998, yangyingchun1999, liuchengji3, cheng-chao23, gemini524, yangly