mindformers.ModelRunner

View Source On Gitee
class mindformers.ModelRunner(model_path, npu_mem_size, cpu_mem_size, block_size, rank_id=0, world_size=1, npu_device_ids=None, plugin_params=None)[source]

ModelRunner API, supports MindFormers to be a backend of MindIEServer.

Parameters
  • model_path (str) – The model config path contains model config file and tokenizer file.

  • npu_mem_size (int) – Npu memory size used for kv-cache.

  • cpu_mem_size (int) – Cpu memory size used for kv-cache.

  • block_size (int) – Block size used for kv-cache.

  • rank_id (int, optional) – Rank id used for infer. Default: 0.

  • world_size (int, optional) – Rank size used for infer. Default: 1.

  • npu_device_ids (list[int], optional) – Get npu_device_ids from MindIE config. Default: None.

  • plugin_params (str, optional) – A JSON string that contains additional plugin parameters. Default: None.

Returns

A MindIERunner object.

Examples

>>> from mindformers import ModelRunner
>>> model_path = /path/to/model/ # contains model config file and tokenizer file.
>>> npu_mem_size = 3
>>> cpu_mem_size = 1
>>> block_size = 128
>>> rank_id = 0
>>> world_size = 1
>>> npu_device_ids = [0]
>>> model_runner = ModelRunner(model_path=model_path, npu_mem_size=npu_mem_size, cpu_mem_size=cpu_mem_size,
>>>                            block_size=block_size, rank_id=rank_id, world_size=world_size,
>>>                            npu_device_ids=npu_device_ids)
>>> type(model_runner)
<class 'mindformers.model_runner.MindIEModelRunner'>