mindspore_lite.LLMEngine

View Source On Gitee
class mindspore_lite.LLMEngine(role: LLMRole, cluster_id: int, batch_mode='auto')[source]

The LLMEngine class defines a MindSpore Lite's LLMEngine, used to load and manage Large Language Mode, and schedule and execute inference request.

Parameters
  • role (LLMRole) – Role of this LLMEngine object.

  • cluster_id (int) – Cluster id of this LLMEngine object.

  • batch_mode (str) – Controls whether the request batching is "auto" formed by the framework or "manual"ly by the user. Option is "auto" or "manual", default "auto".

Raises

Examples

>>> import mindspore_lite as mslite
>>> cluster_id = 1
>>> llm_engine = mslite.LLMEngine(mslite.LLMRole.Prompt, cluster_id)
>>> model_paths = [os.path.join(model_dir, f"device_${rank}") for rank in range(4)]
>>> options = {}
>>> llm_model = llm_engine.add_mode(model_paths, options)  # return LLMModel object
>>> llm_engine.init()
>>> llm_req = mslite.LLMReq(llm_engine.cluster_id, mslite.LLMReq.next_req_id(), prompt_length=1024)
>>> inputs = [mslite.Tensor(np_input) for np_input in np_inputs]
>>> outputs = llm_model.predit(llm_req, inputs)
>>> for output in outputs:
>>>    print(f"output is {output.get_data_to_numpy()}")
>>> llm_engine.complete(llm_req)
add_model(model_paths: Union[Tuple[str], List[str]], options: Dict[str, str], postprocess_model_path=None)[source]

Add model to LLMEngine.

Parameters
  • model_paths (Union[Tuple[str], List[str]]) – List or tuple of model path.

  • options (Dict[str, str]) – Other init options of this LLMEngine object.

  • postprocess_model_path (Union[str, None]) – Postprocess model path, default None.

Raises
  • TypeErrormodel_paths is not a list and tuple.

  • TypeErrormodel_paths is a list or tuple, but the elements are not str.

  • TypeErroroptions is not a dict.

  • RuntimeError – add model failed.

property batch_mode

Get batch mode of this LLMEngine object

property cluster_id

Get cluster id set to this LLMEngine object

complete_request(llm_req: LLMReq)[source]

Complete inference request.

Parameters

llm_req (LLMReq) – Request of LLMEngine.

Raises
fetch_status()[source]

Get LLMEngine status.

Returns

LLMEngineStatus, LLMEngine status.

Raises

RuntimeError – this LLMEngine object has not been inited.

finalize()[source]

Finalize LLMEngine.

init(options: Dict[str, str])[source]

Init LLMEngine.

Parameters

options (Dict[str, str]) – init options of this LLMEngine object.

Raises

Link clusters.

Parameters
  • clusters (Union[List[LLMClusterInfo], Tuple[LLMClusterInfo]]) – clusters.

  • timeout (int, optional) – timeout in seconds. Default: -1.

Raises
  • TypeErrorclusters is not list/tuple of LLMClusterInfo.

  • RuntimeError – LLMEngine is not inited or init failed.

Returns

(Status, tuple[Status]), Whether all clusters link normally, and the link status of each cluster.

Examples

>>> import mindspore_lite as mslite
>>> cluster_id = 1
>>> llm_engine = mslite.LLMEngine(mslite.LLMRole.Prompt, cluster_id)
>>> model_paths = [os.path.join(model_dir, f"device_${rank}") for rank in range(4)]
>>> options = {}
>>> llm_engine.init(model_paths, options)
>>> cluster = mslite.LLMClusterInfo(mslite.LLMRole.Prompt, 0)
>>> cluster.append_local_ip_info(("*.*.*.*", *))
>>> cluster.append_remote_ip_info(("*.*.*.*", *))
>>> cluster2 = mslite.LLMClusterInfo(mslite.LLMRole.Prompt, 1)
>>> cluster2.append_local_ip_info(("*.*.*.*", *))
>>> cluster2.append_remote_ip_info(("*.*.*.*", *))
>>> ret, rets = llm_engine.link_clusters((cluster, cluster2))
>>> if not ret.IsOk():
>>>    for ret_item in rets:
>>>        if not ret_item.IsOk():
>>>            # do something
property role

Get LLM role set to this LLMEngine object

Unlink clusters.

Parameters
  • clusters (Union[List[LLMClusterInfo], Tuple[LLMClusterInfo]]) – clusters.

  • timeout (int, optional) – timeout in seconds. Default: -1.

Raises
Returns

Status, tuple[Status], Whether all clusters unlink normally, and the unlink status of each cluster.

Examples

>>> import mindspore_lite as mslite
>>> cluster_id = 1
>>> llm_engine = mslite.LLMEngine(mslite.LLMRole.Prompt, cluster_id)
>>> model_paths = [os.path.join(model_dir, f"device_${rank}") for rank in range(4)]
>>> options = {}
>>> llm_engine.init(model_paths, options)
>>> cluster = mslite.LLMClusterInfo(mslite.LLMRole.Prompt, 0)
>>> cluster.append_local_ip_info(("*.*.*.*", *))
>>> cluster.append_remote_ip_info(("*.*.*.*", *))
>>> cluster2 = mslite.LLMClusterInfo(mslite.LLMRole.Prompt, 1)
>>> cluster2.append_local_ip_info(("*.*.*.*", *))
>>> cluster2.append_remote_ip_info(("*.*.*.*", *))
>>> ret, rets = llm_engine.unlink_clusters((cluster, cluster2))
>>> if not ret.IsOk():
>>>    for ret_item in rets:
>>>        if not ret_item.IsOk():
>>>            # do something