mindspore_lite.LLMRole

View Source On Gitee
class mindspore_lite.LLMRole[source]

Role of LLMEngine. When LLMEngine accelerates inference performance through KVCache, the generation process includes one full inference and n incremental inference, involving both full and incremental models. When the full and incremental models are deployed on different nodes, the role of the node where the full models are located is Prompt, and the role of the node where the incremental models are located is Decoder.