Models
The following table lists models supported by MindSpore Transformers.
Model |
Specifications |
Model Type |
Model Architecture |
Latest Version |
|---|---|---|---|---|
TeleChat3 |
36B |
Dense LLM |
Mcore |
1.9.0 |
TeleChat3-MoE |
105B-A4.7B |
Sparse LLM |
Mcore |
1.9.0 |
Qwen3 |
0.6B/1.7B/4B/8B/14B/32B |
Dense LLM |
Mcore |
1.9.0 |
Qwen3-MoE |
30B-A3B/235B-A22B |
Sparse LLM |
Mcore |
1.9.0 |
DeepSeek-V3 |
671B |
Sparse LLM |
Mcore/Legacy |
1.9.0 |
GLM4.5 |
106B-A12B/355B-A32B |
Sparse LLM |
Mcore |
1.9.0 |
GLM4 |
9B |
Dense LLM |
Mcore/Legacy |
1.9.0 |
Qwen2.5 |
0.5B/1.5B/7B/14B/32B/72B |
Dense LLM |
Legacy |
1.9.0 |
TeleChat2 |
7B/35B/115B |
Dense LLM |
Mcore/Legacy |
1.9.0 |
Llama3.1 |
8B/70B |
Dense LLM |
Legacy |
1.7.0 |
Mixtral |
8x7B |
Sparse LLM |
Legacy |
1.7.0 |
CodeLlama |
34B |
Dense LLM |
Legacy |
1.5.0 |
CogVLM2-Image |
19B |
MM |
Legacy |
1.5.0 |
CogVLM2-Video |
13B |
MM |
Legacy |
1.5.0 |
DeepSeek-V2 |
236B |
Sparse LLM |
Legacy |
1.5.0 |
DeepSeek-Coder-V1.5 |
7B |
Dense LLM |
Legacy |
1.5.0 |
DeepSeek-Coder |
33B |
Dense LLM |
Legacy |
1.5.0 |
GLM3-32K |
6B |
Dense LLM |
Legacy |
1.5.0 |
GLM3 |
6B |
Dense LLM |
Legacy |
1.5.0 |
InternLM2 |
7B/20B |
Dense LLM |
Legacy |
1.5.0 |
Llama3.2 |
3B |
Dense LLM |
Legacy |
1.5.0 |
Llama3.2-Vision |
11B |
MM |
Legacy |
1.5.0 |
Llama3 |
8B/70B |
Dense LLM |
Legacy |
1.5.0 |
Qwen2 |
0.5B/1.5B/7B/57B/57B-A14B/72B |
Dense /Sparse LLM |
Legacy |
1.5.0 |
Qwen1.5 |
7B/14B/72B |
Dense LLM |
Legacy |
1.5.0 |
Qwen-VL |
9.6B |
MM |
Legacy |
1.5.0 |
TeleChat |
7B/12B/52B |
Dense LLM |
Legacy |
1.5.0 |
Whisper |
1.5B |
MM |
Legacy |
1.5.0 |
Yi |
6B/34B |
Dense LLM |
Legacy |
1.5.0 |
YiZhao |
12B |
Dense LLM |
Legacy |
1.5.0 |
Llama2 |
7B/13B/70B |
Dense LLM |
Legacy |
1.3.2 |
Baichuan2 |
7B/13B |
Dense LLM |
Legacy |
1.3.2 |
GLM2 |
6B |
Dense LLM |
Legacy |
1.3.2 |
GPT2 |
124M/13B |
Dense LLM |
Legacy |
1.3.2 |
InternLM |
7B/20B |
Dense LLM |
Legacy |
1.3.2 |
Qwen |
7B/14B |
Dense LLM |
Legacy |
1.3.2 |
CodeGeex2 |
6B |
Dense LLM |
Legacy |
1.1.0 |
WizardCoder |
15B |
Dense LLM |
Legacy |
1.1.0 |
Baichuan |
7B/13B |
Dense LLM |
Legacy |
1.0 |
Blip2 |
8.1B |
MM |
Legacy |
1.0 |
Bloom |
560M/7.1B/65B/176B |
Dense LLM |
Legacy |
1.0 |
Clip |
149M/428M |
MM |
Legacy |
1.0 |
CodeGeex |
13B |
Dense LLM |
Legacy |
1.0 |
GLM |
6B |
Dense LLM |
Legacy |
1.0 |
iFlytekSpark |
13B |
Dense LLM |
Legacy |
1.0 |
Llama |
7B/13B |
Dense LLM |
Legacy |
1.0 |
MAE |
86M |
MM |
Legacy |
1.0 |
Mengzi3 |
13B |
Dense LLM |
Legacy |
1.0 |
PanguAlpha |
2.6B/13B |
Dense LLM |
Legacy |
1.0 |
SAM |
91M/308M/636M |
MM |
Legacy |
1.0 |
Skywork |
13B |
Dense LLM |
Legacy |
1.0 |
Swin |
88M |
MM |
Legacy |
1.0 |
T5 |
14M/60M |
Dense LLM |
Legacy |
1.0 |
VisualGLM |
6B |
MM |
Legacy |
1.0 |
Ziya |
13B |
Dense LLM |
Legacy |
1.0 |
Bert |
4M/110M |
Dense LLM |
Legacy |
0.8 |
* ⚠️EOL indicates that the model has been offline from the main branch and can be used with the latest supported version (e.g., 1.7.0).