[{"data":1,"prerenderedAt":262},["ShallowReactive",2],{"content-query-Is94lfU1PC":3},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"date":10,"cover":11,"type":12,"body":13,"_type":256,"_id":257,"_source":258,"_file":259,"_stem":260,"_extension":261},"/news/en/2771","en",false,"","Implementation of the Baichun Model on MindSpore through Deployment, Training, Inference, and Fine-Tuning","The Beijing Ascend AI Computing Center successfully completed the deployment, training, inference, and fine-tuning of the open source foundation model Baichuan based on MindSpore and MindFormers.","2023-08-22","https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/13735ad2e6a94758b0ab9c10ba9c7630.png","news",{"type":14,"children":15,"toc":253},"root",[16,24,30,38,43,52,57,65,70,78,83,88,93,101,106,111,119,124,129,134,139,154,159,170,180,191,196,208,213,218,223,228,233,238,243,248],{"type":17,"tag":18,"props":19,"children":21},"element","h1",{"id":20},"implementation-of-the-baichun-model-on-mindspore-through-deployment-training-inference-and-fine-tuning",[22],{"type":23,"value":8},"text",{"type":17,"tag":25,"props":26,"children":27},"p",{},[28],{"type":23,"value":29},"The Beijing Ascend AI Computing Center successfully completed the deployment, training, inference, and fine-tuning of the open source foundation model Baichuan based on MindSpore and MindFormers. Additionally, the Center assisted an industry client in completing the secondary pre-training of a 160-device cluster with stable loss convergence, resulting in the successful creation of an industry foundation model.",{"type":17,"tag":25,"props":31,"children":32},{},[33],{"type":17,"tag":34,"props":35,"children":37},"img",{"alt":7,"src":36},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/0e7261abdd9d46c4aa9ac4914399e68f.png",[],{"type":17,"tag":25,"props":39,"children":40},{},[41],{"type":23,"value":42},"(Cluster training loss curve and learning rate tested by users)",{"type":17,"tag":25,"props":44,"children":45},{},[46],{"type":17,"tag":47,"props":48,"children":49},"strong",{},[50],{"type":23,"value":51},"MindFormers",{"type":17,"tag":25,"props":53,"children":54},{},[55],{"type":23,"value":56},"MindFormers is a powerful development kit for foundation models, including their training, inference, and deployment. MindFormers offers a wealth of fundamental knowledge and a variety of training, inference, and deployment options. Its user-friendly and scalable design allows for quick customization to meet individual needs. With just a single line of code, users can effortlessly transition from training on a single device to large-scale cluster training, efficiently combining policies of data parallelism and model parallelism.",{"type":17,"tag":25,"props":58,"children":59},{},[60],{"type":17,"tag":47,"props":61,"children":62},{},[63],{"type":23,"value":64},"Baichuan Foundation Models",{"type":17,"tag":25,"props":66,"children":67},{},[68],{"type":23,"value":69},"Baichuan, both open source and commercially available, is a series of pre-trained large language models developed by Baichuan AI. The series includes models with 7 billion, 13 billion, and 53 billion parameters.",{"type":17,"tag":25,"props":71,"children":72},{},[73],{"type":17,"tag":47,"props":74,"children":75},{},[76],{"type":23,"value":77},"The detailed advantages are as follows:",{"type":17,"tag":25,"props":79,"children":80},{},[81],{"type":23,"value":82},"1. Trained on massive Chinese and English data, these models exhibit robust language versatility and cross-linguistic abilities. They can handle a wide range of text types and styles, from formal narratives to colloquial expressions, in both Chinese and English.",{"type":17,"tag":25,"props":84,"children":85},{},[86],{"type":23,"value":87},"2. Based on the Transformer structure, they have strong parallelism and scalability. By leveraging large-scale computing resources, they can achieve efficient training and inference. Additionally, through model compression and optimization technologies, they are able to operate on a variety of hardware and environments.",{"type":17,"tag":25,"props":89,"children":90},{},[91],{"type":23,"value":92},"3. These models, utilizing autoregressive technologies, exhibit powerful generation capabilities and coherence. They are able to produce fluent and coherent text in response to contextual cues, regardless of text length. By adjusting parameters and policies, they can achieve a range of generation goals and outcomes.",{"type":17,"tag":25,"props":94,"children":95},{},[96],{"type":17,"tag":47,"props":97,"children":98},{},[99],{"type":23,"value":100},"Beijing Ascend AI Computing Center",{"type":17,"tag":25,"props":102,"children":103},{},[104],{"type":23,"value":105},"The Beijing Ascend AI Computing Center, spearheaded by the Mentougou District government and built by ZGC Group, is operated independently by Beijing Ascend Intelligent Technology Co., Ltd. Based on the Ascend AI software and hardware platform, the Center offers 100P of computing power in its initial phase, with plans to expand to 400P by year's end to serve businesses and universities.",{"type":17,"tag":25,"props":107,"children":108},{},[109],{"type":23,"value":110},"The Center is committed to assisting clients across various sectors to effectively utilize foundation models and high computing power. It has successfully trained, fine-tuned, inferred, and deployed a variety of models including Baichuan, GLM, Llama, Bloom, T5, BERT, GPT2, PanGuAlpha, MAE, ViT, Swin, and CLIP. Looking ahead, the Center plans to implement more foundation models to expedite the development of industry-specific models for its clients.",{"type":17,"tag":25,"props":112,"children":113},{},[114],{"type":17,"tag":47,"props":115,"children":116},{},[117],{"type":23,"value":118},"Appendix: Training Instructions of Baichuan in Single-Server Multi-Device Mode",{"type":17,"tag":25,"props":120,"children":121},{},[122],{"type":23,"value":123},"1. Open ModelArts, go to the image management page, and register an image.",{"type":17,"tag":25,"props":125,"children":126},{},[127],{"type":23,"value":128},"swr.cn-north-309.mtgascendic.cn/bj-aicc/mindformers_0.6rc1_mindspore_2_0_update1:mindformers_0.6rc1_mindspore_2_0_modelarts",{"type":17,"tag":25,"props":130,"children":131},{},[132],{"type":23,"value":133},"2. Click options for development environment and select an image to create a notebook.",{"type":17,"tag":25,"props":135,"children":136},{},[137],{"type":23,"value":138},"3. Pull the MindFormers repository.",{"type":17,"tag":25,"props":140,"children":141},{},[142,144,152],{"type":23,"value":143},"git clone -b dev ",{"type":17,"tag":145,"props":146,"children":150},"a",{"href":147,"rel":148},"https://gitee.com/mindspore/mindformers.gitcd",[149],"nofollow",[151],{"type":23,"value":147},{"type":23,"value":153}," mindformersbash build.shpip3 install -r requirements.txt",{"type":17,"tag":25,"props":155,"children":156},{},[157],{"type":23,"value":158},"4. Install dependencies.",{"type":17,"tag":25,"props":160,"children":161},{},[162,164],{"type":23,"value":163},"wget ",{"type":17,"tag":145,"props":165,"children":168},{"href":166,"rel":167},"https://baichuan.obs.cn-north-309.mtgascendic.cn/wiki.train.tokens",[149],[169],{"type":23,"value":166},{"type":17,"tag":25,"props":171,"children":172},{},[173,174],{"type":23,"value":163},{"type":17,"tag":145,"props":175,"children":178},{"href":176,"rel":177},"https://baichuan.obs.cn-north-309.mtgascendic.cn/baichuan/tokenizer.model",[149],[179],{"type":23,"value":176},{"type":17,"tag":25,"props":181,"children":182},{},[183,184],{"type":23,"value":163},{"type":17,"tag":145,"props":185,"children":188},{"href":186,"rel":187},"https://baichuan.obs.cn-north-309.mtgascendic.cn/baichuan/pytorch%5C_model.bin",[149],[189],{"type":23,"value":190},"https://baichuan.obs.cn-north-309.mtgascendic.cn/baichuan/pytorch\\_model.bin",{"type":17,"tag":25,"props":192,"children":193},{},[194],{"type":23,"value":195},"5. Download configuration files.",{"type":17,"tag":25,"props":197,"children":198},{},[199,201],{"type":23,"value":200},"Wgethttps://baichuan.obs.cn-north-309.mtgascendic.cn/run_baichuan7b.yamlwget ",{"type":17,"tag":145,"props":202,"children":205},{"href":203,"rel":204},"https://baichuan.obs.cn-north-309.mtgascendic.cn/run%5C_baichuan7b%5C_lora.yaml",[149],[206],{"type":23,"value":207},"https://baichuan.obs.cn-north-309.mtgascendic.cn/run\\_baichuan7b\\_lora.yaml",{"type":17,"tag":25,"props":209,"children":210},{},[211],{"type":23,"value":212},"6. Use the pre-processing script to generate MindRecord training data.",{"type":17,"tag":25,"props":214,"children":215},{},[216],{"type":23,"value":217},"cd mindformers/tools/dataset_preprocess/llama",{"type":17,"tag":25,"props":219,"children":220},{},[221],{"type":23,"value":222},"python llama_preprocess.py --input_glob /home/ma-user/work/mindformers/wiki.train.tokens --model_file /home/mauser/work/mindformers/tokenizer.model --seq_length 2048 --output_file /home/mauser/work/mindformers/wiki2048.mindrecord",{"type":17,"tag":25,"props":224,"children":225},{},[226],{"type":23,"value":227},"7. Change the training data path.",{"type":17,"tag":25,"props":229,"children":230},{},[231],{"type":23,"value":232},"/home/ma-user/work/mindformers/wiki2048.mindrecord",{"type":17,"tag":25,"props":234,"children":235},{},[236],{"type":23,"value":237},"8. Start the single-server multi-device training.",{"type":17,"tag":25,"props":239,"children":240},{},[241],{"type":23,"value":242},"cp /user/config/nbstart_hccl.json /home/ma-user/work/mindformerscd scriptsbash run_distribute.sh /home/ma-user/work/mindformers/nbstart_hccl.json /home/mauser/work/mindformers/run_baichuan7b.yaml [0,8] train",{"type":17,"tag":25,"props":244,"children":245},{},[246],{"type":23,"value":247},"9. Check the running result.",{"type":17,"tag":25,"props":249,"children":250},{},[251],{"type":23,"value":252},"tail -f ../../mindformers/output/log/rank_0/info.l",{"title":7,"searchDepth":254,"depth":254,"links":255},4,[],"markdown","content:news:en:2771.md","content","news/en/2771.md","news/en/2771","md",1776506046194]