Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
|
4 months ago | |
|---|---|---|
| .gitee | 9 months ago | |
| .jenkins | 5 months ago | |
| configs | 5 months ago | |
| docs | 5 months ago | |
| mindformers | 4 months ago | |
| research | 5 months ago | |
| scripts | 5 months ago | |
| tests | 5 months ago | |
| toolkit | 4 months ago | |
| .gitignore | 9 months ago | |
| .readthedocs.yaml | 2 years ago | |
| LICENSE | 3 years ago | |
| OWNERS | 5 months ago | |
| README.md | 5 months ago | |
| README_CN.md | 5 months ago | |
| Third_Party_Open_Source_Software_Notice | 5 months ago | |
| build.sh | 5 months ago | |
| convert_weight.py | 5 months ago | |
| requirements.txt | 6 months ago | |
| run_mindformer.py | 6 months ago | |
| setup.py | 6 months ago | |
The goal of the MindSpore Transformers suite is to build a full-process development suite for Large model pre-training, fine-tuning, evaluation, inference, and deployment. It provides mainstream Transformer-based Large Language Models (LLMs) and Multimodal Models (MMs). It is expected to help users easily realize the full process of large model development.
Based on MindSpore's built-in parallel technology and component-based design, the MindSpore Transformers suite has the following features:
For details about MindSpore Transformers tutorials and API documents, see MindSpore Transformers Documentation. The following are quick jump links to some of the key content:
If you have any suggestions on MindSpore Transformers, contact us through an issue, and we will address it promptly.
The following table lists models supported by MindSpore Transformers.
| Model | Specifications | Model Type | Latest Version |
|---|---|---|---|
| DeepSeek-V3 | 671B | Sparse LLM | In-development version, 1.5.0 |
| GLM4 | 9B | Dense LLM | In-development version, 1.5.0 |
| Llama3.1 | 8B/70B | Dense LLM | In-development version, 1.5.0 |
| Qwen2.5 | 0.5B/1.5B/7B/14B/32B/72B | Dense LLM | In-development version, 1.5.0 |
| TeleChat2 | 7B/35B/115B | Dense LLM | In-development version, 1.5.0 |
| CodeLlama | 34B | Dense LLM | 1.5.0 |
| CogVLM2-Image | 19B | MM | 1.5.0 |
| CogVLM2-Video | 13B | MM | 1.5.0 |
| DeepSeek-V2 | 236B | Sparse LLM | 1.5.0 |
| DeepSeek-Coder-V1.5 | 7B | Dense LLM | 1.5.0 |
| DeepSeek-Coder | 33B | Dense LLM | 1.5.0 |
| GLM3-32K | 6B | Dense LLM | 1.5.0 |
| GLM3 | 6B | Dense LLM | 1.5.0 |
| InternLM2 | 7B/20B | Dense LLM | 1.5.0 |
| Llama3.2 | 3B | Dense LLM | 1.5.0 |
| Llama3.2-Vision | 11B | MM | 1.5.0 |
| Llama3 | 8B/70B | Dense LLM | 1.5.0 |
| Llama2 | 7B/13B/70B | Dense LLM | 1.5.0 |
| Mixtral | 8x7B | Sparse LLM | 1.5.0 |
| Qwen2 | 0.5B/1.5B/7B/57B/57B-A14B/72B | Dense/Sparse LLM | 1.5.0 |
| Qwen1.5 | 7B/14B/72B | Dense LLM | 1.5.0 |
| Qwen-VL | 9.6B | MM | 1.5.0 |
| TeleChat | 7B/12B/52B | Dense LLM | 1.5.0 |
| Whisper | 1.5B | MM | 1.5.0 |
| Yi | 6B/34B | Dense LLM | 1.5.0 |
| YiZhao | 12B | Dense LLM | 1.5.0 |
| Baichuan2 | 7B/13B | Dense LLM | 1.3.2 |
| GLM2 | 6B | Dense LLM | 1.3.2 |
| GPT2 | 124M/13B | Dense LLM | 1.3.2 |
| InternLM | 7B/20B | Dense LLM | 1.3.2 |
| Qwen | 7B/14B | Dense LLM | 1.3.2 |
| CodeGeex2 | 6B | Dense LLM | 1.1.0 |
| WizardCoder | 15B | Dense LLM | 1.1.0 |
| Baichuan | 7B/13B | Dense LLM | 1.0 |
| Blip2 | 8.1B | MM | 1.0 |
| Bloom | 560M/7.1B/65B/176B | Dense LLM | 1.0 |
| Clip | 149M/428M | MM | 1.0 |
| CodeGeex | 13B | Dense LLM | 1.0 |
| GLM | 6B | Dense LLM | 1.0 |
| iFlytekSpark | 13B | Dense LLM | 1.0 |
| Llama | 7B/13B | Dense LLM | 1.0 |
| MAE | 86M | MM | 1.0 |
| Mengzi3 | 13B | Dense LLM | 1.0 |
| PanguAlpha | 2.6B/13B | Dense LLM | 1.0 |
| SAM | 91M/308M/636M | MM | 1.0 |
| Skywork | 13B | Dense LLM | 1.0 |
| Swin | 88M | MM | 1.0 |
| T5 | 14M/60M | Dense LLM | 1.0 |
| VisualGLM | 6B | MM | 1.0 |
| Ziya | 13B | Dense LLM | 1.0 |
| Bert | 4M/110M | Dense LLM | 0.8 |
The model maintenance strategy follows the Life Cycle And Version Matching Strategy of the corresponding latest supported version.
Currently, the Atlas 800T A2 training server is supported.
Python 3.11.4 is recommended for the current suite.
| MindSpore Transformers | MindSpore | CANN | Driver/Firmware |
|---|---|---|---|
| In-development version | In-development version | In-development version | In-development version |
Historical Version Supporting Relationships:
| MindSpore Transformers | MindSpore | CANN | Driver/Firmware |
|---|---|---|---|
| 1.5.0 | 2.6.0-rc1 | 8.1.RC1 | 25.0.RC1 |
| 1.3.2 | 2.4.10 | 8.0.0 | 24.1.0 |
| 1.3.0 | 2.4.0 | 8.0.RC3 | 24.1.RC3 |
| 1.2.0 | 2.3.0 | 8.0.RC2 | 24.1.RC2 |
Currently, MindSpore Transformers can be compiled and installed using the source code. You can run the following commands to install MindSpore Transformers:
git clone -b dev https://gitee.com/mindspore/mindformers.git
cd mindformers
bash build.sh
MindSpore Transformers supports distributed pre-training, supervised fine-tuning, and inference tasks for large models with one click. You can click the link of each model in Model List to see the corresponding documentation.
For more information about the functions of MindSpore Transformers, please refer to MindSpore Transformers Documentation.
MindSpore Transformers version has the following five maintenance phases:
| Status | Duration | Description |
|---|---|---|
| Plan | 1-3 months | Planning function. |
| Develop | 3 months | Build function. |
| Preserve | 6 months | Incorporate all solved problems and release new versions. |
| No Preserve | 0—3 months | Incorporate all the solved problems, there is no full-time maintenance team, and there is no plan to release a new version. |
| End of Life (EOL) | N/A | The branch is closed and no longer accepts any modifications. |
MindSpore Transformers released version preservation policy:
| MindSpore Transformers Version | Corresponding Label | Current Status | Release Time | Subsequent Status | EOL Date |
|---|---|---|---|---|---|
| 1.5.0 | v1.5.0 | Preserve | 2025/04/29 | No preserve expected from 2025/10/29 | 2026/01/29 |
| 1.3.2 | v1.3.2 | Preserve | 2024/12/20 | No preserve expected from 2025/06/20 | 2025/09/20 |
| 1.2.0 | v1.2.0 | End of Life | 2024/07/12 | - | 2025/04/12 |
| 1.1.0 | v1.1.0 | End of Life | 2024/04/15 | - | 2025/01/15 |
scripts/examples directory are provided as reference examples and do not form part of the commercially released products. They are only for users' reference. If it needs to be used, the user should be responsible for transforming it into a product suitable for commercial use and ensuring security protection. MindSpore Transformers does not assume security responsibility for the resulting security problems.We welcome contributions to the community. For details, see MindSpore Transformers Contribution Guidelines.
MindSpore Transformers套件的目标是构建一个大模型训练、微调、评估、推理、部署的全流程开发套件: 提供业内主流的Transformer类预训练模型和SOTA下游任务应用,涵盖丰富的并行特性。期望帮助用户轻松的实现大模型训练和创新研发。
Python C++ Markdown Shell other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》