Molecular Foundation Model

In fields such as biological computing and drug design, it is very expensive to label training data in most tasks, and the data sets available for model training are very small. Researchers in this field cannot develop more effective models due to limited data, resulting in poor model accuracy. Based on the theories of biochemistry and transfer learning, the molecular base model can get more accurate results on the target task by using only a small amount of data fine-tuning after pre-training on the relevant task with a large amount of data. MindSpore SPONGE provides a series of molecular foundation models and their checkpoint training based on large-scale data sets. Users can make fine-tuning directly based on these models according to their needs, enabling them to easily achieve high-precision model development.

Supported Networks

Function	Model	Training	Inferring	Back-end
Molecular Compound Pre-training Model	GROVER	√	√	GPU/Ascend
Molecular Compound Pre-training Model	MGBERT	√	√	GPU/Ascend

In the future, basic models such as protein pre-training will be provided. Please stay tuned.