mindformers.AutoModelForCausalLM

View Source On Gitee
class mindformers.AutoModelForCausalLM[source]

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created with the AutoModelForCausalLM.from_pretrained class method or the AutoModelForCausalLM.from_config class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config, **kwargs)

From config method, which instantiates one of the model classes of the library (with a causal language modeling head) by model config or YAML.

Warning

The API is experimental and may have some slight breaking changes in the next releases.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model's configuration. Use AutoModelForCausalLM.from_pretrained to load the model weights.

Parameters
  • config (Union[MindFormerConfig, PretrainedConfig, str]) –

    MindFormerConfig, YAML file, or a model config inherited from PretrainedConfig (experimental feature). The model class to instantiate is selected based on the configuration class:

    • [BertConfig] configuration class: [BertForMaskedLM] (BertModel model)

    • [BloomConfig] configuration class: [BloomLMHeadModel] (BloomModel model)

    • [ChatGLM2Config] configuration class: [ChatGLM2ForConditionalGeneration] (ChatGLM2Model model)

    • [GLMConfig] configuration class: [GLMChatModel] (GLMChatModel model)

    • [GPT2Config] configuration class: [GPT2LMHeadModel] (GPT2Model model)

    • [LlamaConfig] configuration class: [LlamaForCausalLM] (LlamaModel model)

    • [PanguAlphaConfig] configuration class: [PanguAlphaHeadModel] (PanguAlphaModel model)

  • kwargs (Dict[str, Any], optional) – The values in kwargs of any keys which are configuration attributes will be used to override the config values.

Returns

A model, which inherited from PreTrainedModel.

Examples

>>> from mindformers import AutoConfig, AutoModelForCausalLM
>>> # Download configuration from openmind and cache.
>>> config = AutoConfig.from_pretrained("bert_tiny_uncased")
>>> model = AutoModelForCausalLM.from_config(config)
classmethod from_pretrained(pretrained_model_name_or_dir, *model_args, **kwargs)

From pretrain method, which instantiates one of the model classes of the library (with a causal language modeling head) by directory or model_id from modelers.cn.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_dir if possible), or when it's missing, by falling back to using pattern matching on pretrained_model_name_or_dir:

  • [BertConfig] configuration class: [BertForMaskedLM] (BertModel model)

  • [BloomConfig] configuration class: [BloomLMHeadModel] (BloomModel model)

  • [ChatGLM2Config] configuration class: [ChatGLM2ForConditionalGeneration] (ChatGLM2Model model)

  • [GLMConfig] configuration class: [GLMChatModel] (GLMChatModel model)

  • [GPT2Config] configuration class: [GPT2LMHeadModel] (GPT2Model model)

  • [LlamaConfig] configuration class: [LlamaForCausalLM] (LlamaModel model)

  • [PanguAlphaConfig] configuration class: [PanguAlphaHeadModel] (PanguAlphaModel model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Warning

The API is experimental and may have some slight breaking changes in the next releases.

Parameters
  • pretrained_model_name_or_dir (str) – A folder containing a YAML file and ckpt file, a folder containing config.json and ckpt file, or a model_id from modelers.cn. The last two are experimental features.

  • model_args (Any, optional) – Will be passed along to the underlying model __init__() method. Only works in experimental mode.

  • kwargs (Dict[str, Any], optional) –

    Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). **kwargs will be passed to the underlying model's __init__ method when config is provided or automatically loaded; otherwise **kwargs will be first passed to PretrainedConfig.from_pretrained to create a configuration, and the keys which do not correspond to any configuration attribute will be passed to the underlying model's __init__ function. Some of available keys are showed below:

    • config (PretrainedConfig, optional): Configuration for the model to use instead of an automatically loaded configuration. DeFault: None. Configuration can be automatically loaded when:

      • The model is provided by the library (loaded with the model_id string of a pretrained model).

      • The model was saved using PreTrainedModel.save_pretrained and is reloaded by supplying the save directory.

      • The model is loaded by supplying a local directory as pretrained_model_name_or_dir and a configuration JSON file named 'config.json' is found in the directory.

    • cache_dir (Union[str, os.PathLike], optional): Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. DeFault: None.

    • force_download (bool, optional): Whether to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. DeFault: False.

    • resume_download (bool, optional): Whether to delete incompletely received files. Will attempt to resume the download if such a file exists. DeFault: False.

    • proxies (Dict[str, str], optional): A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. DeFault: None.

    • local_files_only (bool, optional): Whether to only look at local files (i.e., not try downloading the model). DeFault: False.

    • revision (str, optional): The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier allowed by git. DeFault: "main".

    • trust_remote_code (bool, optional): Whether to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. DeFault: False.

    • code_revision (str, optional): The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, a commit id, or any identifier allowed by git. DeFault: "main".

Returns

A model, which inherited from PreTrainedModel.

Examples

>>> from mindformers import AutoConfig, AutoModelForCausalLM
>>> # Download model and configuration from openmind and cache.
>>> model = AutoModelForCausalLM.from_pretrained("bert_tiny_uncased")
>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("bert_tiny_uncased", output_attentions=True)
>>> model.config.output_attentions
True
classmethod register(config_class, model_class, exist_ok=False)

Register a new model for this class.

Warning

The API is experimental and may have some slight breaking changes in the next releases.

Parameters
  • config_class (PretrainedConfig) – The model config class.

  • model_class (PretrainedModel) – The model class.

  • exist_ok (bool, optional) – If set to True, no error will be raised even if config_class already exists. Default: False .