Environment Variable Descriptions
The following environment variables are supported by MindSpore Transformers.
Debugging Variables
Variables Names |
Default |
Interpretations |
Descriptions |
Application Scenarios |
---|---|---|---|---|
HCCL_DETERMINISTIC |
false |
Whether to enable deterministic computation of reductive communication operators, where reductive communication operators include AllReduce, ReduceScatter, Reduce. |
|
Turning on deterministic computation eliminates the randomness introduced by inconsistent ordering of multi-card computations, but it results in a performance degradation compared to the disabled state. It is recommended to turn it on in scenarios where consistency is required. |
LCCL_DETERMINISTIC |
0 |
whether to turn the LCCL deterministic operator AllReduce (order-preserving addition) on. |
|
Turning on deterministic computation eliminates the randomness introduced by inconsistent ordering of multi-card computations, but it results in a performance degradation compared to the disabled state. It is recommended to turn it on in scenarios where consistency is required. |
CUSTOM_MATMUL_SHUFFLE |
on |
Whether to enable shuffle operations for custom matrix multiplication. |
|
The shuffle operation is optimized for specific matrix sizes and memory access patterns. If the matrix size does not match the shuffle-optimized size, turning off shuffling may result in better performance. Please set it according to the actual usage. |
ASCEND_LAUNCH_BLOCKING |
0 |
training or online inference scenarios, this environment variable can be used to control whether synchronization mode is activated during operator execution. |
|
Since the default operator executes asynchronously during NPU model training, when an error is reported during operator execution, the error stack information printed is not the actual call stack information. When set to |
TE_PARALLEL_COMPILER |
8 |
The number of threads on which the operator is compiled in parallel. Enables parallel compilation when greater than 1. |
Takes a positive integer;Maximum number of cpu cores*80%/number of Ascend AI processors, value range 1~32, default value is 8. |
When the network model is large, parallel compilation of the operator can be turned on by configuring this environment variable; |
CPU_AFFINITY |
0 |
Turn on the CPU affinity switch, thus ensuring that each process or thread is bound to a single CPU core to improve performance. |
|
CPU affinity is turned off by default for optimized resource utilization and energy saving. |
MS_MEMORY_STATISTIC |
0 |
Memory Statistics. |
|
During memory analysis, basic memory usage can be counted. You can refer to Optimization Guide for details. |
MINDSPORE_DUMP_CONFIG |
NA |
Specify the path to the configuration file that the cloud-side Dump function or end-side Dump function depends on. |
File path, support relative path and absolute path. |
|
GLOG_v |
3 |
Controls the level of MindSpore logs. |
|
|
ASCEND_GLOBAL_LOG_LEVEL |
3 |
Controls the logging level of CANN. |
|
|
ASCEND_SLOG_PRINT_TO_STDOUT |
0 |
Whether to display on the screen. When turned on, the logs will not be saved in the log file, but the generated logs will be displayed directly on the screen. |
|
|
ASCEND_GLOBAL_EVENT_ENABLE |
0 |
Whether to enable event logging. |
|
|
HCCL_EXEC_TIMEOUT |
1836 |
This environment variable allows you to control the amount of time to wait for synchronization when executing between devices, where each device process waits for the other device to perform communication synchronization for the configured amount of time. |
The range is: (0, 17340], and the default value is 1836 in s. |
|
HCCL_CONNECT_TIMEOUT |
120 |
Used in distributed training or inference scenarios to limit the timeout wait time of the socket building process between different devices. |
The environment variable needs to be configured as an integer in the range [120,7200], with default value 120s. |
|
MS_NODE_ID |
NA |
Specifies process rank id in dynamic cluster scenarios. |
The rank_id of the process, unique within the cluster. |
|
MS_ALLOC_CONF |
NA |
Sets memory allocation policies. |
Configuration items, formatted as key:value, with multiple items separated by commas. For example: export MS_ALLOC_CONF=enable_vmm:true,memory_tracker:true. |
|
MS_INTERNAL_DISABLE_CUSTOM_KERNEL_LIST |
PagedAttention |
Enables a list of custom operators. An experimental configuration item, generally not required. Will be removed in future. |
Configured as a string, with operator names separated by commas. |
|
TRANSFORMERS_OFFLINE |
0 |
Forces the Auto interface to read only offline local files. |
|
|
MDS_ENDPOINT |
https://modelers.cn |
Sets the endpoint for openMind Hub. |
Configured as a URL address in string format. |
|
OM_MODULES_CACHE |
~/.cache/openmind/modules |
Cache path for openMind modules. |
Configured as a directory path in string format. |
|
OPENMIND_CACHE |
~/.cache/openmind/hub |
Cache path for openMind Hub. |
Configured as a directory path in string format. |
|
openmind_IS_CI |
Indicates whether openMind is operating within a CI access control environment. |
|
Other Variables
Variables Names |
Default |
Interpretations |
Descriptions |
Application Scenarios |
---|---|---|---|---|
RUN_MODE |
predict |
Set the running mode. |
|
|
USE_ROPE_SELF_DEFINE |
true |
Whether to enable ROPE fusion operator. |
|
Enabling the ROPE fusion operator by default can improve the computation efficiency. Except for debugging scenarios, turn it off as needed, and generally do not make special settings. |
MS_ENABLE_INTERNAL_BOOST |
on |
Whether to turn on the internal acceleration of the MindSpore framework. |
|
In order to achieve high-performance inference, this parameter is turned on by default. In cases where debugging or comparing different acceleration strategies is performed, this parameter needs to be turned off to observe the impact on performance. |
MF_LOG_SUFFIX |
NA |
Set custom suffixes for all log log folders. |
Suffix for the log folder. Default: no suffix |
Adding a consistent suffix isolates logs across tasks from being overwritten. |
PLOG_REDIRECT_TO_OUTPUT |
False |
Controls whether plog logs change storage paths. |
|
This setting makes it easier to query the plog log. |
MS_ENABLE_FA_FLATTEN |
on |
Controls whether support FlashAttention flatten optimization. |
|
Provide a fallback mechanism for models that have not yet been adapted to FlashAttention flatten optimization. |
EXPERIMENTAL_KERNEL_LAUNCH_GROUP |
NA |
Control whether to support the batch parallel submission of operators. If supported, enable the parallel submission and configure the number of parallel submissions. |
|
This feature will continue to evolve in the future, and the subsequent behavior may change. Currently, only the |
ENFORCE_EAGER |
False |
Control whether to disable jit mode. |
|
Jit compiles functions into a callable MindSpore graph, sets ENFORCE_EAGER to False to enable jit mode, which can generate performance benefits. Currently, only inference mode is supported. |
MS_ENABLE_TFT |
NA |
Enable the Training Fault Tolerance (TFT) feature, which most functionalities rely on MindIO TFT. |
The value of the environment variable can be:"{TTP:1,UCE:1,HCCE:1,ARF:1,TRE:1,TSP:1}", when using a certain feature, the corresponding field can be configured as "1". |
Usage can refer to High Availability. |
MS_WORKER_NUM |
NA |
Number of processes assigned the role MS_WORKER. |
Integer greater than 0. |
Distributed scenarios. |
RANK_ID |
NA |
Specifies the logical ID for invoking the NPU. |
0–7. When multiple machines are parallelised, DEVICE_ID may duplicate across different servers. Using RANK_ID avoids this issue (in multi-machine parallelisation, RANK_ID = SERVER_ID * DEVICE_NUM + DEVICE_ID, where DEVICE_ID denotes the Ascend AI processor number on the current machine). |
|
RANK_SIZE |
NA |
Specifies the number of NPU units to invoke. |
An integer greater than 1. |
|
LD_PRELOAD |
NA |
Specifies the shared library to preload. |
Specifies the path to the shared library. |
|
DEVICE_ID |
0 |
Specifies the device ID for invoking the NPU. |
0 to the number of NPUs on the server. |
|
MS_SCHED_PORT |
NA |
Specifies the port number for Scheduler binding. |
Port number within the range 1024–65535. |
|
NPU_ASD_ENABLE |
0 |
Whether to enable feature value detection. |
|
|
MS_SDC_DETECT_ENABLE |
0 |
Enable/disable CheckSum detection for silent failures. |
|
|
ASCEND_HOME_PATH |
NA |
Installation path for the Ascend software package. |
Set to the specified path. |
|
ENABLE_LAZY_INLINE |
1 |
Whether to enable Lazy Inline mode. This environment variable will be deprecated and removed in the next version. |
|
|
LOCAL_DEFAULT_PATH |
./output |
Sets the default path for logs. |
Set to the specified path. |
|
STDOUT_DEVICES |
NA |
Sets the list of device IDs for standard output. |
Set as a numeric list, with multiple IDs separated by commas. |
|
REGISTER_PATH |
Directory path containing the plug-in code to be registered. |
Set to the specified path. |
||
LOG_MF_PATH |
./output/log |
Log path for MindSpore Transformers. |
Set to the specified path. |
|
DEVICE_NUM_PER_NODE |
8 |
Number of NPUs on the server. |
An integer greater than 0. |
|
SHARED_PATHS |
Paths for shared storage. |
Set to the specified path. |
||
ASCEND_PROCESS_LOG_PATH |
NA |
Log path for the Ascend process. |
Set to the specified path. |
|
ENABLE_LAZY_INLINE_NO_PIPELINE |
0 |
Whether to enable Lazy Inline mode during non-pipelined parallelism. This environment variable will be deprecated and removed in the next version. |
|
|
REMOTE_SAVE_URL |
None |
URL used when saving training results on ModelArts. Currently deprecated and will be removed in future. |
Enter the URL for saving results. |