[ "MindSpore Made Easy" ]
MindSpore Made Easy Collecting Profile Data in a ModelArts Development Environment
August 12, 2022
This blog describes how to collect profile data in a ModelArts development environment and enable profiler for performance debugging as required.
1. Collecting Profile Data by Using the Training Script in the Development Environment
To collect the profile data of a neural network, you need to add MindSpore Profiler APIs to the training script.
(1) After set_context is executed and before the network and HCCL are initialized, initialize the MindSpore Profiler objects.
(2) After the training is complete, call Profiler.analyse() to stop profile data collection and generate the profiling results.
The sample code is as follows:
from mindspore.profiler import Profilerfrom mindspore import Model
context.set_context(mode=context.GRAPH_MODE,
device_target=args.device_target)
SAVE_PATH = "./profile"
# Init Profiler and SummaryCollector# Data directory should be placed under SAVE_PATH.
profiler_output_path = SAVE_PATH + "mindspore_profile"
profiler = Profiler(output_path=profiler_output_path)# Train Model
Model.train()
# Profiler end
profiler.analyse()
Note: output_path indicates the path where the profile data is generated. If this path is not specified, the profile data is automatically saved in the data folder (automatically generated) under the current directory.
2. Collecting Profile Data for Performance Debugging as Required
Enable profile data collection as required.
(1) The sample code for collecting profile data by step is as follows:
class StopAtStep(Callback):
def __init__(self, start_step, stop_step):
super(StopAtStep, self).__init__()
self.start_step = start_step
self.stop_step = stop_step
self.profiler = Profiler(start_profile=False)
def step_begin(self, run_context):
cb_params = run_context.original_args()
step_num = cb_params.cur_step_num
if step_num == self.start_step:
self.profiler.start() # Enable profile data collection as required.
def step_end(self, run_context):
cb_params = run_context.original_args()
step_num = cb_params.cur_step_num
if step_num == self.stop_step:
self.profiler.stop() # Disable profile data collection as required.
def end(self, run_context):
self.profiler.analyse()
...
...
start_step = 2
stop_step = 5
profiler_data = StopAtEpoch(start_step, stop_step)
model.train(..., callbacks=[..., profiler_data])
(2) The sample code for collecting profile data by epoch is as follows:
class StopAtEpoch(Callback):
def __init__(self, start_epoch, stop_epoch):
super(StopAtEpoch, self).__init__()
self.start_epoch = start_epoch
self.stop_epoch = stop_epoch
self.profiler = Profiler(start_profile=False)
def epoch_begin(self, run_context):
cb_params = run_context.original_args()
epoch_num = cb_params.cur_epoch_num
if epoch_num == self.start_epoch:
self.profiler.start() # Enable profile data collection as required.
def epoch_end(self, run_context):
cb_params = run_context.original_args()
epoch_num = cb_params.cur_epoch_num
if epoch_num == self.stop_epoch:
self.profiler.stop() # Disable profile data collection as required.
def end(self, run_context):
self.profiler.analyse()
...
...
start_epoch = 2
stop_epoch = 5
profiler_data = StopAtEpoch(start_epoch, stop_epoch)
model.train(..., callbacks=[..., profiler_data])
3. Running the Script
Startup command:
python MindSpore_1P_profiler.py data_path=xxx
Run the script in Terminal of the development environment. After script execution, the generated profile data is stored in SAVE_PATH.
Note: On-demand profiler performance debugging does not support user-defined data storage paths. Therefore, after the program is complete, the profile data is saved in the data file in the current directory by default.
Precautions:
1. Currently, performance debugging is not supported during training while inference, but is supported for separate training or inference.
2. Ascend performance debugging does not support the dynamic shape, multi-subgraph, and control flow scenarios.