[{"data":1,"prerenderedAt":269},["ShallowReactive",2],{"content-query-JuE0scBk6Q":3},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"date":10,"cover":11,"type":12,"category":13,"body":14,"_type":263,"_id":264,"_source":265,"_file":266,"_stem":267,"_extension":268},"/technology-blogs/en/1077","en",false,"","MindSpore Error Information Tricks (3): Discriminating Two Running Modes","Hope to help you understand how error messages can be used to discriminate the running modes of a deep learning framework","2022-03-03","https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2022/03/18/3a1556cfe9be4ea38fa419f9f6bf1b3d.png","technology-blogs","Developer Sharing",{"type":15,"children":16,"toc":260},"root",[17,25,34,39,84,91,102,112,117,127,145,155,159,167,192,202,206,214,245],{"type":18,"tag":19,"props":20,"children":22},"element","h1",{"id":21},"mindspore-error-information-tricks-3-discriminating-two-running-modes",[23],{"type":24,"value":8},"text",{"type":18,"tag":26,"props":27,"children":28},"p",{},[29],{"type":18,"tag":30,"props":31,"children":33},"img",{"alt":7,"src":32},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2022/03/18/43c6fd8be545463f95da126bc9d81bdb.png",[],{"type":18,"tag":26,"props":35,"children":36},{},[37],{"type":24,"value":38},"In most cases, a deep learning framework supports two running modes: static graph mode and dynamic graph mode. In static graph mode, the network is compiled into a graph, and operations are performed according to the graph's structure. In dynamic graph mode, code is sequentially executed. MindSpore also supports two running modes, PyNative mode (dynamic graph mode) and Graph mode (static graph mode). In PyNative mode, operators in the neural network are delivered and executed one by one, facilitating the compilation and debugging of the neural network model. In the latter mode, the neural network model is compiled into a graph and then delivered for execution. This mode uses technologies such as graph optimization to improve running performance and facilitates large-scale deployment and cross-platform operation.",{"type":18,"tag":26,"props":40,"children":41},{},[42,44,50,52,56,58,62,64,68,70,75,77,82],{"type":24,"value":43},"Normally, the running mode is selected according to the network requirements to be trained. However, this blog takes a different approach by analyzing the error information of the ",{"type":18,"tag":45,"props":46,"children":47},"strong",{},[48],{"type":24,"value":49},"Cell",{"type":24,"value":51}," source code for both modes. From the ",{"type":18,"tag":45,"props":53,"children":54},{},[55],{"type":24,"value":49},{"type":24,"value":57}," introduction below, we could see that the MindSpore ",{"type":18,"tag":45,"props":59,"children":60},{},[61],{"type":24,"value":49},{"type":24,"value":63}," class is the basis for setting up networks, as well as the basic unit of a network. In order to customize a network, the ",{"type":18,"tag":45,"props":65,"children":66},{},[67],{"type":24,"value":49},{"type":24,"value":69}," class needs to be inherited and the ",{"type":18,"tag":45,"props":71,"children":72},{},[73],{"type":24,"value":74},"__init__",{"type":24,"value":76}," and ",{"type":18,"tag":45,"props":78,"children":79},{},[80],{"type":24,"value":81},"construct",{"type":24,"value":83}," methods need to be overwritten.",{"type":18,"tag":26,"props":85,"children":86},{},[87],{"type":18,"tag":30,"props":88,"children":90},{"alt":7,"src":89},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2022/03/18/24a64c031f5d445bb27b9e6c6c25e0f0.png",[],{"type":18,"tag":26,"props":92,"children":93},{},[94,96,100],{"type":24,"value":95},"The following are three methods used to distinguish the two modes from error messages of the ",{"type":18,"tag":45,"props":97,"children":98},{},[99],{"type":24,"value":49},{"type":24,"value":101}," source code.",{"type":18,"tag":26,"props":103,"children":104},{},[105,107],{"type":24,"value":106},"1. ",{"type":18,"tag":45,"props":108,"children":109},{},[110],{"type":24,"value":111},"self.compile_and_run(*args)",{"type":18,"tag":26,"props":113,"children":114},{},[115],{"type":24,"value":116},"Example:",{"type":18,"tag":118,"props":119,"children":121},"pre",{"code":120},"def __call__(self, *args, **kwargs):\n\n...\n\n# Run in Graph mode.\n\nif context._get_mode() == context.GRAPH_MODE:\n\n...\n\nout = self.compile_and_run(*args)\n\nreturn out\n\n...\n\n# Run in PyNative mode.\n\n...\n\nwith self.CellGuard():\n\ntry:\n\noutput = self.run_construct(cast_inputs, kwargs)\n\nexcept Exception as err:\n\n_pynative_executor.clear_res()\n\nraise err\n\n...\n",[122],{"type":18,"tag":123,"props":124,"children":125},"code",{"__ignoreMap":7},[126],{"type":24,"value":120},{"type":18,"tag":26,"props":128,"children":129},{},[130,132,137,139,143],{"type":24,"value":131},"The ",{"type":18,"tag":45,"props":133,"children":134},{},[135],{"type":24,"value":136},"__call__",{"type":24,"value":138}," function must be called during network execution, and Graph and PyNative modes execute different code branches. The error information above shows that the ",{"type":18,"tag":45,"props":140,"children":141},{},[142],{"type":24,"value":111},{"type":24,"value":144}," function is invoked, which is executed and reported in Graph mode.",{"type":18,"tag":26,"props":146,"children":147},{},[148,150],{"type":24,"value":149},"2. ",{"type":18,"tag":45,"props":151,"children":152},{},[153],{"type":24,"value":154},"context.get_context(\"mode\")",{"type":18,"tag":26,"props":156,"children":157},{},[158],{"type":24,"value":116},{"type":18,"tag":118,"props":160,"children":162},{"code":161},"Traceback (most recent call last):\n\nFile \"test1.py\", line 26, in \n\noutput = net(x)\n\nFile \"/root/anaconda3/envs/test/lib/python3.7/site-packages/mindspore/nn/cell.py\", line 479, in __call__\n\nout = self.compile_and_run(*args)\n\nFile \"/root/anaconda3/envs/test/lib/python3.7/site-packages/mindspore/nn/cell.py\", line 802, in compile_and_run\n\n...\n\narg_name, prim_name, rel_str, arg_value, type(arg_value).__name__))\n\nValueError: `axis` in `ReduceMean` should be in range of [-4, 4), but got 5.000e+00 with type `int`.\n",[163],{"type":18,"tag":123,"props":164,"children":165},{"__ignoreMap":7},[166],{"type":24,"value":161},{"type":18,"tag":26,"props":168,"children":169},{},[170,172,177,179,184,186,190],{"type":24,"value":171},"By default, MindSpore is in PyNative mode. You can switch it to the graph mode by calling ",{"type":18,"tag":45,"props":173,"children":174},{},[175],{"type":24,"value":176},"context.set_context(mode=context.GRAPH_MODE)",{"type":24,"value":178},". Similarly, MindSpore in graph mode can be switched to PyNative mode through ",{"type":18,"tag":45,"props":180,"children":181},{},[182],{"type":24,"value":183},"context.set_context(mode=context.PYNATIVE_MODE)",{"type":24,"value":185},". That is, you can call ",{"type":18,"tag":45,"props":187,"children":188},{},[189],{"type":24,"value":154},{"type":24,"value":191}," to check the running mode.",{"type":18,"tag":26,"props":193,"children":194},{},[195,197],{"type":24,"value":196},"3. ",{"type":18,"tag":45,"props":198,"children":199},{},[200],{"type":24,"value":201},"The function call stack",{"type":18,"tag":26,"props":203,"children":204},{},[205],{"type":24,"value":116},{"type":18,"tag":118,"props":207,"children":209},{"code":208},"The function call stack (See file '/root/mindspore_test/rank_0/om/analyze_fail.dat' for more details):\n\n# 0 In file /root/anaconda3/envs/test/lib/python3.7/site-packages/mindspore/nn/layer/math.py(1003)\n\nif tensor_dtype == mstype.float16:\n\n# 1 In file /root/anaconda3/envs/test/lib/python3.7/site-packages/mindspore/nn/layer/math.py(1007)\n\nif not self.keep_dims:\n\n# 2 In file /root/anaconda3/envs/test/lib/python3.7/site-packages/mindspore/nn/layer/math.py(1005)\n\nmean = self.reduce_mean(x, self.axis)\n\n^\n",[210],{"type":18,"tag":123,"props":211,"children":212},{"__ignoreMap":7},[213],{"type":24,"value":208},{"type":18,"tag":26,"props":215,"children":216},{},[217,219,223,225,230,232,237,239,243],{"type":24,"value":218},"As mentioned earlier, the network compiles and generates a graph structure before it is executed in Graph mode. However, if a graph's node fails to be executed, how can we find the line of code that generates the node? To solve this problem, MindSpore provides a tracing mechanism for static graph nodes. ",{"type":18,"tag":45,"props":220,"children":221},{},[222],{"type":24,"value":201},{"type":24,"value":224}," records how the node is converted from code. In other words, we can find how the operator is generated by code based on the stack to find the faulty code. As shown in the returned error code, the incorrect ",{"type":18,"tag":45,"props":226,"children":227},{},[228],{"type":24,"value":229},"reduce_mean",{"type":24,"value":231}," operator is generated from line 1005 in ",{"type":18,"tag":45,"props":233,"children":234},{},[235],{"type":24,"value":236},"mindspore/nn/layer/math.py",{"type":24,"value":238},". If an error is reported when the network is executed in a static graph, we can use ",{"type":18,"tag":45,"props":240,"children":241},{},[242],{"type":24,"value":201},{"type":24,"value":244}," to locate the specific code line.",{"type":18,"tag":26,"props":246,"children":247},{},[248,250,258],{"type":24,"value":249},"Error messages mainly indicate that a desired operation has failed, or feed back important warnings. I hope this blog has helped you understand how error messages can be used to discriminate the running modes of a deep learning framework. However, error messages can do more than that. For more details, please subscribe to MindSpore News at ",{"type":18,"tag":251,"props":252,"children":256},"a",{"href":253,"rel":254},"https://www.mindspore.cn/news/en",[255],"nofollow",[257],{"type":24,"value":253},{"type":24,"value":259},".",{"title":7,"searchDepth":261,"depth":261,"links":262},4,[],"markdown","content:technology-blogs:en:1077.md","content","technology-blogs/en/1077.md","technology-blogs/en/1077","md",1776506102703]