[{"data":1,"prerenderedAt":502},["ShallowReactive",2],{"content-query-lEiSo8A8sp":3},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"date":10,"cover":11,"type":12,"body":13,"_type":496,"_id":497,"_source":498,"_file":499,"_stem":500,"_extension":501},"/news/en/2770","en",false,"","Idea Sharing: Physics-informed Convolutional-Recurrent Network for Solving Spatiotemporal PDEs","PhyCRNet combines the strengths of a ConvLSTM, a global residual-connection, and finite-difference-based spatiotemporal filtering into a basis approach in the face of inverse problems as well as sparse and noisy data.","2023-08-28","https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/4f442a4cc6c84408bc3ee6e15d5de47e.png","news",{"type":14,"children":15,"toc":493},"root",[16,24,30,39,44,49,54,62,67,75,115,123,131,136,143,153,158,165,197,205,210,217,222,227,232,237,245,285,292,297,336,343,348,356,367,374,381,400,408,413,420,425,432,445,450,455,462,467,475,480],{"type":17,"tag":18,"props":19,"children":21},"element","h1",{"id":20},"idea-sharing-physics-informed-convolutional-recurrent-network-for-solving-spatiotemporal-pdes",[22],{"type":23,"value":8},"text",{"type":17,"tag":25,"props":26,"children":27},"p",{},[28],{"type":23,"value":29},"Author: Yu Fan Source: Zhihu",{"type":17,"tag":25,"props":31,"children":32},{},[33],{"type":17,"tag":34,"props":35,"children":36},"strong",{},[37],{"type":23,"value":38},"Background",{"type":17,"tag":25,"props":40,"children":41},{},[42],{"type":23,"value":43},"Complex spatiotemporal systems modeled by partial differential equations (PDEs) are common in many disciplines, such as applied mathematics, physics, biology, chemistry, and engineering. In most cases, we cannot obtain analytical solutions to the PDEs used to describe these complex physical systems. Therefore, numerical solutions have been extensively studied, including finite element analysis, finite difference analysis, and isogeometric analysis (IGA). The computing overhead remains a major difficulty in applications of data assimilation and inverse problems, even though classic numerical approaches that approximate the exact solutions with basis functions can achieve remarkable accuracy for forward analysis.",{"type":17,"tag":25,"props":45,"children":46},{},[47],{"type":23,"value":48},"In recent years, various deep learning methods used to solve forward and inverse problems in nonlinear systems have emerged one after another. Research on physical system modeling by using deep neural networks (DNNs) falls into two categories: continuous and discretized networks. The representative work of continuous networks is physics-informed neural networks (PINNs), where the PDE residuals are used as soft constraints of DNNs and fully connected layers are used to approximate the equation solutions, facilitating DNN training on small or unlabeled sampling datasets. Despite the advantages, PINNs are generally limited to low-dimensional parameterizations and less capable of handling PDE systems whose behavior has sharp gradients or complex local morphology. A few recent pilot studies show that discrete networks have better scalability and faster convergence. For example, CNNs can be used as surrogate models for time-independent systems, and PhyGeoNet can geometry-adaptively solve steady-state PDEs through coordinate transformation between the physical and reference domains. On the other hand, for time-dependent systems, most NN-based solutions still focus on data-driven approaches and meshing.",{"type":17,"tag":25,"props":50,"children":51},{},[52],{"type":23,"value":53},"PhyCRNet, proposed by professor Sun Hao's team at Gaoling School of Artificial Intelligence, Renmin University of China, together with Northeastern University (US) and the University of Notre Dame, is an unsupervised learning method for solving multi-dimensional spatiotemporal PDEs using a physics-informed convolutional-recurrent network structure. PhyCRNet combines the strengths of a convolutional long short-term memory network (ConvLSTM) that extracts low-dimensional spatial features and learns their temporal evolution, a global residual connection that strictly maps the time-marching dynamics of the PDE solution, and finite-difference-based spatiotemporal filtering that determines the essential PDE derivatives for constructing the residual loss function into a basis approach in the face of inverse problems as well as sparse and noisy data.",{"type":17,"tag":25,"props":55,"children":56},{},[57],{"type":17,"tag":34,"props":58,"children":59},{},[60],{"type":23,"value":61},"1. Problem Definition",{"type":17,"tag":25,"props":63,"children":64},{},[65],{"type":23,"value":66},"The general form of a multi-dimensional nonlinear parametric PDE system is as follows:",{"type":17,"tag":25,"props":68,"children":69},{},[70],{"type":17,"tag":71,"props":72,"children":74},"img",{"alt":7,"src":73},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/97bd13cb49e243c29f33596c076baeaa.png",[],{"type":17,"tag":25,"props":76,"children":77},{},[78,80,85,87,92,94,99,101,106,108,113],{"type":23,"value":79},"where ",{"type":17,"tag":34,"props":81,"children":82},{},[83],{"type":23,"value":84},"u(x, t)",{"type":23,"value":86}," represents the solution of the equation in the temporal domain ",{"type":17,"tag":34,"props":88,"children":89},{},[90],{"type":23,"value":91},"T",{"type":23,"value":93}," and the spatial domain ",{"type":17,"tag":34,"props":95,"children":96},{},[97],{"type":23,"value":98},"Ω",{"type":23,"value":100},", and ",{"type":17,"tag":34,"props":102,"children":103},{},[104],{"type":23,"value":105},"F",{"type":23,"value":107}," is the nonlinear functional parameterized by ",{"type":17,"tag":34,"props":109,"children":110},{},[111],{"type":23,"value":112},"λ",{"type":23,"value":114},".",{"type":17,"tag":25,"props":116,"children":117},{},[118],{"type":17,"tag":34,"props":119,"children":120},{},[121],{"type":23,"value":122},"2. Modeling Method",{"type":17,"tag":25,"props":124,"children":125},{},[126],{"type":17,"tag":34,"props":127,"children":128},{},[129],{"type":23,"value":130},"ConvLSTM",{"type":17,"tag":25,"props":132,"children":133},{},[134],{"type":23,"value":135},"ConvLSTM is a spatiotemporal sequence-to-sequence learning framework extended from LSTM and its variant LSTM encoder-decoder prediction architecture, which has the advantage of modeling long-period dependencies that evolve over time. Essentially, the memory cell is updated through the input and state information being accessed, accumulated, and removed due to the delicate design of controlling gates. Based on such a setup, the gradient vanishing problem of recurrent neural networks (RNNs) is alleviated. ConvLSTM inherits the basic structure of LSTM (such as cells and gates) for controlling the information flow and replaces the fully connected NNs (FC-NNs) with CNNs due to their better capability of spatial connection representation in gated operations. As a special class of RNNs, LSTM can be used as an implicit numerical scheme for solving time-dependent PDEs. The following figure shows the structure of a single ConvLSTM cell.",{"type":17,"tag":25,"props":137,"children":138},{},[139],{"type":17,"tag":71,"props":140,"children":142},{"alt":7,"src":141},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/d709788ca164443c83c4c22be39d5f7c.png",[],{"type":17,"tag":25,"props":144,"children":145},{},[146,148],{"type":23,"value":147},"Figure 1: Single ConvLSTM cell at time ",{"type":17,"tag":34,"props":149,"children":150},{},[151],{"type":23,"value":152},"t",{"type":17,"tag":25,"props":154,"children":155},{},[156],{"type":23,"value":157},"The mathematical formulations of updating ConvLSTM cells are as follows:",{"type":17,"tag":25,"props":159,"children":160},{},[161],{"type":17,"tag":71,"props":162,"children":164},{"alt":7,"src":163},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/ffa5135323be4c9c9851d80e1f230cb7.png",[],{"type":17,"tag":25,"props":166,"children":167},{},[168,169,174,176,181,183,188,190,195],{"type":23,"value":79},{"type":17,"tag":34,"props":170,"children":171},{},[172],{"type":23,"value":173},"*",{"type":23,"value":175}," represents the convolutional operation, ",{"type":17,"tag":34,"props":177,"children":178},{},[179],{"type":23,"value":180},"⊙",{"type":23,"value":182}," represents the Hadamard product, ",{"type":17,"tag":34,"props":184,"children":185},{},[186],{"type":23,"value":187},"W",{"type":23,"value":189},"s are the weight parameters of the corresponding filters, and ",{"type":17,"tag":34,"props":191,"children":192},{},[193],{"type":23,"value":194},"b",{"type":23,"value":196},"s represent bias vectors.",{"type":17,"tag":25,"props":198,"children":199},{},[200],{"type":17,"tag":34,"props":201,"children":202},{},[203],{"type":23,"value":204},"Pixel Shuffle",{"type":17,"tag":25,"props":206,"children":207},{},[208],{"type":23,"value":209},"Pixel shuffle is an efficient sub-pixel convolution operation for upscaling the low-resolution (LR) feature maps into the high-resolution (HR) outputs. Consider an LR feature tensor of shape (C x r^2, H, W), where C indicates the number of channels, {H, W} are the height and width respectively, and r is the upscaling factor. We can obtain an HR tensor of shape (C, H x r, W x r) by realigning the elements in the LR tensor.",{"type":17,"tag":25,"props":211,"children":212},{},[213],{"type":17,"tag":71,"props":214,"children":216},{"alt":7,"src":215},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/ba90ae464d74434fac90e768473724e5.png",[],{"type":17,"tag":25,"props":218,"children":219},{},[220],{"type":23,"value":221},"Figure 2: Pixel shuffle layer",{"type":17,"tag":25,"props":223,"children":224},{},[225],{"type":23,"value":226},"The efficiency of pixel shuffle is reflected in the following aspects:",{"type":17,"tag":25,"props":228,"children":229},{},[230],{"type":23,"value":231},"Ÿ Upscaling is performed only in the last layer of convolution, unlike other methods such as deconvolution where more convolutional layers are needed to reach the target resolution.",{"type":17,"tag":25,"props":233,"children":234},{},[235],{"type":23,"value":236},"Ÿ The feature extraction layers before the upsampling layer can use smaller filters to process the LR tensors.",{"type":17,"tag":25,"props":238,"children":239},{},[240],{"type":17,"tag":34,"props":241,"children":242},{},[243],{"type":23,"value":244},"PhyCRNet",{"type":17,"tag":25,"props":246,"children":247},{},[248,250,255,257,262,264,269,271,276,278,283],{"type":23,"value":249},"PhyCRNet consists of an encoder-decoder module, residual connection, autoregressive (AR) process and filtering-based differentiation. The encoder includes three convolutional layers for learning low-dimensional latent features from the input state variable Ui and uses ConvLSTM layers to evolve based on the learned features over time. Because the transformation is performed on low-dimensional variables, the memory burden is alleviated. In addition, inspired by the forward Euler scheme, a global residual connection is added between the input state variable ",{"type":17,"tag":34,"props":251,"children":252},{},[253],{"type":23,"value":254},"Ui",{"type":23,"value":256}," and the output variable ",{"type":17,"tag":34,"props":258,"children":259},{},[260],{"type":23,"value":261},"Ui+1",{"type":23,"value":263},". Then, the single-step learning process can be represented as ",{"type":17,"tag":34,"props":265,"children":266},{},[267],{"type":23,"value":268},"Ui+1 = Ui + δt x N[Ui; θ]",{"type":23,"value":270},", where ",{"type":17,"tag":34,"props":272,"children":273},{},[274],{"type":23,"value":275},"N[·]",{"type":23,"value":277}," indicates the trained NN operator and ",{"type":17,"tag":34,"props":279,"children":280},{},[281],{"type":23,"value":282},"δt",{"type":23,"value":284}," is the time interval. Therefore, this recursive relationship can be regarded as a simple AR process.",{"type":17,"tag":25,"props":286,"children":287},{},[288],{"type":17,"tag":71,"props":289,"children":291},{"alt":7,"src":290},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/6528cc6a8514477da38051e82a6da4e3.png",[],{"type":17,"tag":25,"props":293,"children":294},{},[295],{"type":23,"value":296},"Figure 3: PhyCRNet network structure",{"type":17,"tag":25,"props":298,"children":299},{},[300,302,307,309,314,316,321,323,328,330,334],{"type":23,"value":301},"Here, ",{"type":17,"tag":34,"props":303,"children":304},{},[305],{"type":23,"value":306},"U0",{"type":23,"value":308}," is the initial condition, and ",{"type":17,"tag":34,"props":310,"children":311},{},[312],{"type":23,"value":313},"U1",{"type":23,"value":315}," to ",{"type":17,"tag":34,"props":317,"children":318},{},[319],{"type":23,"value":320},"UT",{"type":23,"value":322}," are discrete solutions to be predicted by the model. Compared with traditional numerical methods, ConvLSTM can use larger time intervals. For the calculation of each derivative term, we use a fixed convolutional kernel to represent the numerical differentiation. PhyCRNet uses second-order and fourth-order difference schemes to calculate temporal and spatial derivatives of ",{"type":17,"tag":34,"props":324,"children":325},{},[326],{"type":23,"value":327},"U",{"type":23,"value":329},". To further improve the computational efficiency, we can skip the encoder part in a period of ",{"type":17,"tag":34,"props":331,"children":332},{},[333],{"type":23,"value":91},{"type":23,"value":335},", except the first time instant of each period, as shown in the following figure:",{"type":17,"tag":25,"props":337,"children":338},{},[339],{"type":17,"tag":71,"props":340,"children":342},{"alt":7,"src":341},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/d6ba712ce95f4b3bb65f8c8592dba16e.png",[],{"type":17,"tag":25,"props":344,"children":345},{},[346],{"type":23,"value":347},"Figure 4: PhyCRNet-s network structure",{"type":17,"tag":25,"props":349,"children":350},{},[351],{"type":17,"tag":34,"props":352,"children":353},{},[354],{"type":23,"value":355},"I/BC Hard Imposition",{"type":17,"tag":25,"props":357,"children":358},{},[359,361,365],{"type":23,"value":360},"Compared with the PINNs that use the physical initial/boundary condition (I/BC) as soft imposition (the residual is optimized as a part of the loss function). PhyCRNet strictly integrates I/BCs into the model (the IC is used as the input ",{"type":17,"tag":34,"props":362,"children":363},{},[364],{"type":23,"value":306},{"type":23,"value":366}," of ConvLSTM, and the BC is imposed through padding). In this way, the physical condition is no longer soft imposition, improving the accuracy and facilitating convergence. For Dirichlet BCs, the known boundary constants can be incorporated into the state variables through spatial padding. For Neumann BCs, a layer of ghost elements may be added around the spatial domain, and their values are approximated with difference in a training process.",{"type":17,"tag":25,"props":368,"children":369},{},[370],{"type":17,"tag":71,"props":371,"children":373},{"alt":7,"src":372},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/abbb348f523b47eda800ecc269aeb1f1.png",[],{"type":17,"tag":25,"props":375,"children":376},{},[377],{"type":17,"tag":71,"props":378,"children":380},{"alt":7,"src":379},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/3402a43ae9f546a5b4cf8896f99012bf.png",[],{"type":17,"tag":25,"props":382,"children":383},{},[384,386,391,393,398],{"type":23,"value":385},"Where ",{"type":17,"tag":34,"props":387,"children":388},{},[389],{"type":23,"value":390},"Nτ",{"type":23,"value":392}," is the number of time steps within [0, τ], and ",{"type":17,"tag":34,"props":394,"children":395},{},[396],{"type":23,"value":397},"u *(x, t)",{"type":23,"value":399}," is the reference solution.",{"type":17,"tag":25,"props":401,"children":402},{},[403],{"type":17,"tag":34,"props":404,"children":405},{},[406],{"type":23,"value":407},"2D Burgers' equations",{"type":17,"tag":25,"props":409,"children":410},{},[411],{"type":23,"value":412},"Consider a classical problem in fluid dynamics, the 2D Burgers' equation given in the following form:",{"type":17,"tag":25,"props":414,"children":415},{},[416],{"type":17,"tag":71,"props":417,"children":419},{"alt":7,"src":418},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/a7b2bf249cc9484da6006e2a1fca3c68.png",[],{"type":17,"tag":25,"props":421,"children":422},{},[423],{"type":23,"value":424},"Four representative time instants are chosen for training (t = 1.0, 2.0) and extrapolation (t = 3.0, 4.0) phases to compare the solution accuracy and extrapolation between PhyCRNet and PINN.",{"type":17,"tag":25,"props":426,"children":427},{},[428],{"type":17,"tag":71,"props":429,"children":431},{"alt":7,"src":430},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/6c606482d0dc439e8340391fc6865292.png",[],{"type":17,"tag":25,"props":433,"children":434},{},[435,439,441],{"type":17,"tag":71,"props":436,"children":438},{"alt":7,"src":437},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/24576961a1644a8ab49411f4a6cddf5f.png",[],{"type":23,"value":440}," ",{"type":17,"tag":71,"props":442,"children":444},{"alt":7,"src":443},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/f87e39e1a66344899732a682170a2162.png",[],{"type":17,"tag":25,"props":446,"children":447},{},[448],{"type":23,"value":449},"Figure 7: Comparison of solution accuracy and extrapolability between PhyCRNet and PINN for the λ-ω RD equations",{"type":17,"tag":25,"props":451,"children":452},{},[453],{"type":23,"value":454},"The following figure shows the error propagation curves of PhyCRNet and PINN during training and extrapolation in the preceding two PDE systems. It can be seen that PhyCRNet has better performance in both phases, especially extrapolation.",{"type":17,"tag":25,"props":456,"children":457},{},[458],{"type":17,"tag":71,"props":459,"children":461},{"alt":7,"src":460},"https://obs-mindspore-file.obs.cn-north-4.myhuaweicloud.com/file/2023/09/14/0c47f2fd7f654dee9ba305032d40e6d5.png",[],{"type":17,"tag":25,"props":463,"children":464},{},[465],{"type":23,"value":466},"Figure 8: Comparison of error propagation between PhyCRNet and PINN",{"type":17,"tag":25,"props":468,"children":469},{},[470],{"type":17,"tag":34,"props":471,"children":472},{},[473],{"type":23,"value":474},"References",{"type":17,"tag":25,"props":476,"children":477},{},[478],{"type":23,"value":479},"[1] Ren P, Rao C, Liu Y, et al. PhyCRNet: Physics-informed convolutional-recurrent network for solving spatiotemporal PDEs[J]. Computer Methods in Applied Mechanics and Engineering, 2022, 389: 114399.",{"type":17,"tag":25,"props":481,"children":482},{},[483,485],{"type":23,"value":484},"[2]",{"type":17,"tag":486,"props":487,"children":491},"a",{"href":488,"rel":489},"https://www.sciencedirect.com/science/article/abs/pii/S0045782521006514?via%3Dihub",[490],"nofollow",[492],{"type":23,"value":488},{"title":7,"searchDepth":494,"depth":494,"links":495},4,[],"markdown","content:news:en:2770.md","content","news/en/2770.md","news/en/2770","md",1776506046158]