比较与torch.nn.Module.parameters()的功能差异

查看源文件

torch.nn.Module.parameters

torch.nn.Module.parameters(recurse=True)

更多内容详见torch.nn.Module.parameters

mindspore.nn.Cell.get_parameters

mindspore.nn.Cell.get_parameters(expand=True)

更多内容详见mindspore.nn.Cell.get_parameters

使用方式

PyTorch中,网络有parameter, buffer, state三种概念,其中stateparameterbuffer的合集。parameter可以通过requires_grad属性来区分网络中的参数是否需要优化;buffer多定义为网络中的不变量,例如在定义网络时,BN中的running_meanrunning_var会被自动注册为buffer;用户也可以通过相关接口自行注册parameterbuffer

  • torch.nn.Module.parameters: 获取网络中的parameter,返回类型为迭代器。

  • torch.nn.Module.named_parameters:获取网络中parameter的名称和parameter本身,返回类型为迭代器。

MindSpore中目前只有parameter的概念,通过requires_grad属性来区分网络中的参数是否需要优化,例如在定义网络时,BN中的moving_meanmoving_var会被定义为requires_grad=Falseparameter

  • mindspore.nn.Cell.get_parameters: 获取网络中的parameter,返回类型为迭代器。

  • mindspore.nn.Cell.trainable_paramters:获取网络中需要被优化的parameter(即requires_grad=True),返回类型为列表。

因此,因为概念定义的差异,虽然torch.nn.Module.parametersmindspore.nn.Cell.get_parameters都是获取网络中的 parameter,但是返回的内容略有不同:例如,BN中的不变量moving_meanmoving_variance,在PyTorch中被注册成buffer,所以不会在torch.nn.Module.parameters接口中返回,而在MindSpore中仍然属于parameter,所以会在mindspore.nn.Cell.get_parameters中返回。

代码示例

from mindspore import nn

class ConvBN(nn.Cell):
  def __init__(self):
    super(ConvBN, self).__init__()
    self.conv = nn.Conv2d(3, 64, 3)
    self.bn = nn.BatchNorm2d(64)
  def construct(self, x):
    x = self.conv(x)
    x = self.bn(x)
    return x

class MyNet(nn.Cell):
  def __init__(self):
    super(MyNet, self).__init__()
    self.build_block = nn.SequentialCell(ConvBN(), nn.ReLU())
  def construct(self, x):
    return self.build_block(x)

# The following implements mindspore.nn.Cell.get_parameters() with MindSpore.
net = MyNet()

print(type(net.get_parameters()), "\n")
for params in net.get_parameters():
  print("Name: ", params.name)
  print("params: ", params)
# Out:

Name:  build_block.0.conv.weight
params:  Parameter (name=build_block.0.conv.weight, shape=(64, 3, 3, 3), dtype=Float32, requires_grad=True)
Name:  build_block.0.bn.moving_mean
params:  Parameter (name=build_block.0.bn.moving_mean, shape=(64,), dtype=Float32, requires_grad=False)
Name:  build_block.0.bn.moving_variance
params:  Parameter (name=build_block.0.bn.moving_variance, shape=(64,), dtype=Float32, requires_grad=False)
Name:  build_block.0.bn.gamma
params:  Parameter (name=build_block.0.bn.gamma, shape=(64,), dtype=Float32, requires_grad=True)
Name:  build_block.0.bn.beta
params:  Parameter (name=build_block.0.bn.beta, shape=(64,), dtype=Float32, requires_grad=True)
import torch.nn as nn

class ConvBN(nn.Module):
  def __init__(self):
    super(ConvBN, self).__init__()
    self.conv = nn.Conv2d(3, 64, 3)
    self.bn = nn.BatchNorm2d(64)
  def forward(self, x):
    x = self.conv(x)
    x = self.bn(x)
    return x

class MyNet(nn.Module):
  def __init__(self):
    super(MyNet, self).__init__()
    self.build_block = nn.Sequential(ConvBN(), nn.ReLU())
  def construct(self, x):
    return self.build_block(x)

# The following implements torch.nn.Module.parameters() with torch.
net = MyNet()
print(type(net.parameters()), "\n")
for name, params in net.named_parameters():
  print("Name: ", name)
  print("params: ", params.size())
# Out:
<class 'generator'>

Name:  build_block.0.conv.weight
params:  torch.Size([64, 3, 3, 3])
Name:  build_block.0.conv.bias
params:  torch.Size([64])
Name:  build_block.0.bn.weight
params:  torch.Size([64])
Name:  build_block.0.bn.bias
params:  torch.Size([64])