# Function Differences with torch.optim.Adagrad ## torch.optim.Adagrad ```python class torch.optim.Adagrad( params, lr=0.01, lr_decay=0, weight_decay=0, initial_accumulator_value=0, eps=1e-10 ) ``` For more information, see [torch.optim.Adagrad](https://pytorch.org/docs/1.5.0/optim.html#torch.optim.Adagrad). ## mindspore.nn.Adagrad ```python class mindspore.nn.Adagrad( params, accum=0.1, learning_rate=0.001, update_slots=True, loss_scale=1.0, weight_decay=0.0 )(grads) ``` For more information, see [mindspore.nn.Adagrad](https://mindspore.cn/docs/en/r2.0.0-alpha/api_python/nn/mindspore.nn.Adagrad.html#mindspore.nn.Adagrad). ## Differences PyTorch: Parameters to be optimized should be put into an iterable parameter then passed as a whole. The `step` method is also implemented to perform one single step optimization and return loss. MindSpore: The ways of the same learning rate for all parameters and different values for different parameter groups are supported. | Categories | Subcategories |TensorFlow | MindSpore | Differences | | --- | --- | --- | --- |---| | Parameters | Parameter 1 | learning_rate | learning_rate | - | | | Parameter 2 | initial_accumulator_value | accum | Same function, different parameter names | | | Parameter 3 | epsilon | - | TensorFlow is used to maintain numerical stability of small floating point values. MindSpore does not have this parameter | | | Parameter 4 | name | - | Not involved | | | Parameter 5 | **kwargs | - | Not involved | | | Parameter 6 | - | params | A list of parameters or a list of dictionaries, not available in TensorFlow | | | Parameter 7 | - | update_slots | If the value is True, the accumulator is updated. TensorFlow does not have this parameter | | | Parameter 8 | - | loss_scale | gradient scaling factor, default value: 1.0. TensorFlow does not have this parameter | | | Parameter 9 | - | weight_decay | weight decay (L2 penalty), default value: 0.0. TensorFlow does not have this parameter | | Input | Single input | - | grads | The gradient of `params` in the optimizer. TensorFlow does not have this parameter | ### Code Example > The two APIs basically achieve the same function. ```python # TensorFlow import tensorflow as tf opt = tf.keras.optimizers.Adagrad(initial_accumulator_value=0.1, learning_rate=0.1) var = tf.Variable(1.0) val0 = var.value() loss = lambda: (var ** 2)/2.0 step_count = opt.minimize(loss, [var]).numpy() val1 = var.value() print([val1.numpy()]) # [0.9046537] step_count = opt.minimize(loss, [var]).numpy() val2 = var.value() print([val2.numpy()]) # [0.8393387] # MindSpore import numpy as np import mindspore.nn as nn import mindspore as ms from mindspore.dataset import NumpySlicesDataset class Net(nn.Cell): def __init__(self): super(Net, self).__init__() self.w = ms.Parameter(ms.Tensor(np.array([1.0], np.float32)), name='w') def construct(self, x): f = self.w * x return f class MyLoss(nn.LossBase): def __init__(self, reduction='none'): super(MyLoss, self).__init__() def construct(self, y, y_pred): return (y - y_pred) ** 2 / 2.0 net = Net() loss = MyLoss() optim = nn.Adagrad(params=net.trainable_params(), accum=0.1, learning_rate=0.1) model = ms.Model(net, loss_fn=loss, optimizer=optim) data_x = np.array([1.0], dtype=np.float32) data_y = np.array([0.0], dtype=np.float32) data = NumpySlicesDataset((data_x, data_y), ["x", "y"]) input_x = ms.Tensor(np.array([1.0], np.float32)) y0 = net(input_x) model.train(1, data) y1 = net(input_x) print(y1) # [0.9046537] model.train(1, data) y2 = net(input_x) print(y2) # [0.8393387] ```