比较与torchvision.transforms.ToTensor的功能差异

torchvision.transforms.ToTensor

class torchvision.transforms.ToTensor

更多内容详见torchvision.transforms.ToTensor

mindspore.dataset.vision.ToTensor

class mindspore.dataset.vision.ToTensor(
    output_type=np.float32
    )

更多内容详见mindspore.dataset.vision.ToTensor

使用方式

PyTorch:将PIL类型的Image或numpy 数组转换为 torch中的Tensor, 输入的numpy数组通常是<H, W, C>格式且取值在[0, 255]范围,输出是<C, H, W>格式且取值在[0.0, 1.0]的torch Tensor。

MindSpore:输入为PIL类型的图像或<H, W, C>格式且取值在[0, 255]范围内的numpy数组,输出为[0.0, 1.0]范围内且具有<C, H, W>格式的numpy数组;等同于在原始输入图像上做了通道转换及像素值归一化两种操作。

代码示例

import numpy as np
from PIL import Image
from torchvision import transforms
import mindspore.dataset.vision as vision

# In MindSpore, ToTensor convert PIL Image into numpy array.
img_path =  "/path/to/test/1.jpg"

img = Image.open(img_path)
to_tensor = vision.ToTensor()
img_data = to_tensor(img)
print("img_data type:", type(img_data))
print("img_data dtype:", img_data.dtype)

# Out:
#img_data type: <class 'numpy.ndarray'>
#img_data dtype: float32

# In torch, ToTensor transforms the input to tensor.
img_path = "/path/to/test/1.jpg"

image_transform = transforms.Compose([transforms.ToTensor()])
img = np.array(Image.open(img_path))
img_data = image_transform(img)
print(img_data.shape)
# Out:
# torch.Size([3, 2268, 4032])