Function differences with torchvision.transforms.ToTensor

torchvision.transforms.ToTensor

class torchvision.transforms.ToTensor

For more information, see torchvision.transforms.ToTensor.

mindspore.dataset.vision.ToTensor

class mindspore.dataset.vision.ToTensor(
    output_type=np.float32
    )

For more information, see mindspore.dataset.vision.ToTensor.

Differences

PyTorch: Convert the PIL Image or numpy array to tensor. The input numpy array is usually in the format of <H, W, C> and the value is in the range of [0, 255], and the output is <C, H, W > Torch Tensor with format and value in [0.0, 1.0].

MindSpore: The input is an image of PIL type or a numpy array with a value in the range of [0, 255] in the format of <H, W, C>, and the output is in the range of [0.0, 1.0] with <C, H, W> Format numpy array; it is equivalent to two operations of channel conversion and pixel value normalization on the original input image.

Code Exampleimport numpy as np
from PIL import Image
from torchvision import transforms
import mindspore.dataset.vision as vision

# In MindSpore, ToTensor convert PIL Image into numpy array.
img_path =  "/path/to/test/1.jpg"

img = Image.open(img_path)
to_tensor = vision.ToTensor()
img_data = to_tensor(img)
print("img_data type:", type(img_data))
print("img_data dtype:", img_data.dtype)

# Out:
#img_data type: <class 'numpy.ndarray'>
#img_data dtype: float32

# In torch, ToTensor transforms the input to tensor.
img_path = "/path/to/test/1.jpg"

image_transform = transforms.Compose([transforms.ToTensor()])
img = np.array(Image.open(img_path))
img_data = image_transform(img)
print(img_data.shape)
# Out:
# torch.Size([3, 2268, 4032])