mindspore.dataset.SubsetSampler

class mindspore.dataset.SubsetSampler(indices, num_samples=None)[源代码]

给定样本的索引序列，对数据集采样指定索引的样本。

参数：

indices (Iterable) - 索引的序列（包括除了string类型的任意Python可迭代对象类型）。
num_samples (int, 可选) - 获取的样本数，可用于获取部分采样得到的样本。默认值： None ，获取采样到的所有样本。

异常：

TypeError - indices 的类型不是number。
TypeError - num_samples 的类型不是int。
ValueError - num_samples 为负值。

样例：

>>> import mindspore.dataset as ds
>>> indices = [0, 1, 2, 3, 4, 5]
>>>
>>> # creates a SubsetSampler, will sample from the provided indices
>>> sampler = ds.SubsetSampler(indices)
>>> dataset = ds.ImageFolderDataset(image_folder_dataset_dir,
...                                 num_parallel_workers=8,
...                                 sampler=sampler)

add_child(sampler)[源代码]

为给定采样器添加子采样器。父采样器接收子采样器输出数据作为输入，并应用其采样逻辑返回新的采样结果。

说明

被添加的子sampler如果有 shuffle 属性，其值不能是 Shuffle.PARTIAL ，且父sampler的 shuffle 属性值必须是 Shuffle.GLOBAL 。

参数：

sampler (Sampler) - 用于从数据集中选择样本的对象。仅支持内置采样器（ mindspore.dataset.DistributedSampler 、 mindspore.dataset.PKSampler 、 mindspore.dataset.RandomSampler 、 mindspore.dataset.SequentialSampler 、 mindspore.dataset.SubsetRandomSampler 、 mindspore.dataset.WeightedRandomSampler ）。

样例：

>>> import mindspore.dataset as ds
>>> sampler = ds.SequentialSampler(start_index=0, num_samples=3)
>>> sampler.add_child(ds.RandomSampler(num_samples=4))
>>> dataset = ds.Cifar10Dataset(cifar10_dataset_dir, sampler=sampler)

get_child()[源代码]

获取给定采样器的子采样器。

返回：: Sampler，给定采样器的子采样器。

样例：

>>> import mindspore.dataset as ds
>>> sampler = ds.SequentialSampler(start_index=0, num_samples=3)
>>> sampler.add_child(ds.RandomSampler(num_samples=2))
>>> child_sampler = sampler.get_child()