mindspore.dataset.Dataset.filter

View Source On Gitee
Dataset.filter(predicate, input_columns=None, num_parallel_workers=None)[source]

Filter dataset by prediction.

Parameters
  • predicate (callable) – Python callable which returns a boolean value. If False then filter the element.

  • input_columns (Union[str, list[str]], optional) – List of names of the input columns. If not provided or provided with None, the predicate will be applied on all columns in the dataset. Default: None.

  • num_parallel_workers (int, optional) – Number of workers to process the dataset in parallel. Default: None.

Returns

Dataset, a new dataset with the above operation applied.

Examples

>>> # generator data(0 ~ 19)
>>> # filter the data that greater than or equal to 11
>>> import mindspore.dataset as ds
>>> dataset = ds.GeneratorDataset([i for i in range(20)], "data")
>>> dataset = dataset.filter(predicate=lambda data: data < 11, input_columns = ["data"])