mindspore.dataset.config

The configuration manager.

mindspore.dataset.config.get_monitor_sampling_interval()

Get the default interval of performance monitor sampling.

Returns

interval(ms) of performance monitor sampling.

Return type

Interval

mindspore.dataset.config.get_num_parallel_workers()

Get the default number of parallel workers.

Returns

Int, number of parallel workers to be used as a default for each operation

mindspore.dataset.config.get_prefetch_size()

Get the prefetch size in number of rows.

Returns

Size, total number of rows to be prefetched.

mindspore.dataset.config.get_seed()

Get the seed.

Returns

Int, seed.

mindspore.dataset.config.load(file)

Load configuration from a file.

Parameters

file (str) – path the config file to be loaded.

Raises

RuntimeError – If file is invalid and parsing fails.

Examples

>>> import mindspore.dataset as ds
>>> # sets the default value according to values in configuration file.
>>> ds.config.load("path/to/config/file")
>>> # example config file:
>>> # {
>>> #     "logFilePath": "/tmp",
>>> #     "rowsPerBuffer": 32,
>>> #     "numParallelWorkers": 4,
>>> #     "workerConnectorSize": 16,
>>> #     "opConnectorSize": 16,
>>> #     "seed": 5489,
>>> #     "monitorSamplingInterval": 30
>>> # }
mindspore.dataset.config.set_monitor_sampling_interval(interval)

Set the default interval(ms) of monitor sampling.

Parameters

interval (int) – interval(ms) to be used to performance monitor sampling.

Raises

ValueError – If interval is invalid (<= 0 or > MAX_INT_32).

Examples

>>> import mindspore.dataset as ds
>>> # sets the new interval value.
>>> ds.config.set_monitor_sampling_interval(100)
mindspore.dataset.config.set_num_parallel_workers(num)

Set the default number of parallel workers.

Parameters

num (int) – number of parallel workers to be used as a default for each operation.

Raises

ValueError – If num_parallel_workers is invalid (<= 0 or > MAX_INT_32).

Examples

>>> import mindspore.dataset as ds
>>> # sets the new parallel_workers value, now parallel dataset operators will run with 8 workers.
>>> ds.config.set_num_parallel_workers(8)
mindspore.dataset.config.set_prefetch_size(size)

Set the number of rows to be prefetched.

Parameters

size (int) – total number of rows to be prefetched.

Raises

ValueError – If prefetch_size is invalid (<= 0 or > MAX_INT_32).

Examples

>>> import mindspore.dataset as ds
>>> # sets the new prefetch value.
>>> ds.config.set_prefetch_size(1000)
mindspore.dataset.config.set_seed(seed)

Set the seed to be used in any random generator. This is used to produce deterministic results.

Note

This set_seed function sets the seed in the python random library and numpy.random library for deterministic python augmentations using randomness. This set_seed function should be called with every iterator created to reset the random seed. In our pipeline this does not guarantee deterministic results with num_parallel_workers > 1.

Parameters

seed (int) – seed to be set.

Raises

ValueError – If seed is invalid (< 0 or > MAX_UINT_32).

Examples

>>> import mindspore.dataset as ds
>>> # sets the new seed value, now operators with a random seed will use new seed value.
>>> ds.config.set_seed(1000)