mindspore.dataset.config.set_prefetch_size

mindspore.dataset.config.set_prefetch_size(size)

Set the buffer queue size between dataset operations in the pipeline.

The presence of a buffer queue allows the current operation to start processing subsequent data before the next operation fetches it, so the operations can execute asynchronously and concurrently.

A larger buffer queue size reduces the overall processing latency when neighboring operations have unbalanced throughput rates, but also consumes more system memory.

Parameters

size (int) – The size of the buffer queue, must be greater than 0.

Raises

TypeError – If size is not of type int.
ValueError – If size is not a positive number.

Note

The total memory consumed by the buffer queue is proportional to the number of worker threads. To avoid overuse of memory, when the number of worker threads is greater than 4, the actual buffer queue size used will be adjusted to the greater of (size * 4 / number of worker threads) and 1.

Examples

>>> # Set a new global configuration value for the prefetch size.
>>> import mindspore.dataset as ds
>>> ds.config.set_prefetch_size(1000)