Class BucketBatchByLengthDataset
- Defined in File datasets.h 
Inheritance Relationships
Base Type
- public mindspore::dataset::Dataset(Class Dataset)
Class Documentation
- 
class BucketBatchByLengthDataset : public mindspore::dataset::Dataset
- The result of applying BucketBatchByLength operator to the input dataset. - Public Functions - Constructor of BucketBatchByLengthDataset. - Note - Bucket elements according to their lengths. Each bucket will be padded and batched when they are full. - Parameters
- input – [in] The dataset which need to apply bucket batch by length operation. 
- column_names – [in] Columns passed to element_length_function. 
- bucket_boundaries – [in] A list consisting of the upper boundaries of the buckets. Must be strictly increasing. If there are n boundaries, n+1 buckets are created: One bucket for [0, bucket_boundaries[0]), one bucket for [bucket_boundaries[i], bucket_boundaries[i+1]) for each 0<i<n, and one bucket for [bucket_boundaries[n-1], inf). 
- bucket_batch_sizes – [in] A list consisting of the batch sizes for each bucket. Must contain elements equal to the size of bucket_boundaries + 1. 
- element_length_function – [in] A function pointer that takes in MSTensorVec and outputs a MSTensorVec. The output must contain a single tensor containing a single int32_t. If no value is provided, then size of column_names must be 1, and the size of the first dimension of that column will be taken as the length (default=nullptr). 
- pad_info – [in] Represents how to batch each column. The key corresponds to the column name, the value must be a tuple of 2 elements. The first element corresponds to the shape to pad to, and the second element corresponds to the value to pad with. If a column is not specified, then that column will be padded to the longest in the current batch, and 0 will be used as the padding value. Any unspecified dimensions will be padded to the longest in the current batch, unless if pad_to_bucket_boundary is true. If no padding is wanted, set pad_info to None (default=empty dictionary). 
- pad_to_bucket_boundary – [in] If true, will pad each unspecified dimension in pad_info to the bucket_boundary minus 1. If there are any elements that fall into the last bucket, an error will occur (default=false). 
- drop_remainder – [in] If true, will drop the last batch for each bucket if it is not a full batch (default=false). 
 
 
 - 
~BucketBatchByLengthDataset() override = default
- Destructor of BucketBatchByLengthDataset.