mindspore.dataset.text.Truncate

View Source On Gitee
class mindspore.dataset.text.Truncate(max_seq_len)[source]

Truncate the input sequence so that it does not exceed the maximum length.

Parameters

max_seq_len (int) – Maximum allowable length.

Raises
  • TypeError – If max_length_len is not of type int.

  • ValueError – If value of max_length_len is not greater than or equal to 0.

  • RuntimeError – If the input tensor is not of dtype bool, int, float, double or str.

Supported Platforms:

CPU

Examples

>>> import mindspore.dataset as ds
>>> import mindspore.dataset.text as text
>>>
>>> # Use the transform in dataset pipeline mode
>>> numpy_slices_dataset = ds.NumpySlicesDataset(data=[['a', 'b', 'c', 'd', 'e']], column_names=["text"],
...                                              shuffle=False)
>>> # Data before
>>> # |           col1            |
>>> # +---------------------------+
>>> # | ['a', 'b', 'c', 'd', 'e'] |
>>> # +---------------------------+
>>> truncate = text.Truncate(4)
>>> numpy_slices_dataset = numpy_slices_dataset.map(operations=truncate, input_columns=["text"])
>>> for item in numpy_slices_dataset.create_dict_iterator(num_epochs=1, output_numpy=True):
...     print(item["text"])
['a' 'b' 'c' 'd']
>>> # Data after
>>> # |          col1          |
>>> # +------------------------+
>>> # |  ['a', 'b', 'c', 'd']  |
>>> # +------------------------+
>>>
>>> # Use the transform in eager mode
>>> data = ["happy", "birthday", "to", "you"]
>>> output = text.Truncate(2)(data)
>>> print(output)
['happy' 'birthday']
Tutorial Examples: