mindspore.dataset.text.TruncateSequencePair

class mindspore.dataset.text.TruncateSequencePair(max_length)[source]

Truncate a pair of 1-D string input so that their total length is less than the specified length.

Parameters: max_length (int) – The maximum total length of the output strings. If it is no less than the total length of the original pair of strings, no truncation is performed; otherwise, the longer of the two input strings is truncated until its total length equals this value.
Raises: TypeError – If max_length is not of type int.

Supported Platforms:: CPU

Examples

>>> import mindspore.dataset as ds
>>> import mindspore.dataset.text as text
>>>
>>> # Use the transform in dataset pipeline mode
>>> numpy_slices_dataset = ds.NumpySlicesDataset(data=([[1, 2, 3]], [[4, 5]]), column_names=["col1", "col2"])
>>> # Data before
>>> # |   col1    |   col2    |
>>> # +-----------+-----------|
>>> # | [1, 2, 3] |  [4, 5]   |
>>> # +-----------+-----------+
>>> truncate_sequence_pair_op = text.TruncateSequencePair(max_length=4)
>>> numpy_slices_dataset = numpy_slices_dataset.map(operations=truncate_sequence_pair_op,
...                                                 input_columns=["col1", "col2"])
>>> for item in numpy_slices_dataset.create_dict_iterator(num_epochs=1, output_numpy=True):
...     print(item["col1"], item["col2"])
[1 2] [4 5]
>>> # Data after
>>> # |   col1    |   col2    |
>>> # +-----------+-----------+
>>> # |  [1, 2]   |  [4, 5]   |
>>> # +-----------+-----------+
>>>
>>> # Use the transform in eager mode
>>> data = [["1", "2", "3"], ["4", "5"]]
>>> output = text.TruncateSequencePair(4)(*data)
>>> print(output)
(array(['1', '2'], dtype='<U1'), array(['4', '5'], dtype='<U1'))

Tutorial Examples:

Illustration of text transforms