mindspore.dataset.text.CaseFold

View Source On Gitee
class mindspore.dataset.text.CaseFold[source]

Apply case fold operation on UTF-8 string tensor, which is aggressive that can convert more characters into lower case than str.lower . For supported normalization forms, please refer to ICU_Normalizer2 .

Note

CaseFold is not supported on Windows platform yet.

Supported Platforms:

CPU

Examples

>>> import mindspore.dataset as ds
>>> import mindspore.dataset.text as text
>>>
>>> # Use the transform in dataset pipeline mode
>>> numpy_slices_dataset = ds.NumpySlicesDataset(data=['Welcome     To   BeiJing!'], column_names=["text"])
>>> case_op = text.CaseFold()
>>> numpy_slices_dataset = numpy_slices_dataset.map(operations=case_op)
>>> for item in numpy_slices_dataset.create_dict_iterator(num_epochs=1, output_numpy=True):
...     print(item["text"])
welcome     to   beijing!
>>>
>>> # Use the transform in eager mode
>>> data = 'Welcome     To   BeiJing!'
>>> output = text.CaseFold()(data)
>>> print(output)
welcome     to   beijing!
Tutorial Examples: