mindspore.dataset.text.ToVectors
- class mindspore.dataset.text.ToVectors(vectors, unk_init=None, lower_case_backup=False)[source]
Look up a token and return its vector according to the input vector table.
- Parameters
vectors (Vectors) – A vectors object.
unk_init (sequence, optional) – Sequence used to initialize the vector for out-of-vocabulary (OOV) tokens. Default:
None, which means initialize with zero vectors.lower_case_backup (bool, optional) – Whether to look up tokens in lower case. If
False, each token in the original case will be looked up. IfTrue, each token in the original case will be looked up first; if the token is not found in the keys of the property stoi, the token in lower case will be looked up. Default:False.
- Raises
- Supported Platforms:
CPU
Examples
>>> import mindspore.dataset as ds >>> import mindspore.dataset.text as text >>> >>> # Use the transform in dataset pipeline mode >>> numpy_slices_dataset = ds.NumpySlicesDataset(data=["happy", "birthday", "to", "you"], column_names=["text"]) >>> # Load vectors from file >>> # The paths to vectors_file can be downloaded directly from the mindspore repository. Refer to >>> # https://atomgit.com/mindspore/mindspore/blob/master/tests/ut/data/dataset/testVectors/vectors.txt >>> vectors_file = "tests/ut/data/dataset/testVectors/vectors.txt" >>> vectors = text.Vectors.from_file(vectors_file) >>> # Use ToVectors operation to map tokens to vectors >>> to_vectors = text.ToVectors(vectors) >>> numpy_slices_dataset = numpy_slices_dataset.map(operations=[to_vectors]) >>> for item in numpy_slices_dataset.create_dict_iterator(num_epochs=1, output_numpy=True): ... print(item["text"]) ... break [0. 0. 0. 0. 0. 0.] >>> >>> # Use the transform in eager mode >>> data = ["happy"] >>> output = text.ToVectors(vectors)(data) >>> print(output) [0. 0. 0. 0. 0. 0.]
- Tutorial Examples: