Class Ngram
- Defined in File text.h 
Inheritance Relationships
Base Type
- public mindspore::dataset::TensorTransform(Class TensorTransform)
Class Documentation
- 
class Ngram : public mindspore::dataset::TensorTransform
- Generate n-gram from a 1-D string Tensor. - Public Functions - 
inline explicit Ngram(const std::vector<int32_t> &ngrams, const std::pair<std::string, int32_t> &left_pad = {"", 0}, const std::pair<std::string, int32_t> &right_pad = {"", 0}, const std::string &separator = " ")
- Constructor. - 参数
- ngrams – [in] ngrams is a vector of positive integers. For example, if ngrams={4, 3}, then the result would be a 4-gram followed by a 3-gram in the same tensor. If the number of words is not enough to make up a n-gram, an empty string will be returned. 
- left_pad – [in] {"pad_token", pad_width}. Padding performed on left side of the sequence. pad_width will be capped at n-1. left_pad=("_",2) would pad the left side of the sequence with "__" (default={"", 0}}). 
- right_pad – [in] {"pad_token", pad_width}. Padding performed on right side of the sequence.pad_width will be capped at n-1. right_pad=("-",2) would pad the right side of the sequence with "–" (default={"", 0}}). 
- separator – [in] Symbol used to join strings together (default=" "). 
 样例
- /* Define operations */ auto ngram_op = text::Ngram({2, 3}, {"&", 2}, {"&", 2}, "-"); /* dataset is an instance of Dataset object */ dataset = dataset->Map({ngram_op}, // operations {"text"}); // input columns 
 
 - 
Ngram(const std::vector<int32_t> &ngrams, const std::pair<std::vector<char>, int32_t> &left_pad, const std::pair<std::vector<char>, int32_t> &right_pad, const std::vector<char> &separator)
- Constructor. - 参数
- ngrams – [in] ngrams is a vector of positive integers. For example, if ngrams={4, 3}, then the result would be a 4-gram followed by a 3-gram in the same tensor. If the number of words is not enough to make up a n-gram, an empty string will be returned. 
- left_pad – [in] {"pad_token", pad_width}. Padding performed on left side of the sequence. pad_width will be capped at n-1. left_pad=("_",2) would pad the left side of the sequence with "__" (default={"", 0}}). 
- right_pad – [in] {"pad_token", pad_width}. Padding performed on right side of the sequence.pad_width will be capped at n-1. right_pad=("-",2) would pad the right side of the sequence with "–" (default={"", 0}}). 
- separator – [in] Symbol used to join strings together (default=" "). 
 
 
 - 
~Ngram() override = default
- Destructor. 
 
- 
inline explicit Ngram(const std::vector<int32_t> &ngrams, const std::pair<std::string, int32_t> &left_pad = {"", 0}, const std::pair<std::string, int32_t> &right_pad = {"", 0}, const std::string &separator = " ")