Class TedliumDataset

Inheritance Relationships

Base Type

Class Documentation

class TedliumDataset : public mindspore::dataset::Dataset

A source dataset for reading and parsing tedlium dataset.

Public Functions

TedliumDataset(const std::vector<char> &dataset_dir, const std::vector<char> &release, const std::vector<char> &usage, const std::vector<char> &extensions, const std::shared_ptr<Sampler> &sampler, const std::shared_ptr<DatasetCache> &cache)

Constructor of TedliumDataset.

Parameters
  • dataset_dir[in] Path to the root directory that contains the dataset.

  • release[in] Release of the dataset, can be “release1”, “release2”, “release3”.

  • usage[in] Part of dataset of TEDLIUM, for release3, only can be “all”, for release1 and release2, can be “train”, “test” or “all”.

  • extensions[in] The extensions of audio file. Only support “.sph” now.

  • sampler[in] Shared pointer to a sampler object used to choose samples from the dataset. If sampler is not given, a RandomSampler will be used to randomly iterate the entire dataset.

  • cache[in] Tensor cache to use.

TedliumDataset(const std::vector<char> &dataset_dir, const std::vector<char> &release, const std::vector<char> &usage, const std::vector<char> &extensions, const Sampler *sampler, const std::shared_ptr<DatasetCache> &cache)

Constructor of TedliumDataset.

Parameters
  • dataset_dir[in] Path to the root directory that contains the dataset.

  • release[in] Release of the dataset, can be “release1”, “release2”, “release3”.

  • usage[in] Part of dataset of TEDLIUM, for release3, only can be “all”, for release1 and release2, can be “train”, “test” or “all”.

  • extensions[in] The extensions of audio file. Only support “.sph” now.

  • sampler[in] Raw pointer to a sampler object used to choose samples from the dataset.

  • cache[in] Tensor cache to use.

TedliumDataset(const std::vector<char> &dataset_dir, const std::vector<char> &release, const std::vector<char> &usage, const std::vector<char> &extensions, const std::reference_wrapper<Sampler> &samlper, const std::shared_ptr<DatasetCache> &cache)

Constructor of TedliumDataset.

Parameters
  • dataset_dir[in] Path to the root directory that contains the dataset.

  • release[in] Release of the dataset, can be “release1”, “release2”, “release3”.

  • usage[in] Part of dataset of TEDLIUM, for release3, only can be “all”, for release1 and release2, can be “train”, “test” or “all”.

  • extensions[in] The extensions of audio file. Only support “.sph” now.

  • sampler[in] Sampler object used to choose samples from the dataset.

  • cache[in] Tensor cache to use.

~TedliumDataset() override = default

Destructor of TedliumDataset.