mindspore.dataset.transforms

General

This module is to support common data augmentations. Some operations are implemented in C++ to provide high performance. Other operations are implemented in Python including using NumPy.

Common imported modules in corresponding API examples are as follows:

import mindspore.dataset as ds
import mindspore.dataset.transforms as transforms

Note: Legacy c_transforms and py_transforms are deprecated but can still be imported as follows:

from mindspore.dataset.transforms import c_transforms
from mindspore.dataset.transforms import py_transforms

See Common Transforms tutorial for more details.

Descriptions of common data processing terms are as follows:

  • TensorOperation, the base class of all data processing operations implemented in C++.

  • PyTensorOperation, the base class of all data processing operations implemented in Python.

Note: In eager mode, non-NumPy input is implicitly converted to NumPy format and sent to MindSpore.

Transforms

mindspore.dataset.transforms.Compose

Compose a list of transforms into a single transform.

mindspore.dataset.transforms.Concatenate

Tensor operation that concatenates all columns into a single tensor, only 1D tenspr is supported.

mindspore.dataset.transforms.Duplicate

Duplicate the input tensor to output, only support transform one column each time.

mindspore.dataset.transforms.Fill

Tensor operation to fill all elements in the tensor with the specified value.

mindspore.dataset.transforms.Mask

Mask content of the input tensor with the given predicate.

mindspore.dataset.transforms.OneHot

Tensor operation to apply one hot encoding.

mindspore.dataset.transforms.PadEnd

Pad input tensor according to pad_shape, input tensor needs to have same rank.

mindspore.dataset.transforms.RandomApply

Randomly perform a series of transforms with a given probability.

mindspore.dataset.transforms.RandomChoice

Randomly select one transform from a list of transforms to perform operation.

mindspore.dataset.transforms.RandomOrder

Perform a series of transforms to the input image in a random order.

mindspore.dataset.transforms.Slice

Slice operation to extract a tensor out using the given n slices.

mindspore.dataset.transforms.TypeCast

Tensor operation to cast to a given MindSpore data type or NumPy data type.

mindspore.dataset.transforms.Unique

Perform the unique operation on the input tensor, only support transform one column each time.

Utilities

mindspore.dataset.transforms.Relational

Relationship operator.

Vision

This module is to support vision augmentations. Some image augmentations are implemented with C++ OpenCV to provide high performance. Other additional image augmentations are developed with Python PIL.

Common imported modules in corresponding API examples are as follows:

import mindspore.dataset as ds
import mindspore.dataset.vision as vision
import mindspore.dataset.vision.utils as utils

Note: Legacy c_transforms and py_transforms are deprecated but can still be imported as follows:

import mindspore.dataset.vision.c_transforms as c_vision
import mindspore.dataset.vision.py_transforms as py_vision

See Vision Transforms tutorial for more details.

Descriptions of common data processing terms are as follows:

  • TensorOperation, the base class of all data processing operations implemented in C++.

  • ImageTensorOperation, the base class of all image processing operations. It is a derived class of TensorOperation.

  • PyTensorOperation, the base class of all data processing operations implemented in Python.

The data transform operation can be executed in the data processing pipeline or in the eager mode:

  • Pipeline mode is generally used to process datasets. For examples, please refer to introduction to data processing pipeline .

  • Eager mode is generally used for scattered samples. Examples of image preprocessing are as follows:

    import numpy as np
    import mindspore.dataset.vision as vision
    from PIL import Image,ImageFont,ImageDraw
    
    # draw circle
    img = Image.new("RGB", (300, 300), (255, 255, 255))
    draw = ImageDraw.Draw(img)
    draw.ellipse(((0, 0), (100, 100)), fill=(255, 0, 0), outline=(255, 0, 0), width=5)
    img.save("./1.jpg")
    with open("./1.jpg", "rb") as f:
        data = f.read()
    
    data_decoded = vision.Decode()(data)
    data_croped = vision.RandomCrop(size=(250, 250))(data_decoded)
    data_resized = vision.Resize(size=(224, 224))(data_croped)
    data_normalized = vision.Normalize(mean=[0.485 * 255, 0.456 * 255, 0.406 * 255],
                                       std=[0.229 * 255, 0.224 * 255, 0.225 * 255])(data_resized)
    data_hwc2chw = vision.HWC2CHW()(data_normalized)
    print("data: {}, shape: {}".format(data_hwc2chw, data_hwc2chw.shape), flush=True)
    

Transforms

mindspore.dataset.vision.AdjustBrightness

Adjust the brightness of the input image.

mindspore.dataset.vision.AdjustContrast

Adjust the contrast of the input image.

mindspore.dataset.vision.AdjustGamma

Apply gamma correction on input image.

mindspore.dataset.vision.AdjustHue

Adjust the hue of the input image.

mindspore.dataset.vision.AdjustSaturation

Adjust the saturation of the input image.

mindspore.dataset.vision.AdjustSharpness

Adjust the sharpness of the input image.

mindspore.dataset.vision.Affine

Apply Affine transformation to the input image, keeping the center of the image unchanged.

mindspore.dataset.vision.AutoAugment

Apply AutoAugment data augmentation method based on AutoAugment: Learning Augmentation Strategies from Data .

mindspore.dataset.vision.AutoContrast

Apply automatic contrast on input image.

mindspore.dataset.vision.BoundingBoxAugment

Apply a given image processing operation on a random selection of bounding box regions of a given image.

mindspore.dataset.vision.CenterCrop

Crop the input image at the center to the given size.

mindspore.dataset.vision.ConvertColor

Change the color space of the image.

mindspore.dataset.vision.Crop

Crop the input image at a specific location.

mindspore.dataset.vision.CutMixBatch

Apply CutMix transformation on input batch of images and labels.

mindspore.dataset.vision.CutOut

Randomly cut (mask) out a given number of square patches from the input image array.

mindspore.dataset.vision.Decode

Decode the input image in RGB mode.

mindspore.dataset.vision.Equalize

Apply histogram equalization on input image.

mindspore.dataset.vision.Erase

Erase the input image with given value.

mindspore.dataset.vision.FiveCrop

Crop the given image into one central crop and four corners.

mindspore.dataset.vision.GaussianBlur

Blur input image with the specified Gaussian kernel.

mindspore.dataset.vision.Grayscale

Convert the input PIL Image to grayscale.

mindspore.dataset.vision.HorizontalFlip

Flip the input image horizontally.

mindspore.dataset.vision.HsvToRgb

Convert the input numpy.ndarray images from HSV to RGB.

mindspore.dataset.vision.HWC2CHW

Transpose the input image from shape <H, W, C> to <C, H, W>.

mindspore.dataset.vision.Invert

Apply invert on input image in RGB mode.

mindspore.dataset.vision.LinearTransformation

Linearly transform the input numpy.ndarray image with a square transformation matrix and a mean vector.

mindspore.dataset.vision.MixUp

Randomly mix up a batch of numpy.ndarray images together with its labels.

mindspore.dataset.vision.MixUpBatch

Apply MixUp transformation on input batch of images and labels.

mindspore.dataset.vision.Normalize

Normalize the input image with respect to mean and standard deviation.

mindspore.dataset.vision.NormalizePad

Normalize the input image with respect to mean and standard deviation then pad an extra channel with value zero.

mindspore.dataset.vision.Pad

Pad the image according to padding parameters.

mindspore.dataset.vision.PadToSize

Pad the image to a fixed size.

mindspore.dataset.vision.Perspective

Apply perspective transformation on input image.

mindspore.dataset.vision.Posterize

Reduce the bit depth of the color channels of image to create a high contrast and vivid color effect, similar to that seen in posters or printed materials.

mindspore.dataset.vision.RandAugment

Apply RandAugment data augmentation method on the input image.

mindspore.dataset.vision.RandomAdjustSharpness

Randomly adjust the sharpness of the input image with a given probability.

mindspore.dataset.vision.RandomAffine

Apply Random affine transformation to the input image.

mindspore.dataset.vision.RandomAutoContrast

Automatically adjust the contrast of the image with a given probability.

mindspore.dataset.vision.RandomColor

Adjust the color of the input image by a fixed or random degree.

mindspore.dataset.vision.RandomColorAdjust

Randomly adjust the brightness, contrast, saturation, and hue of the input image.

mindspore.dataset.vision.RandomCrop

Crop the input image at a random location.

mindspore.dataset.vision.RandomCropDecodeResize

A combination of Crop , Decode and Resize .

mindspore.dataset.vision.RandomCropWithBBox

Crop the input image at a random location and adjust bounding boxes accordingly.

mindspore.dataset.vision.RandomEqualize

Apply histogram equalization on the input image with a given probability.

mindspore.dataset.vision.RandomErasing

Randomly erase pixels within a random selected rectangle erea on the input numpy.ndarray image.

mindspore.dataset.vision.RandomGrayscale

Randomly convert the input PIL Image to grayscale.

mindspore.dataset.vision.RandomHorizontalFlip

Randomly flip the input image horizontally with a given probability.

mindspore.dataset.vision.RandomHorizontalFlipWithBBox

Flip the input image horizontally randomly with a given probability and adjust bounding boxes accordingly.

mindspore.dataset.vision.RandomInvert

Randomly invert the colors of image with a given probability.

mindspore.dataset.vision.RandomLighting

Add AlexNet-style PCA-based noise to an image.

mindspore.dataset.vision.RandomPerspective

Randomly apply perspective transformation to the input PIL Image with a given probability.

mindspore.dataset.vision.RandomPosterize

Reduce the bit depth of the color channels of image with a given probability to create a high contrast and vivid color image.

mindspore.dataset.vision.RandomResizedCrop

This operation will crop the input image randomly, and resize the cropped image using a selected interpolation mode mindspore.dataset.vision.Inter .

mindspore.dataset.vision.RandomResizedCropWithBBox

Crop the input image to a random size and aspect ratio and adjust bounding boxes accordingly.

mindspore.dataset.vision.RandomResize

Resize the input image using mindspore.dataset.vision.Inter , a randomly selected interpolation mode.

mindspore.dataset.vision.RandomResizeWithBBox

Tensor operation to resize the input image using a randomly selected interpolation mode mindspore.dataset.vision.Inter and adjust bounding boxes accordingly.

mindspore.dataset.vision.RandomRotation

Rotate the input image randomly within a specified range of degrees.

mindspore.dataset.vision.RandomSelectSubpolicy

Choose a random sub-policy from a policy list to be applied on the input image.

mindspore.dataset.vision.RandomSharpness

Adjust the sharpness of the input image by a fixed or random degree.

mindspore.dataset.vision.RandomSolarize

Randomly selects a subrange within the specified threshold range and sets the pixel value within the subrange to (255 - pixel).

mindspore.dataset.vision.RandomVerticalFlip

Randomly flip the input image vertically with a given probability.

mindspore.dataset.vision.RandomVerticalFlipWithBBox

Flip the input image vertically, randomly with a given probability and adjust bounding boxes accordingly.

mindspore.dataset.vision.Rescale

Rescale the input image with the given rescale and shift.

mindspore.dataset.vision.Resize

Resize the input image to the given size with a given interpolation mode mindspore.dataset.vision.Inter .

mindspore.dataset.vision.ResizedCrop

Crop the input image at a specific region and resize it to desired size.

mindspore.dataset.vision.ResizeWithBBox

Resize the input image to the given size and adjust bounding boxes accordingly.

mindspore.dataset.vision.RgbToHsv

Convert the input numpy.ndarray images from RGB to HSV.

mindspore.dataset.vision.Rotate

Rotate the input image by specified degrees.

mindspore.dataset.vision.SlicePatches

Slice Tensor to multiple patches in horizontal and vertical directions.

mindspore.dataset.vision.Solarize

Solarize the image by inverting all pixel values within the threshold.

mindspore.dataset.vision.TenCrop

Crop the given image into one central crop and four corners with the flipped version of these.

mindspore.dataset.vision.ToNumpy

Convert the PIL input image to numpy.ndarray image.

mindspore.dataset.vision.ToPIL

Convert the input decoded numpy.ndarray image to PIL Image.

mindspore.dataset.vision.ToTensor

Convert the input PIL Image or numpy.ndarray to numpy.ndarray of the desired dtype, rescale the pixel value range from [0, 255] to [0.0, 1.0] and change the shape from <H, W, C> to <C, H, W>.

mindspore.dataset.vision.ToType

Cast the input to a given MindSpore data type or NumPy data type.

mindspore.dataset.vision.TrivialAugmentWide

Apply TrivialAugmentWide data augmentation method on the input image.

mindspore.dataset.vision.UniformAugment

Uniformly select a number of transformations from a sequence and apply them sequentially and randomly, which means that there is a chance that a chosen transformation will not be applied.

mindspore.dataset.vision.VerticalFlip

Flip the input image vertically.

Utilities

mindspore.dataset.vision.AutoAugmentPolicy

AutoAugment policy for different datasets.

mindspore.dataset.vision.Border

Padding Mode, Border Type.

mindspore.dataset.vision.ConvertMode

The color conversion mode.

mindspore.dataset.vision.ImageBatchFormat

Data Format of images after batch operation.

mindspore.dataset.vision.ImageReadMode

The read mode used for the image file.

mindspore.dataset.vision.Inter

Interpolation Modes.

mindspore.dataset.vision.SliceMode

Mode to Slice Tensor into multiple parts.

mindspore.dataset.vision.encode_jpeg

Encode the input image as JPEG data.

mindspore.dataset.vision.encode_png

Encode the input image as PNG data.

mindspore.dataset.vision.get_image_num_channels

Get the number of input image channels.

mindspore.dataset.vision.get_image_size

Get the size of input image as [height, width].

mindspore.dataset.vision.read_file

Read a file in binary mode.

mindspore.dataset.vision.read_image

Read a image file and decode it into one channel grayscale data or RGB color data.

mindspore.dataset.vision.write_file

Write the one dimension uint8 data into a file using binary mode.

mindspore.dataset.vision.write_jpeg

Write the image data into a JPEG file.

mindspore.dataset.vision.write_png

Write the image into a PNG file.

Text

This module is to support text processing for NLP. It includes two parts: text transforms and utils. text transforms is a high performance NLP text processing module which is developed with ICU4C and cppjieba. utils provides some general methods for NLP text processing.

Common imported modules in corresponding API examples are as follows:

import mindspore.dataset as ds
import mindspore.dataset.text as text

See Text Transforms tutorial for more details.

Descriptions of common data processing terms are as follows:

  • TensorOperation, the base class of all data processing operations implemented in C++.

  • TextTensorOperation, the base class of all text processing operations. It is a derived class of TensorOperation.

The data transform operation can be executed in the data processing pipeline or in the eager mode:

  • Pipeline mode is generally used to process datasets. For examples, please refer to introduction to data processing pipeline .

  • Eager mode is generally used for scattered samples. Examples of text preprocessing are as follows:

    import mindspore.dataset.text as text
    from mindspore.dataset.text import NormalizeForm
    
    # construct vocab
    vocab_list = {"music": 1, "Opera": 2, "form": 3, "theatre": 4, "which": 5, "in": 6,
                  "fundamental": 7, "dramatic": 8, "component": 9, "taken": 10, "roles": 11, "singers": 12,
                  "is": 13, "are": 14, "of": 15, "UNK": 16}
    vocab = text.Vocab.from_dict(vocab_list)
    tokenizer_op = text.BertTokenizer(vocab=vocab, suffix_indicator='##', max_bytes_per_token=100,
                                      unknown_token='[UNK]', lower_case=False, keep_whitespace=False,
                                      normalization_form=NormalizeForm.NONE, preserve_unused_token=True,
                                      with_offsets=False)
    # tokenizer
    tokens = tokenizer_op("Opera is a form of theatre in which music is a fundamental "
                          "component and dramatic roles are taken by singers.")
    print("token: {}".format(tokens), flush=True)
    
    # token to ids
    ids = vocab.tokens_to_ids(tokens)
    print("token to id: {}".format(ids), flush=True)
    
    # ids to token
    tokens_from_ids = vocab.ids_to_tokens([15, 3, 7])
    print("token to id: {}".format(tokens_from_ids), flush=True)
    

Note: In eager mode, non-NumPy input is implicitly converted to NumPy format and sent to MindSpore.

Transforms

API Name

Description

Note

mindspore.dataset.text.AddToken

Add token to beginning or end of sequence.

None

mindspore.dataset.text.BasicTokenizer

Tokenize the input UTF-8 encoded string by specific rules.

BasicTokenizer is not supported on Windows platform yet.

mindspore.dataset.text.BertTokenizer

Tokenizer used for Bert text process.

BertTokenizer is not supported on Windows platform yet.

mindspore.dataset.text.CaseFold

Apply case fold operation on UTF-8 string tensor, which is aggressive that can convert more characters into lower case than str.lower .

CaseFold is not supported on Windows platform yet.

mindspore.dataset.text.FilterWikipediaXML

Filter Wikipedia XML dumps to "clean" text consisting only of lowercase letters (a-z, converted from A-Z), and spaces (never consecutive).

FilterWikipediaXML is not supported on Windows platform yet.

mindspore.dataset.text.JiebaTokenizer

Tokenize Chinese string into words based on dictionary.

The integrity of the HMMSEgment algorithm and MPSegment algorithm files must be confirmed.

mindspore.dataset.text.Lookup

Look up a word into an id according to the input vocabulary table.

None

mindspore.dataset.text.Ngram

Generate n-gram from a 1-D string Tensor.

None

mindspore.dataset.text.NormalizeUTF8

Apply normalize operation on UTF-8 string tensor.

NormalizeUTF8 is not supported on Windows platform yet.

mindspore.dataset.text.PythonTokenizer

Class that applies user-defined string tokenizer into input string.

None

mindspore.dataset.text.RegexReplace

Replace a part of UTF-8 string tensor with given text according to regular expressions.

RegexReplace is not supported on Windows platform yet.

mindspore.dataset.text.RegexTokenizer

Tokenize a scalar tensor of UTF-8 string by regex expression pattern.

RegexTokenizer is not supported on Windows platform yet.

mindspore.dataset.text.SentencePieceTokenizer

Tokenize scalar token or 1-D tokens to tokens by sentencepiece.

None

mindspore.dataset.text.SlidingWindow

Construct a tensor from given data (only support 1-D for now), where each element in the dimension axis is a slice of data starting at the corresponding position, with a specified width.

None

mindspore.dataset.text.ToNumber

Tensor operation to convert every element of a string tensor to a number.

None

mindspore.dataset.text.ToVectors

Look up a token into vectors according to the input vector table.

None

mindspore.dataset.text.Truncate

Truncate the input sequence so that it does not exceed the maximum length.

None

mindspore.dataset.text.TruncateSequencePair

Truncate a pair of rank-1 tensors such that the total length is less than max_length.

None

mindspore.dataset.text.UnicodeCharTokenizer

Tokenize a scalar tensor of UTF-8 string to Unicode characters.

None

mindspore.dataset.text.UnicodeScriptTokenizer

Tokenize a scalar tensor of UTF-8 string based on Unicode script boundaries.

UnicodeScriptTokenizer is not supported on Windows platform yet.

mindspore.dataset.text.WhitespaceTokenizer

Tokenize a scalar tensor of UTF-8 string on ICU4C defined whitespaces, such as: ' ', '\t', '\r', '\n'.

WhitespaceTokenizer is not supported on Windows platform yet.

mindspore.dataset.text.WordpieceTokenizer

Tokenize the input text to subword tokens.

None

Utilities

API Name

Description

Note

mindspore.dataset.text.CharNGram

CharNGram object that is used to map tokens into pre-trained vectors.

None

mindspore.dataset.text.FastText

FastText object that is used to map tokens into vectors.

None

mindspore.dataset.text.GloVe

GloVe object that is used to map tokens into vectors.

None

mindspore.dataset.text.JiebaMode

An enumeration for mindspore.dataset.text.JiebaTokenizer .

None

mindspore.dataset.text.NormalizeForm

Enumeration class for Unicode normalization forms .

None

mindspore.dataset.text.SentencePieceModel

An enumeration for SentencePieceModel.

None

mindspore.dataset.text.SentencePieceVocab

SentencePiece object that is used to do words segmentation.

None

mindspore.dataset.text.SPieceTokenizerLoadType

An enumeration for loading type of mindspore.dataset.text.SentencePieceTokenizer .

None

mindspore.dataset.text.SPieceTokenizerOutType

An enumeration for mindspore.dataset.text.SentencePieceTokenizer .

None

mindspore.dataset.text.Vectors

Vectors object that is used to map tokens into vectors.

None

mindspore.dataset.text.Vocab

Vocab object that is used to save pairs of words and ids.

None

mindspore.dataset.text.to_bytes

Convert NumPy array of str to array of bytes by encoding each element based on charset encoding .

None

mindspore.dataset.text.to_str

Convert NumPy array of bytes to array of str by decoding each element based on charset encoding .

None

Audio

This module is to support audio augmentations. It includes two parts: audio transforms and utils. audio transforms is a high performance processing module with common audio operations. utils provides some general methods for audio processing.

Common imported modules in corresponding API examples are as follows:

import mindspore.dataset as ds
import mindspore.dataset.audio as audio
from mindspore.dataset.audio import utils

Alternative and equivalent imported audio module is as follows:

import mindspore.dataset.audio.transforms as audio

Descriptions of common data processing terms are as follows:

  • TensorOperation, the base class of all data processing operations implemented in C++.

  • AudioTensorOperation, the base class of all audio processing operations. It is a derived class of TensorOperation.

The data transform operation can be executed in the data processing pipeline or in the eager mode:

  • Pipeline mode is generally used to process datasets. For examples, please refer to introduction to data processing pipeline .

  • Eager mode is generally used for scattered samples. Examples of audio preprocessing are as follows:

    import numpy as np
    import mindspore.dataset.audio as audio
    from mindspore.dataset.audio import ResampleMethod
    
    # audio sample
    waveform = np.random.random([1, 30])
    
    # transform
    resample_op = audio.Resample(orig_freq=48000, new_freq=16000,
                                 resample_method=ResampleMethod.SINC_INTERPOLATION,
                                 lowpass_filter_width=6, rolloff=0.99, beta=None)
    waveform_resampled = resample_op(waveform)
    print("waveform reampled: {}".format(waveform_resampled), flush=True)
    

Transforms

mindspore.dataset.audio.AllpassBiquad

Design two-pole all-pass filter with central frequency and bandwidth for audio waveform.

mindspore.dataset.audio.AmplitudeToDB

Turn the input audio waveform from the amplitude/power scale to decibel scale.

mindspore.dataset.audio.Angle

Calculate the angle of complex number sequence.

mindspore.dataset.audio.BandBiquad

Design two-pole band-pass filter for audio waveform.

mindspore.dataset.audio.BandpassBiquad

Design two-pole Butterworth band-pass filter for audio waveform.

mindspore.dataset.audio.BandrejectBiquad

Design two-pole Butterworth band-reject filter for audio waveform.

mindspore.dataset.audio.BassBiquad

Design a bass tone-control effect, also known as two-pole low-shelf filter for audio waveform.

mindspore.dataset.audio.Biquad

Perform a biquad filter of input audio.

mindspore.dataset.audio.ComplexNorm

Compute the norm of complex number sequence.

mindspore.dataset.audio.ComputeDeltas

Compute delta coefficients, also known as differential coefficients, of a spectrogram.

mindspore.dataset.audio.Contrast

Apply contrast effect for audio waveform.

mindspore.dataset.audio.DBToAmplitude

Turn a waveform from the decibel scale to the power/amplitude scale.

mindspore.dataset.audio.DCShift

Apply a DC shift to the audio.

mindspore.dataset.audio.DeemphBiquad

Apply Compact Disc (IEC 60908) de-emphasis (a treble attenuation shelving filter) to the audio waveform.

mindspore.dataset.audio.DetectPitchFrequency

Detect pitch frequency.

mindspore.dataset.audio.Dither

Dither increases the perceived dynamic range of audio stored at a particular bit-depth by eliminating nonlinear truncation distortion.

mindspore.dataset.audio.EqualizerBiquad

Design biquad equalizer filter and perform filtering.

mindspore.dataset.audio.Fade

Add a fade in and/or fade out to an waveform.

mindspore.dataset.audio.Filtfilt

Apply an IIR filter forward and backward to a waveform.

mindspore.dataset.audio.Flanger

Apply a flanger effect to the audio.

mindspore.dataset.audio.FrequencyMasking

Apply masking to a spectrogram in the frequency domain.

mindspore.dataset.audio.Gain

Apply amplification or attenuation to the whole waveform.

mindspore.dataset.audio.GriffinLim

Compute waveform from a linear scale magnitude spectrogram using the Griffin-Lim transformation.

mindspore.dataset.audio.HighpassBiquad

Design biquad highpass filter and perform filtering.

mindspore.dataset.audio.InverseMelScale

Solve for a normal STFT from a mel frequency STFT, using a conversion matrix.

mindspore.dataset.audio.InverseSpectrogram

Create an inverse spectrogram to recover an audio signal from a spectrogram.

mindspore.dataset.audio.LFCC

Create LFCC for a raw audio signal.

mindspore.dataset.audio.LFilter

Perform an IIR filter by evaluating different equation.

mindspore.dataset.audio.LowpassBiquad

Design two-pole low-pass filter for audio waveform.

mindspore.dataset.audio.Magphase

Separate a complex-valued spectrogram with shape (..., 2) into its magnitude and phase.

mindspore.dataset.audio.MaskAlongAxis

Apply a mask along axis .

mindspore.dataset.audio.MaskAlongAxisIID

Apply a mask along axis .

mindspore.dataset.audio.MelScale

Convert normal STFT to STFT at the Mel scale.

mindspore.dataset.audio.MelSpectrogram

Create MelSpectrogram for a raw audio signal.

mindspore.dataset.audio.MFCC

Create MFCC for a raw audio signal.

mindspore.dataset.audio.MuLawDecoding

Decode mu-law encoded signal, refer to mu-law algorithm .

mindspore.dataset.audio.MuLawEncoding

Encode signal based on mu-law companding.

mindspore.dataset.audio.Overdrive

Apply an overdrive effect to the audio waveform.

mindspore.dataset.audio.Phaser

Apply a phasing effect to the audio.

mindspore.dataset.audio.PhaseVocoder

Given a STFT spectrogram, speed up in time without modifying pitch by a factor of rate.

mindspore.dataset.audio.PitchShift

Shift the pitch of a waveform by n_steps steps.

mindspore.dataset.audio.Resample

Resample a signal from one frequency to another.

mindspore.dataset.audio.RiaaBiquad

Apply RIAA vinyl playback equalization.

mindspore.dataset.audio.SlidingWindowCmn

Apply sliding-window cepstral mean (and optionally variance) normalization per utterance.

mindspore.dataset.audio.SpectralCentroid

Compute the spectral centroid for each channel along the time axis.

mindspore.dataset.audio.Spectrogram

Create a spectrogram from an audio signal.

mindspore.dataset.audio.TimeMasking

Apply masking to a spectrogram in the time domain.

mindspore.dataset.audio.TimeStretch

Stretch Short Time Fourier Transform (STFT) in time without modifying pitch for a given rate.

mindspore.dataset.audio.TrebleBiquad

Design a treble tone-control effect.

mindspore.dataset.audio.Vad

Voice activity detector.

mindspore.dataset.audio.Vol

Adjust volume of waveform.

Utilities

mindspore.dataset.audio.BorderType

Padding mode.

mindspore.dataset.audio.DensityFunction

Density function type.

mindspore.dataset.audio.FadeShape

Fade Shapes.

mindspore.dataset.audio.GainType

Gain Types.

mindspore.dataset.audio.Interpolation

Interpolation Type.

mindspore.dataset.audio.MelType

Mel scale implementation type.

mindspore.dataset.audio.Modulation

Modulation Type.

mindspore.dataset.audio.NormMode

Normalization mode.

mindspore.dataset.audio.NormType

Normalization type.

mindspore.dataset.audio.ResampleMethod

Resample method.

mindspore.dataset.audio.ScaleType

Scale Types.

mindspore.dataset.audio.WindowType

Window function type.

mindspore.dataset.audio.create_dct

Create a DCT transformation matrix with shape (n_mels, n_mfcc), normalized depending on norm.

mindspore.dataset.audio.linear_fbanks

Creates a linear triangular filterbank.

mindspore.dataset.audio.melscale_fbanks

Create a frequency transformation matrix.