mindspore.dataset.Schema

class mindspore.dataset.Schema(schema_file=None)[source]

Class to represent a schema of a dataset.

Parameters

schema_file (str) – Path of the schema file. Default: None.

Returns

Schema object, schema info about dataset.

Raises

RuntimeError – If schema file failed to load.

Examples

>>> from mindspore import dtype as mstype
>>>
>>> # Create schema; specify column name, mindspore.dtype and shape of the column
>>> schema = ds.Schema()
>>> schema.add_column(name='col1', de_type=mstype.int64, shape=[2])
add_column(name, de_type, shape=None)[source]

Add new column to the schema.

Parameters
  • name (str) – The new name of the column.

  • de_type (str) – Data type of the column.

  • shape (list[int], optional) – Shape of the column. Default: None, [-1] which is an unknown shape of rank 1.

Raises

ValueError – If column type is unknown.

Examples: >>> from mindspore import dtype as mstype >>> >>> schema = ds.Schema() >>> schema.add_column(‘col_1d’, de_type=mstype.int64, shape=[2])

from_json(json_obj)[source]

Get schema file from JSON object.

Parameters

json_obj (dictionary) – Object of JSON parsed.

Raises

Examples

>>> import json
>>>
>>> from mindspore.dataset import Schema
>>>
>>> with open("/path/to/schema_file") as file:
...     json_obj = json.load(file)
...     schema = ds.Schema()
...     schema.from_json(json_obj)
parse_columns(columns)[source]

Parse the columns and add it to self.

Parameters

columns (Union[dict, list[dict], tuple[dict]]) –

Dataset attribute information, decoded from schema file.

  • list[dict], name and type must be in keys, shape optional.

  • dict, columns.keys() as name, columns.values() is dict, and type inside, shape optional.

Raises

Examples

>>> from mindspore.dataset import Schema
>>> schema = Schema()
>>> columns1 = [{'name': 'image', 'type': 'int8', 'shape': [3, 3]},
...             {'name': 'label', 'type': 'int8', 'shape': [1]}]
>>> schema.parse_columns(columns1)
>>> columns2 = {'image': {'shape': [3, 3], 'type': 'int8'}, 'label': {'shape': [1], 'type': 'int8'}}
>>> schema.parse_columns(columns2)
to_json()[source]

Get a JSON string of the schema.

Returns

str, JSON string of the schema.

Examples

>>> from mindspore.dataset import Schema
>>>
>>> schema1 = ds.Schema()
>>> schema2 = schema1.to_json()