mindspore_gl.dataset.IMDBBinary

class mindspore_gl.dataset.IMDBBinary(root)[source]

IMDBBinary Dataset, a source dataset for reading and parsing IMDBBinary dataset.

About IMDBBinary dataset:

IMDBBinary Dataset, a source dataset for reading and parsing IMDBBinary dataset. IMDB-BINARY is a movie collaboration dataset that consists of the ego-networks of 1,000 actors/actresses who played roles in movies in IMDB. In each graph, nodes represent actors/actress, and there is an edge between them if they appear in the same movie. These graphs are derived from the Action and Romance genres.

Statistics:

Nodes: 19773
Edges: 193062
Number of Graphs： 1000
Number of Classes: 2
Label split:
- Train: 800
- Valid: 200

Dataset can be download here: <https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/IMDB-BINARY.zip> You can organize the dataset files into the following directory structure and read.

.
├── IMDB-BINARY_A.txt
├── IMDB-BINARY_graph_indicator.txt
└── IMDB-BINARY_graph_labels.txt

Parameters

root (str) – path to the root directory that contains imdb_binary_with_mask.npz

Raises

TypeError – if root is not a str.
RuntimeError – if root does not contain data files.

Examples

>>> from mindspore_gl.dataset.imdb_binary import IMDBBinary
>>> root = "path/to/imdb_binary"
>>> dataset = IMDBBinary(root)

property edge_feat_size

Feature size of each edge

Returns: int, the number of feature size

Examples

>>> #dataset is an instance object of Dataset
>>> edge_feat_size = dataset.edge_feat_size

property graph_count

Total graph numbers

Returns: int, numbers of graph

Examples

>>> #dataset is an instance object of Dataset
>>> graph_count = dataset.graph_count

property graph_edges

Accumulative graph edges count

Returns: numpy.ndarray, array of accumulative edges

Examples

>>> #dataset is an instance object of Dataset
>>> val_mask = dataset.graph_edges

property graph_label

Graph label

Returns: numpy.ndarray, array of graph label

Examples

>>> #dataset is an instance object of Dataset
>>> graph_label = dataset.graph_label

graph_node_feat(graph_idx)[source]

graph node features.

Parameters

graph_idx (int) – index of graph.

Returns

numpy.ndarray, node feature of graph.

Examples

>>> #dataset is an instance object of Dataset
>>> graph_node_feat = dataset.graph_node_feat(graph_idx)

property graph_nodes

Accumulative graph nodes count

Returns: numpy.ndarray, array of accumulative nodes

Examples

>>> #dataset is an instance object of Dataset
>>> val_mask = dataset.graph_nodes

property node_feat

Node features

Returns: numpy.ndarray, array of node feature

Examples

>>> #dataset is an instance object of Dataset
>>> node_feat = dataset.node_feat

property node_feat_size

Feature size of each node

Returns: int, the number of feature size

Examples

>>> #dataset is an instance object of Dataset
>>> node_feat_size = dataset.node_feat_size

property num_classes

Number of label classes

Returns: int, the number of classes

Examples

>>> #dataset is an instance object of Dataset
>>> num_classes = dataset.num_classes

property train_graphs

Train graph id

Returns: numpy.ndarray, array of train graph id

Examples

>>> #dataset is an instance object of Dataset
>>> train_graphs = dataset.train_graphs

property train_mask

Mask of training nodes

Returns: numpy.ndarray, array of mask

Examples

>>> #dataset is an instance object of Dataset
>>> train_mask = dataset.train_mask

property val_graphs

Valid graph id

Returns: numpy.ndarray, array of valid graph id

Examples

>>> #dataset is an instance object of Dataset
>>> val_graphs = dataset.val_graphs

property val_mask

Mask of validation nodes

Returns: numpy.ndarray, array of mask

Examples

>>> #dataset is an instance object of Dataset
>>> val_mask = dataset.val_mask