mindspore_gl.dataset.IMDBBinary

View Source On Gitee
class mindspore_gl.dataset.IMDBBinary(root)[source]

IMDBBinary Dataset, a source dataset for reading and parsing IMDBBinary dataset.

About IMDBBinary dataset:

IMDBBinary Dataset, a source dataset for reading and parsing IMDBBinary dataset. IMDB-BINARY is a movie collaboration dataset that consists of the ego-networks of 1,000 actors/actresses who played roles in movies in IMDB. In each graph, nodes represent actors/actress, and there is an edge between them if they appear in the same movie. These graphs are derived from the Action and Romance genres.

Statistics:

  • Nodes: 19773

  • Edges: 193062

  • Number of Graphs: 1000

  • Number of Classes: 2

  • Label split:

    • Train: 800

    • Valid: 200

Dataset can be download here: <https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/IMDB-BINARY.zip> You can organize the dataset files into the following directory structure and read.

.
├── IMDB-BINARY_A.txt
├── IMDB-BINARY_graph_indicator.txt
└── IMDB-BINARY_graph_labels.txt
Parameters

root (str) – path to the root directory that contains imdb_binary_with_mask.npz

Raises

Examples

>>> from mindspore_gl.dataset.imdb_binary import IMDBBinary
>>> root = "path/to/imdb_binary"
>>> dataset = IMDBBinary(root)
property edge_feat_size

Feature size of each edge

Returns

int, the number of feature size

Examples

>>> #dataset is an instance object of Dataset
>>> edge_feat_size = dataset.edge_feat_size
property graph_count

Total graph numbers

Returns

int, numbers of graph

Examples

>>> #dataset is an instance object of Dataset
>>> graph_count = dataset.graph_count
property graph_edges

Accumulative graph edges count

Returns

numpy.ndarray, array of accumulative edges

Examples

>>> #dataset is an instance object of Dataset
>>> val_mask = dataset.graph_edges
property graph_label

Graph label

Returns

numpy.ndarray, array of graph label

Examples

>>> #dataset is an instance object of Dataset
>>> graph_label = dataset.graph_label
graph_node_feat(graph_idx)[source]

graph node features.

Parameters

graph_idx (int) – index of graph.

Returns

  • numpy.ndarray, node feature of graph.

Examples

>>> #dataset is an instance object of Dataset
>>> graph_node_feat = dataset.graph_node_feat(graph_idx)
property graph_nodes

Accumulative graph nodes count

Returns

numpy.ndarray, array of accumulative nodes

Examples

>>> #dataset is an instance object of Dataset
>>> val_mask = dataset.graph_nodes
property node_feat

Node features

Returns

numpy.ndarray, array of node feature

Examples

>>> #dataset is an instance object of Dataset
>>> node_feat = dataset.node_feat
property node_feat_size

Feature size of each node

Returns

int, the number of feature size

Examples

>>> #dataset is an instance object of Dataset
>>> node_feat_size = dataset.node_feat_size
property num_classes

Number of label classes

Returns

int, the number of classes

Examples

>>> #dataset is an instance object of Dataset
>>> num_classes = dataset.num_classes
property train_graphs

Train graph id

Returns

numpy.ndarray, array of train graph id

Examples

>>> #dataset is an instance object of Dataset
>>> train_graphs = dataset.train_graphs
property train_mask

Mask of training nodes

Returns

numpy.ndarray, array of mask

Examples

>>> #dataset is an instance object of Dataset
>>> train_mask = dataset.train_mask
property val_graphs

Valid graph id

Returns

numpy.ndarray, array of valid graph id

Examples

>>> #dataset is an instance object of Dataset
>>> val_graphs = dataset.val_graphs
property val_mask

Mask of validation nodes

Returns

numpy.ndarray, array of mask

Examples

>>> #dataset is an instance object of Dataset
>>> val_mask = dataset.val_mask