mindspore_gl.sampling

Sampling APIs for graph data.

mindspore_gl.sampling.k_hop_subgraph(node_idx, num_hops, adj_coo, node_count, relabel_nodes=False, flow='source_to_target')[source]

K-hop sampling on HomoGraph

Parameters
  • node_idx (int, list, tuple or numpy.ndarray) – sampling subgraph around ‘node_idx’.

  • num_hops (int) – sampling ‘num_hops’ hop subgraph.

  • adj_coo (numpy.ndarray) – input adj of graph.

  • node_count (int) – the number of nodes.

  • relabel_nodes (bool) – node indexes need relabel or not. Default: False.

  • flow (str, optional) –

    the visit direction. Default: ‘source_to_target’.

    • ’source_to_target’: from source node to target node.

    • ’target_to_source’: from target node to source node.

Returns

res(dict), has 4 keys ‘subset’, ‘adj_coo’, ‘inv’, ‘edge_mask’, where,

  • subset (numpy.ndarray) - nodes’ idx of sampled K-hop subgraph.

  • adj_coo (numpy.ndarray) - adj of sampled K-hop subgraph.

  • inv (list) - the mapping from node indices in node_idx to their new location.

  • edge_mask (numpy.ndarray) - the edge mask indicating which edges were preserved.

Raises
  • TypeError – If ‘num_hops’ or ‘node_count’ is not a positive int.

  • TypeError – If ‘relabel_nodes’ is not a bool.

  • ValueError – If flow is not in ‘source_to_target’ or ‘target_to_source’.

Supported Platforms:

Ascend GPU

Examples

>>> from mindspore_gl.graph import MindHomoGraph
>>>from mindspore_gl.sampling import k_hop_subgraph
>>> graph = MindHomoGraph()
>>> coo_array = np.array([[0, 1, 1, 2, 3, 0, 3, 4, 2, 5],
...                       [1, 0, 2, 1, 0, 3, 4, 3, 5, 2]])
>>> graph.set_topo_coo(coo_array)
>>> graph.node_count = 6
>>> graph.edge_count = 10
>>> res = k_hop_subgraph([0, 3], 2, graph.adj_coo, graph.node_count,
... relabel_nodes=True)
>>> print(res)
{'subset': array([0, 1, 2, 3, 4]), 'adj_coo': array([[0, 1, 1, 2, 3, 0, 3, 4],
[1, 0, 2, 1, 0, 3, 4, 3]]), 'inv': array([0, 3]), 'edge_mask': array([ True,  True,  True,  True,  True,  True,
True,  True, False, False])}
mindspore_gl.sampling.negative_sample(positive, node, num_neg_samples, mode='undirected', re='more')[source]

Input all positive sample edge sets, and specify the negative sample length, and then return the negative sample edge set of the same length, and will not repeat the positive samples Can choose to consider self-loop, directed graph or undirected graph operation

Parameters
  • positive (list or numpy.ndarray) – All positive sample edges.

  • node (int) – number of node.

  • num_neg_samples (int) – Negative sample length.

  • mode (str, optional) –

    type of operation matrix. Default: ‘undirected’.

    • undirected: undirected graph.

    • bipartite: bipartite graph.

    • other: other type graph.

  • re (str, optional) –

    type of input data. Default: ‘more’.

    • more: positive array shape \((data\_length, 2)\).

    • other: positive array shape \((2, data\_length)\).

Returns

  • array - Negative sample edge set, shape is \((num\_neg\_samples, 2)\).

Raises
  • TypeError – If ‘positive’ is not a list or numpy.ndarry.

  • TypeError – If ‘node’ is not a positive int.

  • TypeError – If ‘re’ is not in ‘more’ or ‘other’.

  • ValueError – If mode is not in ‘bipartite’, ‘undirected’ or ‘other’.

Supported Platforms:

Ascend GPU

Examples

>>> from mindspore_gl.sampling import negative_sample
>>> positive = [[1,2],[2,3]]
>>> neg_len = 4
>>> neg = negative_sample(positive, 4, neg_len)
>>> print(neg)
    [[0 3]
    [0 2]
    [1 3]
    [0 1]]
mindspore_gl.sampling.random_walk_unbias_on_homo(homo_graph: mindspore_gl.graph.MindHomoGraph, seeds: numpy.ndarray, walk_length: int)[source]

Random walks sampling on homo graph.

Parameters
Returns

  • array - sample node \((len(seeds), walk\_length)\).

Raises
  • TypeError – If walk_length is not a positive integer.

  • TypeError – If seeds is not numpy.ndarray int32.

Supported Platforms:

Ascend GPU

Examples

>>> import numpy as np
>>> import networkx
>>> from scipy.sparse import csr_matrix
>>> from mindspore_gl.graph import MindHomoGraph, CsrAdj
>>> from mindspore_gl.sampling.randomwalks import random_walk_unbias_on_homo
>>> node_count = 10000
>>> edge_prob = 0.1
>>> graph = networkx.generators.random_graphs.fast_gnp_random_graph(node_count, edge_prob)
>>> edge_array = np.transpose(np.array(list(graph.edges)))
>>> row = edge_array[0]
>>> col = edge_array[1]
>>> data = np.zeros(row.shape)
>>> csr_mat = csr_matrix((data, (row, col)), shape=(node_count, node_count))
>>> generated_graph = MindHomoGraph()
>>> node_dict = {idx: idx for idx in range(node_count)}
>>> edge_count = col.shape[0]
>>> edge_ids = np.array(list(range(edge_count))).astype(np.int32)
>>> generated_graph.set_topo(CsrAdj(csr_mat.indptr.astype(np.int32), csr_mat.indices.astype(np.int32)),
... node_dict, edge_ids)
>>> nodes = np.arange(0, node_count)
>>> out = random_walk_unbias_on_homo(homo_graph=generated_graph, seeds=nodes[:5].astype(np.int32),
... walk_length=10)
>>> print(out)
# results will be random for suffle
[[   0 9493 8272 1251 3922 4180  211 1083 4198 9981 7669]
 [   1 1585 1308 4703 1115 4989 9365 1098 1618 5987 8312]
 [   2 2352 7214 5956 2184 1573 1352 7005 2325 6211 8667]
 [   3 8723 5645 3691 4857 5501  113 4140 6666 2282 1248]
 [   4 4354 9551 5224 3156 8693  346 8899 6046 6011 5310]]
mindspore_gl.sampling.sage_sampler_on_homo(homo_graph: mindspore_gl.graph.MindHomoGraph, seeds, neighbor_nums: List[int])[source]

GraphSage sampling on MindHomoGraph.

Parameters
Returns

  • layered_edges_{idx} (numpy.array) - edge reindex array for hop idx.

  • layered_eids_{idx} (numpy.array) - edge reindex id array for hop idx.

  • all_nodes - sampling all nodes’ global ids.

  • seeds_idx - seeds local reindex ids.

Raises
  • TypeError – If homo_graph is not a MindHomoGraph class.

  • TypeError – If seeds is not a numpy.ndarray.

  • TypeError – If neighbor_nums is not a list.

Supported Platforms:

Ascend GPU

Examples

>>> import networkx
>>> import numpy as np
>>> from scipy.sparse import csr_matrix
>>> from mindspore_gl.graph import MindHomoGraph, CsrAdj
>>> from mindspore_gl.sampling.neighbor import sage_sampler_on_homo
>>> node_count = 10
>>> edge_prob = 0.3
>>> graph = networkx.generators.random_graphs.fast_gnp_random_graph(node_count, edge_prob, seed=1)
>>> edge_array = np.transpose(np.array(list(graph.edges)))
>>> row = edge_array[0]
>>> col = edge_array[1]
>>> data = np.ones(row.shape)
>>> csr_mat = csr_matrix((data, (row, col)), shape=(node_count, node_count))
>>> generated_graph = MindHomoGraph()
>>> node_dict = {idx: idx for idx in range(node_count)}
>>> edge_count = col.shape[0]
>>> edge_ids = np.array(list(range(edge_count))).astype(np.int32)
>>> generated_graph.set_topo(CsrAdj(csr_mat.indptr.astype(np.int32), csr_mat.indices.astype(np.int32)),        ... node_dict, edge_ids)
>>> nodes = np.arange(0, node_count)
>>> res = sage_sampler_on_homo(homo_graph=generated_graph, seeds=nodes[:3].astype(np.int32),        ... neighbor_nums=[2, 2])
>>> print(res)
{'seeds_idx': array([0, 1, 2], dtype=int32), 'all_nodes': array([0, 1, 2, 4, 5, 6, 7, 8, 9], dtype=int32),
'layered_edges_0': array([[0, 0, 1, 1, 2], [1, 3, 4, 5, 4]], dtype=int32),
'layered_eids_0': array([[0, 0, 1, 1, 2], [1, 3, 4, 5, 4]], dtype=int32),
'layered_edges_1': array([[1, 1, 3, 3, 4, 5, 5], [4, 5, 7, 8, 6, 7, 8]], dtype=int32),
'layered_eids_1': array([[1, 1, 3, 3, 4, 5, 5], [4, 5, 7, 8, 6, 7, 8]], dtype=int32)}