Proximity Methods#

Examples in this module use nx.les_miserables_graph(). Methods return scored full graphs; typical extraction uses fraction_filter() on similarity attributes. Complexity classes are provided in each function docstring.

Proximity-based edge scoring methods.

These methods score each existing edge by the topological proximity (similarity) of its endpoints. They are useful for identifying structurally embedded edges (high proximity) versus bridge edges (low proximity) and can feed into any of the filtering utilities in networkx_backbone.filters.

Local methods#

neighborhood_overlap

Raw common-neighbor count.

jaccard_backbone

Jaccard coefficient of endpoint neighborhoods.

dice_backbone

Dice / Sorensen coefficient of endpoint neighborhoods.

cosine_backbone

Cosine / Salton index of endpoint neighborhoods.

hub_promoted_index

Common neighbors normalised by the smaller degree.

hub_depressed_index

Common neighbors normalised by the larger degree.

lhn_local_index

Leicht-Holme-Newman local similarity index.

preferential_attachment_score

Product of endpoint degrees.

adamic_adar_index

Adamic-Adar index (inverse-log-degree-weighted common neighbors).

resource_allocation_index

Resource-allocation index (inverse-degree-weighted common neighbors).

Quasi-local methods#

graph_distance_proximity

Reciprocal shortest-path distance.

local_path_index

Local path index (second- and third-order path counts).

References

Lu, L. & Zhou, T. (2011). “Link prediction in complex networks: A survey.” Physica A, 390(6), 1150-1170.

Liben-Nowell, D. & Kleinberg, J. (2007). “The link-prediction problem for social networks.” JASIST, 58(7), 1019-1031.

networkx_backbone.neighborhood_overlap(G)[source]#

Score each edge by the raw neighborhood overlap of its endpoints.

For each edge (u, v), the overlap is the number of common neighbors: |N(u) & N(v)|. The score is stored as the "overlap" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph. For directed graphs, neighborhoods are defined by successors (out-neighbors).

Returns:

H – Copy of G with the "overlap" edge attribute added.

Return type:

graph

Examples

>>> import networkx as nx
>>> from networkx_backbone import fraction_filter, neighborhood_overlap
>>> G = nx.les_miserables_graph()
>>> H = neighborhood_overlap(G)
>>> backbone = fraction_filter(H, "overlap", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(mn) time and O(n + m) space.

networkx_backbone.jaccard_backbone(G)[source]#

Score each edge by the Jaccard similarity of its endpoint neighborhoods.

For each edge (u, v):

\[J_{uv} = \frac{|N(u) \cap N(v)|}{|N(u) \cup N(v)|}\]

The score is stored as the "jaccard" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph.

Returns:

H – Copy of G with the "jaccard" edge attribute added.

Return type:

graph

References

[1]

Jaccard, P. (1901). Distribution de la flore alpine dans le bassin des Dranses et dans quelques regions voisines. Bulletin de la Societe Vaudoise des Sciences Naturelles, 37, 241-272.

Examples

>>> import networkx as nx
>>> from networkx_backbone import fraction_filter, jaccard_backbone
>>> G = nx.les_miserables_graph()
>>> H = jaccard_backbone(G)
>>> backbone = fraction_filter(H, "jaccard", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(mn) time and O(n + m) space.

networkx_backbone.dice_backbone(G)[source]#

Score each edge by the Dice similarity of its endpoint neighborhoods.

For each edge (u, v):

\[D_{uv} = \frac{2 \, |N(u) \cap N(v)|}{k_u + k_v}\]

The score is stored as the "dice" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph.

Returns:

H – Copy of G with the "dice" edge attribute added.

Return type:

graph

References

[1]

Dice, L. R. (1945). Measures of the amount of ecologic association between species. Ecology, 26(3), 297-302.

Examples

>>> import networkx as nx
>>> from networkx_backbone import dice_backbone, fraction_filter
>>> G = nx.les_miserables_graph()
>>> H = dice_backbone(G)
>>> backbone = fraction_filter(H, "dice", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(mn) time and O(n + m) space.

networkx_backbone.cosine_backbone(G)[source]#

Score each edge by the cosine similarity of its endpoint neighborhoods.

For each edge (u, v):

\[C_{uv} = \frac{|N(u) \cap N(v)|}{\sqrt{k_u \, k_v}}\]

The score is stored as the "cosine" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph.

Returns:

H – Copy of G with the "cosine" edge attribute added.

Return type:

graph

References

[1]

Salton, G. & McGill, M. J. (1983). Introduction to Modern Information Retrieval. McGraw-Hill.

Examples

>>> import networkx as nx
>>> from networkx_backbone import cosine_backbone, fraction_filter
>>> G = nx.les_miserables_graph()
>>> H = cosine_backbone(G)
>>> backbone = fraction_filter(H, "cosine", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(mn) time and O(n + m) space.

networkx_backbone.hub_promoted_index(G)[source]#

Score each edge by the Hub Promoted Index of its endpoint neighborhoods.

For each edge (u, v):

\[\mathrm{HPI}_{uv} = \frac{|N(u) \cap N(v)|}{\min(k_u,\, k_v)}\]

The score is stored as the "hpi" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph.

Returns:

H – Copy of G with the "hpi" edge attribute added.

Return type:

graph

References

[1]

Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N. & Barabasi, A.-L. (2002). Hierarchical organization of modularity in metabolic networks. Science, 297(5586), 1551-1555.

Examples

>>> import networkx as nx
>>> from networkx_backbone import fraction_filter, hub_promoted_index
>>> G = nx.les_miserables_graph()
>>> H = hub_promoted_index(G)
>>> backbone = fraction_filter(H, "hpi", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(mn) time and O(n + m) space.

networkx_backbone.hub_depressed_index(G)[source]#

Score each edge by the Hub Depressed Index of its endpoint neighborhoods.

For each edge (u, v):

\[\mathrm{HDI}_{uv} = \frac{|N(u) \cap N(v)|}{\max(k_u,\, k_v)}\]

The score is stored as the "hdi" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph.

Returns:

H – Copy of G with the "hdi" edge attribute added.

Return type:

graph

References

[1]

Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N. & Barabasi, A.-L. (2002). Hierarchical organization of modularity in metabolic networks. Science, 297(5586), 1551-1555.

Examples

>>> import networkx as nx
>>> from networkx_backbone import fraction_filter, hub_depressed_index
>>> G = nx.les_miserables_graph()
>>> H = hub_depressed_index(G)
>>> backbone = fraction_filter(H, "hdi", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(mn) time and O(n + m) space.

networkx_backbone.lhn_local_index(G)[source]#

Score each edge by the Leicht-Holme-Newman local similarity index.

For each edge (u, v):

\[\mathrm{LHN}_{uv} = \frac{|N(u) \cap N(v)|}{k_u \cdot k_v}\]

The score is stored as the "lhn_local" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph.

Returns:

H – Copy of G with the "lhn_local" edge attribute added.

Return type:

graph

References

[1]

Leicht, E. A., Holme, P. & Newman, M. E. J. (2006). Vertex similarity in networks. Physical Review E, 73(2), 026120.

Examples

>>> import networkx as nx
>>> from networkx_backbone import fraction_filter, lhn_local_index
>>> G = nx.les_miserables_graph()
>>> H = lhn_local_index(G)
>>> backbone = fraction_filter(H, "lhn_local", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(mn) time and O(n + m) space.

networkx_backbone.preferential_attachment_score(G)[source]#

Score each edge by the preferential attachment index.

For each edge (u, v):

\[\mathrm{PA}_{uv} = k_u \cdot k_v\]

The score is stored as the "pa" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph.

Returns:

H – Copy of G with the "pa" edge attribute added.

Return type:

graph

References

[1]

Barabasi, A.-L. & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509-512.

Examples

>>> import networkx as nx
>>> from networkx_backbone import fraction_filter, preferential_attachment_score
>>> G = nx.les_miserables_graph()
>>> H = preferential_attachment_score(G)
>>> backbone = fraction_filter(H, "pa", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(n + m) time and O(n + m) space.

networkx_backbone.adamic_adar_index(G)[source]#

Score each edge by the Adamic-Adar index.

For each edge (u, v):

\[\mathrm{AA}_{uv} = \sum_{z \,\in\, N(u) \cap N(v)} \frac{1}{\log k_z}\]

The score is stored as the "adamic_adar" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph.

Returns:

H – Copy of G with the "adamic_adar" edge attribute added.

Return type:

graph

References

[1]

Adamic, L. A. & Adar, E. (2003). Friends and neighbors on the Web. Social Networks, 25(3), 211-230.

Examples

>>> import networkx as nx
>>> from networkx_backbone import adamic_adar_index, fraction_filter
>>> G = nx.les_miserables_graph()
>>> H = adamic_adar_index(G)
>>> backbone = fraction_filter(H, "adamic_adar", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(mn) time and O(n + m) space.

networkx_backbone.resource_allocation_index(G)[source]#

Score each edge by the resource-allocation index.

For each edge (u, v):

\[\mathrm{RA}_{uv} = \sum_{z \,\in\, N(u) \cap N(v)} \frac{1}{k_z}\]

The score is stored as the "resource_allocation" edge attribute.

Parameters:

G (networkx.Graph or networkx.DiGraph) – Input graph.

Returns:

H – Copy of G with the "resource_allocation" edge attribute added.

Return type:

graph

References

[1]

Zhou, T., Lu, L. & Zhang, Y.-C. (2009). Predicting missing links via local information. European Physical Journal B, 71(4), 623-630.

Examples

>>> import networkx as nx
>>> from networkx_backbone import fraction_filter, resource_allocation_index
>>> G = nx.les_miserables_graph()
>>> H = resource_allocation_index(G)
>>> backbone = fraction_filter(H, "resource_allocation", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(mn) time and O(n + m) space.

networkx_backbone.graph_distance_proximity(G, all_pairs=False)[source]#

Score edges by reciprocal shortest-path distance.

By default, scores are computed for existing edges only. With all_pairs=True, scores are computed for every node pair (all unordered pairs in undirected graphs; all ordered pairs in directed graphs, excluding self-pairs).

Parameters:
  • G (networkx.Graph or networkx.DiGraph) – Input graph.

  • all_pairs (bool, optional (default=False)) – If False, annotate only existing edges in a copy of G. If True, return a graph containing all node pairs as edges and annotate each with "dist".

Returns:

H – Graph with the "dist" edge attribute added.

Return type:

graph

References

[1]

Liben-Nowell, D. & Kleinberg, J. (2007). The link-prediction problem for social networks. JASIST, 58(7), 1019-1031.

Examples

>>> import networkx as nx
>>> from networkx_backbone import fraction_filter, graph_distance_proximity
>>> G = nx.les_miserables_graph()
>>> H = graph_distance_proximity(G)
>>> backbone = fraction_filter(H, "dist", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True
>>> H_all = graph_distance_proximity(G, all_pairs=True)
>>> H_all.number_of_edges() >= H.number_of_edges()
True

Computational complexity estimate is O(n(n + m)) time and O(n^2 + m) space.

networkx_backbone.local_path_index(G, epsilon=0.01)[source]#

Score each edge by the local path index.

For each edge (u, v):

\[\mathrm{LP}_{uv} = (A^2)_{uv} + \varepsilon \, (A^3)_{uv}\]

where A is the adjacency matrix. The score is stored as the "lp" edge attribute.

Parameters:
Returns:

H – Copy of G with the "lp" edge attribute added.

Return type:

graph

References

[1]

Lu, L., Jin, C.-H. & Zhou, T. (2009). Similarity index based on local random walk and superposed with the AA index. EPL (Europhysics Letters), 89(1), 18001.

Examples

>>> import networkx as nx
>>> from networkx_backbone import fraction_filter, local_path_index
>>> G = nx.les_miserables_graph()
>>> H = local_path_index(G)
>>> backbone = fraction_filter(H, "lp", 0.3, ascending=False)
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(n^3) time and O(n^2) space.