Unweighted Graph Sparsification#
This tutorial demonstrates methods for sparsifying unweighted graphs using the
unweighted module.
Setup#
import networkx as nx
import networkx_backbone as nb
G_full = nx.les_miserables_graph()
G = nx.Graph()
G.add_nodes_from(G_full.nodes(data=True))
G.add_edges_from(G_full.edges())
print(f"Original: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges")
The sparsify pipeline#
The sparsify() function implements a four-step pipeline:
Score: Assign a score to each edge based on a chosen metric
Normalize: Optionally rank-transform scores within each node’s neighborhood
Filter: Select edges based on the normalized scores
Connect (optional): Add a union of maximum spanning trees to ensure connectivity
Edge scoring methods#
The escore parameter controls how edges are scored:
"jaccard": Jaccard similarity of endpoint neighborhoods (default)"degree": Sum of endpoint degrees"triangles": Number of triangles containing the edge"quadrangles": Number of quadrangles containing the edge"random": Random scores (baseline)
# Jaccard scoring (default)
scored_jaccard = nb.sparsify(G, escore="jaccard", s=0.5)
backbone_jaccard = nb.boolean_filter(scored_jaccard, "sparsify_keep")
print(f"Jaccard: {backbone_jaccard.number_of_edges()} edges")
# Degree scoring
scored_degree = nb.sparsify(G, escore="degree", s=0.5)
backbone_degree = nb.boolean_filter(scored_degree, "sparsify_keep")
print(f"Degree: {backbone_degree.number_of_edges()} edges")
# Triangle scoring
scored_tri = nb.sparsify(G, escore="triangles", s=0.5)
backbone_tri = nb.boolean_filter(scored_tri, "sparsify_keep")
print(f"Triangles: {backbone_tri.number_of_edges()} edges")
Normalization#
The normalize parameter controls normalization:
"rank": Rank-transform scores within each node’s neighborhood (default)None: Skip normalization
scored_ranked = nb.sparsify(G, normalize="rank", s=0.5)
backbone_ranked = nb.boolean_filter(scored_ranked, "sparsify_keep")
scored_raw = nb.sparsify(G, normalize=None, filter="threshold", s=0.5)
backbone_raw = nb.boolean_filter(scored_raw, "sparsify_keep")
Filtering#
The filter parameter controls how edges are selected:
"degree": Keep the topceil(d^s)edges per node, wheredis the node’s degree andscontrols sparsity (default)"threshold": Keep edges with score >=s
The s parameter controls the level of sparsification. For degree filtering,
lower values produce sparser backbones:
# More aggressive (sparser)
sparse = nb.boolean_filter(nb.sparsify(G, s=0.3), "sparsify_keep")
print(f"s=0.3: {sparse.number_of_edges()} edges")
# Less aggressive (denser)
dense = nb.boolean_filter(nb.sparsify(G, s=0.7), "sparsify_keep")
print(f"s=0.7: {dense.number_of_edges()} edges")
UMST connectivity guarantee#
Set umst=True to add a union of maximum spanning trees, guaranteeing
that the backbone is connected (if the original graph is connected):
backbone = nb.boolean_filter(nb.sparsify(G, s=0.3, umst=True), "sparsify_keep")
print(f"Connected: {nx.is_connected(backbone)}")
Convenience wrappers#
Two pre-configured wrappers are available:
LSpar (Satuluri et al., 2011): Uses Jaccard scoring, rank normalization, and degree filtering:
backbone = nb.boolean_filter(nb.lspar(G, s=0.5), "sparsify_keep")
print(f"LSpar: {backbone.number_of_edges()} edges")
Local degree (Hamann et al., 2016): Uses degree scoring, rank normalization, and degree filtering:
backbone = nb.boolean_filter(nb.local_degree(G, s=0.3), "sparsify_keep")
print(f"Local degree: {backbone.number_of_edges()} edges")
Comparing unweighted methods#
backbones = {
"lspar_0.3": nb.boolean_filter(nb.lspar(G, s=0.3), "sparsify_keep"),
"lspar_0.5": nb.boolean_filter(nb.lspar(G, s=0.5), "sparsify_keep"),
"local_degree_0.3": nb.boolean_filter(nb.local_degree(G, s=0.3), "sparsify_keep"),
"local_degree_0.5": nb.boolean_filter(nb.local_degree(G, s=0.5), "sparsify_keep"),
}
for name, bb in backbones.items():
ef = nb.edge_fraction(G, bb)
reach = nb.reachability(bb)
print(f"{name:20s}: {bb.number_of_edges()} edges ({ef:.1%}), reachability={reach:.1%}")