Concepts#

This page provides a conceptual overview of backbone extraction and how the methods in networkx-backbone are organized.

What is backbone extraction?#

Real-world networks are often dense and noisy. Backbone extraction identifies the most important substructure of a network by removing edges that are redundant, statistically insignificant, or structurally unimportant. The result is a sparser graph that preserves the essential structure of the original.

Taxonomy of methods#

The 65 functions in networkx-backbone are organized into nine modules based on the approach they take. The method taxonomy aligns with the categories used in netbone (Yassin et al., 2023; https://gitlab.liris.cnrs.fr/coregraphie/netbone).

Statistical methods#

The statistical module provides six methods that test whether each edge’s weight is statistically significant under a null model. These methods produce a p-value or z-score for each edge.

disparity_filter() – uniform null model (Serrano et al., 2009)
noise_corrected_filter() – binomial null model (Coscia & Neffke, 2017)
marginal_likelihood_filter() – binomial null considering both endpoints (Dianati, 2016)
ecm_filter() – maximum-entropy null model (Gemmetto et al., 2017)
lans_filter() – nonparametric empirical CDF (Foti et al., 2011)
multiple_linkage_analysis() – local linkage significance (Van Nuffel et al., 2010; Yassin et al., 2023)

Structural methods#

The structural module provides fourteen methods that use topological properties of the network directly, without hypothesis testing.

Simple filters: global_threshold_filter(), strongest_n_ties(), global_sparsification()
Linkage/centrality filters: primary_linkage_analysis(), edge_betweenness_filter(), node_degree_filter()
Shortest-path methods: high_salience_skeleton(), metric_backbone(), ultrametric_backbone()
Normalization: doubly_stochastic_filter()
Index-based: h_backbone()
Community-based: modularity_backbone()
Constrained: planar_maximally_filtered_graph(), maximum_spanning_tree_backbone()

Proximity methods#

The proximity module provides twelve methods that score each edge based on how similar the neighborhoods of its endpoints are. High-scoring edges are structurally embedded within communities, while low-scoring edges tend to be bridges.

Methods include jaccard_backbone(), dice_backbone(), cosine_backbone(), adamic_adar_index(), resource_allocation_index(), and more.

Hybrid methods#

The hybrid module contains glab_filter(), which combines global betweenness information with local degree information (Zhang et al., 2014).

Bipartite methods#

The bipartite module provides methods for extracting significant edges from bipartite graph projections:

simple_projection(), hyper_projection(), probs_projection(), ycn_projection() – weighted projection schemes (Coscia & Neffke, 2017)
sdsm() – Stochastic Degree Sequence Model (analytical, Neal 2014)
fdsm() – Fixed Degree Sequence Model (Monte Carlo, Neal et al. 2021)
fixedfill(), fixedrow(), fixedcol() – fixed null-model variants
backbone_from_projection() / backbone() – high-level wrappers

Unweighted methods#

The unweighted module provides sparsification methods for unweighted graphs using a four-step pipeline of scoring, normalization, filtering, and optional connectivity restoration:

sparsify() – generic framework
lspar() – Local Sparsification (Satuluri et al., 2011)
local_degree() – Degree-based (Hamann et al., 2016)

The score-then-filter pattern#

Most backbone methods follow a two-step pattern:

Score: Apply a backbone method to annotate each edge with a score (p-value, similarity, salience, etc.). The method returns a copy of the graph with the score as an edge attribute.
Filter: Use a function from the filters module to extract the backbone by selecting edges based on their score.

For example:

# Step 1: Score edges
H = nb.disparity_filter(G)  # adds "disparity_pvalue" attribute

# Step 2: Filter edges
backbone = nb.threshold_filter(H, "disparity_pvalue", 0.05)

Many structural and unweighted methods expose boolean edge flags (for example metric_keep or sparsify_keep). These still follow score-then-filter:

scored = nb.metric_backbone(G)               # adds "metric_keep"
backbone = nb.boolean_filter(scored, "metric_keep")

Evaluation#

The measures module provides metrics for evaluating how well a backbone preserves the properties of the original graph:

Fraction metrics: node_fraction(), edge_fraction(), weight_fraction()
Connectivity: reachability()
Distribution preservation: ks_degree(), ks_weight()
Comparison: compare_backbones()