Concepts#

This page provides a conceptual overview of backbone extraction and how the methods in networkx-backbone are organized.

What is backbone extraction?#

Real-world networks are often dense and noisy. Backbone extraction identifies the most important substructure of a network by removing edges that are redundant, statistically insignificant, or structurally unimportant. The result is a sparser graph that preserves the essential structure of the original.

Taxonomy of methods#

The 65 functions in networkx-backbone are organized into nine modules based on the approach they take. The method taxonomy aligns with the categories used in netbone (Yassin et al., 2023; https://gitlab.liris.cnrs.fr/coregraphie/netbone).

Statistical methods#

The statistical module provides six methods that test whether each edge’s weight is statistically significant under a null model. These methods produce a p-value or z-score for each edge.

Structural methods#

The structural module provides fourteen methods that use topological properties of the network directly, without hypothesis testing.

Proximity methods#

The proximity module provides twelve methods that score each edge based on how similar the neighborhoods of its endpoints are. High-scoring edges are structurally embedded within communities, while low-scoring edges tend to be bridges.

Methods include jaccard_backbone(), dice_backbone(), cosine_backbone(), adamic_adar_index(), resource_allocation_index(), and more.

Hybrid methods#

The hybrid module contains glab_filter(), which combines global betweenness information with local degree information (Zhang et al., 2014).

Bipartite methods#

The bipartite module provides methods for extracting significant edges from bipartite graph projections:

Unweighted methods#

The unweighted module provides sparsification methods for unweighted graphs using a four-step pipeline of scoring, normalization, filtering, and optional connectivity restoration:

The score-then-filter pattern#

Most backbone methods follow a two-step pattern:

  1. Score: Apply a backbone method to annotate each edge with a score (p-value, similarity, salience, etc.). The method returns a copy of the graph with the score as an edge attribute.

  2. Filter: Use a function from the filters module to extract the backbone by selecting edges based on their score.

For example:

# Step 1: Score edges
H = nb.disparity_filter(G)  # adds "disparity_pvalue" attribute

# Step 2: Filter edges
backbone = nb.threshold_filter(H, "disparity_pvalue", 0.05)

Many structural and unweighted methods expose boolean edge flags (for example metric_keep or sparsify_keep). These still follow score-then-filter:

scored = nb.metric_backbone(G)               # adds "metric_keep"
backbone = nb.boolean_filter(scored, "metric_keep")

Evaluation#

The measures module provides metrics for evaluating how well a backbone preserves the properties of the original graph: