Statistical Methods#

Examples in this module use nx.les_miserables_graph(). All functions return scored full graphs; apply threshold_filter() or boolean_filter() as the second step. Complexity classes are provided in each function docstring.

Statistical backbone extraction methods.

These methods evaluate edge significance using hypothesis testing against a null model. Each method computes a p-value (or score) for every edge, which can then be filtered using functions in networkx_backbone.filters.

networkx_backbone.statistical.disparity_filter()[source]#

Serrano et al. (2009) – uniform distribution null model.

networkx_backbone.statistical.noise_corrected_filter()[source]#

Coscia & Neffke (2017) – Bayesian binomial framework.

networkx_backbone.statistical.marginal_likelihood_filter()[source]#

Dianati (2016) – binomial null considering both endpoints.

networkx_backbone.statistical.ecm_filter()[source]#

Gemmetto et al. (2017) – enhanced configuration model.

networkx_backbone.statistical.lans_filter()[source]#

Foti et al. (2011) – nonparametric, empirical CDF-based.

networkx_backbone.statistical.multiple_linkage_analysis()[source]#

Local linkage significance backbone extraction.

networkx_backbone.disparity_filter(G, weight='weight')[source]#

Compute disparity filter p-values for each edge.

The disparity filter [1] tests whether an edge’s weight is statistically significant compared to a null model where each node’s total strength is uniformly distributed across its edges.

For each node u with degree k_u and strength s_u, the normalised weight of edge (u, v) is p_uv = w_uv / s_u. Under the null the probability of observing a normalised weight >= p_uv is:

alpha_uv = (1.0 - p_uv) ** (k_u - 1)

For an undirected edge the p-value is the minimum of the values computed from each endpoint. For a directed edge the p-value is computed from the source node only (out-strength, out-degree).

Parameters:
  • G (networkx.Graph or networkx.DiGraph) – A NetworkX graph.

  • weight (string, optional (default="weight")) – Edge attribute key for weights. All weights must be positive.

Returns:

H – A copy of G (same type) with "disparity_pvalue" added as an edge attribute.

Return type:

graph

Raises:

NetworkXError – If any edge has a non-positive or missing weight.

References

[1]

Serrano, M. A., Boguna, M., & Vespignani, A. (2009). Extracting the multiscale backbone of complex weighted networks. PNAS, 106(16), 6483-6488.

Examples

>>> import networkx as nx
>>> from networkx_backbone import disparity_filter, threshold_filter
>>> G = nx.les_miserables_graph()
>>> H = disparity_filter(G)
>>> backbone = threshold_filter(H, "disparity_pvalue", 0.05, mode="below")
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(n + m) time and O(n + m) space.

networkx_backbone.noise_corrected_filter(G, weight='weight')[source]#

Compute noise-corrected edge significance scores.

Uses a Bayesian framework where each edge weight is modelled as the outcome of a binomial process [1]. The expected weight of edge (u, v) given a total network weight W is:

E[w_uv] = (s_u * s_v) / W

The score measures how many standard deviations the observed weight lies above the expectation (a z-score). Higher scores indicate more significant edges.

For directed graphs, out-strength of u and in-strength of v are used.

Parameters:
  • G (networkx.Graph or networkx.DiGraph) – A NetworkX graph.

  • weight (string, optional (default="weight")) – Edge attribute key for weights. All weights must be positive.

Returns:

H – A copy of G (same type) with "nc_score" edge attribute (z-score; higher means more significant).

Return type:

graph

Raises:

NetworkXError – If any edge has a non-positive or missing weight.

References

[1]

Coscia, M. & Neffke, F. M. (2017). Network backboning with noisy data. Proc. IEEE ICDE, 425-436.

Examples

>>> import networkx as nx
>>> from networkx_backbone import noise_corrected_filter, threshold_filter
>>> G = nx.les_miserables_graph()
>>> H = noise_corrected_filter(G)
>>> backbone = threshold_filter(H, "nc_score", 2.0, mode="above")
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(n + m) time and O(n + m) space.

networkx_backbone.marginal_likelihood_filter(G, weight='weight')[source]#

Compute marginal likelihood p-values for each edge.

The marginal likelihood filter [1] considers edge weights as integer counts under a binomial null model that accounts for both endpoints’ strengths.

Parameters:
  • G (networkx.Graph or networkx.DiGraph) – A NetworkX graph. Integer weights are recommended.

  • weight (string, optional (default="weight")) – Edge attribute key for weights. All weights must be positive.

Returns:

H – A copy of G (same type) with "ml_pvalue" edge attribute.

Return type:

graph

Raises:

NetworkXError – If any edge has a non-positive or missing weight.

References

[1]

Dianati, N. (2016). Unwinding the hairball graph: Pruning algorithms for weighted complex networks. Physical Review E, 93, 012304.

Examples

>>> import networkx as nx
>>> from networkx_backbone import marginal_likelihood_filter, threshold_filter
>>> G = nx.les_miserables_graph()
>>> H = marginal_likelihood_filter(G)
>>> backbone = threshold_filter(H, "ml_pvalue", 0.05, mode="below")
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(n + m) time and O(n + m) space.

networkx_backbone.ecm_filter(G, weight='weight', max_iter=1000, tol=1e-06)[source]#

Compute ECM (Enhanced Configuration Model) p-values.

The ECM [1] is a maximum-entropy null model for weighted networks that preserves the expected degree and strength sequence. Lagrange multipliers are assigned to each node and solved iteratively.

Parameters:
  • G (networkx.Graph or networkx.DiGraph) – A NetworkX graph.

  • weight (string, optional (default="weight")) – Edge attribute key for weights. All weights must be positive.

  • max_iter (int, optional (default=1000)) – Maximum number of iterations for the fixed-point solver.

  • tol (float, optional (default=1e-6)) – Convergence tolerance for the Lagrange multipliers.

Returns:

H – A copy of G (same type) with "ecm_pvalue" edge attribute.

Return type:

graph

Raises:

NetworkXError – If any edge has a non-positive or missing weight.

References

[1]

Gemmetto, V., Cardillo, A., & Garlaschelli, D. (2017). Irreducible network backbones: unbiased graph filtering via maximum entropy. arXiv:1706.00230.

Examples

>>> import networkx as nx
>>> from networkx_backbone import ecm_filter, threshold_filter
>>> G = nx.les_miserables_graph()
>>> H = ecm_filter(G)
>>> backbone = threshold_filter(H, "ecm_pvalue", 0.05, mode="below")
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(I * n^2 + m) time and O(n + m) space.

networkx_backbone.lans_filter(G, weight='weight')[source]#

Compute LANS (Locally Adaptive Network Sparsification) p-values.

LANS [1] is a nonparametric method that makes no distributional assumptions. For each edge (u, v), it computes the fraction of edges at node u (and node v) that have weight <= w_uv — the empirical CDF value. The edge p-value is one minus the maximum of the two eCDF values (so lower p means more significant).

For directed graphs, only the source node’s out-edges are used.

Parameters:
  • G (networkx.Graph or networkx.DiGraph) – A NetworkX graph.

  • weight (string, optional (default="weight")) – Edge attribute key for weights. All weights must be positive.

Returns:

H – A copy of G (same type) with "lans_pvalue" edge attribute.

Return type:

graph

Raises:

NetworkXError – If any edge has a non-positive or missing weight.

References

[1]

Foti, N. J., Hughes, J. M., & Rockmore, D. N. (2011). Nonparametric sparsification of complex multiscale networks. PLoS ONE, 6(2), e16431.

Examples

>>> import networkx as nx
>>> from networkx_backbone import lans_filter, threshold_filter
>>> G = nx.les_miserables_graph()
>>> H = lans_filter(G)
>>> backbone = threshold_filter(H, "lans_pvalue", 0.05, mode="below")
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(m log n) time and O(n + m) space.

networkx_backbone.multiple_linkage_analysis(G, alpha=0.05, weight='weight')[source]#

Extract a backbone using Multiple Linkage Analysis (MLA).

MLA is a local-significance method that selects edges whose weights are unusually high relative to neighboring edges. This implementation uses edge-level empirical CDF p-values from lans_filter() and retains edges with p-value <= alpha.

Parameters:
  • G (networkx.Graph or networkx.DiGraph) – A NetworkX graph.

  • alpha (float, optional (default=0.05)) – Significance threshold in [0, 1].

  • weight (string, optional (default="weight")) – Edge attribute key for weights. All weights must be positive.

Returns:

H – A copy of G with "lans_pvalue", "mla_pvalue", and boolean "mla_keep" edge attributes.

Return type:

graph

Raises:
  • ValueError – If alpha is outside [0, 1].

  • NetworkXError – If any edge has a non-positive or missing weight.

References

[1]

Van Nuffel, N., Heyndrickx, C., & Wets, G. (2010). Measuring hierarchy and reciprocity in networks.

[2]

Yassin, A., Haidar, A., Cherifi, H., Seba, H., & Togni, O. (2023). An evaluation tool for backbone extraction techniques in weighted complex networks. Scientific Reports, 13, 17000.

Examples

>>> import networkx as nx
>>> from networkx_backbone import boolean_filter, multiple_linkage_analysis
>>> G = nx.les_miserables_graph()
>>> H = multiple_linkage_analysis(G, alpha=0.5)
>>> backbone = boolean_filter(H, "mla_keep")
>>> backbone.number_of_edges() <= H.number_of_edges()
True

Computational complexity estimate is O(m log n) time and O(n + m) space.