Les Miserables Benchmark
========================

This benchmark uses NetworkX's built-in ``nx.les_miserables_graph()`` and
applies a strict **score-then-filter** workflow for every method:

1. Run a scoring method that returns the full graph with edge attributes.
2. Apply a filter (`threshold_filter`, `fraction_filter`, or `boolean_filter`).
3. Report edge counts only from the filtered graph.

If the filtered graph has the same edge count as the original graph, issue a
warning and re-test the method parameters.

Setup
-----

::

    import warnings
    import networkx as nx
    import networkx_backbone as nb

    G = nx.les_miserables_graph()
    G_unweighted = nx.Graph()
    G_unweighted.add_nodes_from(G.nodes(data=True))
    G_unweighted.add_edges_from(G.edges())

    ORIGINAL_EDGES = G.number_of_edges()  # 254

    def warn_if_no_reduction(name, filtered):
        if filtered.number_of_edges() == ORIGINAL_EDGES:
            warnings.warn(
                f"{name}: filtered graph has the same number of edges as "
                f"the original ({ORIGINAL_EDGES}). Re-test and validate.",
                UserWarning,
            )

Filtered Edge Counts
--------------------

All counts below are after explicit filtering.

.. list-table::
   :header-rows: 1
   :widths: 48 26 26

   * - Function
     - Filter Applied
     - Edges After Filter
   * - :func:`~networkx_backbone.disparity_filter`
     - ``disparity_pvalue < 0.05``
     - 247
   * - :func:`~networkx_backbone.noise_corrected_filter`
     - ``nc_score >= 2.0``
     - 98
   * - :func:`~networkx_backbone.marginal_likelihood_filter`
     - ``ml_pvalue < 0.05``
     - 70
   * - :func:`~networkx_backbone.ecm_filter`
     - ``ecm_pvalue < 0.05``
     - 254
   * - :func:`~networkx_backbone.lans_filter`
     - ``lans_pvalue < 0.05``
     - 109
   * - :func:`~networkx_backbone.multiple_linkage_analysis`
     - ``mla_keep`` (boolean)
     - 109
   * - :func:`~networkx_backbone.global_threshold_filter`
     - ``global_threshold_keep`` (boolean)
     - 157
   * - :func:`~networkx_backbone.strongest_n_ties`
     - ``strongest_n_ties_keep`` (boolean)
     - 113
   * - :func:`~networkx_backbone.global_sparsification`
     - ``global_sparsification_keep`` (boolean)
     - 127
   * - :func:`~networkx_backbone.primary_linkage_analysis`
     - ``primary_linkage_keep`` (boolean)
     - 69
   * - :func:`~networkx_backbone.edge_betweenness_filter`
     - ``edge_betweenness_keep`` (boolean)
     - 127
   * - :func:`~networkx_backbone.node_degree_filter`
     - ``node_degree_keep`` (boolean)
     - 237
   * - :func:`~networkx_backbone.high_salience_skeleton`
     - ``salience >= 0.5``
     - 76
   * - :func:`~networkx_backbone.metric_backbone`
     - ``metric_keep`` (boolean)
     - 163
   * - :func:`~networkx_backbone.ultrametric_backbone`
     - ``ultrametric_keep`` (boolean)
     - 118
   * - :func:`~networkx_backbone.doubly_stochastic_filter`
     - ``ds_weight >= 0.1``
     - 109
   * - :func:`~networkx_backbone.h_backbone`
     - ``h_backbone_keep`` (boolean)
     - 22
   * - :func:`~networkx_backbone.modularity_backbone`
     - ``modularity_keep`` (boolean)
     - 72
   * - :func:`~networkx_backbone.planar_maximally_filtered_graph`
     - ``pmfg_keep`` (boolean)
     - 162
   * - :func:`~networkx_backbone.maximum_spanning_tree_backbone`
     - ``mst_keep`` (boolean)
     - 76
   * - :func:`~networkx_backbone.neighborhood_overlap`
     - top 30% by ``overlap``
     - 76
   * - :func:`~networkx_backbone.jaccard_backbone`
     - top 30% by ``jaccard``
     - 76
   * - :func:`~networkx_backbone.dice_backbone`
     - top 30% by ``dice``
     - 76
   * - :func:`~networkx_backbone.cosine_backbone`
     - top 30% by ``cosine``
     - 76
   * - :func:`~networkx_backbone.hub_promoted_index`
     - top 30% by ``hpi``
     - 76
   * - :func:`~networkx_backbone.hub_depressed_index`
     - top 30% by ``hdi``
     - 76
   * - :func:`~networkx_backbone.lhn_local_index`
     - top 30% by ``lhn_local``
     - 76
   * - :func:`~networkx_backbone.preferential_attachment_score`
     - top 30% by ``pa``
     - 76
   * - :func:`~networkx_backbone.adamic_adar_index`
     - top 30% by ``adamic_adar``
     - 76
   * - :func:`~networkx_backbone.resource_allocation_index`
     - top 30% by ``resource_allocation``
     - 76
   * - :func:`~networkx_backbone.graph_distance_proximity`
     - top 30% by ``dist``
     - 76
   * - :func:`~networkx_backbone.local_path_index`
     - top 30% by ``lp``
     - 76
   * - :func:`~networkx_backbone.glab_filter`
     - ``glab_pvalue < 0.05``
     - 5
   * - :func:`~networkx_backbone.sparsify`
     - ``sparsify_keep`` (boolean)
     - 136
   * - :func:`~networkx_backbone.lspar`
     - ``sparsify_keep`` (boolean)
     - 136
   * - :func:`~networkx_backbone.local_degree`
     - ``sparsify_keep`` (boolean)
     - 135

Reproducibility Script
----------------------

::

    import warnings
    import networkx as nx
    import networkx_backbone as nb

    G = nx.les_miserables_graph()
    G_unweighted = nx.Graph()
    G_unweighted.add_nodes_from(G.nodes(data=True))
    G_unweighted.add_edges_from(G.edges())

    original_edges = G.number_of_edges()

    methods = [
        ("disparity_filter", lambda: nb.threshold_filter(nb.disparity_filter(G), "disparity_pvalue", 0.05, mode="below")),
        ("noise_corrected_filter", lambda: nb.threshold_filter(nb.noise_corrected_filter(G), "nc_score", 2.0, mode="above")),
        ("marginal_likelihood_filter", lambda: nb.threshold_filter(nb.marginal_likelihood_filter(G), "ml_pvalue", 0.05, mode="below")),
        ("ecm_filter", lambda: nb.threshold_filter(nb.ecm_filter(G), "ecm_pvalue", 0.05, mode="below")),
        ("lans_filter", lambda: nb.threshold_filter(nb.lans_filter(G), "lans_pvalue", 0.05, mode="below")),
        ("multiple_linkage_analysis(alpha=0.05)", lambda: nb.boolean_filter(nb.multiple_linkage_analysis(G, alpha=0.05), "mla_keep")),
        ("global_threshold_filter(threshold=2)", lambda: nb.boolean_filter(nb.global_threshold_filter(G, threshold=2), "global_threshold_keep")),
        ("strongest_n_ties(n=2)", lambda: nb.boolean_filter(nb.strongest_n_ties(G, n=2), "strongest_n_ties_keep")),
        ("global_sparsification(s=0.5)", lambda: nb.boolean_filter(nb.global_sparsification(G, s=0.5), "global_sparsification_keep")),
        ("primary_linkage_analysis", lambda: nb.boolean_filter(nb.primary_linkage_analysis(G), "primary_linkage_keep")),
        ("edge_betweenness_filter(s=0.5)", lambda: nb.boolean_filter(nb.edge_betweenness_filter(G, s=0.5), "edge_betweenness_keep")),
        ("node_degree_filter(min_degree=2)", lambda: nb.boolean_filter(nb.node_degree_filter(G, min_degree=2), "node_degree_keep")),
        ("high_salience_skeleton", lambda: nb.threshold_filter(nb.high_salience_skeleton(G), "salience", 0.5, mode="above")),
        ("metric_backbone", lambda: nb.boolean_filter(nb.metric_backbone(G), "metric_keep")),
        ("ultrametric_backbone", lambda: nb.boolean_filter(nb.ultrametric_backbone(G), "ultrametric_keep")),
        ("doubly_stochastic_filter", lambda: nb.threshold_filter(nb.doubly_stochastic_filter(G), "ds_weight", 0.1, mode="above")),
        ("h_backbone", lambda: nb.boolean_filter(nb.h_backbone(G), "h_backbone_keep")),
        ("modularity_backbone", lambda: nb.boolean_filter(nb.modularity_backbone(G), "modularity_keep")),
        ("planar_maximally_filtered_graph", lambda: nb.boolean_filter(nb.planar_maximally_filtered_graph(G), "pmfg_keep")),
        ("maximum_spanning_tree_backbone", lambda: nb.boolean_filter(nb.maximum_spanning_tree_backbone(G), "mst_keep")),
        ("neighborhood_overlap", lambda: nb.fraction_filter(nb.neighborhood_overlap(G), "overlap", 0.3, ascending=False)),
        ("jaccard_backbone", lambda: nb.fraction_filter(nb.jaccard_backbone(G), "jaccard", 0.3, ascending=False)),
        ("dice_backbone", lambda: nb.fraction_filter(nb.dice_backbone(G), "dice", 0.3, ascending=False)),
        ("cosine_backbone", lambda: nb.fraction_filter(nb.cosine_backbone(G), "cosine", 0.3, ascending=False)),
        ("hub_promoted_index", lambda: nb.fraction_filter(nb.hub_promoted_index(G), "hpi", 0.3, ascending=False)),
        ("hub_depressed_index", lambda: nb.fraction_filter(nb.hub_depressed_index(G), "hdi", 0.3, ascending=False)),
        ("lhn_local_index", lambda: nb.fraction_filter(nb.lhn_local_index(G), "lhn_local", 0.3, ascending=False)),
        ("preferential_attachment_score", lambda: nb.fraction_filter(nb.preferential_attachment_score(G), "pa", 0.3, ascending=False)),
        ("adamic_adar_index", lambda: nb.fraction_filter(nb.adamic_adar_index(G), "adamic_adar", 0.3, ascending=False)),
        ("resource_allocation_index", lambda: nb.fraction_filter(nb.resource_allocation_index(G), "resource_allocation", 0.3, ascending=False)),
        ("graph_distance_proximity", lambda: nb.fraction_filter(nb.graph_distance_proximity(G), "dist", 0.3, ascending=False)),
        ("local_path_index", lambda: nb.fraction_filter(nb.local_path_index(G), "lp", 0.3, ascending=False)),
        ("glab_filter", lambda: nb.threshold_filter(nb.glab_filter(G), "glab_pvalue", 0.05, mode="below")),
        ("sparsify", lambda: nb.boolean_filter(nb.sparsify(G_unweighted), "sparsify_keep")),
        ("lspar", lambda: nb.boolean_filter(nb.lspar(G_unweighted), "sparsify_keep")),
        ("local_degree", lambda: nb.boolean_filter(nb.local_degree(G_unweighted), "sparsify_keep")),
    ]

    print("original_edges", original_edges)
    for name, fn in methods:
        filtered = fn()
        if filtered.number_of_edges() == original_edges:
            warnings.warn(
                f"{name}: filtered graph has the same number of edges as "
                f"the original ({original_edges}). Re-test and validate.",
                UserWarning,
            )
        print(name, filtered.number_of_edges())