Unweighted Graph Sparsification
================================

This tutorial demonstrates methods for sparsifying unweighted graphs using the
:mod:`~networkx_backbone.unweighted` module.

Setup
-----

::

    import networkx as nx
    import networkx_backbone as nb

    G_full = nx.les_miserables_graph()
    G = nx.Graph()
    G.add_nodes_from(G_full.nodes(data=True))
    G.add_edges_from(G_full.edges())
    print(f"Original: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges")

The sparsify pipeline
---------------------

The :func:`~networkx_backbone.sparsify` function implements a four-step pipeline:

1. **Score**: Assign a score to each edge based on a chosen metric
2. **Normalize**: Optionally rank-transform scores within each node's neighborhood
3. **Filter**: Select edges based on the normalized scores
4. **Connect** (optional): Add a union of maximum spanning trees to ensure connectivity

Edge scoring methods
^^^^^^^^^^^^^^^^^^^^

The ``escore`` parameter controls how edges are scored:

- ``"jaccard"``: Jaccard similarity of endpoint neighborhoods (default)
- ``"degree"``: Sum of endpoint degrees
- ``"triangles"``: Number of triangles containing the edge
- ``"quadrangles"``: Number of quadrangles containing the edge
- ``"random"``: Random scores (baseline)

::

    # Jaccard scoring (default)
    scored_jaccard = nb.sparsify(G, escore="jaccard", s=0.5)
    backbone_jaccard = nb.boolean_filter(scored_jaccard, "sparsify_keep")
    print(f"Jaccard: {backbone_jaccard.number_of_edges()} edges")

    # Degree scoring
    scored_degree = nb.sparsify(G, escore="degree", s=0.5)
    backbone_degree = nb.boolean_filter(scored_degree, "sparsify_keep")
    print(f"Degree: {backbone_degree.number_of_edges()} edges")

    # Triangle scoring
    scored_tri = nb.sparsify(G, escore="triangles", s=0.5)
    backbone_tri = nb.boolean_filter(scored_tri, "sparsify_keep")
    print(f"Triangles: {backbone_tri.number_of_edges()} edges")

Normalization
^^^^^^^^^^^^^

The ``normalize`` parameter controls normalization:

- ``"rank"``: Rank-transform scores within each node's neighborhood (default)
- ``None``: Skip normalization

::

    scored_ranked = nb.sparsify(G, normalize="rank", s=0.5)
    backbone_ranked = nb.boolean_filter(scored_ranked, "sparsify_keep")
    scored_raw = nb.sparsify(G, normalize=None, filter="threshold", s=0.5)
    backbone_raw = nb.boolean_filter(scored_raw, "sparsify_keep")

Filtering
^^^^^^^^^

The ``filter`` parameter controls how edges are selected:

- ``"degree"``: Keep the top ``ceil(d^s)`` edges per node, where ``d`` is the
  node's degree and ``s`` controls sparsity (default)
- ``"threshold"``: Keep edges with score >= ``s``

The ``s`` parameter controls the level of sparsification. For degree filtering,
lower values produce sparser backbones::

    # More aggressive (sparser)
    sparse = nb.boolean_filter(nb.sparsify(G, s=0.3), "sparsify_keep")
    print(f"s=0.3: {sparse.number_of_edges()} edges")

    # Less aggressive (denser)
    dense = nb.boolean_filter(nb.sparsify(G, s=0.7), "sparsify_keep")
    print(f"s=0.7: {dense.number_of_edges()} edges")

UMST connectivity guarantee
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Set ``umst=True`` to add a union of maximum spanning trees, guaranteeing
that the backbone is connected (if the original graph is connected)::

    backbone = nb.boolean_filter(nb.sparsify(G, s=0.3, umst=True), "sparsify_keep")
    print(f"Connected: {nx.is_connected(backbone)}")

Convenience wrappers
--------------------

Two pre-configured wrappers are available:

**LSpar** (Satuluri et al., 2011): Uses Jaccard scoring, rank normalization,
and degree filtering::

    backbone = nb.boolean_filter(nb.lspar(G, s=0.5), "sparsify_keep")
    print(f"LSpar: {backbone.number_of_edges()} edges")

**Local degree** (Hamann et al., 2016): Uses degree scoring, rank
normalization, and degree filtering::

    backbone = nb.boolean_filter(nb.local_degree(G, s=0.3), "sparsify_keep")
    print(f"Local degree: {backbone.number_of_edges()} edges")

Comparing unweighted methods
-----------------------------

::

    backbones = {
        "lspar_0.3": nb.boolean_filter(nb.lspar(G, s=0.3), "sparsify_keep"),
        "lspar_0.5": nb.boolean_filter(nb.lspar(G, s=0.5), "sparsify_keep"),
        "local_degree_0.3": nb.boolean_filter(nb.local_degree(G, s=0.3), "sparsify_keep"),
        "local_degree_0.5": nb.boolean_filter(nb.local_degree(G, s=0.5), "sparsify_keep"),
    }

    for name, bb in backbones.items():
        ef = nb.edge_fraction(G, bb)
        reach = nb.reachability(bb)
        print(f"{name:20s}: {bb.number_of_edges()} edges ({ef:.1%}), reachability={reach:.1%}")