Skip to content

[Model] HighlyConnectedDeletion #1022

@zhangjieqingithub

Description

@zhangjieqingithub

Motivation

Highly Connected Deletion is a graph-clustering optimization problem from the biological-network literature. It asks for deleting as few edges as possible so that the remaining graph decomposes into dense clusters.

This is not the same as:

  • ClusterDeletion, which requires every cluster to be a clique
  • generic cut / partition problems, which do not encode the "highly connected" cluster condition

A dedicated model is useful because the objective and feasibility semantics are specific, and the paper's notion explicitly allows isolated leftover vertices.

Associated rule:

Definition

Name: HighlyConnectedDeletion
Reference:

  • Falk H�cffner, Christian Komusiewicz, Adrian Liebtrau, Rolf Niedermeier, "Partitioning Biological Networks into Highly Connected Clusters with Maximum Edge Coverage," IEEE/ACM Transactions on Computational Biology and Bioinformatics 11(3):455-467, 2014. https://doi.org/10.1109/TCBB.2013.177
  • Erez Hartuv, Ron Shamir, "A clustering algorithm based on graph connectivity," Information Processing Letters 76(4-6):175-181, 2000. https://doi.org/10.1016/S0020-0190(00)00142-3

Given a simple undirected graph G = (V, E), find a minimum-cardinality edge set F subseteq E such that every connected component of G - F is either:

  • an isolated vertex, or
  • a highly connected graph on at least 3 vertices.

A graph H is highly connected if
lambda(H) > |V(H)| / 2,
equivalently,
delta(H) > |V(H)| / 2.

Paper-specific nuance adopted here:

  • isolated vertices may remain as unclustered leftovers
  • isolated edges / 2-vertex components are not valid clusters

Variables

  • Count: |E| binary variables, one per edge
  • Per-variable domain: {0,1}
  • Meaning: x_e = 1 iff edge e is deleted

A configuration is feasible iff, after deleting exactly those edges with x_e = 1, every connected component is either an isolated vertex or a highly connected graph of size at least 3.

Schema (data type)

Type name: HighlyConnectedDeletion
Variants: none initially; simple undirected graph only

Field Type Description
graph SimpleGraph The input graph G = (V, E)

Complexity

A conservative exact baseline is brute-force over all subsets of edges to delete.

This gives the worst-case complexity
O(2^num_edges * poly(num_vertices, num_edges))

so the registry complexity string should be
2^num_edges.

Literature-backed remarks:

  • the problem is NP-hard
  • the paper also gives parameterized algorithms and kernelization with respect to the number of deleted edges
  • since this optimization model has no deletion budget in the input, the clean registry complexity is the conservative brute-force bound above

References:

Extra Remark

This model is weaker than clique-based clustering:

  • every clique of size at least 3 is highly connected
  • not every highly connected graph is a clique

So HighlyConnectedDeletion should not be folded into a clique-deletion model.

Reduction Rule Crossref

How to solve

Example Instance

Let

  • V = {0,1,2,3}
  • E = {{0,1}, {0,2}, {1,2}, {2,3}}

This is a triangle on {0,1,2} with one leaf vertex 3 attached to 2.

Expected Outcome

An optimal solution is to delete the single edge {2,3}.

Then the remaining graph has:

  • one component G[{0,1,2}] = K3, which is highly connected
  • one isolated vertex {3}, which is allowed as an unclustered leftover

Thus the optimal value is 1, with deletion configuration
x_(2,3) = 1
and all other deletion variables equal to 0.

Zero deletions are infeasible because the whole graph is connected on 4 vertices and has minimum degree 1, which is not greater than 4/2 = 2.

BibTeX

@article{HueffnerKomusiewiczLiebtrauNiedermeier2014,
  author  = {Falk H{"u}ffner and Christian Komusiewicz and Adrian Liebtrau and Rolf Niedermeier},
  title   = {Partitioning Biological Networks into Highly Connected Clusters with Maximum Edge Coverage},
  journal = {IEEE/ACM Transactions on Computational Biology and Bioinformatics},
  volume  = {11},
  number  = {3},
  pages   = {455--467},
  year    = {2014},
  doi     = {10.1109/TCBB.2013.177},
  url     = {https://doi.org/10.1109/TCBB.2013.177}
}

@article{HartuvShamir2000,
  author  = {Erez Hartuv and Ron Shamir},
  title   = {A clustering algorithm based on graph connectivity},
  journal = {Information Processing Letters},
  volume  = {76},
  number  = {4--6},
  pages   = {175--181},
  year    = {2000},
  doi     = {10.1016/S0020-0190(00)00142-3},
  url     = {https://doi.org/10.1016/S0020-0190(00)00142-3}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    modelA model problem to be implemented.

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions