Motivation
Maximum common subgraph search already appears in the RNA-structure workflow used by the PLOS Computational Biology paper on recurrent interaction networks. The paper itself enumerates maximal common subgraphs rather than optimizing a single best one, but the edge-maximizing optimization surrogate is still the clean formal model to add to the reduction graph.
Definition
Name: MaximumCommonEdgeSubgraph
Reference: https://doi.org/10.1109/TC.1981.1675756 ; https://doi.org/10.1016/j.dam.2012.01.026 ; https://doi.org/10.1371/journal.pcbi.1008990
Given two finite directed edge-labelled graphs
G1 = (V1, E1) and G2 = (V2, E2), with Ei subseteq Vi x Sigma x Vi,
find a partial injective map f: U1 -> V2, where U1 subseteq V1, maximizing the number of labelled arcs (u, lambda, v) in E1 such that u, v in U1 and (f(u), lambda, f(v)) in E2.
Equivalently, the objective is to maximize the number of preserved labelled directed arcs under a partial one-to-one mapping from graph_1 into graph_2.
Important modeling choices for this proposal:
- edge labels must match exactly
- vertex labels are not part of the formal feasibility definition
- disconnected common subgraphs are allowed
- the model uses set semantics, not multiplicities: a labelled arc contributes
1 if preserved and 0 otherwise
- the objective is exactly
max |E(H)| with no secondary tie-break on |V(H)|
Variables
- Count:
|V1| decision variables, one per vertex of graph_1
- Per-variable domain:
{0, 1, ..., |V2|-1} union {bottom}
- Meaning: variable
x_u records which vertex of graph_2 the source vertex u maps to, or bottom if u is left unmatched
Feasibility additionally requires injectivity on the matched vertices.
Schema (data type)
Type name: MaximumCommonEdgeSubgraph
Variants: none initially
| Field |
Type |
Description |
| graph_1 |
Directed edge-labelled graph |
Source graph whose vertices are mapped |
| graph_2 |
Directed edge-labelled graph |
Target graph receiving the partial injective map |
A minimal helper graph schema is:
| Field |
Type |
Description |
| num_vertices |
integer |
Number of vertices |
| arcs |
set of triples (u, label, v) |
Directed labelled arcs |
This keeps the formal model general over any finite alphabet Sigma; the RNA paper is then one application-specific instantiation of Sigma.
Complexity
A conservative exact baseline is obtained by enumerating every assignment graph_1 -> graph_2 or unmatched, then filtering to injective assignments and counting preserved arcs.
This yields the worst-case bound
O((|V2| + 1)^|V1| * poly(|V1|, |V2|, |E1|, |E2|))
so the registry complexity string should be
(num_vertices_2 + 1)^num_vertices_1
References: https://doi.org/10.1109/TC.1981.1675756 ; https://doi.org/10.1016/j.dam.2012.01.026
Extra Remark
This proposal is the edge-maximizing optimization surrogate, so the precise formal name should be MaximumCommonEdgeSubgraph (MCES). The broader phrase "maximum common subgraph" can still be used informally.
Reduction Rule Crossref
How to solve
Example Instance
Let
Sigma = {a, b, c, d}
V1 = {0, 1, 2, 3, 4}
E1 = {(0,a,1), (1,b,2), (0,c,2), (2,a,3), (1,d,3), (3,b,4)}
V2 = {0, 1, 2, 3}
E2 = {(0,a,1), (1,b,2), (0,c,2), (2,a,3), (1,d,3), (0,b,3)}
Expected Outcome
One optimal configuration is
x = (0, 1, 2, 3, 4)
where the value 4 denotes bottom because |V2| = 4.
This means
0 -> 0
1 -> 1
2 -> 2
3 -> 3
4 -> bottom
The preserved labelled arcs are
(0,a,1)
(1,b,2)
(0,c,2)
(2,a,3)
(1,d,3)
so the objective value is 5.
No solution can preserve all 6 source arcs, because doing so would require all five vertices of V1 to be matched injectively into the four vertices of V2.
BibTeX
@article{Bokhari1981Mapping,
title={On the Mapping Problem},
author={Bokhari, Shahid H.},
journal={IEEE Transactions on Computers},
volume={30},
number={3},
pages={207--214},
year={1981},
doi={10.1109/TC.1981.1675756},
url={https://doi.org/10.1109/TC.1981.1675756}
}
@article{Bahiense2012MCES,
title={The maximum common edge subgraph problem: A polyhedral investigation},
author={Bahiense, Luciana and Manic, Gustavo and Piva, Humberto and de Souza, Cid C.},
journal={Discrete Applied Mathematics},
volume={160},
number={18},
pages={2523--2541},
year={2012},
doi={10.1016/j.dam.2012.01.026},
url={https://doi.org/10.1016/j.dam.2012.01.026}
}
@article{Soule2021RNA,
title={Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs},
author={Soule, Audrey and Aguirre-Hernandez, Romain and Pardo, Amandine and Denise, Alain and de la Higuera, Colin and Touzet, H{\'e}l{\`e}ne and Ponty, Yann},
journal={PLOS Computational Biology},
volume={17},
number={7},
pages={e1008990},
year={2021},
doi={10.1371/journal.pcbi.1008990},
url={https://doi.org/10.1371/journal.pcbi.1008990}
}
Motivation
Maximum common subgraph search already appears in the RNA-structure workflow used by the PLOS Computational Biology paper on recurrent interaction networks. The paper itself enumerates maximal common subgraphs rather than optimizing a single best one, but the edge-maximizing optimization surrogate is still the clean formal model to add to the reduction graph.
Definition
Name: MaximumCommonEdgeSubgraph
Reference: https://doi.org/10.1109/TC.1981.1675756 ; https://doi.org/10.1016/j.dam.2012.01.026 ; https://doi.org/10.1371/journal.pcbi.1008990
Given two finite directed edge-labelled graphs
G1 = (V1, E1)andG2 = (V2, E2), withEi subseteq Vi x Sigma x Vi,find a partial injective map
f: U1 -> V2, whereU1 subseteq V1, maximizing the number of labelled arcs(u, lambda, v) in E1such thatu, v in U1and(f(u), lambda, f(v)) in E2.Equivalently, the objective is to maximize the number of preserved labelled directed arcs under a partial one-to-one mapping from
graph_1intograph_2.Important modeling choices for this proposal:
1if preserved and0otherwisemax |E(H)|with no secondary tie-break on|V(H)|Variables
|V1|decision variables, one per vertex ofgraph_1{0, 1, ..., |V2|-1} union {bottom}x_urecords which vertex ofgraph_2the source vertexumaps to, orbottomifuis left unmatchedFeasibility additionally requires injectivity on the matched vertices.
Schema (data type)
Type name:
MaximumCommonEdgeSubgraphVariants: none initially
A minimal helper graph schema is:
(u, label, v)This keeps the formal model general over any finite alphabet
Sigma; the RNA paper is then one application-specific instantiation ofSigma.Complexity
A conservative exact baseline is obtained by enumerating every assignment
graph_1 -> graph_2 or unmatched, then filtering to injective assignments and counting preserved arcs.This yields the worst-case bound
O((|V2| + 1)^|V1| * poly(|V1|, |V2|, |E1|, |E2|))so the registry complexity string should be
(num_vertices_2 + 1)^num_vertices_1References: https://doi.org/10.1109/TC.1981.1675756 ; https://doi.org/10.1016/j.dam.2012.01.026
Extra Remark
This proposal is the edge-maximizing optimization surrogate, so the precise formal name should be
MaximumCommonEdgeSubgraph(MCES). The broader phrase "maximum common subgraph" can still be used informally.Reduction Rule Crossref
How to solve
[Rule] MaximumCommonEdgeSubgraph to ILP).Example Instance
Let
Sigma = {a, b, c, d}V1 = {0, 1, 2, 3, 4}E1 = {(0,a,1), (1,b,2), (0,c,2), (2,a,3), (1,d,3), (3,b,4)}V2 = {0, 1, 2, 3}E2 = {(0,a,1), (1,b,2), (0,c,2), (2,a,3), (1,d,3), (0,b,3)}Expected Outcome
One optimal configuration is
x = (0, 1, 2, 3, 4)where the value
4denotesbottombecause|V2| = 4.This means
0 -> 01 -> 12 -> 23 -> 34 -> bottomThe preserved labelled arcs are
(0,a,1)(1,b,2)(0,c,2)(2,a,3)(1,d,3)so the objective value is
5.No solution can preserve all
6source arcs, because doing so would require all five vertices ofV1to be matched injectively into the four vertices ofV2.BibTeX