Skip to content

Commit a4b8656

Browse files
isPANNclaude
andauthored
feat: add KthLargestMTuple model (#405) (#805)
* feat: add KthLargestMTuple model (issue #405) Add the Kth Largest m-Tuple counting problem (Garey & Johnson MP10). This is the first aggregate-only model using Value = Sum<u64>, which required a fix to the example_db model_specs_are_optimal test to gracefully handle models without witness support. Closes #405 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix formatting after merge conflict resolution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix paper: correct PP-completeness claim and broken solve command - Replace false NP-completeness claim with accurate PP-completeness description citing Haase & Kiefer (2016) - Fix `pred solve` command to use `--solver brute-force` (no ILP path) - Add haase2016 BibTeX entry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Remove unused K field from KthLargestMTuple K was stored but never used in evaluate() — the model is a pure counting problem. The G&J decision version (count >= K?) is noted in the paper but not part of the computational model. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Simplify paper paragraph: remove K references from counting model Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Revert K removal: keep K field for G&J decision formulation The K threshold is needed for the standard PARTITION → KthLargestMTuple reduction (G&J R86). Without K, the counting version has no known many-one reductions — only Turing reductions exist. Retains the paper fixes: PP-completeness claim, --solver brute-force. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 7657a08 commit a4b8656

File tree

9 files changed

+480
-24
lines changed

9 files changed

+480
-24
lines changed

docs/paper/reductions.typ

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,7 @@
199199
"SumOfSquaresPartition": [Sum of Squares Partition],
200200
"TimetableDesign": [Timetable Design],
201201
"TwoDimensionalConsecutiveSets": [2-Dimensional Consecutive Sets],
202+
"KthLargestMTuple": [$K$th Largest $m$-Tuple],
202203
)
203204

204205
// Definition label: "def:<ProblemName>" — each definition block must have a matching label
@@ -4705,6 +4706,32 @@ A classical NP-complete problem from Garey and Johnson @garey1979[Ch.~3, p.~76],
47054706
]
47064707
}
47074708

4709+
#{
4710+
let x = load-model-example("KthLargestMTuple")
4711+
let sets = x.instance.sets
4712+
let k = x.instance.k
4713+
let bound = x.instance.bound
4714+
let config = x.optimal_config
4715+
let m = sets.len()
4716+
// Count qualifying tuples by enumerating the Cartesian product
4717+
let total = sets.fold(1, (acc, s) => acc * s.len())
4718+
[
4719+
#problem-def("KthLargestMTuple")[
4720+
Given $m$ finite sets $X_1, dots, X_m$ of positive integers, a bound $B in ZZ^+$, and a threshold $K in ZZ^+$, count the number of distinct $m$-tuples $(x_1, dots, x_m) in X_1 times dots.c times X_m$ satisfying $sum_(i=1)^m x_i >= B$. The answer is _yes_ iff this count is at least $K$.
4721+
][
4722+
The $K$th Largest $m$-Tuple problem is MP10 in Garey and Johnson's appendix @garey1979. It is _not known to be in NP_, because a "yes" certificate may need to exhibit $K$ qualifying tuples and $K$ can be exponentially large. The problem is PP-complete under polynomial-time Turing reductions @haase2016, though the special case $m = 2$, $K = 1$ is NP-complete via reduction from Subset Sum. In the general case, the only known exact approach is brute-force enumeration of all $product_(i=1)^m |X_i|$ tuples, so the registered catalog complexity is `total_tuples * num_sets`#footnote[No algorithm improving on brute-force is known for the general $K$th Largest $m$-Tuple problem.].
4723+
4724+
*Example.* Let $m = #m$, $B = #bound$, and $K = #k$ with sets #sets.enumerate().map(((i, s)) => [$X_#(i+1) = {#s.map(str).join(", ")}$]).join([, ]). The Cartesian product has $#total$ tuples. For instance, the tuple $(#config.enumerate().map(((i, c)) => str(sets.at(i).at(c))).join(", "))$ has sum $#config.enumerate().map(((i, c)) => sets.at(i).at(c)).sum() >= #bound$, contributing 1 to the count. In total, #k of the #total tuples satisfy the bound, so the answer is _yes_ (count $= K$).
4725+
4726+
#pred-commands(
4727+
"pred create --example KthLargestMTuple -o kth-largest-m-tuple.json",
4728+
"pred solve kth-largest-m-tuple.json --solver brute-force",
4729+
"pred evaluate kth-largest-m-tuple.json --config " + config.map(str).join(","),
4730+
)
4731+
]
4732+
]
4733+
}
4734+
47084735
#{
47094736
let x = load-model-example("SequencingWithReleaseTimesAndDeadlines")
47104737
let n = x.instance.lengths.len()

docs/paper/references.bib

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1455,6 +1455,16 @@ @techreport{plaisted1976
14551455
year = {1976}
14561456
}
14571457

1458+
@article{haase2016,
1459+
author = {Haase, Christoph and Kiefer, Stefan},
1460+
title = {The Complexity of the {K}th Largest Subset Problem and Related Problems},
1461+
journal = {Information Processing Letters},
1462+
volume = {116},
1463+
number = {2},
1464+
pages = {111--115},
1465+
year = {2016}
1466+
}
1467+
14581468
@article{Murty1972,
14591469
author = {Murty, Katta G.},
14601470
title = {A fundamental problem in linear inequalities with applications to the travelling salesman problem},

problemreductions-cli/src/cli.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -249,6 +249,7 @@ Flags by problem type:
249249
ProductionPlanning --num-periods, --demands, --capacities, --setup-costs, --production-costs, --inventory-costs, --cost-bound
250250
SubsetSum --sizes, --target
251251
ThreePartition --sizes, --bound
252+
KthLargestMTuple --sets, --k, --bound
252253
QuadraticDiophantineEquations --coeff-a, --coeff-b, --coeff-c
253254
SumOfSquaresPartition --sizes, --num-groups
254255
ExpectedRetrievalCost --probabilities, --num-sectors

problemreductions-cli/src/commands/create.rs

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ use problemreductions::models::misc::{
2424
AdditionalKey, BinPacking, BoyceCoddNormalFormViolation, CapacityAssignment, CbqRelation,
2525
ConjunctiveBooleanQuery, ConsistencyOfDatabaseFrequencyTables, EnsembleComputation,
2626
ExpectedRetrievalCost, FlowShopScheduling, FrequencyTable, GroupingBySwapping,
27-
JobShopScheduling, KnownValue, LongestCommonSubsequence, MinimumTardinessSequencing,
28-
MultiprocessorScheduling, PaintShop, PartiallyOrderedKnapsack, ProductionPlanning, QueryArg,
29-
RectilinearPictureCompression, ResourceConstrainedScheduling,
27+
JobShopScheduling, KnownValue, KthLargestMTuple, LongestCommonSubsequence,
28+
MinimumTardinessSequencing, MultiprocessorScheduling, PaintShop, PartiallyOrderedKnapsack,
29+
ProductionPlanning, QueryArg, RectilinearPictureCompression, ResourceConstrainedScheduling,
3030
SchedulingWithIndividualDeadlines, SequencingToMinimizeMaximumCumulativeCost,
3131
SequencingToMinimizeWeightedCompletionTime, SequencingToMinimizeWeightedTardiness,
3232
SequencingWithReleaseTimesAndDeadlines, SequencingWithinIntervals, ShortestCommonSupersequence,
@@ -732,6 +732,7 @@ fn example_for(canonical: &str, graph_type: Option<&str>) -> &'static str {
732732
"IntegerKnapsack" => "--sizes 3,4,5,2,7 --values 4,5,7,3,9 --capacity 15",
733733
"SubsetSum" => "--sizes 3,7,1,8,2,4 --target 11",
734734
"ThreePartition" => "--sizes 4,5,6,4,6,5 --bound 15",
735+
"KthLargestMTuple" => "--sets \"2,5,8;3,6;1,4,7\" --k 14 --bound 12",
735736
"QuadraticDiophantineEquations" => "--coeff-a 3 --coeff-b 5 --coeff-c 53",
736737
"BoyceCoddNormalFormViolation" => {
737738
"--n 6 --sets \"0,1:2;2:3;3,4:5\" --target 0,1,2,3,4,5"
@@ -2423,6 +2424,40 @@ pub fn create(args: &CreateArgs, out: &OutputConfig) -> Result<()> {
24232424
)
24242425
}
24252426

2427+
// KthLargestMTuple
2428+
"KthLargestMTuple" => {
2429+
let sets_str = args.sets.as_deref().ok_or_else(|| {
2430+
anyhow::anyhow!(
2431+
"KthLargestMTuple requires --sets, --k, and --bound\n\n\
2432+
Usage: pred create KthLargestMTuple --sets \"2,5,8;3,6;1,4,7\" --k 14 --bound 12"
2433+
)
2434+
})?;
2435+
let k_val = args.k.ok_or_else(|| {
2436+
anyhow::anyhow!(
2437+
"KthLargestMTuple requires --k\n\n\
2438+
Usage: pred create KthLargestMTuple --sets \"2,5,8;3,6;1,4,7\" --k 14 --bound 12"
2439+
)
2440+
})?;
2441+
let bound = args.bound.ok_or_else(|| {
2442+
anyhow::anyhow!(
2443+
"KthLargestMTuple requires --bound\n\n\
2444+
Usage: pred create KthLargestMTuple --sets \"2,5,8;3,6;1,4,7\" --k 14 --bound 12"
2445+
)
2446+
})?;
2447+
let bound = u64::try_from(bound).map_err(|_| {
2448+
anyhow::anyhow!("KthLargestMTuple requires a positive integer --bound")
2449+
})?;
2450+
let sets: Vec<Vec<u64>> = sets_str
2451+
.split(';')
2452+
.map(|group| util::parse_comma_list(group))
2453+
.collect::<Result<_, _>>()?;
2454+
(
2455+
ser(KthLargestMTuple::try_new(sets, k_val as u64, bound)
2456+
.map_err(anyhow::Error::msg)?)?,
2457+
resolved_variant.clone(),
2458+
)
2459+
}
2460+
24262461
// QuadraticDiophantineEquations
24272462
"QuadraticDiophantineEquations" => {
24282463
let a = args.coeff_a.ok_or_else(|| {
Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
//! Kth Largest m-Tuple problem implementation.
2+
//!
3+
//! Given m sets of positive integers and thresholds K and B, count how many
4+
//! distinct m-tuples (one element per set) have total size at least B.
5+
//! The answer is YES iff the count is at least K. Garey & Johnson MP10.
6+
7+
use crate::registry::{FieldInfo, ProblemSchemaEntry, ProblemSizeFieldEntry};
8+
use crate::traits::Problem;
9+
use crate::types::Sum;
10+
use serde::de::Error as _;
11+
use serde::{Deserialize, Deserializer, Serialize};
12+
13+
inventory::submit! {
14+
ProblemSchemaEntry {
15+
name: "KthLargestMTuple",
16+
display_name: "Kth Largest m-Tuple",
17+
aliases: &[],
18+
dimensions: &[],
19+
module_path: module_path!(),
20+
description: "Count m-tuples whose total size meets a bound and compare against a threshold K",
21+
fields: &[
22+
FieldInfo { name: "sets", type_name: "Vec<Vec<u64>>", description: "m sets, each containing positive integer sizes" },
23+
FieldInfo { name: "k", type_name: "u64", description: "Threshold K (answer YES iff count >= K)" },
24+
FieldInfo { name: "bound", type_name: "u64", description: "Lower bound B on tuple sum" },
25+
],
26+
}
27+
}
28+
29+
inventory::submit! {
30+
ProblemSizeFieldEntry {
31+
name: "KthLargestMTuple",
32+
fields: &["num_sets", "total_tuples"],
33+
}
34+
}
35+
36+
/// The Kth Largest m-Tuple problem.
37+
///
38+
/// Given sets `X_1, ..., X_m` of positive integers, a threshold `K`, and a
39+
/// bound `B`, count how many distinct m-tuples `(x_1, ..., x_m)` in
40+
/// `X_1 x ... x X_m` satisfy `sum(x_i) >= B`. The answer is YES iff the
41+
/// count is at least `K`.
42+
///
43+
/// # Representation
44+
///
45+
/// Variable `i` selects an element from set `X_i`, ranging over `{0, ..., |X_i|-1}`.
46+
/// `evaluate` returns `Sum(1)` if the tuple sum >= B, else `Sum(0)`.
47+
/// The aggregate over all configurations gives the total count of qualifying tuples.
48+
///
49+
/// # Example
50+
///
51+
/// ```
52+
/// use problemreductions::models::misc::KthLargestMTuple;
53+
/// use problemreductions::{Problem, Solver, BruteForce};
54+
///
55+
/// let problem = KthLargestMTuple::new(
56+
/// vec![vec![2, 5, 8], vec![3, 6], vec![1, 4, 7]],
57+
/// 14,
58+
/// 12,
59+
/// );
60+
/// let solver = BruteForce::new();
61+
/// let value = solver.solve(&problem);
62+
/// // 14 of the 18 tuples have sum >= 12
63+
/// assert_eq!(value, problemreductions::types::Sum(14));
64+
/// ```
65+
#[derive(Debug, Clone, Serialize)]
66+
pub struct KthLargestMTuple {
67+
sets: Vec<Vec<u64>>,
68+
k: u64,
69+
bound: u64,
70+
}
71+
72+
impl KthLargestMTuple {
73+
fn validate(sets: &[Vec<u64>], k: u64, bound: u64) -> Result<(), String> {
74+
if sets.is_empty() {
75+
return Err("KthLargestMTuple requires at least one set".to_string());
76+
}
77+
if sets.iter().any(|s| s.is_empty()) {
78+
return Err("Every set must be non-empty".to_string());
79+
}
80+
if sets.iter().any(|s| s.contains(&0)) {
81+
return Err("All sizes must be positive (> 0)".to_string());
82+
}
83+
if k == 0 {
84+
return Err("Threshold K must be positive".to_string());
85+
}
86+
if bound == 0 {
87+
return Err("Bound B must be positive".to_string());
88+
}
89+
Ok(())
90+
}
91+
92+
/// Try to create a new KthLargestMTuple instance.
93+
pub fn try_new(sets: Vec<Vec<u64>>, k: u64, bound: u64) -> Result<Self, String> {
94+
Self::validate(&sets, k, bound)?;
95+
Ok(Self { sets, k, bound })
96+
}
97+
98+
/// Create a new KthLargestMTuple instance.
99+
///
100+
/// # Panics
101+
///
102+
/// Panics if the inputs are invalid.
103+
pub fn new(sets: Vec<Vec<u64>>, k: u64, bound: u64) -> Self {
104+
Self::try_new(sets, k, bound).unwrap_or_else(|msg| panic!("{msg}"))
105+
}
106+
107+
/// Returns the sets.
108+
pub fn sets(&self) -> &[Vec<u64>] {
109+
&self.sets
110+
}
111+
112+
/// Returns the threshold K.
113+
pub fn k(&self) -> u64 {
114+
self.k
115+
}
116+
117+
/// Returns the bound B.
118+
pub fn bound(&self) -> u64 {
119+
self.bound
120+
}
121+
122+
/// Returns the number of sets (m).
123+
pub fn num_sets(&self) -> usize {
124+
self.sets.len()
125+
}
126+
127+
/// Returns the total number of m-tuples (product of set sizes).
128+
pub fn total_tuples(&self) -> usize {
129+
self.sets.iter().map(|s| s.len()).product()
130+
}
131+
}
132+
133+
#[derive(Deserialize)]
134+
struct KthLargestMTupleDef {
135+
sets: Vec<Vec<u64>>,
136+
k: u64,
137+
bound: u64,
138+
}
139+
140+
impl<'de> Deserialize<'de> for KthLargestMTuple {
141+
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
142+
where
143+
D: Deserializer<'de>,
144+
{
145+
let data = KthLargestMTupleDef::deserialize(deserializer)?;
146+
Self::try_new(data.sets, data.k, data.bound).map_err(D::Error::custom)
147+
}
148+
}
149+
150+
impl Problem for KthLargestMTuple {
151+
const NAME: &'static str = "KthLargestMTuple";
152+
type Value = Sum<u64>;
153+
154+
fn variant() -> Vec<(&'static str, &'static str)> {
155+
crate::variant_params![]
156+
}
157+
158+
fn dims(&self) -> Vec<usize> {
159+
self.sets.iter().map(|s| s.len()).collect()
160+
}
161+
162+
fn evaluate(&self, config: &[usize]) -> Sum<u64> {
163+
if config.len() != self.num_sets() {
164+
return Sum(0);
165+
}
166+
for (i, &choice) in config.iter().enumerate() {
167+
if choice >= self.sets[i].len() {
168+
return Sum(0);
169+
}
170+
}
171+
let total: u64 = config
172+
.iter()
173+
.enumerate()
174+
.map(|(i, &choice)| self.sets[i][choice])
175+
.sum();
176+
if total >= self.bound {
177+
Sum(1)
178+
} else {
179+
Sum(0)
180+
}
181+
}
182+
}
183+
184+
// Best known: brute-force enumeration of all tuples, O(total_tuples * num_sets).
185+
// No sub-exponential exact algorithm is known for the general case.
186+
crate::declare_variants! {
187+
default KthLargestMTuple => "total_tuples * num_sets",
188+
}
189+
190+
#[cfg(feature = "example-db")]
191+
pub(crate) fn canonical_model_example_specs() -> Vec<crate::example_db::specs::ModelExampleSpec> {
192+
// m=3, X_1={2,5,8}, X_2={3,6}, X_3={1,4,7}, B=12, K=14.
193+
// 14 of 18 tuples have sum >= 12. The config [2,1,2] picks (8,6,7) with sum=21 >= 12.
194+
vec![crate::example_db::specs::ModelExampleSpec {
195+
id: "kth_largest_m_tuple",
196+
instance: Box::new(KthLargestMTuple::new(
197+
vec![vec![2, 5, 8], vec![3, 6], vec![1, 4, 7]],
198+
14,
199+
12,
200+
)),
201+
optimal_config: vec![2, 1, 2],
202+
optimal_value: serde_json::json!(1),
203+
}]
204+
}
205+
206+
#[cfg(test)]
207+
#[path = "../../unit_tests/models/misc/kth_largest_m_tuple.rs"]
208+
mod tests;

src/models/misc/mod.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@ mod flow_shop_scheduling;
7777
mod grouping_by_swapping;
7878
mod job_shop_scheduling;
7979
mod knapsack;
80+
mod kth_largest_m_tuple;
8081
mod longest_common_subsequence;
8182
mod minimum_tardiness_sequencing;
8283
mod multiprocessor_scheduling;
@@ -119,6 +120,7 @@ pub use flow_shop_scheduling::FlowShopScheduling;
119120
pub use grouping_by_swapping::GroupingBySwapping;
120121
pub use job_shop_scheduling::JobShopScheduling;
121122
pub use knapsack::Knapsack;
123+
pub use kth_largest_m_tuple::KthLargestMTuple;
122124
pub use longest_common_subsequence::LongestCommonSubsequence;
123125
pub use minimum_tardiness_sequencing::MinimumTardinessSequencing;
124126
pub use multiprocessor_scheduling::MultiprocessorScheduling;
@@ -186,5 +188,6 @@ pub(crate) fn canonical_model_example_specs() -> Vec<crate::example_db::specs::M
186188
specs.extend(subset_sum::canonical_model_example_specs());
187189
specs.extend(three_partition::canonical_model_example_specs());
188190
specs.extend(cosine_product_integration::canonical_model_example_specs());
191+
specs.extend(kth_largest_m_tuple::canonical_model_example_specs());
189192
specs
190193
}

src/models/mod.rs

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -40,14 +40,15 @@ pub use misc::{
4040
AdditionalKey, BinPacking, CapacityAssignment, CbqRelation, ConjunctiveBooleanQuery,
4141
ConjunctiveQueryFoldability, ConsistencyOfDatabaseFrequencyTables, CosineProductIntegration,
4242
EnsembleComputation, ExpectedRetrievalCost, Factoring, FlowShopScheduling, GroupingBySwapping,
43-
JobShopScheduling, Knapsack, LongestCommonSubsequence, MinimumTardinessSequencing,
44-
MultiprocessorScheduling, PaintShop, Partition, PrecedenceConstrainedScheduling,
45-
ProductionPlanning, QueryArg, RectilinearPictureCompression, ResourceConstrainedScheduling,
46-
SchedulingWithIndividualDeadlines, SequencingToMinimizeMaximumCumulativeCost,
47-
SequencingToMinimizeWeightedCompletionTime, SequencingToMinimizeWeightedTardiness,
48-
SequencingWithReleaseTimesAndDeadlines, SequencingWithinIntervals, ShortestCommonSupersequence,
49-
StackerCrane, StaffScheduling, StringToStringCorrection, SubsetSum, SumOfSquaresPartition,
50-
Term, ThreePartition, TimetableDesign,
43+
JobShopScheduling, Knapsack, KthLargestMTuple, LongestCommonSubsequence,
44+
MinimumTardinessSequencing, MultiprocessorScheduling, PaintShop, Partition,
45+
PrecedenceConstrainedScheduling, ProductionPlanning, QueryArg, RectilinearPictureCompression,
46+
ResourceConstrainedScheduling, SchedulingWithIndividualDeadlines,
47+
SequencingToMinimizeMaximumCumulativeCost, SequencingToMinimizeWeightedCompletionTime,
48+
SequencingToMinimizeWeightedTardiness, SequencingWithReleaseTimesAndDeadlines,
49+
SequencingWithinIntervals, ShortestCommonSupersequence, StackerCrane, StaffScheduling,
50+
StringToStringCorrection, SubsetSum, SumOfSquaresPartition, Term, ThreePartition,
51+
TimetableDesign,
5152
};
5253
pub use set::{
5354
ComparativeContainment, ConsecutiveSets, ExactCoverBy3Sets, IntegerKnapsack, MaximumSetPacking,

0 commit comments

Comments
 (0)