HOL Light: Prove correctness of AVX2 poly_compress_d{4,5,10,11}#1545
HOL Light: Prove correctness of AVX2 poly_compress_d{4,5,10,11}#1545mkannwischer merged 6 commits intomainfrom
poly_compress_d{4,5,10,11}#1545Conversation
poly_compress_d{4,5,10,11}
e6c882e to
3656b1a
Compare
There was a problem hiding this comment.
Mac Mini (M1, 2020) benchmarks
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
12320 cycles |
12321 cycles |
1.00 |
ML-KEM-512 encaps |
14997 cycles |
14998 cycles |
1.00 |
ML-KEM-512 decaps |
19552 cycles |
19558 cycles |
1.00 |
ML-KEM-768 keypair |
21085 cycles |
21085 cycles |
1 |
ML-KEM-768 encaps |
23871 cycles |
23871 cycles |
1 |
ML-KEM-768 decaps |
30413 cycles |
30414 cycles |
1.00 |
ML-KEM-1024 keypair |
30331 cycles |
30332 cycles |
1.00 |
ML-KEM-1024 encaps |
34576 cycles |
34576 cycles |
1 |
ML-KEM-1024 decaps |
44183 cycles |
44186 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
ppc64le (POWER10) benchmarks
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
59347 cycles |
59750 cycles |
0.99 |
ML-KEM-512 encaps |
71831 cycles |
72459 cycles |
0.99 |
ML-KEM-512 decaps |
91670 cycles |
92376 cycles |
0.99 |
ML-KEM-768 keypair |
99543 cycles |
98711 cycles |
1.01 |
ML-KEM-768 encaps |
116314 cycles |
115024 cycles |
1.01 |
ML-KEM-768 decaps |
142257 cycles |
140921 cycles |
1.01 |
ML-KEM-1024 keypair |
148861 cycles |
150328 cycles |
0.99 |
ML-KEM-1024 encaps |
167254 cycles |
168943 cycles |
0.99 |
ML-KEM-1024 decaps |
198649 cycles |
200353 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
28266 cycles |
28233 cycles |
1.00 |
ML-KEM-512 encaps |
34111 cycles |
34106 cycles |
1.00 |
ML-KEM-512 decaps |
44383 cycles |
44329 cycles |
1.00 |
ML-KEM-768 keypair |
47678 cycles |
47626 cycles |
1.00 |
ML-KEM-768 encaps |
53935 cycles |
53954 cycles |
1.00 |
ML-KEM-768 decaps |
68367 cycles |
68377 cycles |
1.00 |
ML-KEM-1024 keypair |
70328 cycles |
70254 cycles |
1.00 |
ML-KEM-1024 encaps |
78732 cycles |
78807 cycles |
1.00 |
ML-KEM-1024 decaps |
98431 cycles |
98438 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'ppc64le (POWER10) benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 3656b1a | Previous: 09c31cb | Ratio |
|---|---|---|---|
ML-KEM-768 keypair |
102020 cycles |
98898 cycles |
1.03 |
ML-KEM-768 encaps |
119213 cycles |
115162 cycles |
1.04 |
ML-KEM-768 decaps |
145398 cycles |
140953 cycles |
1.03 |
ML-KEM-1024 encaps |
172323 cycles |
167223 cycles |
1.03 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 4th gen (c7i)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
9729 cycles |
9523 cycles |
1.02 |
ML-KEM-512 encaps |
11253 cycles |
11346 cycles |
0.99 |
ML-KEM-512 decaps |
15412 cycles |
15424 cycles |
1.00 |
ML-KEM-768 keypair |
16159 cycles |
16308 cycles |
0.99 |
ML-KEM-768 encaps |
17686 cycles |
17956 cycles |
0.98 |
ML-KEM-768 decaps |
23300 cycles |
23550 cycles |
0.99 |
ML-KEM-1024 keypair |
22328 cycles |
22080 cycles |
1.01 |
ML-KEM-1024 encaps |
24724 cycles |
24538 cycles |
1.01 |
ML-KEM-1024 decaps |
32167 cycles |
32304 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Arm Cortex-A55 (Snapdragon 888) benchmarks
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
59739 cycles |
59741 cycles |
1.00 |
ML-KEM-512 encaps |
67387 cycles |
67378 cycles |
1.00 |
ML-KEM-512 decaps |
86118 cycles |
86085 cycles |
1.00 |
ML-KEM-768 keypair |
97313 cycles |
97361 cycles |
1.00 |
ML-KEM-768 encaps |
110826 cycles |
110891 cycles |
1.00 |
ML-KEM-768 decaps |
137504 cycles |
138131 cycles |
1.00 |
ML-KEM-1024 keypair |
154503 cycles |
154743 cycles |
1.00 |
ML-KEM-1024 encaps |
171090 cycles |
171946 cycles |
1.00 |
ML-KEM-1024 decaps |
207552 cycles |
208210 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 4th gen (c7i) (no-opt)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
28228 cycles |
28177 cycles |
1.00 |
ML-KEM-512 encaps |
36504 cycles |
36564 cycles |
1.00 |
ML-KEM-512 decaps |
45112 cycles |
45038 cycles |
1.00 |
ML-KEM-768 keypair |
46171 cycles |
46259 cycles |
1.00 |
ML-KEM-768 encaps |
55570 cycles |
55708 cycles |
1.00 |
ML-KEM-768 decaps |
69905 cycles |
69786 cycles |
1.00 |
ML-KEM-1024 keypair |
70418 cycles |
70313 cycles |
1.00 |
ML-KEM-1024 encaps |
82573 cycles |
82568 cycles |
1.00 |
ML-KEM-1024 decaps |
98950 cycles |
99391 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 4th gen (c7a)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
11718 cycles |
11789 cycles |
0.99 |
ML-KEM-512 encaps |
13192 cycles |
13239 cycles |
1.00 |
ML-KEM-512 decaps |
18077 cycles |
17987 cycles |
1.01 |
ML-KEM-768 keypair |
20535 cycles |
20555 cycles |
1.00 |
ML-KEM-768 encaps |
21202 cycles |
21567 cycles |
0.98 |
ML-KEM-768 decaps |
28659 cycles |
28570 cycles |
1.00 |
ML-KEM-1024 keypair |
27469 cycles |
27727 cycles |
0.99 |
ML-KEM-1024 encaps |
29515 cycles |
29561 cycles |
1.00 |
ML-KEM-1024 decaps |
38943 cycles |
39002 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 3656b1a | Previous: ab4cead | Ratio |
|---|---|---|---|
ML-KEM-512 encaps |
13628 cycles |
13179 cycles |
1.03 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 3rd gen (c6a)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
16869 cycles |
16917 cycles |
1.00 |
ML-KEM-512 encaps |
18563 cycles |
18577 cycles |
1.00 |
ML-KEM-512 decaps |
24072 cycles |
24112 cycles |
1.00 |
ML-KEM-768 keypair |
28441 cycles |
28537 cycles |
1.00 |
ML-KEM-768 encaps |
29595 cycles |
29608 cycles |
1.00 |
ML-KEM-768 decaps |
37495 cycles |
37585 cycles |
1.00 |
ML-KEM-1024 keypair |
41277 cycles |
41664 cycles |
0.99 |
ML-KEM-1024 encaps |
43642 cycles |
44061 cycles |
0.99 |
ML-KEM-1024 decaps |
54086 cycles |
54503 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 3rd gen (c6i)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
16205 cycles |
16153 cycles |
1.00 |
ML-KEM-512 encaps |
18499 cycles |
18553 cycles |
1.00 |
ML-KEM-512 decaps |
25069 cycles |
25091 cycles |
1.00 |
ML-KEM-768 keypair |
28068 cycles |
27480 cycles |
1.02 |
ML-KEM-768 encaps |
28904 cycles |
29770 cycles |
0.97 |
ML-KEM-768 decaps |
39260 cycles |
39310 cycles |
1.00 |
ML-KEM-1024 keypair |
37414 cycles |
37426 cycles |
1.00 |
ML-KEM-1024 encaps |
40747 cycles |
40378 cycles |
1.01 |
ML-KEM-1024 decaps |
55628 cycles |
54074 cycles |
1.03 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 4th gen (c7a) (no-opt)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
36703 cycles |
36593 cycles |
1.00 |
ML-KEM-512 encaps |
43046 cycles |
43062 cycles |
1.00 |
ML-KEM-512 decaps |
55682 cycles |
55696 cycles |
1.00 |
ML-KEM-768 keypair |
58587 cycles |
58622 cycles |
1.00 |
ML-KEM-768 encaps |
67459 cycles |
67562 cycles |
1.00 |
ML-KEM-768 decaps |
84385 cycles |
84375 cycles |
1.00 |
ML-KEM-1024 keypair |
88903 cycles |
88935 cycles |
1.00 |
ML-KEM-1024 encaps |
99096 cycles |
99124 cycles |
1.00 |
ML-KEM-1024 decaps |
120572 cycles |
120521 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton4
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
17676 cycles |
17657 cycles |
1.00 |
ML-KEM-512 encaps |
20595 cycles |
20643 cycles |
1.00 |
ML-KEM-512 decaps |
27072 cycles |
27057 cycles |
1.00 |
ML-KEM-768 keypair |
29929 cycles |
29931 cycles |
1.00 |
ML-KEM-768 encaps |
32733 cycles |
32801 cycles |
1.00 |
ML-KEM-768 decaps |
41981 cycles |
41976 cycles |
1.00 |
ML-KEM-1024 keypair |
43755 cycles |
43750 cycles |
1.00 |
ML-KEM-1024 encaps |
48731 cycles |
48657 cycles |
1.00 |
ML-KEM-1024 decaps |
61388 cycles |
61393 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 3rd gen (c6a) (no-opt)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
40167 cycles |
40245 cycles |
1.00 |
ML-KEM-512 encaps |
48363 cycles |
48414 cycles |
1.00 |
ML-KEM-512 decaps |
62462 cycles |
62512 cycles |
1.00 |
ML-KEM-768 keypair |
63791 cycles |
63653 cycles |
1.00 |
ML-KEM-768 encaps |
74835 cycles |
74828 cycles |
1.00 |
ML-KEM-768 decaps |
93545 cycles |
93492 cycles |
1.00 |
ML-KEM-1024 keypair |
95086 cycles |
95144 cycles |
1.00 |
ML-KEM-1024 encaps |
109123 cycles |
109324 cycles |
1.00 |
ML-KEM-1024 decaps |
131844 cycles |
132082 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton3
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
18659 cycles |
18662 cycles |
1.00 |
ML-KEM-512 encaps |
21885 cycles |
21886 cycles |
1.00 |
ML-KEM-512 decaps |
28878 cycles |
28869 cycles |
1.00 |
ML-KEM-768 keypair |
31590 cycles |
31571 cycles |
1.00 |
ML-KEM-768 encaps |
34743 cycles |
34808 cycles |
1.00 |
ML-KEM-768 decaps |
44810 cycles |
44796 cycles |
1.00 |
ML-KEM-1024 keypair |
46072 cycles |
46156 cycles |
1.00 |
ML-KEM-1024 encaps |
51506 cycles |
51415 cycles |
1.00 |
ML-KEM-1024 decaps |
65017 cycles |
65009 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 3rd gen (c6i) (no-opt)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
45677 cycles |
45683 cycles |
1.00 |
ML-KEM-512 encaps |
54449 cycles |
54200 cycles |
1.00 |
ML-KEM-512 decaps |
69776 cycles |
69776 cycles |
1 |
ML-KEM-768 keypair |
74187 cycles |
74192 cycles |
1.00 |
ML-KEM-768 encaps |
86051 cycles |
86015 cycles |
1.00 |
ML-KEM-768 decaps |
106633 cycles |
106640 cycles |
1.00 |
ML-KEM-1024 keypair |
112097 cycles |
112026 cycles |
1.00 |
ML-KEM-1024 encaps |
124606 cycles |
124528 cycles |
1.00 |
ML-KEM-1024 decaps |
150640 cycles |
150656 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton4 (no-opt)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
35447 cycles |
35449 cycles |
1.00 |
ML-KEM-512 encaps |
40090 cycles |
40183 cycles |
1.00 |
ML-KEM-512 decaps |
51102 cycles |
51185 cycles |
1.00 |
ML-KEM-768 keypair |
56747 cycles |
56701 cycles |
1.00 |
ML-KEM-768 encaps |
64546 cycles |
64956 cycles |
0.99 |
ML-KEM-768 decaps |
79383 cycles |
79034 cycles |
1.00 |
ML-KEM-1024 keypair |
88002 cycles |
87863 cycles |
1.00 |
ML-KEM-1024 encaps |
96984 cycles |
96878 cycles |
1.00 |
ML-KEM-1024 decaps |
116076 cycles |
115822 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton2
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
28266 cycles |
28304 cycles |
1.00 |
ML-KEM-512 encaps |
34180 cycles |
34176 cycles |
1.00 |
ML-KEM-512 decaps |
44384 cycles |
44456 cycles |
1.00 |
ML-KEM-768 keypair |
47627 cycles |
47692 cycles |
1.00 |
ML-KEM-768 encaps |
53950 cycles |
54036 cycles |
1.00 |
ML-KEM-768 decaps |
68359 cycles |
68554 cycles |
1.00 |
ML-KEM-1024 keypair |
70331 cycles |
70252 cycles |
1.00 |
ML-KEM-1024 encaps |
78776 cycles |
78788 cycles |
1.00 |
ML-KEM-1024 decaps |
98509 cycles |
98488 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton3 (no-opt)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
38940 cycles |
38939 cycles |
1.00 |
ML-KEM-512 encaps |
44527 cycles |
44634 cycles |
1.00 |
ML-KEM-512 decaps |
56584 cycles |
56707 cycles |
1.00 |
ML-KEM-768 keypair |
62337 cycles |
62345 cycles |
1.00 |
ML-KEM-768 encaps |
71069 cycles |
71769 cycles |
0.99 |
ML-KEM-768 decaps |
87343 cycles |
87100 cycles |
1.00 |
ML-KEM-1024 keypair |
96356 cycles |
96161 cycles |
1.00 |
ML-KEM-1024 encaps |
106137 cycles |
106143 cycles |
1.00 |
ML-KEM-1024 decaps |
126884 cycles |
126598 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton2 (no-opt)
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
59076 cycles |
59084 cycles |
1.00 |
ML-KEM-512 encaps |
68596 cycles |
68639 cycles |
1.00 |
ML-KEM-512 decaps |
87335 cycles |
87370 cycles |
1.00 |
ML-KEM-768 keypair |
95506 cycles |
95804 cycles |
1.00 |
ML-KEM-768 encaps |
110331 cycles |
109900 cycles |
1.00 |
ML-KEM-768 decaps |
134533 cycles |
135444 cycles |
0.99 |
ML-KEM-1024 keypair |
147674 cycles |
147985 cycles |
1.00 |
ML-KEM-1024 encaps |
163740 cycles |
163868 cycles |
1.00 |
ML-KEM-1024 decaps |
195321 cycles |
195550 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
SpacemiT K1 8 (Banana Pi F3) benchmarks
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
155372 cycles |
155465 cycles |
1.00 |
ML-KEM-512 encaps |
163233 cycles |
163351 cycles |
1.00 |
ML-KEM-512 decaps |
206469 cycles |
206588 cycles |
1.00 |
ML-KEM-768 keypair |
249714 cycles |
249883 cycles |
1.00 |
ML-KEM-768 encaps |
270195 cycles |
270397 cycles |
1.00 |
ML-KEM-768 decaps |
332474 cycles |
332134 cycles |
1.00 |
ML-KEM-1024 keypair |
395444 cycles |
395687 cycles |
1.00 |
ML-KEM-1024 encaps |
423271 cycles |
423463 cycles |
1.00 |
ML-KEM-1024 decaps |
506841 cycles |
507256 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks
Details
| Benchmark suite | Current: 7d9af85 | Previous: 4e44546 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
50770 cycles |
51003 cycles |
1.00 |
ML-KEM-512 encaps |
59441 cycles |
58881 cycles |
1.01 |
ML-KEM-512 decaps |
76264 cycles |
75041 cycles |
1.02 |
ML-KEM-768 keypair |
86971 cycles |
85748 cycles |
1.01 |
ML-KEM-768 encaps |
96215 cycles |
93667 cycles |
1.03 |
ML-KEM-768 decaps |
119562 cycles |
117498 cycles |
1.02 |
ML-KEM-1024 keypair |
130182 cycles |
130677 cycles |
1.00 |
ML-KEM-1024 encaps |
142627 cycles |
143022 cycles |
1.00 |
ML-KEM-1024 decaps |
173949 cycles |
173967 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
CBMC Results (ML-KEM-512)Full Results (185 proofs)
|
CBMC Results (ML-KEM-768)
Full Results (185 proofs)
|
CBMC Results (ML-KEM-1024)Full Results (185 proofs)
|
3e666b0 to
e67fb4b
Compare
e67fb4b to
c6fd37e
Compare
poly_compress_d{4,5,10,11}poly_compress_d{4,5,10,11}
c6fd37e to
56b827d
Compare
56b827d to
7d9af85
Compare
5511a29 to
10d1f9c
Compare
b44f1b7 to
3736083
Compare
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
…sembly Replace the C intrinsics implementation of the four x86_64 AVX2 poly_compress functions with handwritten assembly, enabling verification in HOL Light. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Add functional correctness proofs for the four x86_64 AVX2 poly_compress functions. Each proof verifies the assembly against the compress_d specification from FIPS 203. Co-authored-by: Hanno Becker <beckphan@amazon.co.uk> Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Add subroutine-level constant-time proofs for poly_compress_d{4,5,10,11}.
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
All x86_64 ML-KEM arithmetic intrinsics have been rewritten in assembly and verified, so update the verification claims accordingly. Note that the x86_64 FIPS-202 (Keccak) backend remains in C intrinsics and is not yet covered by proof. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
3736083 to
602e37e
Compare
hanno-becker
left a comment
There was a problem hiding this comment.
Nice teamwork 😃 LGTM.
No description provided.