Skip to content

Commit a808079

Browse files
committed
adding an XOR/Fuse benchmark
1 parent b1b347d commit a808079

File tree

3 files changed

+232
-0
lines changed

3 files changed

+232
-0
lines changed

README.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,5 +87,72 @@ and with less than 1% probability "Found" or "Found; common".
8787

8888
Internally, the tool uses a xor+ filter (see above) with 8 bits per fingerprint. Actually, 1024 smaller filters (segments) are made, the segment id being the highest 10 bits of the key. The lowest bit of the key is set to either 0 (regular) or 1 (common), and so two lookups are made per password. Because of that, the false positive rate is twice of what it would be with just one lookup (0.0078 instead of 0.0039). A regular Bloom filter with the same guarantees would be ~760 MB. For each lookup, one filter segment (so, less than 1 MB) are read from the file.
8989

90+
## Benchmarks
91+
92+
The project includes JMH (Java Microbenchmark Harness) benchmarks to measure the performance of the filters.
93+
94+
### Running Benchmarks
95+
96+
#### Option 1: Run via Maven (recommended)
97+
98+
To run the benchmarks directly from Maven (with minimal iterations for quick testing):
99+
100+
mvn -pl jmh clean package exec:exec@run-benchmarks
101+
102+
For full benchmarks, modify the pom.xml or run the JAR manually with custom parameters.
103+
104+
This will compile and execute the JMH benchmarks for the XOR filters (XOR_8, XOR_16, XOR_BINARY_FUSE_8, XOR_BINARY_FUSE_16).
105+
106+
#### Option 2: Run the JAR manually
107+
108+
First, build the project:
109+
110+
mvn clean package
111+
112+
Then run the benchmarks:
113+
114+
java -jar jmh/target/benchmarks.jar org.fastfilter.FilterBenchmark
115+
116+
To run benchmarks for a specific filter type:
117+
118+
java -jar jmh/target/benchmarks.jar org.fastfilter.FilterBenchmark -p filterType=XOR_BINARY_FUSE_8
119+
120+
Available filter types: `XOR_8`, `XOR_16`, `XOR_BINARY_FUSE_8`, `XOR_BINARY_FUSE_16`.
121+
122+
### Benchmark Details
123+
124+
The benchmarks measure:
125+
- Average time per operation (nanoseconds) for lookups of existing and non-existing keys
126+
- Throughput (operations per second) for the same operations
127+
- False positive rate validation
128+
129+
130+
Possible results:
131+
132+
```
133+
134+
Benchmark (filterType) Mode Cnt Score Error Units
135+
FilterBenchmark.benchmarkContainsExistingThroughput XOR_8 thrpt 412364492,755 ops/s
136+
FilterBenchmark.benchmarkContainsExistingThroughput XOR_16 thrpt 397627818,837 ops/s
137+
FilterBenchmark.benchmarkContainsExistingThroughput XOR_BINARY_FUSE_8 thrpt 516262004,459 ops/s
138+
FilterBenchmark.benchmarkContainsExistingThroughput XOR_BINARY_FUSE_16 thrpt 489256453,340 ops/s
139+
FilterBenchmark.benchmarkContainsNonExistingThroughput XOR_8 thrpt 429856367,135 ops/s
140+
FilterBenchmark.benchmarkContainsNonExistingThroughput XOR_16 thrpt 441042890,257 ops/s
141+
FilterBenchmark.benchmarkContainsNonExistingThroughput XOR_BINARY_FUSE_8 thrpt 533609392,046 ops/s
142+
FilterBenchmark.benchmarkContainsNonExistingThroughput XOR_BINARY_FUSE_16 thrpt 540058414,150 ops/s
143+
FilterBenchmark.benchmarkContainsExisting XOR_8 avgt 2,475 ns/op
144+
FilterBenchmark.benchmarkContainsExisting XOR_16 avgt 2,522 ns/op
145+
FilterBenchmark.benchmarkContainsExisting XOR_BINARY_FUSE_8 avgt 1,965 ns/op
146+
FilterBenchmark.benchmarkContainsExisting XOR_BINARY_FUSE_16 avgt 2,060 ns/op
147+
FilterBenchmark.benchmarkContainsNonExisting XOR_8 avgt 2,347 ns/op
148+
FilterBenchmark.benchmarkContainsNonExisting XOR_16 avgt 2,295 ns/op
149+
FilterBenchmark.benchmarkContainsNonExisting XOR_BINARY_FUSE_8 avgt 1,892 ns/op
150+
FilterBenchmark.benchmarkContainsNonExisting XOR_BINARY_FUSE_16 avgt 1,903 ns/op
151+
```
152+
153+
This indicates that we can issue about half a billion queries per second, and sustain a rate of about 2 ns per query.
154+
155+
The benchmarks use 1,000,000 keys by default. You can modify the `NUM_KEYS` constant in `FilterBenchmark.java` for smaller/larger test sets.
156+
90157

91158

jmh/pom.xml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,13 @@
3939
<fork>true</fork>
4040
<source>${maven.compiler.source}</source>
4141
<target>${maven.compiler.target}</target>
42+
<annotationProcessorPaths>
43+
<path>
44+
<groupId>org.openjdk.jmh</groupId>
45+
<artifactId>jmh-generator-annprocess</artifactId>
46+
<version>${jmh.version}</version>
47+
</path>
48+
</annotationProcessorPaths>
4249
<showDeprecation>true</showDeprecation>
4350
<failOnError>true</failOnError>
4451
<showWarnings>true</showWarnings>
@@ -67,6 +74,34 @@
6774
</execution>
6875
</executions>
6976
</plugin>
77+
78+
<plugin>
79+
<groupId>org.codehaus.mojo</groupId>
80+
<artifactId>exec-maven-plugin</artifactId>
81+
<version>3.1.0</version>
82+
<executions>
83+
<execution>
84+
<id>run-benchmarks</id>
85+
<goals>
86+
<goal>exec</goal>
87+
</goals>
88+
<configuration>
89+
<executable>java</executable>
90+
<arguments>
91+
<argument>-jar</argument>
92+
<argument>${project.build.directory}/benchmarks.jar</argument>
93+
<argument>org.fastfilter.FilterBenchmark</argument>
94+
<argument>-f</argument>
95+
<argument>1</argument>
96+
<argument>-wi</argument>
97+
<argument>1</argument>
98+
<argument>-i</argument>
99+
<argument>1</argument>
100+
</arguments>
101+
</configuration>
102+
</execution>
103+
</executions>
104+
</plugin>
70105
</plugins>
71106
</build>
72107

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
package org.fastfilter;
2+
3+
import org.openjdk.jmh.annotations.*;
4+
import org.openjdk.jmh.infra.Blackhole;
5+
import org.openjdk.jmh.runner.Runner;
6+
import org.openjdk.jmh.runner.RunnerException;
7+
import org.openjdk.jmh.runner.options.Options;
8+
import org.openjdk.jmh.runner.options.OptionsBuilder;
9+
10+
import org.fastfilter.Filter;
11+
import org.fastfilter.xor.Xor8;
12+
import org.fastfilter.xor.Xor16;
13+
import org.fastfilter.xor.XorBinaryFuse8;
14+
import org.fastfilter.xor.XorBinaryFuse16;
15+
16+
import java.util.concurrent.TimeUnit;
17+
18+
@BenchmarkMode(Mode.AverageTime)
19+
@OutputTimeUnit(TimeUnit.NANOSECONDS)
20+
@Warmup(iterations = 3, time = 1, timeUnit = TimeUnit.SECONDS)
21+
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
22+
@Fork(1)
23+
@State(Scope.Benchmark)
24+
public class FilterBenchmark {
25+
26+
@Param({"XOR_8", "XOR_16", "XOR_BINARY_FUSE_8", "XOR_BINARY_FUSE_16"})
27+
public String filterType;
28+
29+
private Filter filter;
30+
private long[] testKeys;
31+
private final int NUM_KEYS = 1_000_000;
32+
33+
@Setup
34+
public void setup() {
35+
// Create 1,000,000 keys (even numbers)
36+
testKeys = new long[NUM_KEYS];
37+
for (int i = 0; i < testKeys.length; i++) {
38+
testKeys[i] = (long) i * 2L; // even numbers
39+
}
40+
41+
try {
42+
switch (filterType) {
43+
case "XOR_8":
44+
filter = Xor8.construct(testKeys);
45+
break;
46+
case "XOR_16":
47+
filter = Xor16.construct(testKeys);
48+
break;
49+
case "XOR_BINARY_FUSE_8":
50+
filter = XorBinaryFuse8.construct(testKeys);
51+
break;
52+
case "XOR_BINARY_FUSE_16":
53+
filter = XorBinaryFuse16.construct(testKeys);
54+
break;
55+
default:
56+
throw new IllegalArgumentException("Unknown filter type: " + filterType);
57+
}
58+
} catch (Throwable e) {
59+
throw new RuntimeException(e);
60+
}
61+
}
62+
63+
@TearDown
64+
public void tearDown() {
65+
filter = null;
66+
testKeys = null;
67+
}
68+
69+
@Benchmark
70+
@OperationsPerInvocation(NUM_KEYS)
71+
public void benchmarkContainsExisting(Blackhole blackhole) throws Throwable {
72+
for (long key : testKeys) {
73+
if (!filter.mayContain(key)) {
74+
throw new RuntimeException("Key should exist: " + key);
75+
}
76+
}
77+
}
78+
79+
@Benchmark
80+
@OperationsPerInvocation(NUM_KEYS)
81+
public void benchmarkContainsNonExisting(Blackhole blackhole) throws Throwable {
82+
int fp = 0;
83+
for (int i = 0; i < testKeys.length; i++) {
84+
long key = (long) i * 2L + 1L; // odd numbers
85+
if (filter.mayContain(key)) {
86+
fp++;
87+
}
88+
}
89+
if (fp > 10000) {
90+
throw new RuntimeException("Too many false positives: " + fp);
91+
}
92+
}
93+
94+
@Benchmark
95+
@BenchmarkMode(Mode.Throughput)
96+
@OutputTimeUnit(TimeUnit.SECONDS)
97+
@OperationsPerInvocation(NUM_KEYS)
98+
public void benchmarkContainsExistingThroughput(Blackhole blackhole) throws Throwable {
99+
for (long key : testKeys) {
100+
if (!filter.mayContain(key)) {
101+
throw new RuntimeException("Key should exist: " + key);
102+
}
103+
}
104+
}
105+
106+
@Benchmark
107+
@BenchmarkMode(Mode.Throughput)
108+
@OutputTimeUnit(TimeUnit.SECONDS)
109+
@OperationsPerInvocation(NUM_KEYS)
110+
public void benchmarkContainsNonExistingThroughput(Blackhole blackhole) throws Throwable {
111+
int fp = 0;
112+
for (int i = 0; i < testKeys.length; i++) {
113+
long key = (long) i * 2L + 1L; // odd numbers
114+
if (filter.mayContain(key)) {
115+
fp++;
116+
}
117+
}
118+
if (fp > 10000) {
119+
throw new RuntimeException("Too many false positives: " + fp);
120+
}
121+
}
122+
123+
public static void main(String[] args) throws RunnerException {
124+
Options opt = new OptionsBuilder()
125+
.include(FilterBenchmark.class.getSimpleName())
126+
.build();
127+
128+
new Runner(opt).run();
129+
}
130+
}

0 commit comments

Comments
 (0)