feat(ingest): local Maple binary (maple start) — OTLP ingest + embedded chDB#64
Draft
Makisuo wants to merge 2 commits into
Draft
feat(ingest): local Maple binary (maple start) — OTLP ingest + embedded chDB#64Makisuo wants to merge 2 commits into
maple start) — OTLP ingest + embedded chDB#64Makisuo wants to merge 2 commits into
Conversation
Ingest Rust Test + Benchmark ResultsCommit: Load Benchmark —
|
| Metric | main (median) | PR (median) | Delta |
|---|---|---|---|
| Requests/sec | 1915.22 | 2101.60 | +9.7% better |
| Rows/sec | 19152.17 | 21016.05 | +9.7% better |
| p50 latency | 32.29 ms | 29.88 ms | -7.5% better |
| p95 latency | 39.04 ms | 34.26 ms | -12.2% better |
| p99 latency | 40.24 ms | 42.02 ms | +4.4% worse |
| Export catch-up | 0.026 s | 0.026 s | -0.5% better |
| Max RSS | 103.05 MiB | 98.38 MiB | -4.5% better |
| Failures | 0 | 0 | same |
Same code path on both sides (same LOAD_TEST_INGEST_MODE), so the delta column is meaningful. Numbers come from ubuntu-latest, which is noisy — treat single-digit-percent deltas as noise.
PR load benchmark JSON (per-iteration)
[
{
"ingest_mode": "tinybird",
"requests": 2000,
"successes": 2000,
"failures": 0,
"rows_sent": 20000,
"rows_exported": 20000,
"imports": 27,
"duration_seconds": 1.050530852,
"export_catchup_seconds": 0.026313127,
"request_rps": 1903.799394556001,
"row_rps": 19037.99394556001,
"p50_ms": 32.175,
"p95_ms": 39.655,
"p99_ms": 42.022,
"max_rss_mb": 101.7421875,
"max_cpu_percent": 62.5,
"avg_cpu_percent": 47.03333333333333
},
{
"ingest_mode": "tinybird",
"requests": 2000,
"successes": 2000,
"failures": 0,
"rows_sent": 20000,
"rows_exported": 20000,
"imports": 24,
"duration_seconds": 0.941078326,
"export_catchup_seconds": 0.025968481,
"request_rps": 2125.221615187852,
"row_rps": 21252.21615187852,
"p50_ms": 28.951,
"p95_ms": 33.652,
"p99_ms": 43.952,
"max_rss_mb": 98.375,
"max_cpu_percent": 69.6,
"avg_cpu_percent": 43.099999999999994
},
{
"ingest_mode": "tinybird",
"requests": 2000,
"successes": 2000,
"failures": 0,
"rows_sent": 20000,
"rows_exported": 20000,
"imports": 24,
"duration_seconds": 0.951653758,
"export_catchup_seconds": 0.026092371,
"request_rps": 2101.604688876771,
"row_rps": 21016.046888767713,
"p50_ms": 29.88,
"p95_ms": 34.262,
"p99_ms": 39.114,
"max_rss_mb": 97.703125,
"max_cpu_percent": 67.8,
"avg_cpu_percent": 42.2
}
]main load benchmark JSON (per-iteration)
[
{
"ingest_mode": "tinybird",
"requests": 2000,
"successes": 2000,
"failures": 0,
"rows_sent": 20000,
"rows_exported": 20000,
"imports": 25,
"duration_seconds": 1.086509461,
"export_catchup_seconds": 0.025871174,
"request_rps": 1840.7570958095873,
"row_rps": 18407.570958095876,
"p50_ms": 34.35,
"p95_ms": 39.041,
"p99_ms": 40.245,
"max_rss_mb": 103.05078125,
"max_cpu_percent": 60.7,
"avg_cpu_percent": 45.5
},
{
"ingest_mode": "tinybird",
"requests": 2000,
"successes": 2000,
"failures": 0,
"rows_sent": 20000,
"rows_exported": 20000,
"imports": 26,
"duration_seconds": 1.04426806,
"export_catchup_seconds": 0.026219733,
"request_rps": 1915.217056432809,
"row_rps": 19152.170564328087,
"p50_ms": 32.294,
"p95_ms": 39.253,
"p99_ms": 54.175,
"max_rss_mb": 104.3359375,
"max_cpu_percent": 66.0,
"avg_cpu_percent": 48.0
},
{
"ingest_mode": "tinybird",
"requests": 2000,
"successes": 2000,
"failures": 0,
"rows_sent": 20000,
"rows_exported": 20000,
"imports": 25,
"duration_seconds": 1.001076635,
"export_catchup_seconds": 0.027458026,
"request_rps": 1997.8490457925832,
"row_rps": 19978.490457925833,
"p50_ms": 31.151,
"p95_ms": 35.031,
"p99_ms": 35.913,
"max_rss_mb": 102.89453125,
"max_cpu_percent": 67.8,
"avg_cpu_percent": 54.96666666666666
}
]WAL-acked microbench (cargo bench --bench ingest_bench)
Compiling maple-ingest v0.1.0 (/home/runner/work/maple/maple/apps/ingest)
Finished `bench` profile [optimized] target(s) in 33.16s
Running benches/ingest_bench.rs (target/release/deps/ingest_bench-56e0fe315b7f3811)
Gnuplot not found, using plotters backend
test ingest_accept/logs_10_rows_wal_ack ... bench: 553592 ns/iter (+/- 8891)
test ingest_accept/traces_10_spans_wal_ack ... bench: 582302 ns/iter (+/- 22359)
cargo test
Updating crates.io index
Compiling maple-ingest v0.1.0 (/home/runner/work/maple/maple/apps/ingest)
Finished `test` profile [unoptimized + debuginfo] target(s) in 5.92s
Running unittests src/lib.rs (target/debug/deps/maple_ingest-8b9e9fc61a910385)
running 22 tests
test otel::tests::build_resource_sets_runtime_and_sdk_type ... ok
test telemetry::tests::apply_attribute_mappings_rewrites_span_attributes ... ok
test telemetry::tests::hex_empty_for_zero_ids ... ok
test telemetry::tests::log_encoder_matches_tinybird_row_shape ... ok
test telemetry::tests::logs_emit_exactly_the_jsonpaths_declared_in_datasources_ts ... ok
test telemetry::tests::custom_datasource_names_propagate_to_frames ... ok
test telemetry::tests::logs_severity_text_falls_back_to_mapped_number ... ok
test telemetry::tests::logs_use_observed_time_when_time_unix_nano_is_zero ... ok
test telemetry::tests::metric_encoder_matches_all_tinybird_datasource_shapes ... ok
test telemetry::tests::metrics_summary_data_points_are_dropped ... ok
test telemetry::tests::metrics_emit_exactly_the_jsonpaths_declared_in_datasources_ts ... ok
test telemetry::tests::sampling_keeps_errors_even_when_ratio_low ... ok
test telemetry::tests::timestamp_has_nano_precision ... ok
test telemetry::tests::timestamps_match_clickhouse_datetime64_nine_format ... ok
test telemetry::tests::trace_encoder_matches_tinybird_row_shape ... ok
test telemetry::tests::traces_emit_exactly_the_jsonpaths_declared_in_datasources_ts ... ok
test telemetry::tests::wal_partial_drain_advances_cursor_without_truncating ... ok
test telemetry::tests::wal_round_trips_frame ... ok
test telemetry::tests::wal_truncates_after_full_drain_allowing_further_appends ... ok
test telemetry::tests::pipeline_e2e_exports_traces_to_fake_tinybird ... ok
test telemetry::tests::pipeline_e2e_exports_gzip_ndjson_to_fake_tinybird ... ok
test telemetry::tests::pipeline_e2e_exports_metrics_to_fake_tinybird ... ok
test result: ok. 22 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.14s
Running unittests src/bin/load_test.rs (target/debug/deps/load_test-3ae74910c06cd17d)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running unittests src/main.rs (target/debug/deps/maple_ingest-c2f428c94e6b99e8)
running 18 tests
test tests::cloudflare_log_record_maps_body_severity_and_attributes ... ok
test tests::cloudflare_timestamps_support_rfc3339_unix_and_unix_nano ... ok
test tests::cloudflare_validation_payload_is_detected ... ok
test tests::cloudflare_ndjson_payload_parses_multiple_records ... ok
test tests::d1_response_parses_empty_results_as_no_match ... ok
test tests::d1_response_parses_failure_with_errors ... ok
test tests::d1_truthy_accepts_int_and_bool_self_managed ... ok
test tests::enrichment_overwrites_tenant_fields ... ok
test tests::extract_ingest_key_returns_sentinel_literal_unchanged ... ok
test tests::d1_response_parses_success_with_rows ... ok
test tests::hash_is_deterministic ... ok
test tests::non_self_managed_goes_to_shared_pool ... ok
test tests::resolve_ingest_key_returns_none_when_hash_missing ... ok
test tests::self_managed_degrades_to_shared_when_endpoint_unset ... ok
test tests::resolve_ingest_key_returns_self_managed_false_when_no_settings_row ... ok
test tests::self_managed_goes_to_self_managed_pool_when_configured ... ok
test tests::sentinel_token_matches_only_exact_literal ... ok
test tests::resolve_ingest_key_returns_self_managed_true_when_active_settings_row ... ok
test result: ok. 18 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.01s
Doc-tests maple_ingest
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Adds `maple start`, a standalone single-binary local mode: OTLP/HTTP
ingest into an embedded in-process ClickHouse (chDB), reusing the
production OTLP→NDJSON encoders and the generated ClickHouse schema so
local rows are shaped identically to cloud. Single-tenant (OrgId="local").
- chdb module: one dedicated writer thread owns the chDB session; all
bootstrap/insert/query is funneled through it (chDB is single-owner).
- new `maple` bin gated behind the `local` cargo feature so the
production maple-ingest build never links libchdb; clap CLI, Axum
routes, rust-embed SPA fallback.
- telemetry::encode_local_{traces,logs,metrics} wrap the private encoders
for zero row-mapping divergence with the Tinybird path.
- schema codegen: emit local-schema.sql + local-inserts.json from the
Tinybird manifest, wired into the clickhouse:schema task.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2566c68 to
ba7db11
Compare
Pre-existing regression on `main` (introduced by ac723de "feat: fix some react stuff", which rewrote this provider to use createElement). main's CI is red on the same error; this branch inherits it via rebase onto main. AutocompleteValuesProvider's `children` prop is required, so React 19's createElement overload requires it in the props object — passing it as the variadic 3rd arg leaves the required prop unsatisfied (TS2769). Move children into the props object. Functionally identical; unblocks @maple/web typecheck. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 of the lightweight local Maple effort: a standalone single-binary local mode started with
maple start. It runs an OTLP/HTTP ingest endpoint backed by an embedded in-process ClickHouse (chDB) — no separate server, no Docker — and reuses the production OTLP→NDJSON encoders and the generated ClickHouse schema, so local rows are shaped identically to cloud rows. Single-tenant: every row is pinned toOrgId = "local".This PR is backend only (Phases 3–5 — component extraction, the local SPA, and cross-platform packaging — are deferred). The bundled UI is a placeholder page.
What's here
chdbmodule (apps/ingest/src/chdb.rs) — chDB is single-owner (one OS thread per data dir), so a dedicated writer thread owns the session and all bootstrap / insert / query work is funneled to it over a channel. Schema bootstrap viaArg::MultiQuery; inserts viaINSERT … SELECT … FROM format(JSONEachRow, …)with the org pinned.maplebin (apps/ingest/src/bin/local.rs) — gated behind a newlocalcargo feature so the productionmaple-ingestbuild never linkslibchdb(~319 MB). clap CLI (start --port --data-dir), Axum routes (/v1/{traces,logs,metrics},/local/query,/health),rust-embedSPA fallback.telemetry::encode_local_{traces,logs,metrics}— thin wrappers over the existing private encoders for zero row-mapping divergence with the Tinybird path.generate-clickhouse-schema-sql.ts+generate-clickhouse-insert-mappings.tsemitlocal-schema.sql+local-inserts.jsonfrom the Tinybird manifest, wired into theclickhouse:schematask. (Includes thedb2debb5revision bump the artifacts embed.)Validated
trace_list_mv,service_map_spans,traces_aggregates_hourlyAggregatingMergeTree) and computed DEFAULT columns populate (IsEntryPoint,SampleRate)./local/queryreturns JSON arrays; empty →[], bad SQL → 500.cargo check --libstays clean (no libchdb without--features local).Not yet (follow-ups)
apps/web→packages/uiapps/local-uiSPA, bundled into the binarylibchdblinking / rpath (DYLD_LIBRARY_PATHcurrently required at runtime) + macOS codesigningTest plan
cargo build --features local -p maple-ingest --bin maplemaple start, send OTLP via an instrumented app,POST /local/querywithSELECT count() FROM tracesbun run clickhouse:schema:checkpasses🤖 Generated with Claude Code