Skip to content

Reusable composable metrics#23

Merged
dwerner merged 8 commits intomainfrom
metrics
Nov 4, 2025
Merged

Reusable composable metrics#23
dwerner merged 8 commits intomainfrom
metrics

Conversation

@dwerner
Copy link
Collaborator

@dwerner dwerner commented Oct 30, 2025

No description provided.

- Add SegmentWorkerMetrics base with common metrics
- Add SegmentMetrics trait for composition pattern
- Add BridgeMetrics with gRPC-specific metrics
- Add QueryMetrics with sync-specific metrics
- Types only implement base() to get all methods via trait
- Add WorkerStage enum for worker state tracking
- Add gather_metrics() wrapper to avoid leaking prometheus types
- Add unit tests for metrics registration and duplication detection

This provides composable metrics infrastructure without coupling
phaser-bridge to prometheus, allowing bridge implementations to
choose their observability approach.
- Replace lazy_static metrics with QueryMetrics instances
- Pass metrics through SyncWorkerConfig instead of globals
- Import SegmentMetrics trait from phaser-metrics
- Update service.rs to create metrics with service_name
- Update worker.rs to use instance methods instead of statics
- Track segment_total_duration for per-segment timing
- Use scopeguard with cloned metrics for phase tracking
- Remove prometheus and lazy_static dependencies

Removes 50+ lines of static metric definitions and unnecessary
dependencies. Consumers now use phaser-metrics' gather_metrics()
instead of importing prometheus directly.
The example code referenced methods that don't exist in the
current API. Remove it rather than maintaining outdated examples.
- Add phaser-metrics to workspace dependencies
- Replace lazy_static global metrics with Arc-wrapped BridgeMetrics
- Update ErigonFlightBridge to initialize and hold metrics instance
- Thread metrics through SegmentWorker constructors and async streams
- Update all metric call sites to use trait methods instead of statics
- Handle Result type from gather_metrics in main.rs metrics endpoint
- Maintains same metric names and functionality, just cleaner architecture

Part of metrics refactor: consolidating metrics into trait-based composable pattern
- Replace hyper dependency with axum
- Replace hyper service boilerplate with clean axum Router
- Return proper HTTP 500 errors when metrics gathering fails
- Use standard axum patterns from official examples
- Simplify server binding with format string

Benefits:
- Much simpler and more readable code (~30 lines vs ~15 lines)
- Better error handling with proper status codes
- Follows axum best practices
- Replace hyper service boilerplate with axum Router
- Return proper HTTP 500 errors when metrics gathering fails
- Make sync::metrics module public for metrics server
- Re-export gather_metrics function
- Add metrics_port to PhaserConfig (default: 9092)

Benefits:
- Consistent with erigon-bridge metrics server
- Simpler and more maintainable code
- Proper error handling with status codes
Tracks worker lifecycle across all three phases:

Active workers by phase:
- active_workers_inc/dec for blocks, transactions, logs phases
- Proper increment on phase start, decrement on phase end
- Cleanup on error paths to prevent metric leaks

Phase durations:
- segment_duration for each phase (blocks, transactions, logs)
- Measured from phase start to completion

Stream lifecycle:
- grpc_stream_inc/dec for blocks, transactions, receipts streams
- Proper cleanup on stream errors

Items processed:
- Track blocks, transactions, receipts processed per segment

This provides complete observability of:
- Current worker distribution across phases
- Phase processing times for performance analysis
- Active gRPC stream counts
- Data throughput per segment
- check-quality.sh is a local development script
- .claude/ contains local claude configuration
@dwerner dwerner merged commit e796213 into main Nov 4, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant