-
Notifications
You must be signed in to change notification settings - Fork 245
feat(tracing): Add Store, P2P and Config tracing #2972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add OpenTelemetry tracing for all RPC server methods:
- ExecutorService: InitChain, GetTxs, ExecuteTxs, SetFinal
- StoreService: GetBlock, GetState, GetMetadata, GetGenesisDaHeight, GetP2PStoreInfo
- P2PService: GetPeerInfo, GetNetInfo
- ConfigService: GetNamespace, GetSignerInfo
All implementations follow decorator pattern with comprehensive test coverage.
Tracer names: ev-node/{service}-service
Summary of ChangesHello @chatton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the observability of the system by integrating OpenTelemetry for distributed tracing. It introduces tracing capabilities for the Store, P2P, and Config RPC services, allowing for detailed insights into their execution flow and performance. The tracing can be toggled via configuration, providing flexibility for different environments. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Claude finished @chatton's task in 2m 27s —— View job Code Review CompleteI've performed a comprehensive code review of this PR, which adds OpenTelemetry tracing to the Store, P2P, and Config RPC services. Overall, this is a high-quality implementation with good test coverage and a clean decorator pattern. However, there are several issues that need attention. Critical Issues1. Missing Per-Store Attributes in
|
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces OpenTelemetry tracing for the Store, P2P, and Config services. The implementation uses a decorator pattern to wrap the existing service handlers, which is a good approach. The changes are well-tested, with a new test file covering the tracing logic. I've identified a few areas for improvement: a bug in attribute key generation, an opportunity to simplify the code in one of the traced methods, a suggestion to reduce code duplication across the tracing decorators, and a gap in the test coverage. Overall, this is a great addition for observability.
pkg/rpc/server/tracing.go
Outdated
| ) | ||
| for i, store := range res.Msg.Stores { | ||
| span.SetAttributes( | ||
| attribute.Int64("store_"+string(rune(i))+"_height", int64(store.Height)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bug in how the attribute key is constructed. string(rune(i)) will convert the integer index i into its corresponding Unicode character, not its string representation. For example, for i=0, it will produce a null character (\x00), which is likely not what you intend and may cause issues with tracing backends.
You should use strconv.Itoa(i) to convert the integer to a string. This will require adding an import for the strconv package.
attribute.Int64("store_"+strconv.Itoa(i)+"_height", int64(store.Height)),
pkg/rpc/server/tracing.go
Outdated
| var height uint64 | ||
| switch identifier := req.Msg.Identifier.(type) { | ||
| case *pb.GetBlockRequest_Height: | ||
| height = identifier.Height | ||
| case *pb.GetBlockRequest_Hash: | ||
| // for hash-based queries, we'll add the hash as an attribute | ||
| } | ||
|
|
||
| ctx, span := t.tracer.Start(ctx, "StoreService.GetBlock", | ||
| trace.WithAttributes( | ||
| attribute.Int64("height", int64(height)), | ||
| ), | ||
| ) | ||
| defer span.End() | ||
|
|
||
| // add hash attribute if hash-based query | ||
| if hashIdentifier, ok := req.Msg.Identifier.(*pb.GetBlockRequest_Hash); ok { | ||
| span.SetAttributes(attribute.String("hash", hex.EncodeToString(hashIdentifier.Hash))) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for setting span attributes based on the request identifier can be simplified. Currently, you're using a switch to get the height, and then a separate if to get the hash, which involves checking the type of req.Msg.Identifier twice. This can be done more cleanly in a single switch statement that builds the attributes before starting the span. This will make the code more readable and efficient.
var attrs []attribute.KeyValue
switch identifier := req.Msg.Identifier.(type) {
case *pb.GetBlockRequest_Height:
attrs = append(attrs, attribute.Int64("height", int64(identifier.Height)))
case *pb.GetBlockRequest_Hash:
if identifier.Hash != nil {
attrs = append(attrs, attribute.String("hash", hex.EncodeToString(identifier.Hash)))
}
}
ctx, span := t.tracer.Start(ctx, "StoreService.GetBlock", trace.WithAttributes(attrs...))
defer span.End()| res, err := t.inner.GetState(ctx, req) | ||
| if err != nil { | ||
| span.RecordError(err) | ||
| span.SetStatus(codes.Error, err.Error()) | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code for handling errors from the inner service call and updating the span is repeated in almost every traced method in this file. To improve maintainability and reduce boilerplate, consider extracting this logic into a generic helper function.
For example, you could create a function that takes the context, request, the inner function call, and returns the response and error, handling the span creation and error recording internally. Here's a conceptual example:
func traceUnary[Req, Res any](
tracer trace.Tracer,
ctx context.Context,
spanName string,
req *connect.Request[Req],
call func(context.Context, *connect.Request[Req]) (*connect.Response[Res], error),
// ... other params for attributes
) (*connect.Response[Res], error) {
ctx, span := tracer.Start(ctx, spanName)
defer span.End()
res, err := call(ctx, req)
if err != nil {
span.RecordError(err)
span.SetStatus(codes.Error, err.Error())
return nil, err
}
// ... set success attributes
return res, nil
}Applying this pattern would make the tracing decorators much more concise.
| attrs := span.Attributes() | ||
| requireAttribute(t, attrs, "store_count", 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is missing assertions for the per-store attributes (store_0_height, store_1_height). The current test only checks for store_count. Adding assertions for individual store heights would have caught the attribute key generation bug in GetP2PStoreInfo and would make the test more robust.
attrs := span.Attributes()
requireAttribute(t, attrs, "store_count", 2)
requireAttribute(t, attrs, "store_0_height", int64(100))
requireAttribute(t, attrs, "store_1_height", int64(99))
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2972 +/- ##
==========================================
+ Coverage 57.82% 58.19% +0.37%
==========================================
Files 98 99 +1
Lines 9394 9543 +149
==========================================
+ Hits 5432 5554 +122
- Misses 3358 3379 +21
- Partials 604 610 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to delete
Overview
Adding additional tracing within
NewServiceHandler