-
Notifications
You must be signed in to change notification settings - Fork 16
refactor(libdd-data-pipeline): health metrics #1433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
BenchmarksComparisonBenchmark execution time: 2026-01-22 13:22:23 Comparing candidate commit 4f55a0d in PR branch Found 0 performance improvements and 2 performance regressions! Performance is the same for 55 metrics, 2 unstable metrics. scenario:normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて
CandidateCandidate benchmark detailsGroup 1
Group 2
Group 3
Group 4
Group 5
Group 6
Group 7
Group 8
Group 9
Group 10
Group 11
Group 12
Group 13
Group 14
Group 15
Group 16
Group 17
Group 18
Group 19
BaselineOmitted due to size. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1433 +/- ##
==========================================
+ Coverage 71.11% 71.18% +0.07%
==========================================
Files 417 418 +1
Lines 67037 67239 +202
==========================================
+ Hits 47671 47863 +192
- Misses 19366 19376 +10
🚀 New features to boost your workflow:
|
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-apple-darwin
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-apple-darwin
x86_64-unknown-linux-gnu
|
| .as_ref() | ||
| .and_then(|v| Tag::new("type", v).ok()); | ||
| let custom_tags = type_tag.as_ref().map(|t| vec![t]); | ||
| self.emit(metric, custom_tags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of emitting each metric can we batch them? You'd need to refactor emit to support this, but the DogstatsD client should support it just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not mistaken, we'd need to go by DogStatsDActionOwned to have the same iterator type to collect (required by generic V of DogStatsDAction). So we either do more copies/allocations, or more send. Given that we are talking about 3 to 6 metrics per SendResult, I tend to think the simplicity of this code is preferable to what amounts to early optimization. But I'm fine doing the change if you think it's preferable o/
| /// - The second element is an optional tag value for error classification | ||
| pub(crate) fn collect_metrics(&self) -> Vec<(HealthMetric, Option<String>)> { | ||
| // Max capacity: 3 base + 1 outcome + 2 dropped | ||
| let mut metrics = Vec::with_capacity(6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a Vec? Can we use the visitor pattern and only do a vec allocation at the very end in emit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it would be ever so slightly better, but at the cost of a lot of complexity. My take on these kind of things is more often than not to not touch it unless we know it is significant. But as the other comment, if you feel strongly about it, I'm fine with implementing it o/
| // Emit failed metric with type tag | ||
| metrics.push(( | ||
| HealthMetric::Count(TRANSPORT_TRACES_FAILED, 1), | ||
| Some(error_type.as_tag_value().into_owned()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this have to be into_owned() at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done o/
| .unwrap_or_else(|_| tag!("type", "unknown")); | ||
| self.emit_metric( | ||
| HealthMetric::Count(health_metrics::TRANSPORT_TRACES_FAILED, 1), | ||
| Some(vec![&resp_tag, &type_tag]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why were we sending the status code in two different tags, and why is it ok that we're only sending it in one tag now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I didn't catch that when refactoring, I just saw there was a codepath for the status code and handled it in the refactored types. And I immediately can't figure out if it's relevant or not, I'll look for some kind of spec.
What does this PR do?
Refactor health metrics emission via common API.
Motivation
Logic was duplicated and would have required to be duplicated again if another usecase arose. Now the core logic can be integrated elsewhere more easiely.
Additional Notes
I don't have the full context of this part of the code, I may have blindspots or problems in the implementation.
How to test the change?
I (mostly AI) added a bunch of tests (maybe too much of them even ?) to validate the changes.