HBASE-30134: Improve CacheAwareLoadBalancer to consider low cache ratio when calculating imbalance#8197
Open
wchevreuil wants to merge 4 commits intoapache:masterfrom
Open
HBASE-30134: Improve CacheAwareLoadBalancer to consider low cache ratio when calculating imbalance#8197wchevreuil wants to merge 4 commits intoapache:masterfrom
wchevreuil wants to merge 4 commits intoapache:masterfrom
Conversation
…io when calculating imbalance Change-Id: I0880f137e2e9efe950de021a9e832e240378f9c7
Change-Id: Ie87a60c1a822682786ca6b2bc9ce4b0a86af54ee
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Improves cache-aware balancing by incorporating reported block cache free space and “cold data” region metrics into CacheAwareLoadBalancer’s imbalance calculation, with accompanying protocol/client/server wiring and tests.
Changes:
- Extend ServerLoad/ServerMetrics to report block cache free bytes and per-region cold data size.
- Update CacheAwareLoadBalancer cost computation to consider low cache ratios alongside free cache capacity and cold-data adjustments.
- Add/adjust unit tests and constructors to pass the new metrics through BalancerClusterState.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestCacheAwareLoadBalancerCostFunctions.java | Adds tests and helpers for needsBalance behavior with low cache ratio + free cache space. |
| hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestRegionsRecoveryChore.java | Updates ServerMetrics test stub to implement new cold-data API. |
| hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java | Populates ServerLoad with cacheFreeSize and regionColdData. |
| hbase-protocol-shaded/src/main/protobuf/server/ClusterStatus.proto | Adds new ServerLoad fields for cacheFreeSize and regionColdData. |
| hbase-client/src/main/java/org/apache/hadoop/hbase/ServerMetricsBuilder.java | Parses/serializes new metrics into ServerMetrics and ServerLoad. |
| hbase-client/src/main/java/org/apache/hadoop/hbase/ServerMetrics.java | Adds new ServerMetrics APIs for cacheFreeSize and regionColdDataSize. |
| hbase-balancer/src/test/java/org/apache/hadoop/hbase/master/balancer/TestStoreFileTableSkewCostFunction.java | Updates BalancerClusterState ctor usage for new parameter. |
| hbase-balancer/src/test/java/org/apache/hadoop/hbase/master/balancer/CandidateGeneratorTestUtil.java | Updates BalancerClusterState ctor usage for new parameter. |
| hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java | Introduces overridable createState() factory for BalancerClusterState. |
| hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/CacheAwareLoadBalancer.java | Uses cacheFreeSize + cold-data metrics in cache cost heuristics and state creation. |
| hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java | Updates BalancerClusterState ctor usage for new parameter. |
| hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalancerConditionals.java | Updates BalancerClusterState ctor usage for new parameter. |
| hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalancerClusterState.java | Stores per-server cache free bytes and exposes observed region cache ratio helper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+354
to
+357
| private static List<ServerName> | ||
| getServersInInsertionOrder(Map<ServerName, List<RegionInfo>> cluster) { | ||
| return new ArrayList<>(cluster.keySet()); | ||
| } |
Contributor
Author
There was a problem hiding this comment.
Well, we have entire control over the map implementation instance passed from our tests, which is a LinkedHashMap, so no need to worry about this here.
Comment on lines
+337
to
+339
| * Unallocated block cache capacity on this RegionServer, in bytes. | ||
| * Used by the master for cache-aware load balancing (optional). | ||
| */ |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Change-Id: I2d35abc09cae1c0e9a7d758b90772d4a0c2abb93
Comment on lines
+597
to
+600
| * If this region is cold in metrics and at least one RS (including its current host) reports | ||
| * enough free block cache to hold it, return an optimistic weighted cache score ({@link | ||
| * #potentialCacheRatioAfterMove} * region MB) so placement is not considered optimal solely | ||
| * from low ratios when capacity exists somewhere in the cluster. |
Comment on lines
+337
to
+339
| * Unallocated block cache capacity on this RegionServer, in bytes. | ||
| * Used by the master for cache-aware load balancing (optional). | ||
| */ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.