Skip to content

TEZ-4371 Implement ClientServiceDelegate.getJobCounters#464

Open
lewismc wants to merge 1 commit intoapache:masterfrom
lewismc:TEZ-4371
Open

TEZ-4371 Implement ClientServiceDelegate.getJobCounters#464
lewismc wants to merge 1 commit intoapache:masterfrom
lewismc:TEZ-4371

Conversation

@lewismc
Copy link
Member

@lewismc lewismc commented Mar 13, 2026

After a number of years, I decided to revisit TEZ-4371.
When the Tez client is used as the MapReduce client (via YARNRunner), ClientProtocol.getJobCounters(JobID) was returning empty Counters. This change implements counter retrieval so Tez DAG Counters are exposed as MapReduce Counters, unblocking NUTCH-2839 (e.g. Nutch Injector job counters on Tez).
I added some unit tests to this PR. My local testing continues with Nutch master branch (1.23-SNAPSHOT)

@tez-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 5m 5s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+1 💚 mvninstall 8m 40s master passed
+1 💚 compile 0m 23s master passed
+1 💚 checkstyle 1m 6s master passed
+1 💚 javadoc 0m 30s master passed
+0 🆗 spotbugs 0m 58s tez-mapreduce in master has 124 extant spotbugs warnings.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 13s the patch passed
+1 💚 codespell 0m 25s No new issues.
+1 💚 compile 0m 13s the patch passed
+1 💚 javac 0m 13s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 8s the patch passed
+1 💚 javadoc 0m 11s the patch passed
+1 💚 spotbugs 0m 34s the patch passed
_ Other Tests _
-1 ❌ unit 1m 3s /patch-unit-tez-mapreduce.txt tez-mapreduce in the patch passed.
+1 💚 asflicense 0m 12s The patch does not generate ASF License warnings.
20m 31s
Reason Tests
Failed junit tests tez.mapreduce.client.TestClientServiceDelegate
Subsystem Report/Notes
Docker ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-464/1/artifact/out/Dockerfile
GITHUB PR #464
Optional Tests dupname asflicense javac javadoc unit spotbugs checkstyle codespell detsecrets compile
uname Linux e529f4865a5b 5.15.0-141-generic #151-Ubuntu SMP Sun May 18 21:35:19 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-home/workspace/tez-multibranch_PR-464/src/.yetus/personality.sh
git revision master / 093177b
Default Java Ubuntu-21.0.10+7-Ubuntu-124.04
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-464/1/testReport/
Max. process+thread count 239 (vs. ulimit of 5500)
modules C: tez-mapreduce U: tez-mapreduce
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-464/1/console
versions git=2.43.0 maven=3.8.7 spotbugs=4.9.3 codespell=2.4.1
Powered by Apache Yetus 0.15.1 https://yetus.apache.org

This message was automatically generated.

@abstractdog
Copy link
Contributor

yay! happy to see you working on this, let me find some time to review this next week

tezClient = new MRTezClient("MapReduce", dagAMConf, false, jobLocalResources, ts);
tezClient.start();
dagClient = new MRDAGClient(tezClient.submitDAGApplication(appId, dag));
dagClientMap.put(jobId, dagClient);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried that items are not evicted from dagClientMap, but something similar is happening in case of ClientCache too:

//TODO: evict from the cache on some threshold

maybe we can take care of this in a followup ticket

@abstractdog
Copy link
Contributor

@lewismc : this patch looks good and neat! I'm wondering if you can show any proof of manually testing this? I'm just asking, I remember back in 2020 when NUTCH-2839 was created, I somehow discovered the empty counters problem with an MR job submitted by YARNRunner, I wish I could do the same experiment easily now (of course, haven't made notes back then :D ), so if you already experimented with this, that would help a lot with approving and merging this, thanks in advance!

@lewismc
Copy link
Member Author

lewismc commented Mar 18, 2026

Hi @abstractdog thanks for taking a look. RE: testing, yes that is ongoing and I will follow-up with an end-to-end example which exercises individual Nutch MapReduce jobs running as dual-vertex Tez DAG's.

My test environment (I'll link the docker compose project soon) utilizes the official Apache Hadoop 3.4.3 Docker image which is both interesting and challenging for a few reasons.

  1. Hadoop 3.4.3 cluster Java 11 runtime
  2. Tez 1.0.0-SNAPSHOT Java 21 compile-time dependency (with Java 11 runtime flag), this literally required reintroeucing some Java 11 logic into the codebase... I will go into more detail with a follow-up patch
  3. Nutch 1.23-SNAPHOT which was also compiled with Java 21 with Java 11 runtime flag.

This is somewhat messy but it will be cleared up soon when Hadoop 3.5.0 is released and we can upgrade in both Tez and Nutch and drop Java 11 support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants