TEZ-4371 Implement ClientServiceDelegate.getJobCounters#464
TEZ-4371 Implement ClientServiceDelegate.getJobCounters#464lewismc wants to merge 1 commit intoapache:masterfrom
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
yay! happy to see you working on this, let me find some time to review this next week |
| tezClient = new MRTezClient("MapReduce", dagAMConf, false, jobLocalResources, ts); | ||
| tezClient.start(); | ||
| dagClient = new MRDAGClient(tezClient.submitDAGApplication(appId, dag)); | ||
| dagClientMap.put(jobId, dagClient); |
There was a problem hiding this comment.
I was worried that items are not evicted from dagClientMap, but something similar is happening in case of ClientCache too:
maybe we can take care of this in a followup ticket
|
@lewismc : this patch looks good and neat! I'm wondering if you can show any proof of manually testing this? I'm just asking, I remember back in 2020 when NUTCH-2839 was created, I somehow discovered the empty counters problem with an MR job submitted by YARNRunner, I wish I could do the same experiment easily now (of course, haven't made notes back then :D ), so if you already experimented with this, that would help a lot with approving and merging this, thanks in advance! |
|
Hi @abstractdog thanks for taking a look. RE: testing, yes that is ongoing and I will follow-up with an end-to-end example which exercises individual Nutch MapReduce jobs running as dual-vertex Tez DAG's. My test environment (I'll link the docker compose project soon) utilizes the official Apache Hadoop 3.4.3 Docker image which is both interesting and challenging for a few reasons.
This is somewhat messy but it will be cleared up soon when Hadoop 3.5.0 is released and we can upgrade in both Tez and Nutch and drop Java 11 support. |
After a number of years, I decided to revisit TEZ-4371.
When the Tez client is used as the MapReduce client (via
YARNRunner),ClientProtocol.getJobCounters(JobID)was returning empty Counters. This change implements counter retrieval so Tez DAG Counters are exposed as MapReduce Counters, unblocking NUTCH-2839 (e.g. Nutch Injector job counters on Tez).I added some unit tests to this PR. My local testing continues with Nutch master branch (1.23-SNAPSHOT)