Tag recognizer pagination #25537

edg956 · 2026-01-26T18:52:26Z

Describe your changes:

This PR includes an endpoint that returns a tag's recognizers with pagination

Type of change:

Checklist:

I have read the CONTRIBUTING document.
My PR title is Fixes <issue-number>: <short explanation>
I have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

I have added tests around the new logic.
For connector/ingestion changes: I updated the documentation.

Summary by Gitar

New API endpoints:
- GET /v1/tags/{id}/recognizers and GET /v1/tags/name/{fqn}/recognizers enable paginated retrieval of tag recognizers
Cursor-based pagination:
- TagRepository.getRecognizersOfTag() implements bidirectional cursor navigation with before/after parameters
New exception handling:
- BadCursorException validates and handles malformed pagination cursors returning HTTP 400

_{This will update automatically on new commits.}

openmetadata-service/src/main/java/org/openmetadata/service/resources/tags/TagResource.java

edg956 · 2026-01-27T07:12:23Z

@gitar-bot

🚨 Bug: Backward pagination passes wrong cursor parameter
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561

The listRecognizersBefore method has a critical bug where it validates the before cursor but then passes the same raw before string to listRecognizersAfter as the after parameter (line 561). This causes two problems:

The beforeId that was parsed and validated is never used

listRecognizersAfter receives before as its second param, which it will then try to parse again as an after cursor

The backward pagination logic is fundamentally broken because the cursor string is passed through unchanged.

Suggested fix: Pass null as the after cursor and rely on the reversed list to handle pagination naturally, OR refactor to encode the cursor properly for the reversed list:
return listRecognizersAfter(recognizers.reversed(), null, limit);
Then handle the cursor-based starting position differently in the calling code.

This is by design. Cursors are nothing but a base64-encoded json containing an id and a name of the next recognizer. By validating in listRecognizersBefore I'm ensuring that if there's any error then it correctly says that it comes from before. Because the validation passed, I know it won't result in an error in listRecognizersAfter.

Would you prefer a refactor to avoid duplication?

⚠️ Bug: Unused beforeId variable - parsed cursor value is discarded
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548

In the listRecognizersBefore method, the beforeId variable is parsed from the cursor at line 548, but is never used. The method then calls listRecognizersAfter(recognizers.reversed(), before, limit) at line 558, passing the original raw before cursor string instead of the validated beforeId.

This means:

The cursor parsing at line 548 is effectively dead code

beforeId is extracted but never referenced again

The validation is performed twice (once here, once in listRecognizersAfter)

The method needs restructuring to properly use the parsed cursor value.

As mentioned before, this is on purpose. A refactor could be performed to do such thing

gitar-bot · 2026-01-27T07:16:46Z

@gitar-bot
🚨 Bug: Backward pagination passes wrong cursor parameter
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561

The listRecognizersBefore method has a critical bug where it validates the before cursor but then passes the same raw before string to listRecognizersAfter as the after parameter (line 561). This causes two problems:

The beforeId that was parsed and validated is never used

listRecognizersAfter receives before as its second param, which it will then try to parse again as an after cursor

The backward pagination logic is fundamentally broken because the cursor string is passed through unchanged.

Suggested fix: Pass null as the after cursor and rely on the reversed list to handle pagination naturally, OR refactor to encode the cursor properly for the reversed list:
return listRecognizersAfter(recognizers.reversed(), null, limit);
Then handle the cursor-based starting position differently in the calling code.
This is by design. Cursors are nothing but a base64-encoded json containing an id and a name of the next recognizer. By validating in listRecognizersBefore I'm ensuring that if there's any error then it correctly says that it comes from before. Because the validation passed, I know it won't result in an error in listRecognizersAfter.

Would you prefer a refactor to avoid duplication?

⚠️ Bug: Unused beforeId variable - parsed cursor value is discarded
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548

In the listRecognizersBefore method, the beforeId variable is parsed from the cursor at line 548, but is never used. The method then calls listRecognizersAfter(recognizers.reversed(), before, limit) at line 558, passing the original raw before cursor string instead of the validated beforeId.

This means:

The cursor parsing at line 548 is effectively dead code

beforeId is extracted but never referenced again

The validation is performed twice (once here, once in listRecognizersAfter)

The method needs restructuring to properly use the parsed cursor value.

As mentioned before, this is on purpose. A refactor could be performed to do such thing

Thank you for the clarification! I understand the design intent now - validating the cursor early in listRecognizersBefore to provide clearer error messages about which parameter (before vs after) caused the issue.

However, the CI test failures (TagResourceIT.test_recognizerPaginationEndpoint) indicate the backward pagination is not working correctly - tests are expecting Recognizer_19 but getting Recognizer_29. This suggests the cursor might not be functioning as intended.

Regarding the refactor: Yes, I'd recommend consolidating the cursor validation logic to avoid the duplication and make the flow clearer. You could:

Extract cursor parsing/validation to a shared method
Have both methods call it with appropriate error context
This would maintain the clear error messages while reducing duplication

But first, it would be good to investigate why the test is failing - the pagination logic may need adjustment regardless of the validation approach.

edg956 · 2026-01-27T07:20:05Z

@gitar-bot
🚨 Bug: Backward pagination passes wrong cursor parameter
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561
The listRecognizersBefore method has a critical bug where it validates the before cursor but then passes the same raw before string to listRecognizersAfter as the after parameter (line 561). This causes two problems:

The beforeId that was parsed and validated is never used

listRecognizersAfter receives before as its second param, which it will then try to parse again as an after cursor

The backward pagination logic is fundamentally broken because the cursor string is passed through unchanged.
Suggested fix: Pass null as the after cursor and rely on the reversed list to handle pagination naturally, OR refactor to encode the cursor properly for the reversed list:
return listRecognizersAfter(recognizers.reversed(), null, limit);
Then handle the cursor-based starting position differently in the calling code.
This is by design. Cursors are nothing but a base64-encoded json containing an id and a name of the next recognizer. By validating in listRecognizersBefore I'm ensuring that if there's any error then it correctly says that it comes from before. Because the validation passed, I know it won't result in an error in listRecognizersAfter.
Would you prefer a refactor to avoid duplication?

⚠️ Bug: Unused beforeId variable - parsed cursor value is discarded
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548
In the listRecognizersBefore method, the beforeId variable is parsed from the cursor at line 548, but is never used. The method then calls listRecognizersAfter(recognizers.reversed(), before, limit) at line 558, passing the original raw before cursor string instead of the validated beforeId.
This means:

The cursor parsing at line 548 is effectively dead code

beforeId is extracted but never referenced again

The validation is performed twice (once here, once in listRecognizersAfter)

The method needs restructuring to properly use the parsed cursor value.

As mentioned before, this is on purpose. A refactor could be performed to do such thing
Thank you for the clarification! I understand the design intent now - validating the cursor early in listRecognizersBefore to provide clearer error messages about which parameter (before vs after) caused the issue.

However, the CI test failures (TagResourceIT.test_recognizerPaginationEndpoint) indicate the backward pagination is not working correctly - tests are expecting Recognizer_19 but getting Recognizer_29. This suggests the cursor might not be functioning as intended.

Regarding the refactor: Yes, I'd recommend consolidating the cursor validation logic to avoid the duplication and make the flow clearer. You could:

Extract cursor parsing/validation to a shared method

Have both methods call it with appropriate error context

This would maintain the clear error messages while reducing duplication

But first, it would be good to investigate why the test is failing - the pagination logic may need adjustment regardless of the validation approach.

That test doesn't fail anymore

gitar-bot · 2026-01-27T07:25:12Z

@gitar-bot
🚨 Bug: Backward pagination passes wrong cursor parameter
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561
The listRecognizersBefore method has a critical bug where it validates the before cursor but then passes the same raw before string to listRecognizersAfter as the after parameter (line 561). This causes two problems:

The beforeId that was parsed and validated is never used

listRecognizersAfter receives before as its second param, which it will then try to parse again as an after cursor

The backward pagination logic is fundamentally broken because the cursor string is passed through unchanged.
Suggested fix: Pass null as the after cursor and rely on the reversed list to handle pagination naturally, OR refactor to encode the cursor properly for the reversed list:
return listRecognizersAfter(recognizers.reversed(), null, limit);
Then handle the cursor-based starting position differently in the calling code.
This is by design. Cursors are nothing but a base64-encoded json containing an id and a name of the next recognizer. By validating in listRecognizersBefore I'm ensuring that if there's any error then it correctly says that it comes from before. Because the validation passed, I know it won't result in an error in listRecognizersAfter.
Would you prefer a refactor to avoid duplication?

⚠️ Bug: Unused beforeId variable - parsed cursor value is discarded
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548
In the listRecognizersBefore method, the beforeId variable is parsed from the cursor at line 548, but is never used. The method then calls listRecognizersAfter(recognizers.reversed(), before, limit) at line 558, passing the original raw before cursor string instead of the validated beforeId.
This means:

The cursor parsing at line 548 is effectively dead code

beforeId is extracted but never referenced again

The validation is performed twice (once here, once in listRecognizersAfter)

The method needs restructuring to properly use the parsed cursor value.

As mentioned before, this is on purpose. A refactor could be performed to do such thing
Thank you for the clarification! I understand the design intent now - validating the cursor early in listRecognizersBefore to provide clearer error messages about which parameter (before vs after) caused the issue.

However, the CI test failures (TagResourceIT.test_recognizerPaginationEndpoint) indicate the backward pagination is not working correctly - tests are expecting Recognizer_19 but getting Recognizer_29. This suggests the cursor might not be functioning as intended.

Regarding the refactor: Yes, I'd recommend consolidating the cursor validation logic to avoid the duplication and make the flow clearer. You could:

Extract cursor parsing/validation to a shared method

Have both methods call it with appropriate error context

This would maintain the clear error messages while reducing duplication

But first, it would be good to investigate why the test is failing - the pagination logic may need adjustment regardless of the validation approach.
That test doesn't fail anymore

Great to hear the test is passing now! I can confirm that both integration test jobs (mysql-elasticsearch and postgres-opensearch) are showing SUCCESS.

Since the validation approach is intentional and the tests are passing, the code review findings about the cursor validation can be dismissed. The design makes sense for providing clearer error messages.

Regarding the refactor suggestion: It's optional. If you feel the validation duplication adds clarity for error reporting (as you explained), you can keep it as is. A refactor would be beneficial if you want to reduce code duplication, but it's not necessary if the current approach serves the error messaging goal well.

gitar-bot · 2026-01-27T07:57:11Z

🔍 CI failure analysis for 25faf52: Nine CI jobs failed. playwright-ci-postgresql (2, 6) shows massive infrastructure failure with 207 failed tests - all toBeVisible() timing issues, indicating severe environment/infrastructure problem.

Issue

Nine CI jobs failed, all due to infrastructure issues or test problems unrelated to the PR's tag pagination changes.

Failures 1-8: Previous Issues (Still Applicable)

integration-tests-mysql-elasticsearch: Elasticsearch infrastructure failure
maven-collate-ci: Workflow orchestration timeout (workflow succeeded)
3-4. playwright-ci-postgresql (4, 6) and (3, 6): Flaky E2E tests with timing issues (1 failed + 9-10 flaky each)
5-6. py-run-tests (3.10 and 3.11): Deterministic S3 storage failures (0% retry success)
maven-postgresql-ci: 1 failure + 3 errors (Apps/AWS issues)
maven-sonarcloud-ci: 2 failures + 3 errors (Apps/AWS/Glossary issues)

Failure 9: playwright-ci-postgresql (2, 6) (Job 61570693936) [NEW - CRITICAL]

Test Results: 207 failed tests + 7 flaky tests

Critical Severity:

This is a massive infrastructure failure - the highest test failure count across all CI jobs. For comparison:

Shard (3, 6): 1 failed + 10 flaky
Shard (4, 6): 1 failed + 9 flaky
Shard (2, 6): 207 failed + 7 flaky ⚠️

Error Pattern:

All 207 failures show the same error:

Error: expect(locator).toBeVisible() failed

Repeated consistently across all failing tests - this indicates a systematic infrastructure or environment problem, not test-specific issues.

Root Cause Analysis:

Infrastructure Failure - UI Rendering/Browser Environment:

Scope: Nearly all tests in shard (2, 6) are affected (207 failures)
Pattern: 100% of failures are toBeVisible() errors - elements not appearing
Comparison: Other shards (3, 6) and (4, 6) have only 1-10 failures each
Implication: This shard's test environment has a severe issue:
- Browser instance may not have started correctly
- UI server may not be responding
- Network connectivity issues between test runner and UI
- Resource exhaustion (memory/CPU) preventing UI rendering
- Test database/backend not available for this shard

Analysis:

The 207 failed tests represent approximately 73% of tests in this shard (assuming ~280 tests per shard based on 6 total shards). This is not flakiness - this is a complete environment failure where the test infrastructure for shard (2, 6) did not function correctly.

Root Cause Summary

All nine failures are infrastructure/test issues:

Elasticsearch service failure - external dependency
GitHub Actions workflow timeout - orchestration issue
3-4. Playwright test flakiness (shards 3, 4) - Minor UI timing issues (1-10 failures)
5-6. Python S3 storage tests - Deterministic STS API mismatch
7-8. Maven tests - Apps/AWS/Glossary validation/timing issues
Playwright shard (2, 6) - CRITICAL: Complete environment failure (207 failures)

Details

PR Changes Are Unrelated:

This PR modifies only tag pagination Java backend:

TagRepository.java, TagResource.java, TagResourceIT.java, BadCursorException.java, TagRepositoryUnitTest.java

The failures are in completely unrelated components. Zero overlap between PR changes (Tag pagination) and failing components.

Code Review ✅ Approved 2 resolved / 2 findings

Both previous issues have been addressed. The operationIds are now unique (listATagsRecognizersById and listATagsRecognizersByFQN), and the backward pagination now properly extracts and passes the cursor ID. Good implementation with comprehensive test coverage.

✅ 2 resolved

✅ Quality: Duplicate OpenAPI operationId will cause spec conflicts

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/tags/TagResource.java:811-814 📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/tags/TagResource.java:857-860
Both REST endpoints (listRecognizersByTagId and listRecognizersByTagFQN) use the same operationId = "listATagsRecognizers". OpenAPI specification requires unique operation IDs across all endpoints. This will cause issues with:

OpenAPI spec validation

SDK code generation (client generators use operationId to generate method names)

API documentation tools

Suggested fix: Use unique operation IDs for each endpoint:

For the ID-based endpoint:
operationId = "listTagRecognizersById"
For the FQN-based endpoint:
operationId = "listTagRecognizersByFQN"

✅ Bug: Backward pagination cursor not passed correctly

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548-558 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561
In listRecognizersBefore, the beforeId is parsed and validated but never used. The method then passes the raw before cursor string to listRecognizersAfter, which will re-parse it.

While the cursor format is the same, this creates:

Dead code (beforeId assigned but unused)

Duplicate cursor parsing

The CI test failures (expecting Recognizer_19 but getting Recognizer_29) suggest the pagination logic may have issues

Consider using the already-parsed beforeId by passing it directly, or refactoring to share cursor parsing logic.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

`Auto-apply`	`Compact`
`gitar auto-apply:on`	`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

* Add pagination for tag's recognizers * Add tests * Apply Gitar comments

edg956 self-assigned this Jan 26, 2026

edg956 requested a review from a team as a code owner January 26, 2026 18:52

edg956 added safe to test Add this label to run secure Github workflows on PRs governance labels Jan 26, 2026

edg956 temporarily deployed to test January 26, 2026 18:52 — with GitHub Actions Inactive

github-actions bot added the Ingestion label Jan 26, 2026

edg956 force-pushed the feat/tag-recognizer-pagination branch from e2dca3b to dce5dac Compare January 26, 2026 18:56

harshach previously approved these changes Jan 26, 2026

View reviewed changes

gitar-bot bot reviewed Jan 26, 2026

View reviewed changes

openmetadata-service/src/main/java/org/openmetadata/service/resources/tags/TagResource.java Show resolved Hide resolved

edg956 temporarily deployed to test January 26, 2026 20:21 — with GitHub Actions Inactive

edg956 had a problem deploying to test January 26, 2026 20:21 — with GitHub Actions Failure

edg956 temporarily deployed to test January 26, 2026 20:21 — with GitHub Actions Inactive

edg956 had a problem deploying to test January 26, 2026 20:21 — with GitHub Actions Failure

edg956 added 3 commits January 27, 2026 08:44

Add pagination for tag's recognizers

1c4ebb0

Add tests

5d09826

Apply Gitar comments

25faf52

edg956 dismissed harshach’s stale review via 25faf52 January 27, 2026 07:45

edg956 force-pushed the feat/tag-recognizer-pagination branch from dce5dac to 25faf52 Compare January 27, 2026 07:45

edg956 had a problem deploying to test January 27, 2026 07:45 — with GitHub Actions Failure

edg956 temporarily deployed to test January 27, 2026 07:45 — with GitHub Actions Inactive

edg956 had a problem deploying to test January 27, 2026 07:45 — with GitHub Actions Failure

edg956 temporarily deployed to test January 27, 2026 07:45 — with GitHub Actions Inactive

edg956 had a problem deploying to test January 27, 2026 07:45 — with GitHub Actions Failure

edg956 temporarily deployed to test January 27, 2026 07:45 — with GitHub Actions Inactive

pmbrull merged commit f074266 into main Jan 27, 2026
20 of 31 checks passed

pmbrull deleted the feat/tag-recognizer-pagination branch January 27, 2026 13:50

edg956 added a commit that referenced this pull request Jan 27, 2026

Tag recognizer pagination (#25537)

c0dd7bb

* Add pagination for tag's recognizers * Add tests * Apply Gitar comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tag recognizer pagination #25537

Tag recognizer pagination #25537

Uh oh!

edg956 commented Jan 26, 2026 •

edited by gitar-bot bot

Loading

Uh oh!

Uh oh!

edg956 commented Jan 27, 2026

Uh oh!

gitar-bot bot commented Jan 27, 2026

Uh oh!

edg956 commented Jan 27, 2026

Uh oh!

gitar-bot bot commented Jan 27, 2026

Uh oh!

gitar-bot bot commented Jan 27, 2026 •

edited

Loading

Issue

Failures 1-8: Previous Issues (Still Applicable)

Failure 9: playwright-ci-postgresql (2, 6) (Job 61570693936) [NEW - CRITICAL]

Critical Severity:

Error Pattern:

Root Cause Analysis:

Analysis:

Root Cause Summary

Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Tag recognizer pagination #25537

Tag recognizer pagination #25537

Uh oh!

Conversation

edg956 commented Jan 26, 2026 • edited by gitar-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes:

Type of change:

Checklist:

Summary by Gitar

Uh oh!

Uh oh!

edg956 commented Jan 27, 2026

Uh oh!

gitar-bot bot commented Jan 27, 2026

Uh oh!

edg956 commented Jan 27, 2026

Uh oh!

gitar-bot bot commented Jan 27, 2026

Uh oh!

gitar-bot bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Failures 1-8: Previous Issues (Still Applicable)

Failure 9: playwright-ci-postgresql (2, 6) (Job 61570693936) [NEW - CRITICAL]

Critical Severity:

Error Pattern:

Root Cause Analysis:

Analysis:

Root Cause Summary

Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

edg956 commented Jan 26, 2026 •

edited by gitar-bot bot

Loading

gitar-bot bot commented Jan 27, 2026 •

edited

Loading