Skip to content

Conversation

@edg956
Copy link
Contributor

@edg956 edg956 commented Jan 26, 2026

Describe your changes:

This PR includes an endpoint that returns a tag's recognizers with pagination

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.
  • I have added tests around the new logic.
  • For connector/ingestion changes: I updated the documentation.

Summary by Gitar

  • New API endpoints:
    • GET /v1/tags/{id}/recognizers and GET /v1/tags/name/{fqn}/recognizers enable paginated retrieval of tag recognizers
  • Cursor-based pagination:
    • TagRepository.getRecognizersOfTag() implements bidirectional cursor navigation with before/after parameters
  • New exception handling:
    • BadCursorException validates and handles malformed pagination cursors returning HTTP 400

This will update automatically on new commits.


@edg956 edg956 self-assigned this Jan 26, 2026
@edg956 edg956 requested a review from a team as a code owner January 26, 2026 18:52
@edg956 edg956 added safe to test Add this label to run secure Github workflows on PRs governance labels Jan 26, 2026
@edg956 edg956 force-pushed the feat/tag-recognizer-pagination branch from e2dca3b to dce5dac Compare January 26, 2026 18:56
harshach
harshach previously approved these changes Jan 26, 2026
@edg956
Copy link
Contributor Author

edg956 commented Jan 27, 2026

@gitar-bot

🚨 Bug: Backward pagination passes wrong cursor parameter
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561

The listRecognizersBefore method has a critical bug where it validates the before cursor but then passes the same raw before string to listRecognizersAfter as the after parameter (line 561). This causes two problems:

  1. The beforeId that was parsed and validated is never used
  2. listRecognizersAfter receives before as its second param, which it will then try to parse again as an after cursor

The backward pagination logic is fundamentally broken because the cursor string is passed through unchanged.

Suggested fix: Pass null as the after cursor and rely on the reversed list to handle pagination naturally, OR refactor to encode the cursor properly for the reversed list:

return listRecognizersAfter(recognizers.reversed(), null, limit);

Then handle the cursor-based starting position differently in the calling code.

This is by design. Cursors are nothing but a base64-encoded json containing an id and a name of the next recognizer. By validating in listRecognizersBefore I'm ensuring that if there's any error then it correctly says that it comes from before. Because the validation passed, I know it won't result in an error in listRecognizersAfter.

Would you prefer a refactor to avoid duplication?

⚠️ Bug: Unused beforeId variable - parsed cursor value is discarded
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548

In the listRecognizersBefore method, the beforeId variable is parsed from the cursor at line 548, but is never used. The method then calls listRecognizersAfter(recognizers.reversed(), before, limit) at line 558, passing the original raw before cursor string instead of the validated beforeId.

This means:

  1. The cursor parsing at line 548 is effectively dead code
  2. beforeId is extracted but never referenced again
  3. The validation is performed twice (once here, once in listRecognizersAfter)

The method needs restructuring to properly use the parsed cursor value.

As mentioned before, this is on purpose. A refactor could be performed to do such thing

@gitar-bot
Copy link

gitar-bot bot commented Jan 27, 2026

@gitar-bot

🚨 Bug: Backward pagination passes wrong cursor parameter
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561

The listRecognizersBefore method has a critical bug where it validates the before cursor but then passes the same raw before string to listRecognizersAfter as the after parameter (line 561). This causes two problems:

  1. The beforeId that was parsed and validated is never used
  2. listRecognizersAfter receives before as its second param, which it will then try to parse again as an after cursor

The backward pagination logic is fundamentally broken because the cursor string is passed through unchanged.

Suggested fix: Pass null as the after cursor and rely on the reversed list to handle pagination naturally, OR refactor to encode the cursor properly for the reversed list:

return listRecognizersAfter(recognizers.reversed(), null, limit);

Then handle the cursor-based starting position differently in the calling code.

This is by design. Cursors are nothing but a base64-encoded json containing an id and a name of the next recognizer. By validating in listRecognizersBefore I'm ensuring that if there's any error then it correctly says that it comes from before. Because the validation passed, I know it won't result in an error in listRecognizersAfter.

Would you prefer a refactor to avoid duplication?

⚠️ Bug: Unused beforeId variable - parsed cursor value is discarded
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548

In the listRecognizersBefore method, the beforeId variable is parsed from the cursor at line 548, but is never used. The method then calls listRecognizersAfter(recognizers.reversed(), before, limit) at line 558, passing the original raw before cursor string instead of the validated beforeId.

This means:

  1. The cursor parsing at line 548 is effectively dead code
  2. beforeId is extracted but never referenced again
  3. The validation is performed twice (once here, once in listRecognizersAfter)

The method needs restructuring to properly use the parsed cursor value.

As mentioned before, this is on purpose. A refactor could be performed to do such thing

Thank you for the clarification! I understand the design intent now - validating the cursor early in listRecognizersBefore to provide clearer error messages about which parameter (before vs after) caused the issue.

However, the CI test failures (TagResourceIT.test_recognizerPaginationEndpoint) indicate the backward pagination is not working correctly - tests are expecting Recognizer_19 but getting Recognizer_29. This suggests the cursor might not be functioning as intended.

Regarding the refactor: Yes, I'd recommend consolidating the cursor validation logic to avoid the duplication and make the flow clearer. You could:

  1. Extract cursor parsing/validation to a shared method
  2. Have both methods call it with appropriate error context
  3. This would maintain the clear error messages while reducing duplication

But first, it would be good to investigate why the test is failing - the pagination logic may need adjustment regardless of the validation approach.

@edg956
Copy link
Contributor Author

edg956 commented Jan 27, 2026

@gitar-bot

🚨 Bug: Backward pagination passes wrong cursor parameter
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561
The listRecognizersBefore method has a critical bug where it validates the before cursor but then passes the same raw before string to listRecognizersAfter as the after parameter (line 561). This causes two problems:

  1. The beforeId that was parsed and validated is never used
  2. listRecognizersAfter receives before as its second param, which it will then try to parse again as an after cursor

The backward pagination logic is fundamentally broken because the cursor string is passed through unchanged.
Suggested fix: Pass null as the after cursor and rely on the reversed list to handle pagination naturally, OR refactor to encode the cursor properly for the reversed list:

return listRecognizersAfter(recognizers.reversed(), null, limit);

Then handle the cursor-based starting position differently in the calling code.

This is by design. Cursors are nothing but a base64-encoded json containing an id and a name of the next recognizer. By validating in listRecognizersBefore I'm ensuring that if there's any error then it correctly says that it comes from before. Because the validation passed, I know it won't result in an error in listRecognizersAfter.
Would you prefer a refactor to avoid duplication?

⚠️ Bug: Unused beforeId variable - parsed cursor value is discarded
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548
In the listRecognizersBefore method, the beforeId variable is parsed from the cursor at line 548, but is never used. The method then calls listRecognizersAfter(recognizers.reversed(), before, limit) at line 558, passing the original raw before cursor string instead of the validated beforeId.
This means:

  1. The cursor parsing at line 548 is effectively dead code
  2. beforeId is extracted but never referenced again
  3. The validation is performed twice (once here, once in listRecognizersAfter)

The method needs restructuring to properly use the parsed cursor value.

As mentioned before, this is on purpose. A refactor could be performed to do such thing

Thank you for the clarification! I understand the design intent now - validating the cursor early in listRecognizersBefore to provide clearer error messages about which parameter (before vs after) caused the issue.

However, the CI test failures (TagResourceIT.test_recognizerPaginationEndpoint) indicate the backward pagination is not working correctly - tests are expecting Recognizer_19 but getting Recognizer_29. This suggests the cursor might not be functioning as intended.

Regarding the refactor: Yes, I'd recommend consolidating the cursor validation logic to avoid the duplication and make the flow clearer. You could:

  1. Extract cursor parsing/validation to a shared method
  2. Have both methods call it with appropriate error context
  3. This would maintain the clear error messages while reducing duplication

But first, it would be good to investigate why the test is failing - the pagination logic may need adjustment regardless of the validation approach.

That test doesn't fail anymore

@gitar-bot
Copy link

gitar-bot bot commented Jan 27, 2026

@gitar-bot

🚨 Bug: Backward pagination passes wrong cursor parameter
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561
The listRecognizersBefore method has a critical bug where it validates the before cursor but then passes the same raw before string to listRecognizersAfter as the after parameter (line 561). This causes two problems:

  1. The beforeId that was parsed and validated is never used
  2. listRecognizersAfter receives before as its second param, which it will then try to parse again as an after cursor

The backward pagination logic is fundamentally broken because the cursor string is passed through unchanged.
Suggested fix: Pass null as the after cursor and rely on the reversed list to handle pagination naturally, OR refactor to encode the cursor properly for the reversed list:

return listRecognizersAfter(recognizers.reversed(), null, limit);

Then handle the cursor-based starting position differently in the calling code.

This is by design. Cursors are nothing but a base64-encoded json containing an id and a name of the next recognizer. By validating in listRecognizersBefore I'm ensuring that if there's any error then it correctly says that it comes from before. Because the validation passed, I know it won't result in an error in listRecognizersAfter.
Would you prefer a refactor to avoid duplication?

⚠️ Bug: Unused beforeId variable - parsed cursor value is discarded
📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548
In the listRecognizersBefore method, the beforeId variable is parsed from the cursor at line 548, but is never used. The method then calls listRecognizersAfter(recognizers.reversed(), before, limit) at line 558, passing the original raw before cursor string instead of the validated beforeId.
This means:

  1. The cursor parsing at line 548 is effectively dead code
  2. beforeId is extracted but never referenced again
  3. The validation is performed twice (once here, once in listRecognizersAfter)

The method needs restructuring to properly use the parsed cursor value.

As mentioned before, this is on purpose. A refactor could be performed to do such thing

Thank you for the clarification! I understand the design intent now - validating the cursor early in listRecognizersBefore to provide clearer error messages about which parameter (before vs after) caused the issue.

However, the CI test failures (TagResourceIT.test_recognizerPaginationEndpoint) indicate the backward pagination is not working correctly - tests are expecting Recognizer_19 but getting Recognizer_29. This suggests the cursor might not be functioning as intended.

Regarding the refactor: Yes, I'd recommend consolidating the cursor validation logic to avoid the duplication and make the flow clearer. You could:

  1. Extract cursor parsing/validation to a shared method
  2. Have both methods call it with appropriate error context
  3. This would maintain the clear error messages while reducing duplication

But first, it would be good to investigate why the test is failing - the pagination logic may need adjustment regardless of the validation approach.

That test doesn't fail anymore

Great to hear the test is passing now! I can confirm that both integration test jobs (mysql-elasticsearch and postgres-opensearch) are showing SUCCESS.

Since the validation approach is intentional and the tests are passing, the code review findings about the cursor validation can be dismissed. The design makes sense for providing clearer error messages.

Regarding the refactor suggestion: It's optional. If you feel the validation duplication adds clarity for error reporting (as you explained), you can keep it as is. A refactor would be beneficial if you want to reduce code duplication, but it's not necessary if the current approach serves the error messaging goal well.

@gitar-bot
Copy link

gitar-bot bot commented Jan 27, 2026

🔍 CI failure analysis for 25faf52: Nine CI jobs failed. playwright-ci-postgresql (2, 6) shows massive infrastructure failure with 207 failed tests - all toBeVisible() timing issues, indicating severe environment/infrastructure problem.

Issue

Nine CI jobs failed, all due to infrastructure issues or test problems unrelated to the PR's tag pagination changes.


Failures 1-8: Previous Issues (Still Applicable)

  1. integration-tests-mysql-elasticsearch: Elasticsearch infrastructure failure
  2. maven-collate-ci: Workflow orchestration timeout (workflow succeeded)
    3-4. playwright-ci-postgresql (4, 6) and (3, 6): Flaky E2E tests with timing issues (1 failed + 9-10 flaky each)
    5-6. py-run-tests (3.10 and 3.11): Deterministic S3 storage failures (0% retry success)
  3. maven-postgresql-ci: 1 failure + 3 errors (Apps/AWS issues)
  4. maven-sonarcloud-ci: 2 failures + 3 errors (Apps/AWS/Glossary issues)

Failure 9: playwright-ci-postgresql (2, 6) (Job 61570693936) [NEW - CRITICAL]

Test Results: 207 failed tests + 7 flaky tests

Critical Severity:

This is a massive infrastructure failure - the highest test failure count across all CI jobs. For comparison:

  • Shard (3, 6): 1 failed + 10 flaky
  • Shard (4, 6): 1 failed + 9 flaky
  • Shard (2, 6): 207 failed + 7 flaky ⚠️

Error Pattern:

All 207 failures show the same error:

Error: expect(locator).toBeVisible() failed

Repeated consistently across all failing tests - this indicates a systematic infrastructure or environment problem, not test-specific issues.

Root Cause Analysis:

Infrastructure Failure - UI Rendering/Browser Environment:

  1. Scope: Nearly all tests in shard (2, 6) are affected (207 failures)
  2. Pattern: 100% of failures are toBeVisible() errors - elements not appearing
  3. Comparison: Other shards (3, 6) and (4, 6) have only 1-10 failures each
  4. Implication: This shard's test environment has a severe issue:
    • Browser instance may not have started correctly
    • UI server may not be responding
    • Network connectivity issues between test runner and UI
    • Resource exhaustion (memory/CPU) preventing UI rendering
    • Test database/backend not available for this shard

Analysis:

The 207 failed tests represent approximately 73% of tests in this shard (assuming ~280 tests per shard based on 6 total shards). This is not flakiness - this is a complete environment failure where the test infrastructure for shard (2, 6) did not function correctly.

Root Cause Summary

All nine failures are infrastructure/test issues:

  1. Elasticsearch service failure - external dependency
  2. GitHub Actions workflow timeout - orchestration issue
    3-4. Playwright test flakiness (shards 3, 4) - Minor UI timing issues (1-10 failures)
    5-6. Python S3 storage tests - Deterministic STS API mismatch
    7-8. Maven tests - Apps/AWS/Glossary validation/timing issues
  3. Playwright shard (2, 6) - CRITICAL: Complete environment failure (207 failures)

Details

PR Changes Are Unrelated:

This PR modifies only tag pagination Java backend:

  • TagRepository.java, TagResource.java, TagResourceIT.java, BadCursorException.java, TagRepositoryUnitTest.java

The failures are in completely unrelated components. Zero overlap between PR changes (Tag pagination) and failing components.

Code Review ✅ Approved 2 resolved / 2 findings

Both previous issues have been addressed. The operationIds are now unique (listATagsRecognizersById and listATagsRecognizersByFQN), and the backward pagination now properly extracts and passes the cursor ID. Good implementation with comprehensive test coverage.

✅ 2 resolved
Quality: Duplicate OpenAPI operationId will cause spec conflicts

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/tags/TagResource.java:811-814 📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/tags/TagResource.java:857-860
Both REST endpoints (listRecognizersByTagId and listRecognizersByTagFQN) use the same operationId = "listATagsRecognizers". OpenAPI specification requires unique operation IDs across all endpoints. This will cause issues with:

  1. OpenAPI spec validation
  2. SDK code generation (client generators use operationId to generate method names)
  3. API documentation tools

Suggested fix: Use unique operation IDs for each endpoint:

For the ID-based endpoint:

operationId = "listTagRecognizersById"

For the FQN-based endpoint:

operationId = "listTagRecognizersByFQN"
Bug: Backward pagination cursor not passed correctly

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548-558 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:548 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java:561
In listRecognizersBefore, the beforeId is parsed and validated but never used. The method then passes the raw before cursor string to listRecognizersAfter, which will re-parse it.

While the cursor format is the same, this creates:

  1. Dead code (beforeId assigned but unused)
  2. Duplicate cursor parsing
  3. The CI test failures (expecting Recognizer_19 but getting Recognizer_29) suggest the pagination logic may have issues

Consider using the already-parsed beforeId by passing it directly, or refactoring to share cursor parsing logic.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@pmbrull pmbrull merged commit f074266 into main Jan 27, 2026
20 of 31 checks passed
@pmbrull pmbrull deleted the feat/tag-recognizer-pagination branch January 27, 2026 13:50
edg956 added a commit that referenced this pull request Jan 27, 2026
* Add pagination for tag's recognizers

* Add tests

* Apply Gitar comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

governance Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants