Skip to content

Integ test cases for field-level security#5008

Open
Swiddis wants to merge 12 commits intoopensearch-project:mainfrom
Swiddis:fix/background-io-thread-context
Open

Integ test cases for field-level security#5008
Swiddis wants to merge 12 commits intoopensearch-project:mainfrom
Swiddis:fix/background-io-thread-context

Conversation

@Swiddis
Copy link
Copy Markdown
Collaborator

@Swiddis Swiddis commented Dec 30, 2025

Description

Adds some tests to validate various scenarios for document-level security (DLS), since there was an internal scare for a bit on if DLS is working with background IO. Turns out we're protected implicitly by ContextPreservingRunnable. Just pushing some work from that to keep verifying this in the future.

Related Issues

N/A

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 30, 2025

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Summary by CodeRabbit

  • Tests

    • Added comprehensive FGAC integration tests covering index-, field-, and row-level security across large datasets and multiple engine configurations
    • Introduced a shared security-focused test base with utilities for role/user setup, DLS/FLS, bulk data population, and authenticated query execution
    • Updated existing permission tests to use the new security-centric test base
  • Chores

    • Added junit-jupiter-params test dependency

Walkthrough

Adds a SecurityTestBase providing role/user/DLS/FLS helpers, authenticated query and bulk utilities; introduces FGACIndexScanningIT to test index-, field-, and row-level FGAC with large datasets across two engine modes; refactors PPLPermissionsIT to use the new base; and updates integ-test build deps for JUnit parameterized tests.

Changes

Cohort / File(s) Summary
Test build config
integ-test/build.gradle
Adds org.junit.jupiter:junit-jupiter-params:5.9.3 to testImplementation.
Security test base
integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java
New abstract test base adding helpers for creating roles (index access, permissions, DLS, FLS), creating users and role-mappings, executing authenticated PPL queries (Basic Auth), building/performing bulk inserts, creating indices with mappings, and an inner BulkDocumentBuilder.
New FGAC integration test
integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java
New parameterized integration test that creates indices, roles, and users; bulk-populates large datasets; and verifies index-, field-, and row-level FGAC behaviors across users and engine modes.
Permissions tests refactor
integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java
Class now extends SecurityTestBase (was PPLIntegTestCase); removes in-class password/user creation and legacy helpers; replaces role/user/index setup with SecurityTestBase utilities like createRoleWithIndexAccess and createRoleWithPermissions.

Sequence Diagram(s)

sequenceDiagram
    actor Test
    participant Base as SecurityTestBase
    participant OpenSearch
    participant Auth as Auth/Authorization

    rect rgba(230,240,255,0.5)
        Note over Test,Base: Setup Phase
        Test->>Base: createRoleWithIndexAccess(role, indexPattern)
        Base->>OpenSearch: HTTP role creation (index/DLS/FLS/permissions)
        OpenSearch-->>Base: 200 OK
        Test->>Base: createUser(username, role)
        Base->>OpenSearch: HTTP user creation & role-mapping
        OpenSearch-->>Base: 200 OK
    end

    rect rgba(240,255,230,0.5)
        Note over Test,Base: Bulk Load Phase
        Test->>Base: performBulkInsert(bulkBody)
        Base->>OpenSearch: Bulk API request (with refresh)
        OpenSearch-->>Base: Bulk response
    end

    rect rgba(255,240,230,0.5)
        Note over Test,Base: Authenticated Query Phase
        Test->>Base: executeQueryAsUser(query, username)
        Base->>Base: createBasicAuthHeader(username, password)
        Base->>OpenSearch: PPL query + Basic Auth header
        OpenSearch->>Auth: Validate credentials & evaluate roles
        Auth-->>OpenSearch: Authorization decision
        OpenSearch-->>Base: Query results (filtered by FGAC)
        Base-->>Test: JSONObject response
    end

    rect rgba(245,235,255,0.5)
        Note over Test: Verification
        Test->>Test: Assert results match expected FGAC behavior
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

testing

Suggested reviewers

  • kavithacm
  • derek-ho
  • joshuali925
  • anirudha
  • penghuo
  • ps48
  • ykmr1224
  • yuancu
  • mengweieric
  • GumpacG
  • MaxKsyunz
🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.98% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive Title focuses on field-level security testing, but PR primarily adds document-level security (DLS) integration tests with a new security test base class and parameterized tests across multiple security scenarios. Clarify whether the title should emphasize DLS (the main focus per the description) or broaden to include the new SecurityTestBase utility class and comprehensive security testing framework.
✅ Passed checks (1 passed)
Check name Status Explanation
Description check ✅ Passed Description clearly explains the purpose: adding DLS integration tests to verify security with background I/O, references ContextPreservingRunnable, and indicates future verification intent.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
@Swiddis Swiddis force-pushed the fix/background-io-thread-context branch from 1b1ca7b to ded19e9 Compare December 30, 2025 19:00
@Swiddis Swiddis added the maintenance Improves code quality, but not the product label Dec 30, 2025
@Swiddis
Copy link
Copy Markdown
Collaborator Author

Swiddis commented Dec 30, 2025

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 30, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java (2)

53-64: Static initialization flag may cause issues with test isolation.

The static boolean initialized flag combined with @BeforeEach could cause issues if tests run in parallel or if the test class is instantiated multiple times. Consider using @BeforeAll with a static method instead, which is the idiomatic JUnit 5 pattern for one-time setup.

🔎 Proposed refactor to use @BeforeAll
- private static boolean initialized = false;
-
- @SneakyThrows
- @BeforeEach
- public void initialize() {
-   if (!initialized) {
-     setUpIndices(); // Initialize client if needed
-     setupTestIndices();
-     createSecurityRolesAndUsers();
-     initialized = true;
-   }
- }
+ @SneakyThrows
+ @BeforeAll
+ public static void initializeOnce() {
+   // Note: This requires making the setup methods static or 
+   // using a different initialization approach
+ }

Alternatively, if @BeforeAll isn't compatible with the parent class, document why the static flag pattern is necessary.


74-85: Consider adding test cleanup for created indices.

The test creates multiple indices (public_logs_fgac, sensitive_logs_fgac, secure_logs_fgac, employee_records_fgac) but there's no @AfterAll or @AfterEach cleanup method. While using unique _fgac suffix provides isolation, these indices will persist after tests complete.

As per coding guidelines, integration tests should validate that tests clean up resources after execution.

🔎 Consider adding cleanup
@AfterAll
public static void cleanup() throws IOException {
  // Delete test indices
  String[] indices = {PUBLIC_LOGS, SENSITIVE_LOGS, SECURE_LOGS, EMPLOYEE_RECORDS};
  for (String index : indices) {
    try {
      client().performRequest(new Request("DELETE", "/" + index));
    } catch (Exception e) {
      // Index may not exist, ignore
    }
  }
}
integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java (2)

56-98: Consider using JSONArray for permission arrays to avoid manual string building.

The manual StringBuilder approach for building JSON arrays works but is error-prone. Using JSONArray would be cleaner and safer.

🔎 Optional refactor using JSONArray
import org.json.JSONArray;

// In createRoleWithPermissions:
JSONArray clusterPermsArray = new JSONArray(clusterPermissions);
JSONArray indexPermsArray = new JSONArray(indexPermissions);

// Then use clusterPermsArray.toString() and indexPermsArray.toString() in the template

341-353: Remove unused BulkDocumentBuilder helper class or document its intended purpose.

The BulkDocumentBuilder is defined but never instantiated or used anywhere in the codebase. Either remove it if not needed for this PR, or add a comment documenting its intended usage in future tests.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08be6f9 and ded19e9.

📒 Files selected for processing (4)
  • integ-test/build.gradle
  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java
🧰 Additional context used
📓 Path-based instructions (4)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify changes with `./gradlew :integ-test:integTest` before merge
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify changes with `./gradlew :integ-test:integTest` before merge

Applied to files:

  • integ-test/build.gradle
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module

Applied to files:

  • integ-test/build.gradle
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Use `./gradlew :integ-test:integTest` for integration testing in OpenSearch SQL

Applied to files:

  • integ-test/build.gradle
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/build.gradle
  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java
  • integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Development requires JDK 21 for the OpenSearch SQL project

Applied to files:

  • integ-test/build.gradle
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/build.gradle
  • integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java (1)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
  • PPLIntegTestCase (36-415)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (12)
integ-test/build.gradle (2)

211-212: LGTM - Dependency addition for parameterized tests.

The junit-jupiter-params dependency aligns with the existing JUnit 5 version (5.9.3) used in the project.


449-453: LGTM - Test filter inclusion follows existing pattern.

The FGACIndexScanningIT test is correctly added to the security test filter alongside existing security integration tests. Based on learnings, the *IT.java naming convention is properly followed.

integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java (4)

197-223: LGTM - Bulk insert implementation follows best practices.

The method correctly uses StringBuilder for string concatenation in loops (as per coding guidelines), sets the proper Content-Type: application/x-ndjson header, and verifies the response status code.


363-405: LGTM - Index-level security test is comprehensive.

The test properly validates both positive cases (user can access their index) and negative cases (user cannot access unauthorized index) with meaningful assertions. The exception handling verifies appropriate error messages.


407-499: LGTM - Column-level security test covers FLS comprehensively.

The test validates:

  1. HR user can see all fields including sensitive ssn
  2. Manager user cannot see ssn field
  3. Manager cannot even query ssn field
  4. FLS is enforced even with large datasets (background scanning)

Error messages clearly indicate security violations with actionable context.


501-628: LGTM - Row-level security tests cover both V2 and V3 engines.

The tests appropriately:

  • Test V2 (legacy) engine by explicitly calling disableCalcite()
  • Test V3 (Calcite) engine using the default enabled state
  • Verify DLS filtering by counting documents per security level
  • Include detailed error messages explaining ThreadContext implications
integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java (3)

19-26: LGTM - Well-structured base class with clear documentation.

The abstract class properly inherits from PPLIntegTestCase and provides reusable utilities for security integration tests, reducing code duplication across test classes.


198-248: LGTM - User creation with role mapping is convenient for test setup.

The method appropriately combines user creation and role mapping, which is a common pattern for test utilities. Both operations verify success via status codes.


250-277: LGTM - Clean implementation for authenticated query execution.

The method properly constructs the request with Basic Auth headers and validates the response status code before parsing.

integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java (3)

31-31: LGTM - Good refactoring to extend SecurityTestBase.

The class now properly inherits from SecurityTestBase, eliminating duplicate code for role/user creation and authenticated query execution.


83-102: LGTM - Clean usage of inherited SecurityTestBase methods.

The refactored setup properly delegates to the base class methods for creating roles and users, following DRY principles.


104-194: LGTM - Comprehensive negative test setup with inherited utilities.

The helper methods properly use createRoleWithPermissions from SecurityTestBase to create roles with specific missing permissions, enabling thorough negative testing of permission enforcement.

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
@Swiddis Swiddis mentioned this pull request Dec 31, 2025
8 tasks
@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In
`@integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java`:
- Around line 265-309: bulkInsertDocsWithSecurityLevel currently only inserts
documents with non-null security_level; add a few extra bulk entries that
include one document with "security_level": null and one document that omits the
security_level field entirely (use FGACIndexScanningIT.SECURE_LOGS as the
_index) so DLS behavior for null/missing values is exercised; update any
corresponding assertions in the test that verifies visibility for limited_user
to assert those null/missing-security_level docs are not visible.
- Around line 502-527: The test testFieldLevelSecurityEnforcedWithLargeDataset
currently uses a query that excludes ssn so it doesn't exercise FLS; change the
test to either (A) issue a query that directly requests the restricted field
(e.g., include "ssn" in the fields list) via
executeQueryAsUser(queryLargeDataset, MANAGER_USER) and assert that it fails
with the same ResponseException expectation used in
testManagerUserCannotQueryRestrictedField, or (B) keep the large-dataset query
but assert presence of allowed fields (name, salary, department) in the returned
schema/result to prove the query executed and FLS is not leaking extra fields;
update assertions accordingly and reuse existing constants EMPLOYEE_RECORDS and
MANAGER_USER to locate where to modify the query and assertions.
- Around line 529-600: The testRowLevelSecurity method reads datarows assuming
fixed column positions; instead read the result.getJSONArray("schema") to find
indexes for "count()" and "security_level" (e.g., locate these names in the
schema array), validate those indexes are present, then use those index
variables when extracting values from datarows in the loop; update references to
count and securityLevel accordingly and add an assertion/failure if the schema
does not contain the expected columns so the test fails loudly rather than
silently misreading columns.
- Around line 196-222: The bulk helper functions (bulkInsertDocs,
bulkInsertEmployeeRecords, bulkInsertDocsWithSecurityLevel) only assert HTTP
200, missing per-item failures; after performing the _bulk request read and
parse the response body JSON and assert the top-level "errors" field is false
(and optionally inspect "items" to fail the test with the first item error),
replacing the lone status check with this validation so partial failures cause
the test to fail.

@Swiddis Swiddis requested review from dai-chen and penghuo February 2, 2026 18:44
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java (1)

238-240: ⚠️ Potential issue | 🟡 Minor

Comment has incorrect index name.

The comment states "user should not be able to access the dog index" but this test verifies that dog_user cannot access the bank index.

📝 Suggested fix
     } catch (ResponseException e) {
-      // This is expected - user should not be able to access the dog index
+      // This is expected - user should not be able to access the bank index
       String responseBody =

@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

ahkcs
ahkcs previously approved these changes Mar 12, 2026
@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit e2e95ae.

PathLineSeverityDescription
integ-test/build.gradle216highNew dependency added: 'org.junit.jupiter:junit-jupiter-params:5.9.3'. Per mandatory flagging rules, all dependency changes must be flagged regardless of apparent legitimacy. Maintainers should verify this artifact matches the expected JUnit Jupiter artifact from the official repository.
integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java35lowHardcoded credential 'correcthorsebatterystaple' introduced in the new shared base class, replacing the previous per-class value 'StrongPassword123!'. While this is test infrastructure, centralizing a changed password value in a shared base class warrants review to confirm it is intentional and not being used in non-test environments.

The table above displays the top 10 most important findings.

Total: 2 | Critical: 0 | High: 1 | Medium: 0 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@Swiddis
Copy link
Copy Markdown
Collaborator Author

Swiddis commented Apr 1, 2026

this passphrase-style password of ambiguous origin.

i'd expect that reference to have shown up enough times in their training data.

@Swiddis Swiddis added the skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis. label Apr 1, 2026
@Swiddis Swiddis requested a review from ahkcs April 1, 2026 20:29
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 Security concerns

Sensitive information exposure:
The STRONG_PASSWORD constant ("correcthorsebatterystaple") is a publicly known example passphrase. While this is test-only code, it is stored as a plain-text constant in the source. If these test credentials were accidentally used against a real environment or if the test infrastructure is shared, this could pose a risk. Additionally, the password is embedded in Basic auth headers constructed via createBasicAuthHeader, which encodes but does not encrypt credentials.

✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Refactor PPLPermissionsIT to use shared SecurityTestBase

Relevant files:

  • integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java
  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java

Sub-PR theme: Add FGAC index scanning integration tests for DLS and FLS

Relevant files:

  • integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java
  • integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java

⚡ Recommended focus areas for review

Weak Password

The STRONG_PASSWORD constant is set to "correcthorsebatterystaple", which is a well-known example passphrase from the XKCD comic. While it meets length requirements, using a publicly known example password in security tests could be misleading. Consider using a more unique password that isn't a famous example.

protected static final String STRONG_PASSWORD = "correcthorsebatterystaple";
Inconsistent Auth

The executeGrammarAsUser method constructs the Basic auth header inline using java.util.Base64, while executeQueryAsUser uses the createBasicAuthHeader helper method. These should be consistent — executeGrammarAsUser should also use createBasicAuthHeader.

protected JSONObject executeGrammarAsUser(String username) throws IOException {
  Request request = new Request("GET", "/_plugins/_ppl/_grammar");

  RequestOptions.Builder restOptionsBuilder = RequestOptions.DEFAULT.toBuilder();
  restOptionsBuilder.addHeader(
      "Authorization",
      "Basic "
          + java.util.Base64.getEncoder()
              .encodeToString((username + ":" + STRONG_PASSWORD).getBytes()));
  request.setOptions(restOptionsBuilder);

  Response response = client().performRequest(request);
  assertEquals(200, response.getStatusLine().getStatusCode());
  return new JSONObject(org.opensearch.sql.legacy.TestUtils.getResponseBody(response, true));
}
DLS Query Escaping

The DLS query string passed to createRoleWithDLS uses manual backslash escaping ({\\\"match\\\":{\\\"security_level\\\":\\\"public\\\"}}) which is fragile and error-prone. If the escaping is incorrect, the DLS filter may silently fail to apply, causing security tests to pass even when DLS is broken. Consider using a structured approach or validating the DLS is actually applied.

  createRoleWithDLS(
      LIMITED_ROLE, SECURE_LOGS, "{\\\"match\\\":{\\\"security_level\\\":\\\"public\\\"}}");
}
Missing Null Row Assertion

Documents with null or missing security_level are inserted into SECURE_LOGS, but testRowLevelSecurity does not assert whether these documents are visible or hidden for limited_user. This could mask a DLS bypass where null/missing field documents leak through the filter.

  for (int i = 0; i < datarows.length(); i++) {
    var row = datarows.getJSONArray(i);
    int count = row.getInt(countIdx);
    String securityLevel = row.getString(levelIdx);
    totalDocs += count;

    if ("confidential".equals(securityLevel)) {
      sawConfidential = true;
    } else if ("internal".equals(securityLevel)) {
      sawInternal = true;
    } else if ("public".equals(securityLevel)) {
      publicDocs = count;
    }
  }

  assertFalse(
      String.format(
          "[%s] SECURITY VIOLATION: limited_user should NOT see 'confidential' documents. "
              + "This indicates ThreadContext is not being copied to async worker threads, "
              + "causing queries to run with admin permissions and bypass row-level security.",
          engineLabel),
      sawConfidential);

  assertFalse(
      String.format(
          "[%s] SECURITY VIOLATION: limited_user should NOT see 'internal' documents. "
              + "This indicates ThreadContext is not being copied to async worker threads, "
              + "causing queries to run with admin permissions and bypass row-level security.",
          engineLabel),
      sawInternal);

  int expectedPublicDocs = LARGE_DATASET_SIZE / 2;
  assertEquals(
      String.format(
          "[%s] limited_user should ONLY see 'public' documents (half of dataset). "
              + "Seeing more indicates row-level security is being bypassed.",
          engineLabel),
      expectedPublicDocs,
      publicDocs);

  assertEquals(
      String.format(
          "[%s] Total visible documents should match public documents only. "
              + "Seeing all documents indicates row-level security is completely bypassed.",
          engineLabel),
      expectedPublicDocs,
      totalDocs);
}
Unused Helpers

The buildBulkInsertRequest, performBulkInsert, createIndexWithMapping, and BulkDocumentBuilder utility methods/class are defined in SecurityTestBase but are not used by any of the test classes in this PR. Dead code in a base class adds maintenance burden and may confuse future contributors.

protected String buildBulkInsertRequest(String indexName, List<Map<String, Object>> documents) {
  StringBuilder bulk = new StringBuilder();
  for (Map<String, Object> doc : documents) {
    bulk.append(String.format(Locale.ROOT, "{ \"index\": { \"_index\": \"%s\" } }\n", indexName));
    bulk.append(new JSONObject(doc).toString());
    bulk.append("\n");
  }
  return bulk.toString();
}

/**
 * Performs a bulk insert operation with automatic refresh.
 *
 * @param bulkRequestBody the bulk request body (NDJSON format)
 * @return the response from the bulk operation
 */
protected Response performBulkInsert(String bulkRequestBody) throws IOException {
  Request request = new Request("POST", "/_bulk");
  request.addParameter("refresh", "true");
  request.setJsonEntity(bulkRequestBody);

  RequestOptions.Builder restOptionsBuilder = RequestOptions.DEFAULT.toBuilder();
  restOptionsBuilder.addHeader("Content-Type", "application/x-ndjson");
  request.setOptions(restOptionsBuilder);

  Response response = client().performRequest(request);
  assertEquals(200, response.getStatusLine().getStatusCode());
  return response;
}

/**
 * Creates an index with a simple mapping.
 *
 * @param indexName the name of the index
 * @param mappingJson the mapping definition as JSON string
 */
protected void createIndexWithMapping(String indexName, String mappingJson) throws IOException {
  Request request = new Request("PUT", "/" + indexName);
  request.setJsonEntity(mappingJson);
  Response response = client().performRequest(request);
  assertEquals(200, response.getStatusLine().getStatusCode());
}

/** Simple builder for creating bulk insert documents. */
protected static class BulkDocumentBuilder {
  private final List<Map<String, Object>> documents = new ArrayList<>();

  public BulkDocumentBuilder addDocument(Map<String, Object> doc) {
    documents.add(doc);
    return this;
  }

  public List<Map<String, Object>> build() {
    return documents;
  }
}

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Use explicit UTF-8 charset for Base64 encoding

The getBytes() call without a charset argument uses the platform's default charset,
which may vary across environments and cause authentication failures. Use
StandardCharsets.UTF_8 explicitly to ensure consistent encoding. Note that
executeGrammarAsUser has the same issue and should be updated too.

integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java [303-306]

 protected String createBasicAuthHeader(String username, String password) {
   return "Basic "
-      + java.util.Base64.getEncoder().encodeToString((username + ":" + password).getBytes());
+      + java.util.Base64.getEncoder().encodeToString((username + ":" + password).getBytes(java.nio.charset.StandardCharsets.UTF_8));
 }
Suggestion importance[1-10]: 5

__

Why: Using getBytes() without a charset can cause platform-dependent behavior. Specifying StandardCharsets.UTF_8 ensures consistent encoding across environments. The same issue exists in executeGrammarAsUser at line 288.

Low
Make shared password constant immutable

The STRONG_PASSWORD field is declared protected but not final, allowing subclasses
to accidentally override it and break authentication in all tests that rely on this
shared constant. It should be declared protected static final.

integ-test/src/test/java/org/opensearch/sql/security/SecurityTestBase.java [25]

-protected String STRONG_PASSWORD = "correcthorsebatterystaple";
+protected static final String STRONG_PASSWORD = "correcthorsebatterystaple";
Suggestion importance[1-10]: 4

__

Why: The STRONG_PASSWORD field is already declared as protected static final in the actual code (line 25 shows protected static final String STRONG_PASSWORD), so the 'existing_code' in the suggestion doesn't match the PR diff. The field is already correctly declared.

Low
Document and verify full dataset distribution in test

The confidential document range is [confidentialStart, LARGE_DATASET_SIZE), which
equals [75, 100) — only 25 documents. However, the test for totalDocs asserts it
equals expectedPublicDocs (50), implying only public docs are visible. This logic is
correct for the DLS assertion, but the variable confidentialStart and the loop
bounds should be verified to match the intended distribution (50 public, 25
internal, 25 confidential = 100 total), which they do. However, the
confidentialStart variable is unused after the loop and the count of confidential
docs is never validated, making the test incomplete for verifying the full document
distribution.

integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java [277-280]

-int confidentialStart = internalStart + internalCount;
+int publicCount = LARGE_DATASET_SIZE / 2;       // 50
+int internalStart = publicCount;                 // 50
+int internalCount = LARGE_DATASET_SIZE / 4;      // 25
+int confidentialStart = internalStart + internalCount; // 75
+int confidentialCount = LARGE_DATASET_SIZE - confidentialStart; // 25
+// confidentialCount is used to document intent; total = 50+25+25 = 100
 
-...
-
-for (int i = confidentialStart; i < LARGE_DATASET_SIZE; i++) {
-
Suggestion importance[1-10]: 2

__

Why: The suggestion only adds a comment variable to document intent without changing any logic. The existing distribution (50 public, 25 internal, 25 confidential) is already correct and the DLS assertions are valid, making this a very low-impact cosmetic change.

Low
Possible issue
Avoid fragile manual JSON escaping in DLS query

The DLS query uses manually escaped JSON with \", which is fragile and
error-prone. The createRoleWithDLS method embeds the dlsQuery string directly into a
String.format template that is already a JSON string value, so the escaping must be
exact. Consider using a JSON library to serialize the query object, or at minimum
add a comment and a test to verify the resulting JSON is valid.

integ-test/src/test/java/org/opensearch/sql/security/FGACIndexScanningIT.java [383-385]

-createRoleWithDLS(
-    LIMITED_ROLE, SECURE_LOGS, "{\\\"match\\\":{\\\"security_level\\\":\\\"public\\\"}}");
+// The DLS query must be a valid JSON string value (double-escaped for embedding in JSON)
+String dlsQuery = new JSONObject()
+    .put("match", new JSONObject().put("security_level", "public"))
+    .toString()
+    .replace("\"", "\\\"");
+createRoleWithDLS(LIMITED_ROLE, SECURE_LOGS, dlsQuery);
Suggestion importance[1-10]: 5

__

Why: The manually escaped DLS query string {\\\"match\\\":{\\\"security_level\\\":\\\"public\\\"}} is fragile and hard to maintain. Using a JSON library to construct and escape the query would be more robust and readable.

Low

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintenance Improves code quality, but not the product skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis. stalled

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants