Skip to content

NO-ISSUE: [release-4.21] Stabilize e2e-ocl test suite#5667

Open
umohnani8 wants to merge 3 commits intoopenshift:release-4.21from
umohnani8:4.21-ocl
Open

NO-ISSUE: [release-4.21] Stabilize e2e-ocl test suite#5667
umohnani8 wants to merge 3 commits intoopenshift:release-4.21from
umohnani8:4.21-ocl

Conversation

@umohnani8
Copy link
Copy Markdown
Contributor

@umohnani8 umohnani8 commented Feb 17, 2026

This is a manual backport of #5652, #5613, and #5595 to stabilize the e2e-ocl test suite.

This helps complete https://issues.redhat.com/browse/MCO-2130

Three fixes to address intermittent test failures in CI:

1. TestControllerEventuallyReconciles timeout issue:
   - Increased job completion timeout from 10 to 20 minutes
   - Test simulates adverse conditions (scaled down deployments)
   - Image builds can take longer in resource-constrained CI environments

2. Rate limiter exhaustion in log streaming:
   - Reduced log streaming retry interval from 2s to 5s
   - Multiple concurrent goroutines were making API calls too frequently
   - 60% reduction in API call rate prevents rate limiter exhaustion

3. HTTP/2 connection errors failing tests:
   - Made log streaming errors non-fatal (log warnings instead)
   - API server closes long-running log streams when pods terminate
   - Log collection is for debugging, not a test requirement
   - Tests now pass/fail based on actual functionality
External image registries (Docker Hub, GitHub Container Registry) have
changed their API error responses over time:
- Docker.io now returns imageNotFound for nonexistent repos (was accessDenied)
- ghcr.io now returns imageNotFound for nonexistent tags (was accessDenied)

Updated test to accept either error type when both flags are set:
- Modified inspectTestFunc and deleteTestFunc to treat both flags as "accept either"
- Updated Docker.io inspect/nonexistentRepo case to accept both error types
- Updated GitHub registry delete/nonexistentTag case to accept both error types

Both error types are tolerable for ImagePruner functionality, so tests
should not be brittle to registry-specific error response changes.
The cleanupEphemeralBuildObjects function was experiencing intermittent
timeout failures during cleanup verification, even though deletions
were succeeding.

Each verification used 5-minute default timeout with 1s poll interval
which could exhaust rate limits leading to the "context deadline
exceeded" error.

Create a dedicated 2-minute timeout context for cleanup verification
and increase poll interval from 1s to 3s to reduce API call rate which
should be about 40 attempts per resource.

Signed-off-by: Urvashi <umohnani@redhat.com>
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 17, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@umohnani8: This pull request explicitly references no jira issue.

Details

In response to this:

This is a manual backport of #5652, #5613, and #5595 to stabilize the e2e-ocl test suite.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 17, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: umohnani8

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 17, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 25, 2026

@umohnani8: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants