tests: Add bootc end-to-end validation test #126

martinpitt · 2025-06-03T08:47:01Z

Add a knob to tests/tasks/install_and_check.yml to skip the role run, as we don't want to do this during validation.

This uses a new test, as e.g. tests/tests_default.yml are already just wrappers
around task includes, and modifying them would be messier and harder to
understand than a new test.

The other commits are identical to linux-system-roles/sudo#59 and linux-system-roles/.github#114 .

Summary by Sourcery

Add support for end-to-end validation of bootc container images via QEMU and update related CI, defaults, and test logic to accommodate bootc scenarios

New Features:

Add a bootc end-to-end Ansible test playbook to build and validate container images via QEMU
Introduce a __run_role flag to skip role execution during validation runs

Enhancements:

Disable PostgreSQL server tuning by default when using buildah container connection
Update install_and_check tasks to respect __run_role and skip tuning checks under bootc validation

CI:

Bump tox-lsr to v3.10.0 in all CI workflows
Add a Podman 5.x upgrade step in QEMU integration workflow for bootc-image-builder compatibility
Add a CI job to run bootc validation tests in QEMU

Documentation:

Clarify in README that server tuning is disabled during container builds

Tests:

Add tests/tests_bootc_e2e.yml for bootc validation
Adjust existing test tasks to include conditional logic for bootc validation flags

sourcery-ai · 2025-06-03T08:47:06Z

Reviewer's Guide

Integrates bootc end-to-end validation into CI by adding a dedicated test playbook, gating role execution with a new flag, disabling tuning in buildah containers, and updating GitHub workflows for tox-lsr 3.10.0 and Podman 5.x support.

Sequence Diagram for Bootc End-to-End Validation Test

sequenceDiagram
    participant CI as GitHub Actions
    participant Playbook as tests_bootc_e2e.yml
    participant Builder as bootc-image-builder
    participant TargetEnv as QEMU VM / Deployed Image

    CI->>Playbook: Trigger bootc E2E test
    Playbook->>Playbook: Prepare environment (e.g., Podman 5.x)
    Playbook->>Builder: Build bootc image
    Builder-->>Playbook: Image built
    Playbook->>TargetEnv: Deploy/Run image
    Playbook->>TargetEnv: Validate deployed image
    TargetEnv-->>Playbook: Validation results
    Playbook->>CI: Report test status (pass/fail)

Flow Diagram for `postgresql_server_tuning` Default Logic

graph TD
    subgraph "Setting postgresql_server_tuning Default"
        Start[Load defaults/main.yml] --> CheckConn{ansible_connection == 'buildah'?};
        CheckConn -- "Yes" --> SetFalse[postgresql_server_tuning = false];
        CheckConn -- "No" --> SetTrue[postgresql_server_tuning = true];
        SetFalse --> EndValue[Value Set];
        SetTrue --> EndValue[Value Set];
    end

File-Level Changes

Change	Details	Files
Introduce __run_role flag to optionally skip role execution in validation tasks	Add __run_role variable check around include_role in install_and_check.yml Wrap postgresql role invocation with when: __run_role	d(true)
Disable server tuning by default in buildah container builds	Change postgresql_server_tuning default to '{{ false if ansible_connection == "buildah" else true }}' Annotate rationale for disabling tuning in defaults/main.yml	`defaults/main.yml`
Document buildah-specific tuning behavior in README	Add note under configuration section explaining buildah connection disables tuning by default	`README.md`
Add new bootc end-to-end test playbook	Create tests/tests_bootc_e2e.yml to build bootc image and validate deployed image Use include_role and include_tasks with __run_role and __bootc_validation flags in playbook	`tests/tests_bootc_e2e.yml`
Bump tox-lsr dependency to v3.10.0 in CI workflows	Update pip3 install pin from 3.9.0 to 3.10.0 in ansible-lint.yml Update pip3 install pin from 3.9.0 to 3.10.0 in ansible-managed-var-comment.yml Update pip3 install pin from 3.9.0 to 3.10.0 in ansible-test.yml Update pip3 install pin from 3.9.0 to 3.10.0 in qemu-kvm-integration-tests.yml	`.github/workflows/ansible-lint.yml` `.github/workflows/ansible-managed-var-comment.yml` `.github/workflows/ansible-test.yml` `.github/workflows/qemu-kvm-integration-tests.yml`
Enhance QEMU CI for Podman 5.x compatibility and bootc validation	Add apt source override and package pinning to install Podman 5.x in qemu-kvm-integration-tests.yml Adjust tox invocation to skip tests::bootc-e2e tag Introduce 'Run bootc validation tests in QEMU' step that iterates QCOW2 images and logs PASS/FAIL	`.github/workflows/qemu-kvm-integration-tests.yml`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @martinpitt - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟢 General issues: all looks good
🟢 Security: all looks good
🟢 Review instructions: all looks good
🟢 Testing: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

martinpitt · 2025-06-03T09:06:49Z

That first version fails as expected, already noticed in https://issues.redhat.com/browse/RHEL-88396:

 postgresql.service - PostgreSQL database server
     Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; preset: disabled)
     Active: failed (Result: exit-code) since Tue 2025-06-03 09:06:07 UTC; 23s ago
    Process: 876 ExecStartPre=/usr/libexec/postgresql-check-db-dir postgresql (code=exited, status=0/SUCCESS)
    Process: 910 ExecStart=/usr/bin/postmaster -D ${PGDATA} (code=exited, status=1/FAILURE)
   Main PID: 910 (code=exited, status=1/FAILURE)
        CPU: 31ms

Jun 03 09:06:07 localhost.localdomain systemd[1]: Starting PostgreSQL database server...
Jun 03 09:06:07 localhost.localdomain postmaster[910]: 2025-06-03 09:06:07.232 UTC [910] FATAL:  could not map anonymous shared memory: Cannot allocate memory
Jun 03 09:06:07 localhost.localdomain postmaster[910]: 2025-06-03 09:06:07.232 UTC [910] HINT:  This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 4250107904 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
Jun 03 09:06:07 localhost.localdomain postmaster[910]: 2025-06-03 09:06:07.232 UTC [910] LOG:  database system is shut down
Jun 03 09:06:07 localhost.localdomain systemd[1]: postgresql.service: Main process exited, code=exited, status=1/FAILURE
Jun 03 09:06:07 localhost.localdomain systemd[1]: postgresql.service: Failed with result 'exit-code'.
Jun 03 09:06:07 localhost.localdomain systemd[1]: Failed to start PostgreSQL database server.

When running the role during a container build, the shared_buffers auto-detection is done based on the container RAM, which can be significantly higher than the VM's -- and either way it is completely unrelated. This needs to be disabled.

sourcery-ai

Hey @martinpitt - I've reviewed your changes - here's some feedback:

Consolidate the repeated tox-lsr installation steps across workflows into a shared template or composite action to reduce duplication and maintenance overhead.
Define sensible defaults for the __run_role and __bootc_validation variables in role defaults or a common vars file so tests have predictable behavior and avoid false negatives.
Add a clear TODO or expiration note around the ‘plucky’ Podman pinning hack, so it’s removed once Ubuntu 26.04 LTS is in place and the workaround is no longer needed.

Here's what I looked at during the review

🟡 General issues: 1 issue found
🟢 Security: all looks good
🟢 Review instructions: all looks good
🟢 Testing: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

README.md

richm · 2025-06-03T16:21:14Z

tests/tests_bootc_e2e.yml

+          changed_when: true
+
+    - name: Validation of deployed image
+      when: ansible_connection | d("") != "buildah"


Under what conditions can ansible_connection be undefined? Because above you reference it without the | d() in postgresql_server_tuning: "{{ false if ansible_connection == 'buildah' else true }}" ?

ansible_connection is undefined for QEMU runs, i.e. when not using an external connection plugin.

Without the | d() it fails like this:

TASK [Flush handlers] ****************************************************************************************************************** task path: /var/home/martin/upstream/lsr/postgresql/tests/tasks/install_and_check.yml:11 Tuesday 03 June 2025 18:48:02 +0200 (0:00:00.017) 0:00:00.082 ********** ERROR! The conditional check 'ansible_connection != "buildah"' failed. The error was: error while evaluating conditional (ansible_connection != "buildah"): 'ansible_connection' is undefined. 'ansible_connection' is undefined The error appears to be in '/var/home/martin/upstream/lsr/postgresql/tests/tasks/install_and_check.yml': line 11, column 7, but may be elsewhere in the file depending on the exact syntax problem.

Apparently Ansible gets along with testing for equality == against an undefined value, but not with !=. However, that's confusing, and I applied the default everywhere now for consistency.

ansible_connection is undefined for QEMU runs, i.e. when not using an external connection plugin.

That's really strange. I cannot reproduce this behavior. No matter what I try, I get "ansible_connection": "ssh" unless I explicitly specify another connection type.

Without the | d() it fails like this:

TASK [Flush handlers] ****************************************************************************************************************** task path: /var/home/martin/upstream/lsr/postgresql/tests/tasks/install_and_check.yml:11 Tuesday 03 June 2025 18:48:02 +0200 (0:00:00.017) 0:00:00.082 ********** ERROR! The conditional check 'ansible_connection != "buildah"' failed. The error was: error while evaluating conditional (ansible_connection != "buildah"): 'ansible_connection' is undefined. 'ansible_connection' is undefined The error appears to be in '/var/home/martin/upstream/lsr/postgresql/tests/tasks/install_and_check.yml': line 11, column 7, but may be elsewhere in the file depending on the exact syntax problem.

Can you provide your exact tox command line?

Apparently Ansible gets along with testing for equality == against an undefined value, but not with !=.

That is even stranger.

However, that's confusing, and I applied the default everywhere now for consistency.

Prep (works):

LSR_CONTAINER_PROFILE=false LSR_CONTAINER_PRETTY=false tox -e container-ansible-core-2.16 -- --image-name centos-9-bootc tests/tests_bootc_e2e.yml

Validate:

tox -e qemu-ansible-core-2.16 -- --image-file `pwd`/tests/tmp/tests_bootc_e2e/qcow2/disk.qcow2 --log-level=debug -e __bootc_validation=true -- tests/tests_bootc_e2e.yml

Perhaps you forgot the -e __bootc_validation=true?

The variable reference is in a handler - code run in handlers is in a strange variable namespace - I have seen cases where vars defined in the playbook are not available in the handler - but this one doesn't make any sense - if I put a debug var: ansible_connection at the beginning of the playbook, it prints the value - I have no idea why it would be removed before the handler is run - I'll chalk this up to a bug in ansible that we have to workaround.

martinpitt · 2025-06-04T04:44:04Z

The prep commits landed in the meantime in #127, so rebasing.

Cause: Detecting available RAM and basing the shared memory settings of the server on that is futile and wrong during a container image build. The available RAM during that build is independent of the actual runtime, be that a bootc deployment or a system container. Consequence: postgresql.service failed to start when the available RAM during container build was higher than at deployment time. In the opposite case, the shared memory tuning was too small and thus also wrong. Fix: Disable `postgresql_server_tuning` by default for container builds, i.e. when being connected to a buildah instance.

Add a knob to tests/tasks/install_and_check.yml to skip the role run, as we don't want to do this during validation. Skip the memory tuning checks for bootc validation, as running the role during a container build disables the automatic tuning.

martinpitt requested review from richm and spetrosi as code owners June 3, 2025 08:47

sourcery-ai bot reviewed Jun 3, 2025

View reviewed changes

martinpitt force-pushed the qemu-bootc branch from f2aecfc to d959816 Compare June 3, 2025 08:50

martinpitt marked this pull request as draft June 3, 2025 08:55

martinpitt removed request for richm and spetrosi June 3, 2025 08:55

martinpitt force-pushed the qemu-bootc branch 3 times, most recently from 101b9fe to 272e0f4 Compare June 3, 2025 09:56

martinpitt marked this pull request as ready for review June 3, 2025 09:56

martinpitt requested a review from richm June 3, 2025 09:56

sourcery-ai bot reviewed Jun 3, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

martinpitt force-pushed the qemu-bootc branch from 272e0f4 to ecfbc77 Compare June 3, 2025 10:01

richm reviewed Jun 3, 2025

View reviewed changes

martinpitt force-pushed the qemu-bootc branch 2 times, most recently from 397646d to 7e74d99 Compare June 3, 2025 17:01

martinpitt requested a review from richm June 3, 2025 17:14

richm approved these changes Jun 3, 2025

View reviewed changes

martinpitt added 2 commits June 4, 2025 06:44

martinpitt force-pushed the qemu-bootc branch from 7e74d99 to 59974c9 Compare June 4, 2025 04:44

martinpitt merged commit 18be022 into linux-system-roles:main Jun 4, 2025
18 checks passed

martinpitt deleted the qemu-bootc branch June 4, 2025 04:57

tests: Add bootc end-to-end validation test #126

tests: Add bootc end-to-end validation test #126

Uh oh!

Conversation

martinpitt commented Jun 3, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence Diagram for Bootc End-to-End Validation Test

Flow Diagram for postgresql_server_tuning Default Logic

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

martinpitt commented Jun 3, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

richm Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

martinpitt Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

richm Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

martinpitt Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

richm Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

martinpitt commented Jun 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

martinpitt commented Jun 3, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Jun 3, 2025 •

edited

Loading

Flow Diagram for `postgresql_server_tuning` Default Logic

martinpitt Jun 3, 2025 •

edited

Loading