Skip to content

Comments

feat(wg-easy): add weekly automated chart validation tests#91

Open
adamancini wants to merge 5 commits intomainfrom
add-wg-easy-weekly-tests
Open

feat(wg-easy): add weekly automated chart validation tests#91
adamancini wants to merge 5 commits intomainfrom
add-wg-easy-weekly-tests

Conversation

@adamancini
Copy link
Member

Summary

Adds automated weekly testing workflow to ensure the wg-easy Helm chart remains installable and functional, even during periods of low development activity.

Problem

The current GitHub Actions workflows only run when there's active development on the wg-easy chart. This means:

  • Breakages from external dependencies or infrastructure changes go unnoticed
  • Users may encounter installation failures that could have been caught proactively
  • Team isn't automatically notified when the chart stops working

Solution

Created a new weekly test workflow (.github/workflows/wg-easy-weekly-test.yaml) that:

Runs automatically every Monday at 9:00 AM UTC
Tests the full lifecycle: validation → packaging → deployment → health checks
Creates GitHub Issues automatically when tests fail
Notifies on recovery by commenting on issues when tests pass again
Efficient resource usage with stable "weekly-test" channel/customer and single cluster config
Easy debugging with uploaded logs and direct workflow links

What Gets Tested

  1. Chart Validation - Lints and validates all charts using existing validation actions
  2. Chart Packaging - Packages charts for release
  3. Resource Creation - Creates/reuses Replicated channel and customer resources
  4. Deployment - Creates fresh k3s cluster (v1.35) and deploys the chart
  5. Health Checks - Verifies pods are running and application is healthy
  6. Cleanup - Automatically removes test cluster after completion

Changes

  • 📄 New file: .github/workflows/wg-easy-weekly-test.yaml - Weekly test workflow
  • 📝 Updated: applications/wg-easy/README.md - Added "Automated Testing" documentation section

Implementation Details

  • Reuses existing PR validation logic for consistency
  • Tests on latest stable Kubernetes (v1.35) to balance coverage and cost
  • Uses idempotent resource creation (reuses channel/customer when possible)
  • Runs in ~15-20 minutes total
  • Can be triggered manually via workflow_dispatch

Notifications

On Test Failure:

  • Creates GitHub Issue with automated-test, wg-easy, and bug labels
  • Includes direct link to failed workflow run
  • If issue exists for the same day, adds comment instead of creating duplicate

On Test Success (after previous failure):

  • Comments on most recent open test failure issue
  • Suggests closing if problem is resolved

Benefits

🎯 Proactive Monitoring - Catch breakages before users encounter them
🔔 Automatic Notifications - Team is immediately alerted via GitHub Issues
🤖 Zero Maintenance - Fully automated, no manual intervention needed
💰 Cost Efficient - Single cluster per week, reused resources
🔍 Easy Debugging - Full logs and workflow links in issue reports

Testing

  • Workflow syntax validated
  • Reuses existing, tested action components
  • Will validate on first scheduled run (next Monday)

Related

This addresses the need for continuous validation independent of development cycles, as discussed in the project requirements.

Add automated weekly testing workflow to ensure the wg-easy Helm chart
remains installable and functional during periods of low development activity.

Changes:
- Add weekly-test.yaml workflow that runs every Monday at 9:00 AM UTC
- Test chart validation, packaging, deployment, and functionality
- Create GitHub Issues automatically on test failures
- Comment on issues when tests recover to passing state
- Reuse stable "weekly-test" channel/customer for efficiency
- Update README.md with automated testing documentation

Benefits:
- Proactive monitoring catches breakages before users encounter them
- Automatic notifications via GitHub Issues
- Zero maintenance - runs completely automated
- Cost efficient with single cluster configuration
- Easy debugging with uploaded logs and workflow links
This allows testing the workflow on the PR before merge.
Will be removed after validation.
The wg-easy pod deploys to the wg-easy namespace, not default.
Update health check to look in the correct namespace.
The workflow has been validated and works correctly.
Removing the temporary pull_request trigger before merge.
@adamancini
Copy link
Member Author

✅ Testing Complete

The weekly test workflow has been successfully validated!

Test Results

Workflow Run: https://github.com/replicatedhq/platform-examples/actions/runs/21690452740

All steps completed successfully:

  • ✅ Chart validation and linting
  • ✅ Chart packaging
  • ✅ Replicated resource creation (channel/customer)
  • ✅ Cluster provisioning (k3s v1.35)
  • ✅ Full application deployment
  • ✅ Health checks (fixed namespace bug: defaultwg-easy)
  • ✅ Cleanup

Bug Found & Fixed

During testing, discovered the health check was looking in the wrong namespace:

  • Issue: Checked default namespace instead of wg-easy
  • Fix: Updated health check to use correct namespace
  • Result: All health checks now pass ✅

Total Runtime

~8 minutes from start to finish

Ready to Merge

  • Temporary PR trigger has been removed
  • Workflow is production-ready
  • Will run automatically every Monday at 9:00 AM UTC

Add automatic test status tracking to the wg-easy README:

Changes:
- Add test status section to README with markers for automation
- Add update-readme-status job to weekly test workflow
- Automatically updates status table with test results
- Includes status, timestamp, Kubernetes version, and workflow link
- Commits changes directly to main on scheduled runs only

Benefits:
- Users can see test status at a glance in README
- Status is always current (updated weekly)
- Provides direct link to latest test run
- No manual status updates needed
@adamancini
Copy link
Member Author

🎨 Added: Automated Test Status in README

New Feature

The workflow now automatically updates the README with test results!

What was added:

  1. Test Status Section in README - Added a "Test Status" table at the top of the README that shows:

    • Current test status (✅ Passing / ❌ Failed)
    • Last test date/time
    • Kubernetes version tested
    • Direct link to test run
  2. Automated Updates - New update-readme-status job in the workflow:

    • Runs after all tests complete (on scheduled runs only)
    • Updates the status section in README automatically
    • Commits changes directly to main branch
    • Uses [skip ci] to prevent triggering other workflows

How It Works

<!-- TEST_STATUS_START -->
## Test Status

| Component | Status | Last Tested | Kubernetes Version | Details |
|-----------|--------|-------------|-------------------|---------|
| Chart Installation | ✅ Passing | 2026-02-04 22:22 UTC | v1.35 | [View Run](link) |
<!-- TEST_STATUS_END -->

The markers <!-- TEST_STATUS_START --> and <!-- TEST_STATUS_END --> allow the workflow to locate and replace the status section.

Benefits

Always current - Status updates automatically every Monday
At-a-glance visibility - Users see test status in the README immediately
Full transparency - Direct link to test run for detailed results
Zero maintenance - No manual updates needed

First Update

The first automated update will occur after the next scheduled run (Monday at 9:00 AM UTC).

@xavpaice
Copy link
Member

Needs a rebase, but the other question I have is if we can re-use any of the stuff in .github/workflows/wg-easy-pr-validation.yaml? Adding an entirely new workflow looks to be adding a bunch of duplication and given that these are complex workflows with lots of shell, I'd rather keep them as simple as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants