Add ci-pipeline-monitor Copilot CLI skill by JulieLeeMSFT · Pull Request #125809 · dotnet/runtime

JulieLeeMSFT · 2026-03-19T23:53:59Z

Automates monitoring of CI test pipelines on Azure DevOps, triaging failures, and generating weekly reports with GitHub issue tracking.

It currently monitors 20+ JIT pipelines only. It needs to expand to other pipelines.

Automates monitoring of 20+ CI stress/PGO test pipelines on Azure DevOps, triaging failures, and generating weekly reports with GitHub issue tracking. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds a new ci-pipeline-monitor Copilot CLI skill under .github/skills/ to collect Azure DevOps pipeline/test data into a SQLite DB, download Helix logs, generate a standardized markdown report, and validate results before publishing.

Changes:

Introduces Python scripts to (1) create/populate monitor.db, (2) extract failed tests via AzDO Test Results API, (3) download Helix console logs, (4) generate a report, and (5) validate DB/report/log consistency.
Adds templates and reference docs that define the workflow, report/debug-log formats, and “verbatim copy-paste” rules for failure text.
Adds pipeline definition inventory (pipelines.md) and ignores generated artifacts via .gitignore.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
.github/skills/ci-pipeline-monitor/scripts/setup_and_fetch_builds.py	Creates schema and populates `pipelines` with latest build results via AzDO Builds API.
.github/skills/ci-pipeline-monitor/scripts/extract_failed_tests.py	Queries AzDO Test Results API for failing tests and inserts rows into `test_results`.
.github/skills/ci-pipeline-monitor/scripts/fetch_helix_logs.py	Downloads Helix console logs and updates `test_results` with `exit_code`/`console_log_path`.
.github/skills/ci-pipeline-monitor/scripts/generate_report.py	Formats a report from DB contents into `logs/` following the template.
.github/skills/ci-pipeline-monitor/scripts/validate_results.py	Performs DB/report/log validation checks, including GitHub search verification for “NEW” failures.
.github/skills/ci-pipeline-monitor/report-template.md	Defines the expected report structure/fields.
.github/skills/ci-pipeline-monitor/log-template.md	Defines the expected debug log structure/steps.
.github/skills/ci-pipeline-monitor/references/triage-workflow.md	Detailed triage workflow instructions for the LLM/operator.
.github/skills/ci-pipeline-monitor/references/verbatim-rules.md	Rules for verbatim extraction of error messages/stack traces from logs.
.github/skills/ci-pipeline-monitor/references/validation-checks.md	Documents the validation step and what checks are performed.
.github/skills/ci-pipeline-monitor/pipelines.md	Lists monitored pipelines and cached definition IDs.
.github/skills/ci-pipeline-monitor/SKILL.md	Skill definition and end-to-end workflow documentation.
.github/skills/ci-pipeline-monitor/README.md	User-facing setup and usage instructions for the skill.
.github/skills/ci-pipeline-monitor/.gitignore	Ignores generated DB/logs/JSON outputs.

.github/skills/ci-pipeline-monitor/README.md

.github/skills/ci-pipeline-monitor/scripts/setup_and_fetch_builds.py

.github/skills/ci-pipeline-monitor/scripts/validate_results.py

.github/skills/ci-pipeline-monitor/scripts/extract_failed_tests.py

.github/skills/ci-pipeline-monitor/scripts/fetch_helix_logs.py

.github/skills/ci-pipeline-monitor/scripts/generate_report.py

Automates monitoring of 20+ CI stress/PGO test pipelines on Azure DevOps, triaging failures, and generating weekly reports with GitHub issue tracking. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 10 comments.

.github/skills/ci-pipeline-monitor/README.md

.github/skills/ci-pipeline-monitor/scripts/extract_failed_tests.py

.github/skills/ci-pipeline-monitor/README.md

.github/skills/ci-pipeline-monitor/scripts/setup_and_fetch_builds.py

.github/skills/ci-pipeline-monitor/scripts/validate_results.py

.github/skills/ci-pipeline-monitor/scripts/setup_and_fetch_builds.py

.github/skills/ci-pipeline-monitor/README.md

Copilot · 2026-03-20T01:37:00Z

.github/skills/ci-pipeline-monitor/scripts/extract_failed_tests.py

+                f"Error getting token from az cli (exit code {result.returncode}): {result.stderr.strip()}",
+                file=sys.stderr,
+            )
+            return ""
+        token = result.stdout.strip()


This paged AzDO results request also lacks a timeout / exception handling. Consider adding a timeout and handling RequestException (e.g., retry or break out with a warning) so one stalled request doesn’t block the full extraction.

.github/skills/ci-pipeline-monitor/scripts/generate_report.py

…available. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.

.github/skills/ci-pipeline-monitor/scripts/setup_and_fetch_builds.py

.github/skills/ci-pipeline-monitor/scripts/extract_failed_tests.py

.github/skills/ci-pipeline-monitor/scripts/generate_report.py

.github/skills/ci-pipeline-monitor/scripts/extract_failed_tests.py

.github/skills/ci-pipeline-monitor/log-template.md

.github/skills/ci-pipeline-monitor/scripts/validate_results.py

.github/skills/ci-pipeline-monitor/SKILL.md

- Remove all gh CLI usage (subprocess calls, auth login fallback) - Use unauthenticated GitHub Search API (urllib) for check 16d - Add rate-limit sleep between requests (10 req/min) to handle >10 NEW failures - Skip sleep after last request - Remove unused subprocess import - Update README auth section to match Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Clarify that these fields are initially populated from the AzDO Test Results API and may be enriched or overwritten by the agent with verbatim console-log snippets during triage. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Match the report-template.md spec: the validation warning banner should appear after the 'End of report' footer, not before it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

.github/skills/ci-pipeline-monitor/references/validation-checks.md

.github/skills/ci-pipeline-monitor/scripts/validate_results.py

.github/skills/ci-pipeline-monitor/SKILL.md

- Count pipelines with 0 test failures as 'monitored' not 'skipped'. Only private pipelines are truly skipped. Fixes header showing '11 monitored' instead of '21 monitored'. - Cap test names per failure group in Pipeline Summary at 5, with '... and N more' for the rest. Prevents 500+ line reports from mass failures like the 3152 ARM64 HW intrinsics regression. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Match fetch_helix_logs.py behavior: collect all exit codes from the console log and prefer the first non-zero value. The previous logic took the last match, which was always XUnitLogChecker's 'exit code 0', causing false mismatches every run. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Copilot · 2026-04-02T17:03:58Z

.github/skills/ci-pipeline-monitor/README.md

+| 5. Generate Report | Generate markdown report from DB | Script (`generate_report.py`) | None (reads DB only) |
+| 6. Validate Report | Validate DB completeness and report accuracy | Script (`validate_results.py`) | GitHub Search API (unauthenticated spot-checks) |


The step ordering here contradicts SKILL.md: this table says Step 5 generates the report and Step 6 validates, but SKILL.md requires validating the DB before generating the report (and generate_report.py assumes a clean DB). Please swap these rows (or otherwise align README step numbering/description with SKILL.md) so users run validation before report generation.

Suggested change

| 5. Generate Report | Generate markdown report from DB | Script (`generate_report.py`) | None (reads DB only) |

| 6. Validate Report | Validate DB completeness and report accuracy | Script (`validate_results.py`) | GitHub Search API (unauthenticated spot-checks) |

| 5. Validate Report | Validate DB completeness and report accuracy | Script (`validate_results.py`) | GitHub Search API (unauthenticated spot-checks) |

| 6. Generate Report | Generate markdown report from DB | Script (`generate_report.py`) | None (reads DB only) |

Copilot · 2026-04-02T17:03:58Z

.github/skills/ci-pipeline-monitor/scripts/generate_report.py

+        total = cur.execute("SELECT COUNT(*) FROM pipelines").fetchone()[0]
+        # Only private/skip pipelines are truly skipped; pipelines with
+        # 0 test failures are still "monitored" even though they got
+        # re-marked as skipped by extract_failed_tests.py.
+        truly_skipped = cur.execute(
+            "SELECT COUNT(*) FROM pipelines WHERE result = 'skipped' AND skip_reason = 'private'"
+        ).fetchone()[0]
+        monitored = total - truly_skipped
+        skipped = truly_skipped


The header/summary counts treat all non-private result='skipped' pipelines as “monitored” (skip_reason != 'private'), but setup_and_fetch_builds.py also marks intentionally skipped pipelines (skip, missing_def_id, no builds found, etc.) as skipped. Those should likely count as skipped/not-monitored; otherwise the report overstates monitored pipelines and understates skipped. Consider defining monitored pipelines as those not intentionally skipped (e.g., skip_reason IS NULL OR skip_reason == 'Build reported failed but Test Results API returned 0 test method failures'), and count other skip_reasons as skipped.

Copilot · 2026-04-02T17:03:59Z

.github/skills/ci-pipeline-monitor/scripts/generate_report.py

+        ).fetchone()[0]
+        # Pipelines re-marked as skipped (0 test failures) are still monitored
+        skipped_0_tests = cur.execute(
+            "SELECT COUNT(*) FROM pipelines WHERE result = 'skipped' AND skip_reason != 'private'"


Similarly, skipped_0_tests is computed as all skipped pipelines with skip_reason != 'private', which includes pipelines intentionally skipped in pipelines.md (notes include "skip") or missing def IDs/builds. That inflates the “monitored” denominator in the Pipeline Summary. Consider narrowing this to only the specific “0 test failures” skip_reason set by extract_failed_tests.py (or otherwise explicitly exclude known intentional skip reasons).

Suggested change

"SELECT COUNT(*) FROM pipelines WHERE result = 'skipped' AND skip_reason != 'private'"

"SELECT COUNT(*) FROM pipelines WHERE result = 'skipped' AND skip_reason = '0 test failures'"

Copilot · 2026-04-02T17:03:59Z

.github/skills/ci-pipeline-monitor/log-template.md

+Fallback — Build Log Parsing (if needed)
+================================================================================
+
+[HH:mm:ss] Test Results API failed for <pipeline> — falling back to log parsing
+[HH:mm:ss] GET get_build_log(project="public", buildId=<buildId>)
+           → <N> log entries
+[HH:mm:ss] Candidate logs (lineCount 100-2000): <list of logIds>
+[HH:mm:ss] GET get_build_log_by_id(project="public", buildId=<buildId>, logId=<logId>, startLine=0, endLine=1)
+           → "##[section]Starting: <step name>"
+           → Match: YES/NO
+[HH:mm:ss] GET get_build_log_by_id(project="public", buildId=<buildId>, logId=<logId>)
+           → Full log (<N> lines), found <M> errors
+
+================================================================================


The debug log template still references get_builds, get_build_log, and get_build_log_by_id calls (MCP-style), but SKILL.md bans the ado-pipelines_* tools and the actual implementation fetches builds via setup_and_fetch_builds.py (direct HTTP). Please update this template to log the real script invocations / REST URLs (and remove the “Fallback — Build Log Parsing” section if those APIs/tools are intentionally disallowed).

Suggested change

Fallback — Build Log Parsing (if needed)

================================================================================

[HH:mm:ss] Test Results API failed for <pipeline> — falling back to log parsing

[HH:mm:ss] GET get_build_log(project="public", buildId=<buildId>)

→ <N> log entries

[HH:mm:ss] Candidate logs (lineCount 100-2000): <list of logIds>

[HH:mm:ss] GET get_build_log_by_id(project="public", buildId=<buildId>, logId=<logId>, startLine=0, endLine=1)

→ "##[section]Starting: <step name>"

→ Match: YES/NO

[HH:mm:ss] GET get_build_log_by_id(project="public", buildId=<buildId>, logId=<logId>)

→ Full log (<N> lines), found <M> errors

================================================================================

Replace 'List ALL failing tests' with cap-at-5 rule in both the report template and the generated report notes section. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The Failed tests block in each failure's body also listed all unique test names uncapped. Apply the same cap-at-5 rule. Update template line 119 to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Copilot · 2026-04-02T17:20:09Z

.github/skills/ci-pipeline-monitor/scripts/generate_report.py

+    def _header(self, out):
+        cur = self.conn.cursor()
+        total = cur.execute("SELECT COUNT(*) FROM pipelines").fetchone()[0]
+        # Only private/skip pipelines are truly skipped; pipelines with
+        # 0 test failures are still "monitored" even though they got
+        # re-marked as skipped by extract_failed_tests.py.
+        truly_skipped = cur.execute(
+            "SELECT COUNT(*) FROM pipelines WHERE result = 'skipped' AND skip_reason = 'private'"
+        ).fetchone()[0]
+        monitored = total - truly_skipped
+        skipped = truly_skipped
+        passed = cur.execute(
+            "SELECT COUNT(*) FROM pipelines WHERE result = 'succeeded'"
+        ).fetchone()[0]
+        failed = monitored - passed


_header() / _pipeline_summary() treat only skip_reason = 'private' as “truly skipped”, so pipelines intentionally marked skip (and missing_def_id) are counted as monitored. That makes the PASS/FAIL/monitored totals inaccurate and can classify skipped pipelines as failures via failed = monitored - passed. Consider counting skip_reason IN ('private','skip','missing_def_id') (and any other intentional opt-out reasons) as skipped, while keeping the “0 test failures” case separately monitored as intended.

Copilot · 2026-04-02T17:20:10Z

.github/skills/ci-pipeline-monitor/scripts/validate_results.py

+        # Compute shared_paths: console_log_path values used by multiple test_results rows
+        shared_paths = set()
+        path_counts = {}
+        for r in rows:
+            p = r["console_log_path"]
+            path_counts[p] = path_counts.get(p, 0) + 1
+        for p, c in path_counts.items():
+            if c > 1:
+                shared_paths.add(p)
+        skipped_shared = 0
+        for r in rows:
+            log_path = r["console_log_path"]
+            if log_path in shared_paths:
+                skipped_shared += 1
+                continue  # on-disk file was overwritten — can't validate
+            if not os.path.isfile(log_path):


The exit_code matches console log check skips any console_log_path used by multiple test_results rows, assuming the file was overwritten. But fetch_helix_logs.py intentionally deduplicates by URL and updates multiple rows to point at the same on-disk log file, so this will commonly skip valid rows and reduce validation coverage. Consider validating once per unique console_log_path (or per unique URL) and applying the result to all associated rows, rather than skipping.

Copilot · 2026-04-02T17:20:10Z

.github/skills/ci-pipeline-monitor/scripts/extract_failed_tests.py

+    conn = sqlite3.connect(args.db)
+    insert_into_db(args.db, all_failures)
+


main() opens a SQLite connection (conn = sqlite3.connect(args.db)) before calling insert_into_db(), but insert_into_db() opens and commits via a separate connection. Keeping two connections open unnecessarily can be confusing and may contribute to locking/isolation edge cases. Consider either passing the existing conn into insert_into_db() or moving the sqlite3.connect() call until after the inserts are complete (right before the pipeline UPDATEs).

Suggested change

conn = sqlite3.connect(args.db)

insert_into_db(args.db, all_failures)

insert_into_db(args.db, all_failures)

conn = sqlite3.connect(args.db)

Pipelines where the build failed but the Test Results API returned 0 test failures are now shown as INCONCLUSIVE (⚠️) instead of SKIPPED. This fixes the PASS+FAIL != monitored math (7+4 != 21) by adding INCONCLUSIVE as a third category (7+4+10 = 21). - ⚠️ INCONCLUSIVE: build failed, no test failures detected - ⏭️ SKIPPED: private pipeline (intentional opt-out) - Update report template to match Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Introduce explicit 'inconclusive' result for builds that failed but had 0 test failures from the Test Results API. Previously these were re-marked as 'skipped', conflating them with private pipelines. - setup_and_fetch_builds.py: store all non-succeeded builds as 'failed' (including canceled). Only pipelines never fetched are 'skipped'. - extract_failed_tests.py: refine to 'inconclusive' when 0 test failures found (instead of re-marking as 'skipped'). - generate_report.py: query result directly — no skip_reason string matching. Show ⚠️ for inconclusive, ⏭️ for skipped. - validate_results.py: update queries for new result values. - SKILL.md: update schema docs. Report now shows: 7 PASS, 4 FAIL, 10 INCONCLUSIVE (of 21 monitored) with P+F+I = M. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…tems fetch_helix_logs.py: - Timeout (Command timed out): set exit_code = NULL (process killed, no meaningful exit code) - Coreclr multi-test (App Exit Code 100 + Command exited with 0): set exit_code = -1 (work item passed but has individual test failures — agent should classify from error messages, not exit code) - All other cases: keep first-non-zero logic validate_results.py: - Remove shared_paths skip — validate once per unique log file path - Skip rows with NULL or -1 exit codes (no meaningful value to check) - Mirror fetch_helix_logs.py logic for timeout and coreclr detection Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Pass the existing connection into insert_into_db() instead of opening a second connection to the same SQLite DB. Avoids unnecessary dual connections and potential locking issues. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Replace MCP-style get_builds() calls with actual script invocation (setup_and_fetch_builds.py) - Remove 'Fallback — Build Log Parsing' section that referenced banned get_build_log/get_build_log_by_id tools Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Add Step 0 Prerequisites (references/prerequisites.md) with Python/ADO_TOKEN setup - Add temp/ to .gitignore; prohibit ad-hoc scripts in scripts/ - Allow up to 3 validation retries while failure count decreases - Clarify WARN message: 'Validation error persists after retry' - Redesign Action Items: Issues to File, High Impact Failures, Needs Review - Sync log-template.md with new retry policy and token flow - Fix stale --validation-warnings help text in generate_report.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Copilot · 2026-04-09T16:40:56Z

.github/skills/ci-pipeline-monitor/scripts/setup_and_fetch_builds.py

+CREATE TABLE IF NOT EXISTS failures (
+    id                    INTEGER PRIMARY KEY,
+    title                 TEXT NOT NULL,
+    scope                 TEXT,
+    test_name             TEXT NOT NULL,
+    work_item             TEXT,
+    failure_category      TEXT,
+    exit_codes            TEXT,
+    failing_since_date    TEXT,
+    failing_since_build   TEXT,
+    console_log_url       TEXT,
+    source_test_result_id INTEGER,
+    error_message         TEXT,
+    stack_trace           TEXT,
+    summary               TEXT,
+    analysis              TEXT,
+    github_issue_number   INTEGER,
+    github_issue_url      TEXT,
+    github_issue_state    TEXT,
+    github_issue_assigned TEXT,
+    labels                TEXT,
+    milestone             TEXT DEFAULT '11.0.0',
+    FOREIGN KEY (source_test_result_id) REFERENCES test_results(id)
+);


SCHEMA creates failures with a FOREIGN KEY referencing test_results(id) before test_results is created. In SQLite this can cause executescript() to fail during DB setup. Create test_results before failures, or drop this FK constraint (and rely on validator) / add it via a follow-up ALTER TABLE pattern.

Copilot · 2026-04-09T16:40:57Z

.github/skills/ci-pipeline-monitor/references/validation-checks.md

+are verified against GitHub Search API (via `gh api`) using the full
+test name to confirm no matching issue exists, every line in
+error_message and stack_trace appears as a complete line in the console
+log (catches mid-line truncation).


This doc says NEW failures are verified against GitHub Search API “via gh api”, but validate_results.py actually uses direct urllib requests to https://api.github.com/search/issues (and does not require gh). Please update the documentation to reflect the actual implementation to avoid confusing operators.

Suggested change

are verified against GitHub Search API (via `gh api`) using the full

test name to confirm no matching issue exists, every line in

error_message and stack_trace appears as a complete line in the console

log (catches mid-line truncation).

are verified against the GitHub Search API using the full test name to

confirm no matching issue exists, every line in error_message and

stack_trace appears as a complete line in the console log (catches

mid-line truncation).

Copilot · 2026-04-09T16:40:57Z

.github/skills/ci-pipeline-monitor/scripts/generate_report.py

+            "SELECT name FROM pipelines WHERE result = 'skipped' ORDER BY name"
+        ):
+            out.append(f"  ⏭️ {p['name']}: SKIPPED (private)")
+
+        out.append("")
+        out.append("Notes:")
+        out.append("- ✅ = all tests passed")
+        out.append("- ❌ = one or more test failures")
+        out.append("- ⚠️ = inconclusive (build failed but no test failures detected via Test Results API)")
+        out.append("- ⏭️ = skipped (private pipeline)")


Skipped pipelines are always rendered as SKIPPED (private), but setup_and_fetch_builds.py uses skip_reason for several non-private cases (e.g., skip, missing_def_id, no builds found). Consider selecting and printing skip_reason here so the report correctly explains why each pipeline was skipped.

Suggested change

"SELECT name FROM pipelines WHERE result = 'skipped' ORDER BY name"

):

out.append(f" ⏭️ {p['name']}: SKIPPED (private)")

out.append("")

out.append("Notes:")

out.append("- ✅ = all tests passed")

out.append("- ❌ = one or more test failures")

out.append("- ⚠️ = inconclusive (build failed but no test failures detected via Test Results API)")

out.append("- ⏭️ = skipped (private pipeline)")

"SELECT name, skip_reason FROM pipelines WHERE result = 'skipped' ORDER BY name"

):

reason = p["skip_reason"] or "private"

out.append(f" ⏭️ {p['name']}: SKIPPED ({reason})")

out.append("")

out.append("Notes:")

out.append("- ✅ = all tests passed")

out.append("- ❌ = one or more test failures")

out.append("- ⚠️ = inconclusive (build failed but no test failures detected via Test Results API)")

out.append("- ⏭️ = skipped (for example: private pipeline, intentional skip, missing definition, or no builds found)")

Add ci-pipeline-monitor Copilot CLI skill

9340a17

Automates monitoring of 20+ CI stress/PGO test pipelines on Azure DevOps, triaging failures, and generating weekly reports with GitHub issue tracking. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 19, 2026 23:54

github-actions bot added the area-skills Agent Skills label Mar 19, 2026

JulieLeeMSFT self-assigned this Mar 19, 2026

JulieLeeMSFT added this to the 11.0.0 milestone Mar 19, 2026

Copilot started reviewing on behalf of JulieLeeMSFT March 19, 2026 23:58 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

ewhapdx and others added 7 commits March 19, 2026 17:12

Add ci-pipeline-monitor Copilot CLI skill

50de899

Automates monitoring of 20+ CI stress/PGO test pipelines on Azure DevOps, triaging failures, and generating weekly reports with GitHub issue tracking. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Remove references to ado MCP server.

ad202b4

Added more case to handle build test result.

09ef66b

Filter completed builds only to avoid in-progress builds.

2e4acc1

Check test name and error message when triaging each unique failure.

358e214

Track github search failures. Add User-Agent header.

6e64459

Paginate test result handling.

5c20efd

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 20, 2026 01:30

Support macOS/Linux too for get_token().

067083a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot started reviewing on behalf of JulieLeeMSFT March 20, 2026 01:34 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

ewhapdx and others added 3 commits March 19, 2026 18:41

Add timeout to the result API call.

10d5cc9

Add run_name to the helix log filename.

ac99c0f

Remove (None) in the report failure section when no platform info is …

1002fd9

…available. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 20, 2026 01:51

Copilot started reviewing on behalf of JulieLeeMSFT March 20, 2026 01:51 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Resolve merge conflict.

a6f8700

JulieLeeMSFT marked this pull request as draft March 20, 2026 02:16

kg reviewed Mar 25, 2026

View reviewed changes

.github/skills/ci-pipeline-monitor/SKILL.md Outdated Show resolved Hide resolved

kg reviewed Mar 25, 2026

View reviewed changes

.github/skills/ci-pipeline-monitor/SKILL.md Outdated Show resolved Hide resolved

JulieLeeMSFT added 2 commits March 25, 2026 13:06

Improve skip reason and remove unused tables in DB.

83c079b

Updated Readme to clarify authentication and APIs/Tools.

bdff263

JulieLeeMSFT marked this pull request as draft April 1, 2026 00:58

JulieLeeMSFT and others added 3 commits March 31, 2026 18:13

Fix generate_report.py: move validation warning after footer

0621c7a

Match the report-template.md spec: the validation warning banner should appear after the 'End of report' footer, not before it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

JulieLeeMSFT requested a review from Copilot April 1, 2026 01:21

Copilot started reviewing on behalf of JulieLeeMSFT April 1, 2026 01:22 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

.github/skills/ci-pipeline-monitor/references/validation-checks.md Show resolved Hide resolved

.github/skills/ci-pipeline-monitor/scripts/validate_results.py Show resolved Hide resolved

.github/skills/ci-pipeline-monitor/SKILL.md Show resolved Hide resolved

JulieLeeMSFT and others added 2 commits April 2, 2026 09:29

Copilot AI review requested due to automatic review settings April 2, 2026 16:58

Copilot started reviewing on behalf of JulieLeeMSFT April 2, 2026 16:59 View session

Copilot AI reviewed Apr 2, 2026

View reviewed changes

JulieLeeMSFT and others added 2 commits April 2, 2026 10:13

Update report template to reflect test name cap per failure group

52f0e72

Replace 'List ALL failing tests' with cap-at-5 rule in both the report template and the generated report notes section. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Cap test names in Failure Details body at 5

4cd6df9

The Failed tests block in each failure's body also listed all unique test names uncapped. Apply the same cap-at-5 rule. Update template line 119 to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 2, 2026 17:14

Copilot started reviewing on behalf of JulieLeeMSFT April 2, 2026 17:16 View session

Copilot AI reviewed Apr 2, 2026

View reviewed changes

JulieLeeMSFT and others added 7 commits April 2, 2026 12:03

Fix README: correct step order (validate before report)

d0e00fe

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 9, 2026 16:36

Copilot started reviewing on behalf of JulieLeeMSFT April 9, 2026 16:38 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

kg added 3 commits April 9, 2026 10:51

Generate issues and issue comments automatically (dry run mode)

d7b05a5

Bug fixes + live issue creation

7b619fb

Fix formatting

dd9cd95

		\| 5. Generate Report \| Generate markdown report from DB \| Script (`generate_report.py`) \| None (reads DB only) \|
		\| 6. Validate Report \| Validate DB completeness and report accuracy \| Script (`validate_results.py`) \| GitHub Search API (unauthenticated spot-checks) \|

	"SELECT COUNT(*) FROM pipelines WHERE result = 'skipped' AND skip_reason != 'private'"
	"SELECT COUNT(*) FROM pipelines WHERE result = 'skipped' AND skip_reason = '0 test failures'"

		conn = sqlite3.connect(args.db)
		insert_into_db(args.db, all_failures)

Conversation

JulieLeeMSFT commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

JulieLeeMSFT commented Mar 19, 2026 •

edited

Loading