From ce75d938082a5aee3fc51d0456472763b100c2e5 Mon Sep 17 00:00:00 2001 From: Simon Rozsival Date: Tue, 16 Jun 2026 20:50:44 +0200 Subject: [PATCH 1/7] [skills] Update CI/test skills for public pipeline + NUnit dotnet test Adapt the local agent skills and workflow docs to two recent CI changes: - #11578 made dnceng-public `dotnet-android` the single PR pipeline (full matrix for every PR) and set `pr: none` on the DevDiv `Xamarin.Android-PR` pipeline. Rework the `ci-status` skill around one public build: build-id extraction, AZDO test queries via `az rest` (ResultsByBuild), queue-driven ETA variance, a gating-vs-flaky verdict (build result + checks are authoritative; `continueOnError` device-test failures are non-gating), and device-test `logcat-*.txt` artifacts. - #11224 migrated Mono.Android device tests from NUnitLite/`-t:RunTestApp` to stock NUnit via `dotnet test` + Microsoft Testing Platform. Update the `tests` skill, `copilot-instructions.md`, and `UnitTests.md` to the new `-t:Install` + `dotnet test --no-build --report-trx` flow (TRX output), and drop the removed NUnitLite row from the `update-tpn` skill. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/copilot-instructions.md | 15 +- .github/skills/android-reviewer/SKILL.md | 2 +- .github/skills/ci-status/SKILL.md | 185 +++++++++--------- .../ci-status/references/binlog-analysis.md | 6 +- .github/skills/tests/SKILL.md | 13 +- .../skills/tests/references/test-catalog.md | 14 +- .github/skills/update-tpn/SKILL.md | 1 - Documentation/workflow/UnitTests.md | 34 ++-- 8 files changed, 148 insertions(+), 122 deletions(-) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 2d468c368ad..e6f0e4c6bcb 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -192,9 +192,9 @@ This pattern ensures proper encoding, timestamps, and file attributes are handle ## CI / Build Investigation -**dotnet/android's primary CI runs on Azure DevOps (internal), not GitHub Actions.** When a user asks about CI status, CI failures, why a PR is blocked, or build errors: +**dotnet/android PR validation runs on a public Azure DevOps pipeline (`dotnet-android` on `dnceng-public`), not GitHub Actions.** As of #11578 it runs the full test matrix for every PR (direct and fork); the old internal `Xamarin.Android-PR` (DevDiv) pipeline no longer runs on PRs. When a user asks about CI status, CI failures, why a PR is blocked, or build errors: -1. **ALWAYS invoke the `ci-status` skill first** — do NOT rely on `gh pr checks` alone. GitHub checks may all show ✅ while the internal Azure DevOps build is failing. +1. **ALWAYS invoke the `ci-status` skill first.** The pipeline surfaces as ~39 `dotnet-android (...)` GitHub checks, but the skill adds build progress, ETA, per-stage failures, and failed-test names that `gh pr checks` alone doesn't give you. 2. The skill auto-detects the current PR from the git branch when no PR number is given. 3. For deep .binlog analysis, use the `azdo-build-investigator` skill. 4. Only after the skill confirms no Azure DevOps failures should you report CI as passing. @@ -203,16 +203,21 @@ This pattern ensures proper encoding, timestamps, and file attributes are handle When diagnosing runtime, build, or test failures, follow these practices. They exist because the .NET ↔ JNI ↔ C++ ↔ generated-native stack is loosely coupled and static reasoning alone is unreliable. -- **Reproduce CI failures locally — do not iterate through CI.** A clean local test cycle is minutes; a CI iteration is hours. Run device tests the same way CI does: +- **Reproduce CI failures locally — do not iterate through CI.** A clean local test cycle is minutes; a CI iteration is hours. Run device tests the same way CI does (NUnit via `dotnet test` / MTP, see [#11224](https://github.com/dotnet/android/pull/11224)): ```bash make prepare && make all CONFIGURATION=Release + # Build + install the instrumentation APK on a connected device/emulator: ./dotnet-local.sh build tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj \ - -t:RunTestApp -c Release \ + -t:Install -c Release \ -p:_AndroidTypeMapImplementation= \ -p:UseMonoRuntime= + # Run the on-device tests via dotnet test (MTP), from the project dir so the + # project-local global.json ("runner": "Microsoft.Testing.Platform") applies: + ( cd tests/Mono.Android-Tests/Mono.Android-Tests && \ + ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release --report-trx ) ``` On Windows, use `build.cmd` and `dotnet-local.cmd` instead of `make`/`dotnet-local.sh`. - Results land in `TestResult-Mono.Android.NET_Tests-*.xml` at the repo root. + Results land as a `.trx` (VSTest format) in the test results directory — not `TestResult-*.xml`. - **When the build gets into a weird state, delete `bin/` and `obj/` and rebuild from scratch.** Stale incremental output causes phantom errors. See **Troubleshooting → Build** below. diff --git a/.github/skills/android-reviewer/SKILL.md b/.github/skills/android-reviewer/SKILL.md index 3327beb4d34..a337b1f7054 100644 --- a/.github/skills/android-reviewer/SKILL.md +++ b/.github/skills/android-reviewer/SKILL.md @@ -57,7 +57,7 @@ Review the CI results. **Never post ✅ LGTM if any required CI check is failing - Investigate the failure using the **azdo-build-investigator** skill (for Azure DevOps pipeline failures) or GitHub Actions job logs. - If the failure is caused by the PR's code changes, flag it as ❌ error. - If the failure is a known infrastructure issue or pre-existing flake unrelated to the PR, note it in the summary but still use ⚠️ Needs Changes — the PR isn't mergeable until CI is green. -- If **all public CI checks pass** but only the internal `Xamarin.Android-PR` check is failing, still use ⚠️ Needs Changes with a note that the internal pipeline may need a re-run. Do not give ✅ LGTM. +- All PR checks now come from the single public `dotnet-android` pipeline (dnceng-public). If you see a `Xamarin.Android-PR` check, it's a branch/official build, not PR validation — don't gate the review on it. - If the PR description acknowledges the failure and documents a dependency (e.g., "blocked on X"), note it in the summary. ### 5. Load review rules diff --git a/.github/skills/ci-status/SKILL.md b/.github/skills/ci-status/SKILL.md index 49d08331a6d..7fbb90572d3 100644 --- a/.github/skills/ci-status/SKILL.md +++ b/.github/skills/ci-status/SKILL.md @@ -4,7 +4,7 @@ description: > Check CI build status and investigate failures for dotnet/android PRs. ALWAYS use this skill when the user asks "check CI", "CI status", "why is CI failing", "is CI green", "why is my PR blocked", or anything about build status on a PR. Auto-detects the current PR from the git branch when no - PR number is given. Covers both GitHub checks and internal Azure DevOps builds. + PR number is given. Covers GitHub checks and the public Azure DevOps pipeline (dnceng-public). DO NOT USE FOR: GitHub Actions workflow authoring, non-dotnet/android repos. --- @@ -12,7 +12,7 @@ description: > Check CI status and investigate build failures for dotnet/android PRs. -**Key fact:** dotnet/android's primary CI runs on Azure DevOps (internal). GitHub checks alone are insufficient — they may all show ✅ while the internal build is failing. +**Key fact:** as of [#11578](https://github.com/dotnet/android/pull/11578), dotnet/android PR validation runs on a **single public** Azure DevOps pipeline — **`dotnet-android`** on `dev.azure.com/dnceng-public` (project `public`, definition id `333`), defined by `build-tools/automation/azure-pipelines-public.yaml`. It runs the **full test matrix for every PR** — both direct and fork. The old internal DevDiv pipeline `Xamarin.Android-PR` (`azure-pipelines.yaml`) now has `pr: none` and **no longer runs on PRs**; it only builds `main`/`release/*`/`feature/*` branches and official signed builds. On GitHub the pipeline surfaces as ~39 granular `dotnet-android (...)` checks (plus `license/cla`); querying AZDO directly adds progress, ETA, and failure detail. ## Prerequisites @@ -21,7 +21,7 @@ Check CI status and investigate build failures for dotnet/android PRs. | `gh` | `gh --version` | https://cli.github.com/ | | `az` + devops ext | `az version` | `az extension add --name azure-devops` then `az login` | -If `az` is not authenticated, stop and tell the user to run `az login`. +The pipeline lives in the **public** `dnceng-public` project, so most `build` queries (status, timeline, logs) work without auth. A few `test`-area REST calls need a token — if one returns a sign-in page or 401, tell the user to run `az login` and retry. ## Workflow @@ -47,11 +47,11 @@ If no PR exists for the current branch, tell the user and stop. - `true` → **fork PR** (external contributor) - `false` → **direct PR** (team member, branch in dotnet/android) -This matters for CI behavior: -- **Fork PRs:** `Xamarin.Android-PR` does NOT run. `dotnet-android` runs the full pipeline including tests. -- **Direct PRs:** `Xamarin.Android-PR` runs the full test suite. `dotnet-android` skips test stages (build-only) since tests run on DevDiv instead. +Both run the **same** `dotnet-android` pipeline with the **full test matrix** — fork status no longer changes *which* pipeline runs or *whether* tests run. It now only affects **triggering**: +- **Direct PRs:** the build starts automatically on every push. +- **Fork PRs:** the public pipeline may wait for a maintainer to approve the run (dnceng-public policy) and may need re-approval after each push. A team member can (re)trigger it by commenting `/azp run` on the PR. Until then the `dotnet-android` checks sit in a pending/expected state. -Highlight the fork status in the output so the user understands which checks to expect. +Highlight the fork status in the output so the user understands why a build may not have started yet. #### Step 2 — Get GitHub check status @@ -64,31 +64,37 @@ gh pr checks $PR --repo dotnet/android --json "name,state,link,bucket" 2>&1 \ gh pr checks $PR --repo dotnet/android --json "name,state,link,bucket" | ConvertFrom-Json ``` -Note which checks passed/failed/pending. The `link` field contains the AZDO build URL for internal checks. +Note which checks passed/failed/pending. Every `dotnet-android (...)` check `link` points at the **same** AZDO build; `license/cla` is a GitHub-side check. -#### Step 3 — Get Azure DevOps build status (repeat for EACH build) +#### Step 3 — Get the Azure DevOps build status -There are typically **two separate AZDO builds** for a dotnet/android PR. They run **independently** — neither waits for the other: -- **`dotnet-android`** on `dev.azure.com/dnceng-public` — Defined in `azure-pipelines-public.yaml` with an explicit `pr:` trigger. - - **Fork PRs:** runs the full pipeline including build + tests (since `Xamarin.Android-PR` won't run for forks). - - **Direct PRs:** runs **build-only** — test stages are auto-skipped because those run on DevDiv instead. This means the `dotnet-android` build will be significantly shorter for direct PRs. -- **`Xamarin.Android-PR`** on `devdiv.visualstudio.com` — full test suite, MAUI integration, compliance. Defined in `azure-pipelines.yaml` but its PR trigger is configured in the AZDO UI, not in YAML. - - **Fork PRs:** does NOT run at all (no access to internal resources). - - **Direct PRs:** runs the full test matrix. May take a few minutes to start after a push. +There is now a **single** AZDO build per PR: **`dotnet-android`** on `dev.azure.com/dnceng-public` (project `public`, definition id `333`), defined by `azure-pipelines-public.yaml`. It runs the full matrix for every PR — build (macOS/Windows/Linux) plus test stages (Linux Tests, MSBuild Tests, MSBuild Emulator Tests, Package/APK Tests, MAUI Tests). -Use the **pipeline definition name** (from the `definitionName` field) as the label in output — do NOT label them "Public" or "Internal". +> `Xamarin.Android-PR` on `devdiv.visualstudio.com` no longer runs on PRs (`pr: none`). If you ever see a `Xamarin.Android-PR` check, it belongs to a branch or official build, not PR validation — ignore it for PR status. -When a check shows **"Expected — Waiting for status to be reported"** on GitHub (typically `Xamarin.Android-PR`): -- **For direct PRs:** the pipeline hasn't been triggered yet — this is normal, it's not waiting for the other build, just for AZDO to pick it up. Report it as: "⏳ Not triggered yet — typically starts within a few minutes of a push." -- **For fork PRs:** `Xamarin.Android-PR` will NOT run. Report: "⏳ Will not run — fork PRs don't trigger the internal pipeline." +Set the org/project once: -Extract AZDO build URLs from the check `link` fields. Parse `{orgUrl}`, `{project}`, and `{buildId}` from patterns: -- `https://dev.azure.com/{org}/{project}/_build/results?buildId={id}` -- `https://{org}.visualstudio.com/{project}/_build/results?buildId={id}` +```bash +ORG_URL=https://dev.azure.com/dnceng-public +PROJECT=public +``` + +All `dotnet-android (...)` check links share one build id. Extract it from any of them: +- `https://dev.azure.com/dnceng-public/{project-guid}/_build/results?buildId={id}` + +```bash +BUILD_ID=$(gh pr checks $PR --repo dotnet/android --json name,link \ + --jq '[.[] | select(.name | startswith("dotnet-android")) | .link][0]' \ + | grep -oE 'buildId=[0-9]+' | head -1 | cut -d= -f2) +``` -**Run Steps 3, 3a, and 3b for each AZDO build independently.** The builds have different pipelines, different job counts, and different typical durations — each gets its own progress and ETA. +If `BUILD_ID` is empty (checks in "Expected — Waiting for status" with no build URL), the pipeline hasn't been picked up yet: +- **Fork PR:** likely awaiting maintainer approval — report "⏳ Awaiting pipeline approval — a maintainer can start it with `/azp run`." +- **Direct PR:** report "⏳ Not triggered yet — typically starts within a few minutes of a push." -For each build, first get the overall status including start time and definition ID: +Then stop (nothing to query yet). + +First get the overall status including start time and definition id: ```bash az devops invoke --area build --resource builds \ @@ -98,7 +104,7 @@ az devops invoke --area build --resource builds \ --output json 2>&1 ``` -**Compute elapsed time:** Subtract `startTime` from the current time (or from `finishTime` if the build is complete). Present as e.g. "Ran for 42 min" or "Running for 42 min". +**Compute elapsed time:** Subtract `startTime` from the current time (or from `finishTime` if the build is complete). Present as e.g. "Ran for 2h 18m" or "Running for 42 min". Then fetch the build timeline for **all jobs** (to get progress counts) and **any failures so far** — even when the build is still in progress: @@ -126,19 +132,14 @@ az devops invoke --area build --resource timeline \ --output json 2>&1 ``` -Check `issues` arrays first — they often contain the root cause directly. - -#### Step 3a — Estimate completion time per build (when build is in progress) +Check `issues` arrays first — they often contain the root cause directly. The granular GitHub checks (e.g. `dotnet-android (Linux Tests Linux > Tests > MSBuild 2)`) also pinpoint which job failed without any AZDO query. -Use the `definitionId` from the build to query recent successful builds of the **same pipeline definition** and compute the median duration. **Do this separately for each build** — the pipelines have very different durations. +#### Step 3a — Estimate completion time (when build is in progress) -**Important:** The `dotnet-android` pipeline duration varies significantly based on whether the PR is from a fork: -- **Direct PRs:** `dotnet-android` runs build-only (tests skipped) — typically much shorter (~1h 45min) -- **Fork PRs:** `dotnet-android` runs the full pipeline with tests — typically much longer - -To get accurate ETAs, filter historical builds to match the current PR type. You can approximate this by looking at the **job count** of the current build vs historical builds — build-only runs have ~3 jobs while full runs have many more. Alternatively, compare the historical durations and pick the ones that are similar in magnitude to what you'd expect for the current build type. +Every PR runs the same full matrix (same ~38 jobs across 8 stages), but **wall-clock duration is dominated by hosted-agent queue time** and varies widely — recent green runs range from **~50 min to ~3 h+** (same stages, very different queue waits). Treat any ETA as rough. ```bash +DEF_ID=333 az devops invoke --area build --resource builds \ --route-parameters project=$PROJECT \ --org $ORG_URL \ @@ -149,53 +150,60 @@ az devops invoke --area build --resource builds \ **Compute ETA:** 1. For each recent build, calculate `duration = finishTime - startTime` -2. Filter to builds with similar duration profile (short ~1-2h for build-only, long ~3h+ for full runs) matching the current PR type -3. Compute the **median** duration of the filtered set (more robust than average against outliers) -4. `ETA = startTime + medianDuration` -5. Present as: "ETA: ~14:30 UTC (typical for direct PRs: ~1h 45min)" +2. Compute the **median** (more robust than average); you may drop obvious outliers (very fast <60 min runs that barely queued, or >4 h stragglers) +3. `ETA = startTime + medianDuration` +4. Present as a rough window, e.g. "ETA: ~14:30 UTC (recent runs ≈50 min–3 h, median ~2 h)" If `startTime` is null (build hasn't started yet), skip the ETA and say "Build queued, not started yet". If the build already completed, skip the ETA and show the actual duration instead. +If it has been running longer than the median, say "overdue by ~X min — likely agent queue time, not necessarily stuck". #### Step 3b — Check for failed tests (always do this, especially when the build is still running) **This step is critical when the build is in progress.** Test results are published as jobs complete, so failures may already be visible before the build finishes. Surfacing these early lets the user start fixing them immediately. -Query test runs for this build: +> On `dnceng-public`, `az devops invoke --area test --resource runs` (list-by-build) is broken (404). Use `az rest` against the REST API with the Azure DevOps resource token instead: ```bash -az devops invoke --area test --resource runs \ - --route-parameters project=$PROJECT \ - --org $ORG_URL \ - --query-parameters "buildUri=vstfs:///Build/Build/$BUILD_ID" \ - --query "value[?runStatistics[?outcome=='Failed']] | [].{id:id, name:name, totalTests:totalTests, state:state, stats:runStatistics}" \ +ADO_RESOURCE=499b84ac-1321-427f-aa17-267ca6975798 # Azure DevOps app id, for az rest auth +``` + +Get **all failed tests for the build in one call** via `ResultsByBuild`: + +```bash +az rest --method get --resource "$ADO_RESOURCE" \ + --url "$ORG_URL/$PROJECT/_apis/test/ResultsByBuild?buildId=$BUILD_ID&outcomes=Failed&api-version=7.1-preview" \ + --query "value[].{test:automatedTestName, testCase:testCaseTitle, runId:runId}" \ --output json 2>&1 ``` -For each test run that has failures, fetch the failed test results: +To list the test runs for a build (e.g. for per-run pass/fail totals): ```bash -az devops invoke --area test --resource results \ - --route-parameters project=$PROJECT runId=$RUN_ID \ - --org $ORG_URL \ - --query-parameters "outcomes=Failed&\$top=20" \ - --query "value[].{testName:testCaseTitle, outcome:outcome, errorMessage:errorMessage, durationMs:durationInMs}" \ +az rest --method get --resource "$ADO_RESOURCE" \ + --url "$ORG_URL/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.1" \ + --query "value[].{id:id, name:name, total:totalTests, passed:passedTests}" \ --output json 2>&1 ``` -If the `errorMessage` is truncated or absent, you can fetch a single test result's full details: +For the full error message / stack trace of the failed tests, list the failed results for the run (use the `runId` from `ResultsByBuild`; repeat per distinct `runId` if failures span multiple runs). Use the **list** form with `outcomes=Failed` — the single-result-by-`testId` route returns null on this org: ```bash az devops invoke --area test --resource results \ - --route-parameters project=$PROJECT runId=$RUN_ID testId=$TEST_ID \ + --route-parameters project=$PROJECT runId=$RUN_ID \ --org $ORG_URL \ - --query "{testName:testCaseTitle, errorMessage:errorMessage, stackTrace:stackTrace}" \ + --query-parameters "outcomes=Failed&\$top=20" \ + --query "value[].{testName:testCaseTitle, outcome:outcome, errorMessage:errorMessage, stackTrace:stackTrace}" \ --output json 2>&1 ``` +> **On-device (Package/APK) test failures:** the `Package Tests` stage runs the Mono.Android instrumentation tests via stock NUnit + `dotnet test`/MTP ([#11224](https://github.com/dotnet/android/pull/11224)) and publishes results as VSTest/TRX — the queries above work unchanged (the failed `automatedTestName` is e.g. `Java.InteropTests.JnienvTest.DoNotLeakWeakReferences`). Native/JNI crashes (`UnsatisfiedLinkError`, `SIGSEGV`, `am instrument` going silent) often appear **only in logcat**: each run uploads a `logcat-.txt` (e.g. `logcat-Mono.Android.NET_Tests-Release.txt`) inside the `Test Results - APKs ...` artifact. Grab it in Phase 2 to diagnose device-test crashes. + +> **Distinguish gating failures from flaky/tolerated ones — `ResultsByBuild` alone is NOT a red signal.** The **build `result`** and the **GitHub check states** are authoritative for pass/fail. The device-test lanes run with `continueOnError`, so flaky failures (commonly the network-dependent `System.NetTests.SslTest.*`, or failures only in flavor lanes like `-TrimModePartial` / `-NoAab`) get published as failed test results **without failing the build**. A failure is **gating** only when its job/stage shows `result: failed` in the timeline **and** a matching ❌ GitHub check. So: if the build `result == succeeded` and all checks are green, treat any `ResultsByBuild` failures as **non-gating/flaky** and report them as a brief note — not as red CI. + #### Step 4 — Present summary -Use this format — **one section per AZDO build**, each with its own progress and ETA: +Use this format — a single `dotnet-android` build section with its progress and ETA: ``` # CI Status for PR #NNNN — "PR Title" @@ -208,56 +216,44 @@ Use this format — **one section per AZDO build**, each with its own progress a ## dotnet-android [#BuildId](link) **Result:** ✅ Succeeded / ❌ Failed / 🟡 In Progress -ℹ️ Build-only (tests run on Xamarin.Android-PR for direct PRs) — or ℹ️ Full pipeline with tests (fork PR) -⏱️ Running for **12 min** · ETA: ~15:15 UTC (typical for direct PRs: ~1h 45min) -📊 Jobs: **0/3 completed** · 1 running · 2 waiting - -| Job | Status | -|-----|--------| -| macOS > Build | 🟡 In Progress | -| Linux > Build | ⏳ Waiting | -| Windows > Build & Smoke Test | ⏳ Waiting | - -## Xamarin.Android-PR [#BuildId](link) -**Result:** ✅ Succeeded / ❌ Failed / 🟡 In Progress -— or for fork PRs: ⏳ **Will not run** — fork PRs don't trigger this pipeline -⏱️ Running for **42 min** · ETA: ~15:45 UTC (typical: ~2h 30min) +⏱️ Running for **42 min** · ETA: ~15:45 UTC (recent runs ≈50 min–3 h, median ~2 h) 📊 Jobs: **18/56 completed** · 6 running · 32 waiting +| Stage > Job | Status | +|-------------|--------| +| Mac > macOS > Build | ✅ Succeeded | +| Linux Tests > Linux > Tests > MSBuild 2 | ❌ Failed | +| MSBuild Emulator Tests > macOS > Tests > MSBuild+Emulator 8 | 🟡 In Progress | + ### Failures (if any) ❌ Stage > Job > Task Error: -### Failed Tests (if any — even while build is still running) -| Test Run | Failed | Total | -|----------|--------|-------| -| run-name | N | M | - -**Failed test names:** -- `Namespace.TestClass.TestMethod` — brief error message -- ... +### Failed Tests +- **Gating** (job/stage `result: failed` + ❌ check) — must be fixed: + - `Namespace.TestClass.TestMethod` — brief error message +- **Flaky / non-gating** (build still green; e.g. `SslTest.*` or flavor-lane-only) — note, don't block: + - `System.NetTests.SslTest.HttpsShouldWork` (in `-TrimModePartial`, `-NoAab`) ## What next? 1. View full logs / stack traces for a test failure -2. Download and analyze .binlog artifacts -3. Retry failed stages +2. Download and analyze .binlog artifacts (+ `logcat-*.txt` for device tests) +3. Retry failed stages (re-run with `/azp run` on the PR) ``` **Progress section guidelines:** -- Always show fork status (🔀 Direct PR / 🍴 Fork PR) at the top — it determines which builds run and their expected durations -- For `dotnet-android`, note whether it's build-only (direct PR) or full pipeline (fork PR) -- For `Xamarin.Android-PR` on fork PRs, don't try to query it — just report "Will not run" +- Always show fork status (🔀 Direct PR / 🍴 Fork PR) at the top — it only affects *triggering* now (fork builds may await approval), not which pipeline runs +- There is exactly one PR build (`dotnet-android`); do NOT look for or report a `Xamarin.Android-PR` build - Always show elapsed time when `startTime` is available - Show ETA when the build is in progress and historical data is available. If the build has been running longer than the median, say "overdue by ~X min" -- Show job counters as "N/Total completed · M running · P waiting" -- If the build hasn't started yet, show "⏳ Not triggered yet — typically starts within a few minutes of a push" -- If a check is in "Expected" state with no build URL on a direct PR, the AZDO pipeline hasn't picked it up yet — this is normal and not gated on other builds - -**If the build is still running but tests have already failed**, highlight these prominently so the user can start fixing them immediately. Use a note like: - -> ⚠️ Build still in progress, but **N tests have already failed** — you can start investigating these now. +- Show job counters as "N/Total completed · M running · P waiting" (pending may be 0 — stages start in parallel) +- If the build hasn't started yet: direct PR → "⏳ Not triggered yet — typically starts within a few minutes of a push"; fork PR → "⏳ Awaiting pipeline approval — a maintainer can start it with `/azp run`" -**If no failures found anywhere**, report CI as green and stop. +**Pass/fail verdict — use the build `result` + GitHub checks, not the raw test-failure count:** +- **Build `result: failed` or any ❌ check** → CI is red. Surface the gating failures (the ❌ checks / `result: failed` timeline jobs and their tests). +- **Build still running with a gating job already `result: failed`** → highlight prominently so the user can start fixing immediately: + > ⚠️ Build still in progress, but the **Package Tests** stage has already failed — you can start investigating now. +- **Build `result: succeeded` and all checks green** → report CI **green**, even if `ResultsByBuild` lists failures. Mention any such failures as a one-line flaky/non-gating note (e.g. "2 flaky `SslTest` failures in `continueOnError` lanes — not blocking"). ### Phase 2: Deep Investigation (only if user requests) @@ -294,14 +290,17 @@ See [references/error-patterns.md](references/error-patterns.md) for dotnet/andr ## Error Handling -- **Build in progress:** Still query for failed timeline records AND test runs. Report any early failures alongside the in-progress status. Only offer `gh pr checks --watch` if there are no failures yet. -- **Check in "Expected" state (no build URL):** The AZDO pipeline hasn't been triggered yet. This is normal — the two pipelines (`dotnet-android` and `Xamarin.Android-PR`) run independently, not sequentially. Report: "⏳ Not triggered yet — typically starts within a few minutes of a push." Do NOT say it's waiting for the other build. -- **Auth expired:** Tell user to run `az login` and retry. +- **Build in progress:** Still query failed timeline records AND `ResultsByBuild`. Report early **gating** failures (timeline jobs with `result: failed`) alongside the in-progress status; treat `ResultsByBuild`-only failures cautiously (they may be flaky/non-gating — see the gating note in Step 3b). Only offer `gh pr checks --watch` if there are no gating failures yet. +- **Checks in "Expected" state (no build URL):** The `dotnet-android` pipeline hasn't started. For a **fork PR** it's likely awaiting maintainer approval — report: "⏳ Awaiting pipeline approval — a maintainer can start it with `/azp run`." For a **direct PR** it usually starts within a few minutes of a push — report: "⏳ Not triggered yet — typically starts within a few minutes of a push." +- **A `Xamarin.Android-PR` check appears:** That pipeline no longer runs on PRs (`pr: none`); if present it belongs to a branch or official build — ignore it for PR status. +- **Sign-in page / 401 on a `test`-area `az rest` call:** Tell the user to run `az login` and retry. - **Build not found:** Verify the PR number/build ID is correct. - **No test runs yet:** The build may not have reached the test phase. Report what's available and note that tests haven't started. ## Tips +- The **build `result` + GitHub check states** are the source of truth for pass/fail — the test API (`ResultsByBuild`) lists failures even on green builds (flaky `continueOnError` device-test lanes) - Focus on the **first** error chronologically — later errors often cascade - `.binlog` has richer detail than text logs when logs show only "Build FAILED" - `issues` in timeline records often contain the root cause without needing to download logs +- For on-device (Package/APK) test crashes, the `logcat-*.txt` artifact is usually more informative than the test error message diff --git a/.github/skills/ci-status/references/binlog-analysis.md b/.github/skills/ci-status/references/binlog-analysis.md index 320b7fb9853..7066e8d522b 100644 --- a/.github/skills/ci-status/references/binlog-analysis.md +++ b/.github/skills/ci-status/references/binlog-analysis.md @@ -20,7 +20,11 @@ az pipelines runs artifact list --run-id $BUILD_ID --org $ORG_URL --project $PRO az pipelines runs artifact list --run-id $BUILD_ID --org $ORG_URL --project $PROJECT --output json ``` -Look for artifact names containing `binlog`, `msbuild`, or `build-log`. +Look for artifact names that contain build logs. On the `dotnet-android` (dnceng-public) pipeline the relevant ones are: +- `Build Results - macOS` / `Build Results - Windows` / `Build Results - Linux` — contain the `.binlog` files (published mainly when a build stage fails or when `XA.PublishAllLogs` is set). +- `Test Results - ...` — per-test-stage logs and artifacts. For the on-device `Package Tests` (APKs) stage these also include each device test's `build-.binlog`, `run-.binlog`, the `.trx`, and `logcat-.txt` (essential for native/JNI crash diagnosis). + +If a green build has no `Build Results - *` artifact, the binlogs weren't published; re-run with `XA.PublishAllLogs` or rely on the timeline/test queries instead. ### Download diff --git a/.github/skills/tests/SKILL.md b/.github/skills/tests/SKILL.md index 957a5255655..e910ae9fe9d 100644 --- a/.github/skills/tests/SKILL.md +++ b/.github/skills/tests/SKILL.md @@ -71,13 +71,18 @@ dotnet test .csproj -v minimal --filter "Name~TestName" ./dotnet-local.sh test bin/TestDebug/MSBuildDeviceIntegration/${TFM}/MSBuildDeviceIntegration.dll --filter "Name~InstallAndRunTests" ``` -### On-device runtime tests (NUnitLite, full-build + device) +### On-device runtime tests (NUnit via `dotnet test` / MTP, full-build + device) -These do NOT use `dotnet test`. Use the `RunTestApp` MSBuild target: +As of [#11224](https://github.com/dotnet/android/pull/11224) these run **stock NUnit** through `dotnet test` with the Microsoft Testing Platform (MTP) — NUnitLite and the `-t:RunTestApp` target are gone. Build + install the instrumentation APK, then run `dotnet test` **from the project directory** so the project-local `global.json` (`"runner": "Microsoft.Testing.Platform"`) is picked up: ```bash -./dotnet-local.sh build -t:RunTestApp tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj +# 1. Build + install on a connected device/emulator: +./dotnet-local.sh build -t:Install -c Release tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj + +# 2. Run the on-device tests (MTP): +( cd tests/Mono.Android-Tests/Mono.Android-Tests && \ + ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release --report-trx ) ``` -Results appear in `TestResult-*.xml` in the repo root. +Results are a **`.trx`** (VSTest format) in the test results directory — not `TestResult-*.xml`. Restrict to specific NUnit `[Category]` names with `-p:IncludeCategories=Intune` at **build** time (the old `am instrument -e` category args are gone; exclusions now flow through runtimeconfig — see `TestInstrumentation.ExcludedCategories`/`IncludedCategories`). ### Java.Interop tests Tooling tests are standalone (`dotnet test` on `.csproj`). JVM tests require the local SDK: diff --git a/.github/skills/tests/references/test-catalog.md b/.github/skills/tests/references/test-catalog.md index 87f6b599387..691c4edcf84 100644 --- a/.github/skills/tests/references/test-catalog.md +++ b/.github/skills/tests/references/test-catalog.md @@ -114,7 +114,7 @@ Device: **Yes** (most tests have `[Category("UsesDevice")]`) ## On-Device Runtime Tests (full-build — requires local SDK + device) -These use NUnitLite and run directly on the device via `-t:RunTestApp`. They do NOT use `dotnet test`. +As of [#11224](https://github.com/dotnet/android/pull/11224), `Mono.Android.NET-Tests` runs **stock NUnit** via `dotnet test` + Microsoft Testing Platform (MTP) — NUnitLite and the `-t:RunTestApp` target are gone. (The older `locales` / `embedded DSOs` apps below were out of scope of that PR and may still use the legacy path.) Build: Full-build + the test project itself Device: **Yes** @@ -136,19 +136,21 @@ Device: **Yes** ### On-device test categories -The `Mono.Android.NET-Tests.csproj` dynamically excludes categories based on runtime: +The `Mono.Android.NET-Tests.csproj` dynamically excludes categories based on runtime (now via runtimeconfig read by `TestInstrumentation`, not `am instrument -e`): - **CoreCLR runtime**: Excludes `CoreCLRIgnore`, `NTLM` - **NativeAOT runtime**: Excludes `NativeAOTIgnore`, `SSL`, `NTLM`, `Export`, `NativeTypeMap` - **LLVM**: Excludes `LLVMIgnore`, `InetAccess`, `NetworkInterfaces` -Other categories: `SSL`, `InetAccess`, `JavaList`, `RuntimeConfig`, `Intune`, `NTLM` +Other categories: `SSL`, `InetAccess`, `JavaList`, `RuntimeConfig`, `Intune`, `NTLM`. Restrict a run to specific categories with `-p:IncludeCategories=Intune` at **build** time. -Command: +Command (build + install, then `dotnet test` from the project dir so the project-local `global.json` MTP runner is used): ```bash -./dotnet-local.sh build -t:RunTestApp tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj +./dotnet-local.sh build -t:Install -c Release tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj +( cd tests/Mono.Android-Tests/Mono.Android-Tests && \ + ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release --report-trx ) ``` -Results appear in `TestResult-*.xml` in the repo root. +Results are a `.trx` (VSTest format) in the test results directory — not `TestResult-*.xml`. --- diff --git a/.github/skills/update-tpn/SKILL.md b/.github/skills/update-tpn/SKILL.md index 8bf5cc18b7a..75e060c2078 100644 --- a/.github/skills/update-tpn/SKILL.md +++ b/.github/skills/update-tpn/SKILL.md @@ -73,7 +73,6 @@ List contents of `src-ThirdParty/` directory. Current vendored code and license | `android-platform-tools-base/` | android/platform/tools/base | https://android.googlesource.com/platform/tools/base/+/refs/heads/main/sdk-common/NOTICE (Apache 2.0) | | `bionic/` | google/bionic | https://android.googlesource.com/platform/bionic/ (Apache 2.0) | | `crc32.net/` | force-net/crc32.net | https://github.com/force-net/Crc32.NET (MIT) | -| `NUnitLite/` | nunit/nunitlite | https://github.com/nunit/nunitlite/ (MIT) | | `StrongNameSigner/` | brutaldev/StrongNameSigner | https://github.com/brutaldev/StrongNameSigner/ (Apache 2.0) | Note: `Mono.Security.Cryptography/`, `System.Diagnostics.CodeAnalysis/`, `System.Runtime.CompilerServices/`, and `dotnet/` are Microsoft-owned and do not need TPN entries. diff --git a/Documentation/workflow/UnitTests.md b/Documentation/workflow/UnitTests.md index b5c7518adca..89926ff7c5b 100644 --- a/Documentation/workflow/UnitTests.md +++ b/Documentation/workflow/UnitTests.md @@ -393,8 +393,7 @@ public void MyAppShouldRunAndRespondToClick () There are a category of tests which run on the device itself, these test the runtime behaviour. These run `NUnit` tests directly on the device. Some of these are located in the runtime itself. We build them within the repo then run -the tests on the device. They use a custom mobile version of `NUnit` called -`NUnitLite`. For the most part they are the same. +the tests on the device. These tests are generally found in: @@ -402,19 +401,32 @@ These tests are generally found in: * [`tests/EmbeddedDSOs/EmbeddedDSO`](../../tests/EmbeddedDSOs/EmbeddedDSO) * [`tests/locales/Xamarin.Android.Locale-Tests`](../../tests/locales/Xamarin.Android.Locale-Tests) -These tests are run by using the `RunTestApp` target on the appropriate project -file, which includes: - - * `tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj` - -For example: +As of [#11224](https://github.com/dotnet/android/pull/11224), +`Mono.Android.NET-Tests` runs **stock `NUnit`** through `dotnet test` with the +[Microsoft Testing Platform (MTP)](https://learn.microsoft.com/dotnet/core/testing/microsoft-testing-platform-intro); +the previous `NUnitLite` mobile runner and the custom `RunTestApp` target have +been removed. Build and install the instrumentation app with `-t:Install`, then +run `dotnet test` **from the project directory** so the project-local +`global.json` (which sets `"runner": "Microsoft.Testing.Platform"`) is picked +up: ```zsh -./dotnet-local.sh build -t:RunTestApp tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj +# 1. Build + install on a connected device/emulator: +./dotnet-local.sh build -t:Install -c Release tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj + +# 2. Run the on-device tests via dotnet test (MTP): +( cd tests/Mono.Android-Tests/Mono.Android-Tests && \ + ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release --report-trx ) ``` -After running the tests, a `TestResult*.xml` file will be created in the -top checkout directory containing the results of the tests. +After running the tests, a `.trx` file (VSTest format) will be created in the +test results directory containing the results of the tests. Pass +`--results-directory ` to control where it is written, and +`-p:IncludeCategories=` at **build** time to restrict the run to +specific NUnit `[Category]` names. + +> Note: the older `EmbeddedDSO` and `Xamarin.Android.Locale-Tests` apps were out +> of scope of #11224 and may still use the legacy `RunTestApp` path. The following is an example unit test. From b9fe9c4c4d5006dcf4d73b0889a1cb18b954b97e Mon Sep 17 00:00:00 2001 From: Simon Rozsival Date: Tue, 16 Jun 2026 22:28:00 +0200 Subject: [PATCH 2/7] [skills] Simplify and optimize the ci-status skill Apply the skill-creator "concise is key" principle to the ci-status skill: cut content that is obvious to the agent and keep only what is unique to dotnet/android CI. - SKILL.md 306 -> 78 lines (~70% smaller always-loaded body): drop the prerequisites table, PowerShell duplicates, step-by-step arithmetic (elapsed/median/job-counter explanations) and the verbose output template; state the repo-specific facts once (single public pipeline, Xamarin.Android-PR pr:none, test-area 404 -> az rest, gating-vs-flaky continueOnError lanes, queue-time ETA variance, logcat artifact). - Move deep-dive commands (ETA query, per-test error/stack, test-runs list, log fetch) to references/azdo-queries.md, loaded only when needed. Validated with skill-creator quick_validate and by re-running the lean workflow against live PRs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/skills/ci-status/SKILL.md | 300 +++--------------- .../ci-status/references/azdo-queries.md | 50 +++ 2 files changed, 86 insertions(+), 264 deletions(-) create mode 100644 .github/skills/ci-status/references/azdo-queries.md diff --git a/.github/skills/ci-status/SKILL.md b/.github/skills/ci-status/SKILL.md index 7fbb90572d3..8d342415389 100644 --- a/.github/skills/ci-status/SKILL.md +++ b/.github/skills/ci-status/SKILL.md @@ -10,297 +10,69 @@ description: > # CI Status -Check CI status and investigate build failures for dotnet/android PRs. +PR validation runs on **one public** Azure DevOps pipeline: **`dotnet-android`** on `dev.azure.com/dnceng-public` (project `public`, definition id `333`, `build-tools/automation/azure-pipelines-public.yaml`), full test matrix for **every** PR. On GitHub it shows as ~39 `dotnet-android (...)` checks plus `license/cla`, all backed by **one** build. -**Key fact:** as of [#11578](https://github.com/dotnet/android/pull/11578), dotnet/android PR validation runs on a **single public** Azure DevOps pipeline — **`dotnet-android`** on `dev.azure.com/dnceng-public` (project `public`, definition id `333`), defined by `build-tools/automation/azure-pipelines-public.yaml`. It runs the **full test matrix for every PR** — both direct and fork. The old internal DevDiv pipeline `Xamarin.Android-PR` (`azure-pipelines.yaml`) now has `pr: none` and **no longer runs on PRs**; it only builds `main`/`release/*`/`feature/*` branches and official signed builds. On GitHub the pipeline surfaces as ~39 granular `dotnet-android (...)` checks (plus `license/cla`); querying AZDO directly adds progress, ETA, and failure detail. +Repo-specific things you must know (everything else is standard `gh`/`az`): -## Prerequisites +- **`Xamarin.Android-PR`** (devdiv) has `pr: none` — it does NOT run on PRs. If you see that check it's a branch/official build; ignore it for PR status. +- **Fork status only changes triggering, not which pipeline runs.** Fork builds may wait for a maintainer to approve the run (and re-approve per push) via an `/azp run` comment; direct builds auto-start on push. +- The **`test` area of `az devops invoke` is broken on dnceng-public (404)** — get test results via `az rest` (below). The `build` area works, unauthenticated. `az rest` / log+artifact downloads need `az login` (else a sign-in page / 401). +- **The build `result` + GitHub check states are the source of truth — not the test API.** Device-test lanes run with `continueOnError`, so flaky failures (notably `System.NetTests.SslTest.*`, or failures only in flavor lanes like `-TrimModePartial`/`-NoAab`) appear as failed tests on otherwise-green builds. -| Tool | Check | Setup | -|------|-------|-------| -| `gh` | `gh --version` | https://cli.github.com/ | -| `az` + devops ext | `az version` | `az extension add --name azure-devops` then `az login` | - -The pipeline lives in the **public** `dnceng-public` project, so most `build` queries (status, timeline, logs) work without auth. A few `test`-area REST calls need a token — if one returns a sign-in page or 401, tell the user to run `az login` and retry. - -## Workflow - -### Phase 1: Quick Status (always do this first) - -#### Step 1 — Resolve the PR and detect fork status - -**No PR specified** — detect from current branch: - -```bash -gh pr view --json number,title,url,headRefName,isCrossRepository --jq '{number,title,url,headRefName,isCrossRepository}' -``` - -**PR number given** — use it directly: - -```bash -gh pr view $PR --repo dotnet/android --json number,title,url,headRefName,isCrossRepository --jq '{number,title,url,headRefName,isCrossRepository}' -``` - -If no PR exists for the current branch, tell the user and stop. - -**`isCrossRepository`** tells you whether the PR is from a fork: -- `true` → **fork PR** (external contributor) -- `false` → **direct PR** (team member, branch in dotnet/android) - -Both run the **same** `dotnet-android` pipeline with the **full test matrix** — fork status no longer changes *which* pipeline runs or *whether* tests run. It now only affects **triggering**: -- **Direct PRs:** the build starts automatically on every push. -- **Fork PRs:** the public pipeline may wait for a maintainer to approve the run (dnceng-public policy) and may need re-approval after each push. A team member can (re)trigger it by commenting `/azp run` on the PR. Until then the `dotnet-android` checks sit in a pending/expected state. - -Highlight the fork status in the output so the user understands why a build may not have started yet. - -#### Step 2 — Get GitHub check status +## Phase 1 — Status (always) ```bash -gh pr checks $PR --repo dotnet/android --json "name,state,link,bucket" 2>&1 \ - | jq '[.[] | {name, state, bucket, link}]' -``` - -```powershell -gh pr checks $PR --repo dotnet/android --json "name,state,link,bucket" | ConvertFrom-Json -``` - -Note which checks passed/failed/pending. Every `dotnet-android (...)` check `link` points at the **same** AZDO build; `license/cla` is a GitHub-side check. - -#### Step 3 — Get the Azure DevOps build status +ORG=https://dev.azure.com/dnceng-public; PROJECT=public -There is now a **single** AZDO build per PR: **`dotnet-android`** on `dev.azure.com/dnceng-public` (project `public`, definition id `333`), defined by `azure-pipelines-public.yaml`. It runs the full matrix for every PR — build (macOS/Windows/Linux) plus test stages (Linux Tests, MSBuild Tests, MSBuild Emulator Tests, Package/APK Tests, MAUI Tests). +# Resolve the PR (drop --repo/$PR to auto-detect from the current branch); stop if none: +gh pr view $PR --repo dotnet/android --json number,title,isCrossRepository -> `Xamarin.Android-PR` on `devdiv.visualstudio.com` no longer runs on PRs (`pr: none`). If you ever see a `Xamarin.Android-PR` check, it belongs to a branch or official build, not PR validation — ignore it for PR status. +# GitHub checks (every dotnet-android link points at the same build): +gh pr checks $PR --repo dotnet/android --json name,state,link -Set the org/project once: - -```bash -ORG_URL=https://dev.azure.com/dnceng-public -PROJECT=public -``` - -All `dotnet-android (...)` check links share one build id. Extract it from any of them: -- `https://dev.azure.com/dnceng-public/{project-guid}/_build/results?buildId={id}` - -```bash +# Shared build id: BUILD_ID=$(gh pr checks $PR --repo dotnet/android --json name,link \ - --jq '[.[] | select(.name | startswith("dotnet-android")) | .link][0]' \ - | grep -oE 'buildId=[0-9]+' | head -1 | cut -d= -f2) + --jq '[.[]|select(.name|startswith("dotnet-android")).link][0]' | grep -oE 'buildId=[0-9]+' | cut -d= -f2 | head -1) ``` -If `BUILD_ID` is empty (checks in "Expected — Waiting for status" with no build URL), the pipeline hasn't been picked up yet: -- **Fork PR:** likely awaiting maintainer approval — report "⏳ Awaiting pipeline approval — a maintainer can start it with `/azp run`." -- **Direct PR:** report "⏳ Not triggered yet — typically starts within a few minutes of a push." +Empty `BUILD_ID` (checks "Expected", no build URL) = pipeline not started: fork PR → "awaiting `/azp run` approval"; direct PR → "not triggered yet (starts within minutes of a push)". Report and stop. -Then stop (nothing to query yet). - -First get the overall status including start time and definition id: +Build status, then timeline (job progress + failures so far — both valid mid-build): ```bash -az devops invoke --area build --resource builds \ +az devops invoke --area build --resource builds --org $ORG \ --route-parameters project=$PROJECT buildId=$BUILD_ID \ - --org $ORG_URL \ - --query "{status:status, result:result, startTime:startTime, finishTime:finishTime, definitionId:definition.id, definitionName:definition.name}" \ - --output json 2>&1 -``` - -**Compute elapsed time:** Subtract `startTime` from the current time (or from `finishTime` if the build is complete). Present as e.g. "Ran for 2h 18m" or "Running for 42 min". + --query "{status:status, result:result, startTime:startTime, finishTime:finishTime}" -o json -Then fetch the build timeline for **all jobs** (to get progress counts) and **any failures so far** — even when the build is still in progress: - -```bash -az devops invoke --area build --resource timeline \ +az devops invoke --area build --resource timeline --org $ORG \ --route-parameters project=$PROJECT buildId=$BUILD_ID \ - --org $ORG_URL \ - --query "records[?type=='Job'] | [].{name:name, state:state, result:result}" \ - --output json 2>&1 + --query "records[?type=='Job'].{name:name, state:state, result:result}" -o json ``` -**Compute job progress counters** from the timeline response: -- Count jobs where `state == 'completed'` → **finished** -- Count jobs where `state == 'inProgress'` → **running** -- Count jobs where `state == 'pending'` → **waiting** -- Total = finished + running + waiting +Job `state` is `completed`/`inProgress`/`pending` (pending is often 0 — stages start in parallel). `records[?result=='failed']` gives failing stages/jobs/tasks; their `issues[]` usually carry the root cause, and the granular check names (e.g. `dotnet-android (Linux Tests Linux > Tests > MSBuild 2)`) already pinpoint the lane. -Then fetch failures: +Failed tests — `az devops invoke --area test` 404s here, so use `az rest`: ```bash -az devops invoke --area build --resource timeline \ - --route-parameters project=$PROJECT buildId=$BUILD_ID \ - --org $ORG_URL \ - --query "records[?result=='failed'] | [].{name:name, type:type, result:result, issues:issues, errorCount:errorCount, log:log}" \ - --output json 2>&1 -``` - -Check `issues` arrays first — they often contain the root cause directly. The granular GitHub checks (e.g. `dotnet-android (Linux Tests Linux > Tests > MSBuild 2)`) also pinpoint which job failed without any AZDO query. - -#### Step 3a — Estimate completion time (when build is in progress) - -Every PR runs the same full matrix (same ~38 jobs across 8 stages), but **wall-clock duration is dominated by hosted-agent queue time** and varies widely — recent green runs range from **~50 min to ~3 h+** (same stages, very different queue waits). Treat any ETA as rough. - -```bash -DEF_ID=333 -az devops invoke --area build --resource builds \ - --route-parameters project=$PROJECT \ - --org $ORG_URL \ - --query-parameters "definitions=$DEF_ID&statusFilter=completed&resultFilter=succeeded&\$top=10" \ - --query "value[].{startTime:startTime, finishTime:finishTime}" \ - --output json 2>&1 +RES=499b84ac-1321-427f-aa17-267ca6975798 # Azure DevOps app id +az rest --method get --resource $RES \ + --url "$ORG/$PROJECT/_apis/test/ResultsByBuild?buildId=$BUILD_ID&outcomes=Failed&api-version=7.1-preview" \ + --query "value[].{test:automatedTestName, runId:runId}" -o json ``` -**Compute ETA:** -1. For each recent build, calculate `duration = finishTime - startTime` -2. Compute the **median** (more robust than average); you may drop obvious outliers (very fast <60 min runs that barely queued, or >4 h stragglers) -3. `ETA = startTime + medianDuration` -4. Present as a rough window, e.g. "ETA: ~14:30 UTC (recent runs ≈50 min–3 h, median ~2 h)" - -If `startTime` is null (build hasn't started yet), skip the ETA and say "Build queued, not started yet". -If the build already completed, skip the ETA and show the actual duration instead. -If it has been running longer than the median, say "overdue by ~X min — likely agent queue time, not necessarily stuck". - -#### Step 3b — Check for failed tests (always do this, especially when the build is still running) - -**This step is critical when the build is in progress.** Test results are published as jobs complete, so failures may already be visible before the build finishes. Surfacing these early lets the user start fixing them immediately. - -> On `dnceng-public`, `az devops invoke --area test --resource runs` (list-by-build) is broken (404). Use `az rest` against the REST API with the Azure DevOps resource token instead: - -```bash -ADO_RESOURCE=499b84ac-1321-427f-aa17-267ca6975798 # Azure DevOps app id, for az rest auth -``` - -Get **all failed tests for the build in one call** via `ResultsByBuild`: - -```bash -az rest --method get --resource "$ADO_RESOURCE" \ - --url "$ORG_URL/$PROJECT/_apis/test/ResultsByBuild?buildId=$BUILD_ID&outcomes=Failed&api-version=7.1-preview" \ - --query "value[].{test:automatedTestName, testCase:testCaseTitle, runId:runId}" \ - --output json 2>&1 -``` - -To list the test runs for a build (e.g. for per-run pass/fail totals): - -```bash -az rest --method get --resource "$ADO_RESOURCE" \ - --url "$ORG_URL/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.1" \ - --query "value[].{id:id, name:name, total:totalTests, passed:passedTests}" \ - --output json 2>&1 -``` - -For the full error message / stack trace of the failed tests, list the failed results for the run (use the `runId` from `ResultsByBuild`; repeat per distinct `runId` if failures span multiple runs). Use the **list** form with `outcomes=Failed` — the single-result-by-`testId` route returns null on this org: - -```bash -az devops invoke --area test --resource results \ - --route-parameters project=$PROJECT runId=$RUN_ID \ - --org $ORG_URL \ - --query-parameters "outcomes=Failed&\$top=20" \ - --query "value[].{testName:testCaseTitle, outcome:outcome, errorMessage:errorMessage, stackTrace:stackTrace}" \ - --output json 2>&1 -``` - -> **On-device (Package/APK) test failures:** the `Package Tests` stage runs the Mono.Android instrumentation tests via stock NUnit + `dotnet test`/MTP ([#11224](https://github.com/dotnet/android/pull/11224)) and publishes results as VSTest/TRX — the queries above work unchanged (the failed `automatedTestName` is e.g. `Java.InteropTests.JnienvTest.DoNotLeakWeakReferences`). Native/JNI crashes (`UnsatisfiedLinkError`, `SIGSEGV`, `am instrument` going silent) often appear **only in logcat**: each run uploads a `logcat-.txt` (e.g. `logcat-Mono.Android.NET_Tests-Release.txt`) inside the `Test Results - APKs ...` artifact. Grab it in Phase 2 to diagnose device-test crashes. - -> **Distinguish gating failures from flaky/tolerated ones — `ResultsByBuild` alone is NOT a red signal.** The **build `result`** and the **GitHub check states** are authoritative for pass/fail. The device-test lanes run with `continueOnError`, so flaky failures (commonly the network-dependent `System.NetTests.SslTest.*`, or failures only in flavor lanes like `-TrimModePartial` / `-NoAab`) get published as failed test results **without failing the build**. A failure is **gating** only when its job/stage shows `result: failed` in the timeline **and** a matching ❌ GitHub check. So: if the build `result == succeeded` and all checks are green, treat any `ResultsByBuild` failures as **non-gating/flaky** and report them as a brief note — not as red CI. - -#### Step 4 — Present summary - -Use this format — a single `dotnet-android` build section with its progress and ETA: - -``` -# CI Status for PR #NNNN — "PR Title" -🔀 **Direct PR** (branch in dotnet/android) — or 🍴 **Fork PR** (external contributor) - -## GitHub Checks -| Check | Status | -|-------|--------| -| check-name | ✅ / ❌ / 🟡 | - -## dotnet-android [#BuildId](link) -**Result:** ✅ Succeeded / ❌ Failed / 🟡 In Progress -⏱️ Running for **42 min** · ETA: ~15:45 UTC (recent runs ≈50 min–3 h, median ~2 h) -📊 Jobs: **18/56 completed** · 6 running · 32 waiting - -| Stage > Job | Status | -|-------------|--------| -| Mac > macOS > Build | ✅ Succeeded | -| Linux Tests > Linux > Tests > MSBuild 2 | ❌ Failed | -| MSBuild Emulator Tests > macOS > Tests > MSBuild+Emulator 8 | 🟡 In Progress | - -### Failures (if any) -❌ Stage > Job > Task - Error: - -### Failed Tests -- **Gating** (job/stage `result: failed` + ❌ check) — must be fixed: - - `Namespace.TestClass.TestMethod` — brief error message -- **Flaky / non-gating** (build still green; e.g. `SslTest.*` or flavor-lane-only) — note, don't block: - - `System.NetTests.SslTest.HttpsShouldWork` (in `-TrimModePartial`, `-NoAab`) - -## What next? -1. View full logs / stack traces for a test failure -2. Download and analyze .binlog artifacts (+ `logcat-*.txt` for device tests) -3. Retry failed stages (re-run with `/azp run` on the PR) -``` - -**Progress section guidelines:** -- Always show fork status (🔀 Direct PR / 🍴 Fork PR) at the top — it only affects *triggering* now (fork builds may await approval), not which pipeline runs -- There is exactly one PR build (`dotnet-android`); do NOT look for or report a `Xamarin.Android-PR` build -- Always show elapsed time when `startTime` is available -- Show ETA when the build is in progress and historical data is available. If the build has been running longer than the median, say "overdue by ~X min" -- Show job counters as "N/Total completed · M running · P waiting" (pending may be 0 — stages start in parallel) -- If the build hasn't started yet: direct PR → "⏳ Not triggered yet — typically starts within a few minutes of a push"; fork PR → "⏳ Awaiting pipeline approval — a maintainer can start it with `/azp run`" - -**Pass/fail verdict — use the build `result` + GitHub checks, not the raw test-failure count:** -- **Build `result: failed` or any ❌ check** → CI is red. Surface the gating failures (the ❌ checks / `result: failed` timeline jobs and their tests). -- **Build still running with a gating job already `result: failed`** → highlight prominently so the user can start fixing immediately: - > ⚠️ Build still in progress, but the **Package Tests** stage has already failed — you can start investigating now. -- **Build `result: succeeded` and all checks green** → report CI **green**, even if `ResultsByBuild` lists failures. Mention any such failures as a one-line flaky/non-gating note (e.g. "2 flaky `SslTest` failures in `continueOnError` lanes — not blocking"). - -### Phase 2: Deep Investigation (only if user requests) - -Only proceed here if the user asks to investigate a specific failure, view logs, or analyze binlogs. - -#### Fetch logs - -Get the `log.id` from failed timeline records, then: - -```bash -az devops invoke --area build --resource logs \ - --route-parameters project=$PROJECT buildId=$BUILD_ID logId=$LOG_ID \ - --org $ORG_URL --project $PROJECT \ - --out-file "/tmp/azdo-log-$LOG_ID.log" 2>&1 -tail -40 "/tmp/azdo-log-$LOG_ID.log" -``` - -```powershell -$logFile = Join-Path $env:TEMP "azdo-log-$LOG_ID.log" -az devops invoke --area build --resource logs ` - --route-parameters project=$PROJECT buildId=$BUILD_ID logId=$LOG_ID ` - --org $ORG_URL --project $PROJECT ` - --out-file $logFile -Get-Content $logFile -Tail 40 -``` - -#### Analyze .binlog artifacts - -See [references/binlog-analysis.md](references/binlog-analysis.md) for binlog download and analysis commands. +`ResultsByBuild` returns every failed test across all runs (only `Failed`/`Aborted` are queryable). For per-test error/stack, the ETA query, or the test-runs list, see [references/azdo-queries.md](references/azdo-queries.md). -#### Categorize failures +### Verdict — judge by build `result` + checks, NOT the failed-test count -See [references/error-patterns.md](references/error-patterns.md) for dotnet/android-specific error patterns and categorization. +- **`result: failed` or any ❌ check** → red. Surface the gating failures (the ❌ checks / `result: failed` timeline jobs and their tests). If the build is still running with a job already failed, lead with that so the user can start fixing now. +- **`result: succeeded` and all checks green** → green, even if `ResultsByBuild` lists failures — those are flaky/non-gating `continueOnError` lanes. Mention them in one line; don't block. -## Error Handling +### Report -- **Build in progress:** Still query failed timeline records AND `ResultsByBuild`. Report early **gating** failures (timeline jobs with `result: failed`) alongside the in-progress status; treat `ResultsByBuild`-only failures cautiously (they may be flaky/non-gating — see the gating note in Step 3b). Only offer `gh pr checks --watch` if there are no gating failures yet. -- **Checks in "Expected" state (no build URL):** The `dotnet-android` pipeline hasn't started. For a **fork PR** it's likely awaiting maintainer approval — report: "⏳ Awaiting pipeline approval — a maintainer can start it with `/azp run`." For a **direct PR** it usually starts within a few minutes of a push — report: "⏳ Not triggered yet — typically starts within a few minutes of a push." -- **A `Xamarin.Android-PR` check appears:** That pipeline no longer runs on PRs (`pr: none`); if present it belongs to a branch or official build — ignore it for PR status. -- **Sign-in page / 401 on a `test`-area `az rest` call:** Tell the user to run `az login` and retry. -- **Build not found:** Verify the PR number/build ID is correct. -- **No test runs yet:** The build may not have reached the test phase. Report what's available and note that tests haven't started. +Cover: fork badge (🔀 direct / 🍴 fork), the single `dotnet-android` build (result, elapsed, jobs `N/total done · M running`), an ETA if in-progress (rough window — durations swing ~50 min to ~3 h with agent queue time; see references), failing stages/jobs, and gating vs flaky test failures. For `Package Tests` (on-device) crashes — `UnsatisfiedLinkError`, `SIGSEGV`, a silent `am instrument` — the answer is usually in `logcat-.txt` inside the `Test Results - APKs ...` artifact, not the test message. -## Tips +## Phase 2 — Deep dive (only if asked) -- The **build `result` + GitHub check states** are the source of truth for pass/fail — the test API (`ResultsByBuild`) lists failures even on green builds (flaky `continueOnError` device-test lanes) -- Focus on the **first** error chronologically — later errors often cascade -- `.binlog` has richer detail than text logs when logs show only "Build FAILED" -- `issues` in timeline records often contain the root cause without needing to download logs -- For on-device (Package/APK) test crashes, the `logcat-*.txt` artifact is usually more informative than the test error message +- Logs, per-test error/stack, ETA, test-runs list → [references/azdo-queries.md](references/azdo-queries.md) +- `.binlog` download + analysis → [references/binlog-analysis.md](references/binlog-analysis.md) +- Categorizing a failure (real / flaky / infra) → [references/error-patterns.md](references/error-patterns.md) diff --git a/.github/skills/ci-status/references/azdo-queries.md b/.github/skills/ci-status/references/azdo-queries.md new file mode 100644 index 00000000000..2c0c1c90c31 --- /dev/null +++ b/.github/skills/ci-status/references/azdo-queries.md @@ -0,0 +1,50 @@ +# AZDO queries (dnceng-public) + +Deeper `az` commands for the `dotnet-android` build, beyond the core ones in SKILL.md. Shared setup: + +```bash +ORG=https://dev.azure.com/dnceng-public; PROJECT=public +RES=499b84ac-1321-427f-aa17-267ca6975798 # Azure DevOps app id, for `az rest --resource` +``` + +`build`-area `az devops invoke` works unauthenticated; the `test` area is broken (404) so the test data goes through `az rest`; `az rest` and artifact/log downloads need `az login`. + +## ETA for an in-progress build + +Duration is dominated by hosted-agent queue time (same ~38 jobs every run, yet ~50 min to ~3 h+). Pull recent green runs of def `333`, take the **median** duration, `ETA = startTime + median`; present it as a rough window. + +```bash +az devops invoke --area build --resource builds --org $ORG \ + --route-parameters project=$PROJECT \ + --query-parameters "definitions=333&statusFilter=completed&resultFilter=succeeded&\$top=10" \ + --query "value[].{start:startTime, finish:finishTime}" -o json +``` + +## Failed-test error message / stack trace + +`ResultsByBuild` (SKILL.md) gives the names + `runId`. For messages, list the run's failed results — the single-result-by-`testId` route returns null here. Repeat per distinct `runId`: + +```bash +az devops invoke --area test --resource results --org $ORG \ + --route-parameters project=$PROJECT runId=$RUN_ID \ + --query-parameters "outcomes=Failed&\$top=20" \ + --query "value[].{test:testCaseTitle, error:errorMessage, stack:stackTrace}" -o json +``` + +## Test-runs list (per-run pass/total) + +```bash +az rest --method get --resource $RES \ + --url "$ORG/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.1" \ + --query "value[].{id:id, name:name, total:totalTests, passed:passedTests}" -o json +``` + +## Fetch a failed task's log + +Take `log.id` from a `records[?result=='failed']` timeline entry, then: + +```bash +az devops invoke --area build --resource logs --org $ORG --project $PROJECT \ + --route-parameters project=$PROJECT buildId=$BUILD_ID logId=$LOG_ID \ + --out-file "/tmp/azdo-$LOG_ID.log" +``` From 5e544e7ada5580f04d0cd355528dbd423b70da61 Mon Sep 17 00:00:00 2001 From: Simon Rozsival Date: Tue, 16 Jun 2026 22:36:14 +0200 Subject: [PATCH 3/7] [skills] Restore explicit output template in ci-status MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per review feedback: the explicit "Present summary" template produced predictable, familiar output. Bring back an explicit (but lean) report template — PR header + fork badge, the dotnet-android build block (result / elapsed / ETA / job counters), a Stage > Job status table, Failures, gating-vs-flaky Failed tests, Verdict, and What next — in place of the prose-only guidance from the previous commit. SKILL.md is still ~107 lines (vs 306 originally). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/skills/ci-status/SKILL.md | 33 +++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/.github/skills/ci-status/SKILL.md b/.github/skills/ci-status/SKILL.md index 8d342415389..86c42d48026 100644 --- a/.github/skills/ci-status/SKILL.md +++ b/.github/skills/ci-status/SKILL.md @@ -67,9 +67,38 @@ az rest --method get --resource $RES \ - **`result: failed` or any ❌ check** → red. Surface the gating failures (the ❌ checks / `result: failed` timeline jobs and their tests). If the build is still running with a job already failed, lead with that so the user can start fixing now. - **`result: succeeded` and all checks green** → green, even if `ResultsByBuild` lists failures — those are flaky/non-gating `continueOnError` lanes. Mention them in one line; don't block. -### Report +### Report — use this format (omit sections that don't apply) -Cover: fork badge (🔀 direct / 🍴 fork), the single `dotnet-android` build (result, elapsed, jobs `N/total done · M running`), an ETA if in-progress (rough window — durations swing ~50 min to ~3 h with agent queue time; see references), failing stages/jobs, and gating vs flaky test failures. For `Package Tests` (on-device) crashes — `UnsatisfiedLinkError`, `SIGSEGV`, a silent `am instrument` — the answer is usually in `logcat-.txt` inside the `Test Results - APKs ...` artifact, not the test message. +``` +# CI Status — PR #NNNN "" +🔀 Direct PR (or 🍴 Fork PR — may await `/azp run` approval) + +## dotnet-android [#<buildId>](<link>) +**Result:** ✅ Succeeded / ❌ Failed / 🟡 In Progress +⏱️ <elapsed> · ETA ~HH:MM UTC (rough — recent runs ≈50 min–3 h) ← only while in progress +📊 Jobs: <done>/<total> done · <running> running · <waiting> waiting + +| Stage > Job | Status | +|-------------|--------| +| Mac > macOS > Build | ✅ | +| Package Tests > macOS > Tests > APKs 2 | ❌ | + +### Failures ← if any +❌ <Stage> > <Job> — <first error from issues[]> + +### Failed tests ← if any +- **Gating** (must fix): `Ns.Class.Test` — <error> +- **Flaky / non-gating** (build still green; e.g. `SslTest.*`, `-TrimModePartial`/`-NoAab` lanes): `...` + +## Verdict: ✅ green / ❌ red — <one-line reason> + +## What next? +1. Logs / stack trace for a failure +2. `.binlog` (+ `logcat-*.txt` for device-test crashes) +3. Re-run a flaky/failed stage with `/azp run` +``` + +Notes: every `dotnet-android (...)` check is one job, so the Stage > Job table *is* the check list (the only non-`dotnet-android` check is `license/cla`). For `Package Tests` (on-device) crashes — `UnsatisfiedLinkError`, `SIGSEGV`, a silent `am instrument` — the cause is usually in `logcat-<testName>.txt` inside the `Test Results - APKs ...` artifact, not the test message. ## Phase 2 — Deep dive (only if asked) From ab92a96866edea4a7dcbb2b8c254539598954d37 Mon Sep 17 00:00:00 2001 From: Simon Rozsival <simon@rozsival.com> Date: Wed, 17 Jun 2026 10:27:04 +0200 Subject: [PATCH 4/7] [skills] Add deep failure analysis to ci-status Expand the ci-status skill's first summary (Phase 1, Step 3d) with a new scripts/ci_failures.py that turns raw build failures into: - a per-test cross-config matrix: for each test that failed in >=1 config, which flavors/OSes it failed vs passed in, with same-build retries shown as "Failed->Passed (retry)" (a retry that passes => flaky), plus the assembly and the assert/stack trace - crashed / incomplete lane detection: lanes that went red with no usable failed-test list ("Zero tests ran" startup crash, incomplete run, or timeout/hang), excluding normal failed-test lanes; points at the device logcat for the started-but-never-finished culprit - branch cross-reference: PR-changed files whose name matches a failing test's class/namespace/assembly Also document the crash-culprit-from-logcat recipe in references/azdo-queries.md and fix the error-patterns.md job-timeout signal. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/skills/ci-status/SKILL.md | 162 ++++++++++--- .../ci-status/references/azdo-queries.md | 52 ++++- .../ci-status/references/error-patterns.md | 2 +- .../skills/ci-status/scripts/ci_failures.py | 213 ++++++++++++++++++ 4 files changed, 387 insertions(+), 42 deletions(-) create mode 100755 .github/skills/ci-status/scripts/ci_failures.py diff --git a/.github/skills/ci-status/SKILL.md b/.github/skills/ci-status/SKILL.md index 86c42d48026..e734b817cbf 100644 --- a/.github/skills/ci-status/SKILL.md +++ b/.github/skills/ci-status/SKILL.md @@ -10,34 +10,44 @@ description: > # CI Status -PR validation runs on **one public** Azure DevOps pipeline: **`dotnet-android`** on `dev.azure.com/dnceng-public` (project `public`, definition id `333`, `build-tools/automation/azure-pipelines-public.yaml`), full test matrix for **every** PR. On GitHub it shows as ~39 `dotnet-android (...)` checks plus `license/cla`, all backed by **one** build. +Triage CI for a `dotnet/android` PR in two phases: **Phase 1** (always) gathers status and renders the report; **Phase 2** (only when asked) drills in via the references. Run the commands verbatim — the `jq`/`az` queries are exact and fragile. -Repo-specific things you must know (everything else is standard `gh`/`az`): +Every PR runs **one** public Azure DevOps build: pipeline **`dotnet-android`** on `dev.azure.com/dnceng-public` (project `public`, definition `333`), full test matrix. It surfaces on GitHub as ~39 `dotnet-android (...)` checks plus `license/cla`, all backed by that single build. -- **`Xamarin.Android-PR`** (devdiv) has `pr: none` — it does NOT run on PRs. If you see that check it's a branch/official build; ignore it for PR status. -- **Fork status only changes triggering, not which pipeline runs.** Fork builds may wait for a maintainer to approve the run (and re-approve per push) via an `/azp run` comment; direct builds auto-start on push. -- The **`test` area of `az devops invoke` is broken on dnceng-public (404)** — get test results via `az rest` (below). The `build` area works, unauthenticated. `az rest` / log+artifact downloads need `az login` (else a sign-in page / 401). -- **The build `result` + GitHub check states are the source of truth — not the test API.** Device-test lanes run with `continueOnError`, so flaky failures (notably `System.NetTests.SslTest.*`, or failures only in flavor lanes like `-TrimModePartial`/`-NoAab`) appear as failed tests on otherwise-green builds. +## Pipeline facts (apply throughout) + +Everything else is standard `gh`/`az`; only these are non-obvious: + +- **Judge pass/fail by the build `result` + GitHub check states — never by the test API.** Device-test lanes run with `continueOnError`, so flaky failures (notably `System.NetTests.SslTest.*`, or failures only in flavor lanes like `-TrimModePartial`/`-NoAab`) show as failed tests on otherwise-green builds. +- **Ignore `Xamarin.Android-PR`** (devdiv): it has `pr: none` and never runs on PRs; if present, it's a branch/official build. +- **Expect a fork PR to await `/azp run` approval** (re-approved per push); direct PRs auto-start on push. Forks change only triggering, not which pipeline runs. +- **Query test results with `az rest`** — `az devops invoke --area test` 404s on dnceng-public. The `build` area works unauthenticated; `az rest` and log/artifact downloads need `az login` (else 401). ## Phase 1 — Status (always) +Run the steps in order; each `jq` reuses a file an earlier fetch saved: + +1. **Resolve the PR** and its build id — stop if none or not yet built. +2. **Fetch the build result** and save the timeline. +3. **Derive** job status (3a), per-job timing (3b), and the failing-job test breakdown (3c). +4. **Decide the verdict**, then **write the report**. + ```bash ORG=https://dev.azure.com/dnceng-public; PROJECT=public +``` -# Resolve the PR (drop --repo/$PR to auto-detect from the current branch); stop if none: -gh pr view $PR --repo dotnet/android --json number,title,isCrossRepository +**Step 1 — Resolve the PR.** Drop `--repo`/`$PR` to auto-detect from the current branch: -# GitHub checks (every dotnet-android link points at the same build): +```bash +gh pr view $PR --repo dotnet/android --json number,title,isCrossRepository gh pr checks $PR --repo dotnet/android --json name,state,link - -# Shared build id: BUILD_ID=$(gh pr checks $PR --repo dotnet/android --json name,link \ --jq '[.[]|select(.name|startswith("dotnet-android")).link][0]' | grep -oE 'buildId=[0-9]+' | cut -d= -f2 | head -1) ``` -Empty `BUILD_ID` (checks "Expected", no build URL) = pipeline not started: fork PR → "awaiting `/azp run` approval"; direct PR → "not triggered yet (starts within minutes of a push)". Report and stop. +If `BUILD_ID` is empty (checks "Expected", no build URL), the pipeline hasn't started — report "awaiting `/azp run` approval" (fork) or "not triggered yet" (direct), then stop. -Build status, then timeline (job progress + failures so far — both valid mid-build): +**Step 2 — Fetch the build result and save the timeline** (both valid mid-build; `/tmp/tl.json` is reused by Steps 3–4): ```bash az devops invoke --area build --resource builds --org $ORG \ @@ -45,29 +55,99 @@ az devops invoke --area build --resource builds --org $ORG \ --query "{status:status, result:result, startTime:startTime, finishTime:finishTime}" -o json az devops invoke --area build --resource timeline --org $ORG \ - --route-parameters project=$PROJECT buildId=$BUILD_ID \ - --query "records[?type=='Job'].{name:name, state:state, result:result}" -o json + --route-parameters project=$PROJECT buildId=$BUILD_ID --query "records[]" -o json > /tmp/tl.json +``` + +**Step 3a — List job status, then failing records.** `state` is `completed`/`inProgress`/`pending` (pending is often 0 — stages start in parallel). Trust failing `issues[]` for the root cause; check names (e.g. `dotnet-android (Linux Tests Linux > Tests > MSBuild 2)`) already name the lane: + +```bash +jq -r '.[]|select(.type=="Job")|[(.result // .state), .name]|@tsv' /tmp/tl.json | sort +jq -r '.[]|select(.result=="failed" or .result=="canceled")|[.type,.name,((.issues//[])|map(.message)|join(" | "))]|@tsv' /tmp/tl.json ``` -Job `state` is `completed`/`inProgress`/`pending` (pending is often 0 — stages start in parallel). `records[?result=='failed']` gives failing stages/jobs/tasks; their `issues[]` usually carry the root cause, and the granular check names (e.g. `dotnet-android (Linux Tests Linux > Tests > MSBuild 2)`) already pinpoint the lane. +**Step 3b — Time every job and spell out its status.** Emit one row per job: `Status` · `Wait` (build start → job start: upstream builds + agent queue) · `Run` (execution) · `Finished` (… ago, or `running`). Always spell `Status` out — never a bare icon (this vocabulary is reused in the report): -Failed tests — `az devops invoke --area test` 404s here, so use `az rest`: +- `✅ Passed` · `❌ Failed` · `⏹️ Canceled` +- `⏱️ Timed out (N-min cap)` — a `canceled` job whose `issues[]` says *"ran longer than the maximum time"* (read N from the message) +- `🟡 Running` · `⏳ Queued` + +```bash +jq -r ' + def secs: sub("\\.[0-9]+";"")|fromdateiso8601; + def hms: if .==null then "—" else (./1|floor) as $s|($s/3600|floor) as $h|(($s%3600)/60|floor) as $m|($s%60) as $x| + if $h>0 then "\($h)h\(if $m<10 then "0" else "" end)\($m)m" elif $m>0 then "\($m)m\(if $x<10 then "0" else "" end)\($x)s" else "\($x)s" end end; + def reason: + ((.issues//[])|map(.message)|join(" ")) as $msg + | if .result=="succeeded" then "✅ Passed" + elif .result=="canceled" or .result=="failed" then + (if ($msg|test("maximum time of")) then ($msg|capture("maximum time of (?<m>[0-9]+) minutes")|"⏱️ Timed out (\(.m)-min cap)") + elif .result=="canceled" then "⏹️ Canceled" else "❌ Failed" end) + elif .state=="inProgress" then "🟡 Running" + elif .state=="pending" then "⏳ Queued" + else "· \(.result // .state)" end; + (now) as $now | ([.[]|select(.startTime!=null)|(.startTime|secs)]|min) as $t0 + | .[]|select(.type=="Job") + | [ reason, .name, + (if .startTime then ((.startTime|secs)-$t0|hms) else "—" end), + (if .startTime then (((.finishTime|if .==null then $now else secs end))-(.startTime|secs)|hms) else "—" end), + (if .finishTime then (($now-(.finishTime|secs))|hms)+" ago" elif .state=="inProgress" then "running" else "—" end) ] + | @tsv' /tmp/tl.json | sort -t$'\t' -k2 | column -t -s$'\t' +``` + +The `reason` function detects timeout from each job's own `issues[]`. Refine a bare `❌ Failed` with the Step 3c count: **0 failed tests ⇒ a canceled `Run tests` task or the `fail if any issues occurred` gate, not a real failure** — say so. + +**Step 3c — Fetch failed tests + per-flavor counts** (two `az rest` calls; `--area test` 404s here): **(a)** failed test names + their `runId`; **(b)** every run's per-flavor counts + its phase (`unanalyzedTests`=failed, `notApplicableTests`=skipped): ```bash RES=499b84ac-1321-427f-aa17-267ca6975798 # Azure DevOps app id az rest --method get --resource $RES \ --url "$ORG/$PROJECT/_apis/test/ResultsByBuild?buildId=$BUILD_ID&outcomes=Failed&api-version=7.1-preview" \ - --query "value[].{test:automatedTestName, runId:runId}" -o json + --query "value[].{test:automatedTestName, runId:runId}" -o json > /tmp/failed.json + +az rest --method get --resource $RES \ + --url "$ORG/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.1&includeRunDetails=true" \ + --query "value[].{id:id, name:name, total:totalTests, passed:passedTests, failed:unanalyzedTests, skipped:notApplicableTests, phase:pipelineReference.phaseReference.phaseName}" -o json > /tmp/runs.json ``` -`ResultsByBuild` returns every failed test across all runs (only `Failed`/`Aborted` are queryable). For per-test error/stack, the ETA query, or the test-runs list, see [references/azdo-queries.md](references/azdo-queries.md). +Then build the breakdown — for each failed/canceled job, list its flavors (test runs) with `passed/total · fail · skip`, failed test names nested beneath: -### Verdict — judge by build `result` + checks, NOT the failed-test count +```bash +jq -r --slurpfile failed /tmp/failed.json --slurpfile tl /tmp/tl.json ' + [$tl[0][]|select(.type=="Phase")] as $ph + | ($ph|map(select(.result=="failed" or .result=="canceled"))|map(.refName)) as $bad + | $failed[0] as $ft + | group_by(.phase)[] | select(.[0].phase as $p|$bad|index($p)) + | .[0].phase as $p | ($ph[]|select(.refName==$p)|.name) as $job + | "### \($job) — \(map(.total)|add) tests: \(map(.passed)|add) passed, \(map(.failed)|add) failed, \(map(.skipped)|add) skipped", + (sort_by(-.failed,.name)[] + | (if .failed>0 then "❌" else "✅" end) as $m + | " \($m) \(.name) (\(.passed)/\(.total) pass, \(.failed) fail, \(.skipped) skip)", + (.id as $rid|$ft[]|select(.runId==$rid)|" ↳ \(.test)")) +' /tmp/runs.json +``` -- **`result: failed` or any ❌ check** → red. Surface the gating failures (the ❌ checks / `result: failed` timeline jobs and their tests). If the build is still running with a job already failed, lead with that so the user can start fixing now. -- **`result: succeeded` and all checks green** → green, even if `ResultsByBuild` lists failures — those are flaky/non-gating `continueOnError` lanes. Mention them in one line; don't block. +`ResultsByBuild` returns every failed test across runs (only `Failed`/`Aborted` are queryable). Matrix lanes that share one phase (e.g. `MSBuild+Emulator`) aggregate in the breakdown — use the Step 3b timing table to pinpoint the numbered job that died. For per-test error/stack, the ETA query, and the run→job mapping, see [references/azdo-queries.md](references/azdo-queries.md). -### Report — use this format (omit sections that don't apply) +**Step 3d — Deep failure analysis (run whenever the build is red).** From the repo root, run the bundled script — it turns raw failures into the **per-test cross-config matrix**, **crash detection**, and **branch cross-reference** the report needs (makes its own `az`/`gh` calls, needs `az login`, ~15–45 s — scales with the affected test family + retries): + +```bash +python3 .github/skills/ci-status/scripts/ci_failures.py --build-id $BUILD_ID --pr $PR +``` + +It prints three report-ready sections: +- **Cross-config matrix** — per failed test: the flavors/OSes where it **failed** vs **passed**, with same-build retries shown as `Failed→Passed (retry)` (a retry that passes ⇒ flaky), plus the assembly and the assert/stack. Failing in one flavor/OS only localizes the cause; failing across many is systemic. +- **Crashed / incomplete lanes** — lanes that went red with *no* usable failed-test list (`Zero tests ran`, an incomplete run, or a timeout/hang). The culprit (a test that **started but never finished**, or a native crash) lives only in the device **logcat**; the script prints the download+grep command (also in [references/azdo-queries.md](references/azdo-queries.md)). +- **Branch cross-reference** — PR-changed files whose name matches a failing test's class/namespace/assembly: a lead for an obvious cause. Confirm against the diff before asserting causation. + + +### Step 4 — Verdict (decide before writing). Judge by build `result` + checks, NOT the failed-test count: + +- **`result: failed`, or any ❌ check → red.** Lead with the gating failures (their jobs + tests). If the build is still running with a job already failed, surface it so the user can start fixing now. +- **`result: succeeded` and all checks green → green** — even if `ResultsByBuild` lists failures, those are flaky `continueOnError` lanes. Note them in one line; don't block. + +### Report format + +Emit this structure (omit sections that don't apply). Spell out every `Status` per the Step 3b vocabulary, refining `❌ Failed` with the Step 3c count: ``` # CI Status — PR #NNNN "<title>" @@ -78,17 +158,29 @@ az rest --method get --resource $RES \ ⏱️ <elapsed> · ETA ~HH:MM UTC (rough — recent runs ≈50 min–3 h) ← only while in progress 📊 Jobs: <done>/<total> done · <running> running · <waiting> waiting -| Stage > Job | Status | -|-------------|--------| -| Mac > macOS > Build | ✅ | -| Package Tests > macOS > Tests > APKs 2 | ❌ | +| Stage > Job | Status | Wait | Run | Finished | +|-------------|--------|------|-----|----------| +| Mac > macOS > Build | ✅ Passed | 12m | 23m | 8h28m ago | +| Package Tests > macOS > Tests > APKs 2 | ❌ Failed — 1 test (flaky GC) | 1h42m | 1h13m | 6h12m ago | +| Package Tests > macOS > Tests > APKs 1 | ❌ Failed — 0 tests (canceled run / gate) | 1h41m | 26m31s | 7h02m ago | +| MSBuild Emulator Tests > … > MSBuild+Emulator 6 | ⏱️ Timed out (180-min cap) | 1h44m | 3h00m | 4h21m ago | +(List every job, or — for a large matrix — the failed/canceled/timed-out lanes plus the slowest few.) ### Failures ← if any ❌ <Stage> > <Job> — <first error from issues[]> -### Failed tests ← if any -- **Gating** (must fix): `Ns.Class.Test` — <error> -- **Flaky / non-gating** (build still green; e.g. `SslTest.*`, `-TrimModePartial`/`-NoAab` lanes): `...` +### Failed tests — cross-config (Step 3d) ← one block per failed test +**`SslWithinTasksShouldWork`** (`System.NetTests.SslTest` · `microsoft.android.run.dll`) +- ❌ failed: `NoAab` (Failed→Passed on retry), `TrimModePartial` (Failed→Passed on retry) +- ✅ passed: `Release`, `CoreCLR`, `Debug`, +4 more +- `System.Net.WebException : 503 Service Unavailable` ⇒ flaky network, non-gating + at System.NetTests.SslTest.SslWithinTasksShouldWork() + +### Crashed / incomplete lanes (Step 3d) ← if any +⚠️ **Mono.Android.NET_Tests-Debug** — `run` task succeededWithIssues, no results published ("Zero tests ran" / native crash). Name the culprit from logcat (Step 3d command). + +### Branch cross-reference (Step 3d) ← if --pr and a name overlaps +🔍 `SomeType.SomeTest` ⟵ `src/.../SomeType.cs` changed in this PR — likely cause; confirm in the diff. ## Verdict: ✅ green / ❌ red — <one-line reason> @@ -98,10 +190,12 @@ az rest --method get --resource $RES \ 3. Re-run a flaky/failed stage with `/azp run` ``` -Notes: every `dotnet-android (...)` check is one job, so the Stage > Job table *is* the check list (the only non-`dotnet-android` check is `license/cla`). For `Package Tests` (on-device) crashes — `UnsatisfiedLinkError`, `SIGSEGV`, a silent `am instrument` — the cause is usually in `logcat-<testName>.txt` inside the `Test Results - APKs ...` artifact, not the test message. +Notes: every `dotnet-android (...)` check is one job, so the Stage > Job table *is* the check list (the only non-`dotnet-android` check is `license/cla`). Step 3d's cross-config matrix is the fastest way to tell a real failure (fails across flavors/OSes, never passes on retry) from a flake (single flavor, or `Failed→Passed` on retry). For a crashed lane with no failed-test list, name the culprit from the device `logcat-<flavor>.txt` (Step 3d's command; recipe in [references/azdo-queries.md](references/azdo-queries.md)) — not the test message. + +## Phase 2 — Deep dive (only when asked) -## Phase 2 — Deep dive (only if asked) +Read the matching reference, then act on it: -- Logs, per-test error/stack, ETA, test-runs list → [references/azdo-queries.md](references/azdo-queries.md) +- Logs, per-test error/stack, ETA, per-flavor breakdown fields + run→job mapping, **crash-culprit from logcat** → [references/azdo-queries.md](references/azdo-queries.md) - `.binlog` download + analysis → [references/binlog-analysis.md](references/binlog-analysis.md) -- Categorizing a failure (real / flaky / infra) → [references/error-patterns.md](references/error-patterns.md) +- Categorize a failure (real / flaky / infra) → [references/error-patterns.md](references/error-patterns.md) diff --git a/.github/skills/ci-status/references/azdo-queries.md b/.github/skills/ci-status/references/azdo-queries.md index 2c0c1c90c31..92c7acccb84 100644 --- a/.github/skills/ci-status/references/azdo-queries.md +++ b/.github/skills/ci-status/references/azdo-queries.md @@ -31,20 +31,58 @@ az devops invoke --area test --resource results --org $ORG \ --query "value[].{test:testCaseTitle, error:errorMessage, stack:stackTrace}" -o json ``` -## Test-runs list (per-run pass/total) +## Per-flavor test breakdown — fields & run → job mapping + +The breakdown in SKILL.md fetches `/tmp/runs.json` from `/_apis/test/runs?...&includeRunDetails=true`. Field meanings per run (one run = one test *flavor*, e.g. `Mono.Android.NET_Tests-NativeAOT`): + +| Field | Source | Meaning | +|-------|--------|---------| +| `total` | `totalTests` | all tests in the run | +| `passed` | `passedTests` | passed | +| `failed` | `unanalyzedTests` | failed/aborted | +| `skipped` | `notApplicableTests` | skipped / inconclusive | +| `phase` | `pipelineReference.phaseReference.phaseName` | the pipeline phase the run belongs to | + +`run.phase` equals a timeline **Phase** record's `refName`; that record's `name` is the human lane — e.g. `mac_apk_tests_net_2` → `macOS > Tests > APKs 2`. That join (`runs` × timeline phases) is what the breakdown `jq` does. **Matrix lanes that share one phase** (e.g. all `MSBuild+Emulator N` jobs are phase `mac_dotnetdevice_tests`) aggregate into a single breakdown block — use the per-job timing table to see which numbered job actually failed/timed out. + +Quick per-run counts without the join: ```bash az rest --method get --resource $RES \ - --url "$ORG/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.1" \ - --query "value[].{id:id, name:name, total:totalTests, passed:passedTests}" -o json + --url "$ORG/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.1&includeRunDetails=true" \ + --query "value[].{name:name, total:totalTests, passed:passedTests, failed:unanalyzedTests, skipped:notApplicableTests}" -o json ``` +To enrich the breakdown with the **actual error message** under each failed test, replace `/tmp/failed.json` with per-run results that include `errorMessage` (the "Failed-test error message" query above) — key them by `runId` the same way the breakdown's `$ft` lookup does. + ## Fetch a failed task's log -Take `log.id` from a `records[?result=='failed']` timeline entry, then: +Take `log.id` from a `records[?result=='failed']` timeline entry, then (works unauthenticated via `az rest`): + +```bash +az rest --method get --resource $RES \ + --url "$ORG/$PROJECT/_apis/build/builds/$BUILD_ID/logs/$LOG_ID?api-version=7.1" --output-file "/tmp/azdo-$LOG_ID.log" +``` + +The per-flavor `run <flavor>` task log holds the MTP summary (`Test run summary: Zero tests ran` ⇒ the app crashed at startup); the per-test lifecycle and native crash are **not** here — they are in logcat (below). + +## Crash culprit from logcat + +`scripts/ci_failures.py` flags crashed/incomplete/timed-out lanes, but the culprit test is only in the device **logcat**, published inside that lane's `Test Results - ...` build artifact (100 MB–2 GB — prefer the smaller `Debug` lane). Download it, then scan `logcat-<flavor>.txt`: ```bash -az devops invoke --area build --resource logs --org $ORG --project $PROJECT \ - --route-parameters project=$PROJECT buildId=$BUILD_ID logId=$LOG_ID \ - --out-file "/tmp/azdo-$LOG_ID.log" +# list artifacts + sizes to pick the failing lane: +az rest --method get --resource $RES \ + --url "$ORG/$PROJECT/_apis/build/builds/$BUILD_ID/artifacts?api-version=7.1" \ + --query "value[].{name:name, mb:(resource.properties.artifactsize)}" -o json + +az pipelines runs artifact download --run-id $BUILD_ID --org $ORG --project $PROJECT \ + --artifact-name "Test Results - APKs .NET Debug - macOS 1" --path /tmp/cilogs + +# The crasher is the LAST test that logged a start with no matching pass/fail, +# usually right before a native signal: +grep -nE 'Running |\[PASS\]|\[FAIL\]|SIGSEGV|SIGABRT|tombstone|FATAL|art::|JNI DETECTED|Process .* died' \ + /tmp/cilogs/**/logcat-*.txt | tail -60 ``` + +For a `Zero tests ran` lane the crash is at app startup (look for the first `SIGSEGV`/`tombstone`/`JNI DETECTED ERROR`, not a specific test); for a timeout the suspect is the last `Running <test>` with no result. diff --git a/.github/skills/ci-status/references/error-patterns.md b/.github/skills/ci-status/references/error-patterns.md index f705a080548..d3e0efbf764 100644 --- a/.github/skills/ci-status/references/error-patterns.md +++ b/.github/skills/ci-status/references/error-patterns.md @@ -36,7 +36,7 @@ These are CI environment issues, not code problems. | Network | `Unable to load the service index`, `Connection refused` | | NuGet feed | `NU1301` (feed connectivity) | | Agent issues | `The agent did not connect`, `##[error] The job was canceled` | -| Timeout (job-level) | Job canceled after 55+ minutes | +| Timeout (job-level) | `result: canceled` + `issues[]` says *"ran longer than the maximum time of N minutes"* | ## Decision Tree diff --git a/.github/skills/ci-status/scripts/ci_failures.py b/.github/skills/ci-status/scripts/ci_failures.py new file mode 100755 index 00000000000..27aac7ca653 --- /dev/null +++ b/.github/skills/ci-status/scripts/ci_failures.py @@ -0,0 +1,213 @@ +#!/usr/bin/env python3 +"""Enriched failure analysis for one dnceng-public `dotnet-android` build: + 1. cross-config matrix per failed test (failed/passed/retried configs) + stack/asserts + 2. crashed / incomplete lanes (started-but-not-finished culprit lives in logcat) + 3. branch cross-reference (PR changes that name a failing test's class/namespace/assembly) + +Needs `az login`. Usage: ci_failures.py --build-id N [--pr N] [--repo dotnet/android] +""" +import json, subprocess, sys, argparse, re +from collections import defaultdict +from concurrent.futures import ThreadPoolExecutor + +ORG = "https://dev.azure.com/dnceng-public" +PROJECT = "public" +RES = "499b84ac-1321-427f-aa17-267ca6975798" + + +def az_json(url): + p = subprocess.run(["az", "rest", "--method", "get", "--resource", RES, + "--url", url, "-o", "json"], capture_output=True, text=True) + if p.returncode != 0: + sys.stderr.write(f"az error {url}\n{p.stderr[:300]}\n") + return None + try: + return json.loads(p.stdout) + except json.JSONDecodeError: + return None + + +def run_results(rid): + data = az_json(f"{ORG}/{PROJECT}/_apis/test/Runs/{rid}/results?api-version=7.1&$top=5000") + out = {} + for row in (data or {}).get("value", []): + n = row.get("automatedTestName") + if n: + out[n] = (row.get("outcome"), row.get("errorMessage"), row.get("stackTrace")) + return rid, out + + +def fetch_all(rids, workers=6): + if not rids: + return {} + with ThreadPoolExecutor(max_workers=workers) as ex: + return dict(ex.map(run_results, rids)) + + +def base_of(name): + """Strip flavor/OS/index suffix so sibling configs share one base. + 'Mono.Android.NET_Tests-NativeAOT' -> 'Mono.Android.NET_Tests'; + 'Xamarin.Android.Build.Tests - macOS-7' -> 'Xamarin.Android.Build.Tests'.""" + b = re.sub(r' - (macOS|Windows|Linux)(-\d+)?$', '', name) + b = re.sub(r'-[A-Za-z0-9]+$', '', b) + return b + + +# ---------------- section 1: cross-config matrix ---------------- +def section_matrix(bid, failed, runs, run_by_id): + fail_runs, storage = defaultdict(set), {} + for f in failed: + fail_runs[f["automatedTestName"]].add(f["runId"]) + storage[f["automatedTestName"]] = f.get("automatedTestStorage") + + def first_base(rids): + for r in rids: + if r in run_by_id: + return base_of(run_by_id[r]["name"]) + return "" + fam = {n: first_base(rids) for n, rids in fail_runs.items()} + cand = defaultdict(list) + for fk in set(fam.values()): + for r in runs: + if base_of(r["name"]) == fk: + cand[fk].append(r) + cache = fetch_all(list({r["id"] for fk in fam.values() for r in cand[fk]})) + + print(f"## Failed-test cross-config matrix — {len(fail_runs)} distinct test(s)\n") + for n in sorted(fail_runs): + fk = fam[n] + cfg = defaultdict(list) + for r in cand[fk]: + row = cache.get(r["id"], {}).get(n) + if row: + cfg[r["name"]].append((r.get("completedDate") or "", row[0])) + short, ns = n.rsplit(".", 1)[-1], n.rsplit(".", 1)[0] + print(f"### `{short}` ({ns})") + print(f"- assembly `{storage.get(n)}` · family `{fk}`") + fl, pa, ot = [], [], [] + for name in sorted(cfg): + outs = [o for _, o in sorted(cfg[name])] + label = name[len(fk):].lstrip(" -") or name + disp = "->".join(outs) + " (retry)" if len(set(outs)) > 1 else outs[0] + (fl if "Failed" in outs else pa if set(outs) == {"Passed"} else ot).append( + f"`{label}`" + ("" if disp == "Passed" else f" ({disp})")) + print(f"- FAILED in: {', '.join(fl) or '-'}") + print(f"- passed in: {', '.join(pa) or '-'}") + if ot: + print(f"- other: {', '.join(ot)}") + for rid in fail_runs[n]: + row = cache.get(rid, {}).get(n) + if row and row[1]: + print(f"- assert/error: {row[1].strip().splitlines()[0][:300]}") + if row[2]: + print(" ```") + for ln in row[2].strip().splitlines()[:6]: + print(" " + ln[:200]) + print(" ```") + break + print() + + +# ---------------- section 2: crashed / incomplete lanes ---------------- +def section_crashes(bid, runs, timeline): + recs = timeline.get("records", []) + published = {r["name"]: r for r in runs} + crashed = [] + # incomplete test runs (runner died mid-run) + for r in runs: + inc = r.get("incompleteTests") or 0 + if inc > 0: + crashed.append((r["name"], f"{inc} test(s) did not complete - runner died mid-run")) + # "run <flavor>" tasks that did not cleanly succeed AND published no (complete) results = crash/zero-tests + for rec in recs: + if rec.get("type") == "Task" and (rec.get("name") or "").startswith("run ") \ + and rec.get("result") in ("failed", "succeededWithIssues", "canceled"): + flavor = rec["name"][4:].strip() + run = published.get(flavor) + if run is None or (run.get("incompleteTests") or 0) > 0: + crashed.append((flavor, f"`run` task {rec['result']} but no complete test run published - app likely crashed ('Zero tests ran' / native crash)")) + # job-level timeouts (hang) + for rec in recs: + if rec.get("type") == "Job" and rec.get("result") == "canceled": + msg = " ".join(i.get("message", "") for i in (rec.get("issues") or [])) + m = re.search(r"maximum time of (\d+) minutes", msg) + if m: + crashed.append((rec["name"], f"timed out at {m.group(1)}-min cap - likely a hung test; last started test in logcat is the suspect")) + if not crashed: + return + print("## Crashed / incomplete lanes (!)\n") + print("These went red with **no usable failed-test list** - the culprit (a test that **started but never " + "finished**, or a native crash) is only in the device **logcat**, not the test API:\n") + seen = set() + for name, why in crashed: + if (name, why) in seen: + continue + seen.add((name, why)) + print(f"- **{name}** - {why}") + print() + print("To name the culprit, download that lane's logs artifact (large: 100MB-2GB - prefer the `Debug` lane) " + "and scan its logcat (see references/azdo-queries.md):\n") + print("```bash") + print(f'az pipelines runs artifact download --run-id {bid} --org {ORG} --project {PROJECT} \\') + print(' --artifact-name "Test Results - APKs .NET Debug - macOS 1" --path /tmp/cilogs') + print(r"grep -nE 'Running |\[PASS\]|\[FAIL\]|SIGSEGV|SIGABRT|tombstone|FATAL|art::|JNI DETECTED|Process .*died' \\") + print(' /tmp/cilogs/**/logcat-*.txt | tail -60 # last test that STARTED with no PASS/FAIL = crasher') + print("```\n") + + +# ---------------- section 3: branch cross-reference ---------------- +def section_xref(failed, repo, pr): + names = sorted({f["automatedTestName"] for f in failed}) + if not names: + return + p = subprocess.run(["gh", "pr", "diff", str(pr), "--repo", repo, "--name-only"], + capture_output=True, text=True) + if p.returncode != 0: + sys.stderr.write(f"gh diff failed: {p.stderr[:200]}\n") + return + files = [f for f in p.stdout.splitlines() if f.strip()] + stems = {f.rsplit("/", 1)[-1].rsplit(".", 1)[0]: f for f in files} + print("## Branch cross-reference\n") + print(f"PR #{pr} changes {len(files)} file(s). Name overlaps with failing tests (judge if causal):\n") + any_hit = False + for n in names: + parts = n.split(".") + cls = parts[-2] if len(parts) >= 2 else "" + method, ns = parts[-1], ".".join(parts[:-2]) + hits = set() + for stem, path in stems.items(): + if stem and (stem == cls or stem == method or stem in ns.split(".") or (cls and cls in path)): + hits.add(path) + if hits: + any_hit = True + print(f"- `{cls}.{method}` <- {', '.join('`'+h+'`' for h in sorted(hits)[:5])}") + if not any_hit: + print("- No direct file-name overlap. Check whether changed runtime/build code affects the failing assembly.") + print() + + +def main(): + ap = argparse.ArgumentParser() + ap.add_argument("--build-id", required=True) + ap.add_argument("--pr") + ap.add_argument("--repo", default="dotnet/android") + args = ap.parse_args() + bid = args.build_id + + failed = (az_json(f"{ORG}/{PROJECT}/_apis/test/ResultsByBuild?buildId={bid}&outcomes=Failed&api-version=7.1-preview") or {}).get("value", []) + runs = (az_json(f"{ORG}/{PROJECT}/_apis/test/runs?buildUri=vstfs:///Build/Build/{bid}&api-version=7.1&includeRunDetails=true") or {}).get("value", []) + timeline = az_json(f"{ORG}/{PROJECT}/_apis/build/builds/{bid}/timeline?api-version=7.1") or {} + run_by_id = {r["id"]: r for r in runs} + + print(f"# Failure analysis - build {bid}\n") + if failed: + section_matrix(bid, failed, runs, run_by_id) + else: + print("_No failed tests in the test API (build may still be red via crash/timeout below)._\n") + section_crashes(bid, runs, timeline) + if args.pr: + section_xref(failed, args.repo, args.pr) + + +if __name__ == "__main__": + main() From 88f4f0410ceaacc27cbb3506ca963a330a803f1a Mon Sep 17 00:00:00 2001 From: Simon Rozsival <simon@rozsival.com> Date: Wed, 17 Jun 2026 11:38:12 +0200 Subject: [PATCH 5/7] Scope CI skill update to CI changes Remove duplicated tests skill and workflow documentation updates from this PR, leaving the focused ci-status changes and refreshed Copilot guidance. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/copilot-instructions.md | 20 ++++++----- .github/skills/tests/SKILL.md | 13 +++---- .../skills/tests/references/test-catalog.md | 14 ++++---- .github/skills/update-tpn/SKILL.md | 1 + Documentation/workflow/UnitTests.md | 34 ++++++------------- 5 files changed, 33 insertions(+), 49 deletions(-) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index e6f0e4c6bcb..79ffd6bedab 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -203,21 +203,23 @@ This pattern ensures proper encoding, timestamps, and file attributes are handle When diagnosing runtime, build, or test failures, follow these practices. They exist because the .NET ↔ JNI ↔ C++ ↔ generated-native stack is loosely coupled and static reasoning alone is unreliable. -- **Reproduce CI failures locally — do not iterate through CI.** A clean local test cycle is minutes; a CI iteration is hours. Run device tests the same way CI does (NUnit via `dotnet test` / MTP, see [#11224](https://github.com/dotnet/android/pull/11224)): +- **Reproduce CI failures locally — do not iterate through CI.** A clean local test cycle is minutes; a CI iteration is hours. Run device tests the same way CI does: ```bash make prepare && make all CONFIGURATION=Release - # Build + install the instrumentation APK on a connected device/emulator: - ./dotnet-local.sh build tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj \ - -t:Install -c Release \ + ./dotnet-local.sh build -t:Install -c Release \ + tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj \ -p:_AndroidTypeMapImplementation=<llvm-ir|managed|trimmable> \ -p:UseMonoRuntime=<true|false> - # Run the on-device tests via dotnet test (MTP), from the project dir so the - # project-local global.json ("runner": "Microsoft.Testing.Platform") applies: - ( cd tests/Mono.Android-Tests/Mono.Android-Tests && \ - ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release --report-trx ) + ( + cd tests/Mono.Android-Tests/Mono.Android-Tests + ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release \ + --report-trx --results-directory ../../../bin/TestRelease/TestResults \ + -p:_AndroidTypeMapImplementation=<llvm-ir|managed|trimmable> \ + -p:UseMonoRuntime=<true|false> + ) ``` On Windows, use `build.cmd` and `dotnet-local.cmd` instead of `make`/`dotnet-local.sh`. - Results land as a `.trx` (VSTest format) in the test results directory — not `TestResult-*.xml`. + Results land in `.trx` files under `bin/TestRelease/TestResults`. - **When the build gets into a weird state, delete `bin/` and `obj/` and rebuild from scratch.** Stale incremental output causes phantom errors. See **Troubleshooting → Build** below. diff --git a/.github/skills/tests/SKILL.md b/.github/skills/tests/SKILL.md index e910ae9fe9d..957a5255655 100644 --- a/.github/skills/tests/SKILL.md +++ b/.github/skills/tests/SKILL.md @@ -71,18 +71,13 @@ dotnet test <project>.csproj -v minimal --filter "Name~TestName" ./dotnet-local.sh test bin/TestDebug/MSBuildDeviceIntegration/${TFM}/MSBuildDeviceIntegration.dll --filter "Name~InstallAndRunTests" ``` -### On-device runtime tests (NUnit via `dotnet test` / MTP, full-build + device) +### On-device runtime tests (NUnitLite, full-build + device) -As of [#11224](https://github.com/dotnet/android/pull/11224) these run **stock NUnit** through `dotnet test` with the Microsoft Testing Platform (MTP) — NUnitLite and the `-t:RunTestApp` target are gone. Build + install the instrumentation APK, then run `dotnet test` **from the project directory** so the project-local `global.json` (`"runner": "Microsoft.Testing.Platform"`) is picked up: +These do NOT use `dotnet test`. Use the `RunTestApp` MSBuild target: ```bash -# 1. Build + install on a connected device/emulator: -./dotnet-local.sh build -t:Install -c Release tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj - -# 2. Run the on-device tests (MTP): -( cd tests/Mono.Android-Tests/Mono.Android-Tests && \ - ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release --report-trx ) +./dotnet-local.sh build -t:RunTestApp tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj ``` -Results are a **`.trx`** (VSTest format) in the test results directory — not `TestResult-*.xml`. Restrict to specific NUnit `[Category]` names with `-p:IncludeCategories=Intune` at **build** time (the old `am instrument -e` category args are gone; exclusions now flow through runtimeconfig — see `TestInstrumentation.ExcludedCategories`/`IncludedCategories`). +Results appear in `TestResult-*.xml` in the repo root. ### Java.Interop tests Tooling tests are standalone (`dotnet test` on `.csproj`). JVM tests require the local SDK: diff --git a/.github/skills/tests/references/test-catalog.md b/.github/skills/tests/references/test-catalog.md index 691c4edcf84..87f6b599387 100644 --- a/.github/skills/tests/references/test-catalog.md +++ b/.github/skills/tests/references/test-catalog.md @@ -114,7 +114,7 @@ Device: **Yes** (most tests have `[Category("UsesDevice")]`) ## On-Device Runtime Tests (full-build — requires local SDK + device) -As of [#11224](https://github.com/dotnet/android/pull/11224), `Mono.Android.NET-Tests` runs **stock NUnit** via `dotnet test` + Microsoft Testing Platform (MTP) — NUnitLite and the `-t:RunTestApp` target are gone. (The older `locales` / `embedded DSOs` apps below were out of scope of that PR and may still use the legacy path.) +These use NUnitLite and run directly on the device via `-t:RunTestApp`. They do NOT use `dotnet test`. Build: Full-build + the test project itself Device: **Yes** @@ -136,21 +136,19 @@ Device: **Yes** ### On-device test categories -The `Mono.Android.NET-Tests.csproj` dynamically excludes categories based on runtime (now via runtimeconfig read by `TestInstrumentation`, not `am instrument -e`): +The `Mono.Android.NET-Tests.csproj` dynamically excludes categories based on runtime: - **CoreCLR runtime**: Excludes `CoreCLRIgnore`, `NTLM` - **NativeAOT runtime**: Excludes `NativeAOTIgnore`, `SSL`, `NTLM`, `Export`, `NativeTypeMap` - **LLVM**: Excludes `LLVMIgnore`, `InetAccess`, `NetworkInterfaces` -Other categories: `SSL`, `InetAccess`, `JavaList`, `RuntimeConfig`, `Intune`, `NTLM`. Restrict a run to specific categories with `-p:IncludeCategories=Intune` at **build** time. +Other categories: `SSL`, `InetAccess`, `JavaList`, `RuntimeConfig`, `Intune`, `NTLM` -Command (build + install, then `dotnet test` from the project dir so the project-local `global.json` MTP runner is used): +Command: ```bash -./dotnet-local.sh build -t:Install -c Release tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj -( cd tests/Mono.Android-Tests/Mono.Android-Tests && \ - ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release --report-trx ) +./dotnet-local.sh build -t:RunTestApp tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj ``` -Results are a `.trx` (VSTest format) in the test results directory — not `TestResult-*.xml`. +Results appear in `TestResult-*.xml` in the repo root. --- diff --git a/.github/skills/update-tpn/SKILL.md b/.github/skills/update-tpn/SKILL.md index 75e060c2078..8bf5cc18b7a 100644 --- a/.github/skills/update-tpn/SKILL.md +++ b/.github/skills/update-tpn/SKILL.md @@ -73,6 +73,7 @@ List contents of `src-ThirdParty/` directory. Current vendored code and license | `android-platform-tools-base/` | android/platform/tools/base | https://android.googlesource.com/platform/tools/base/+/refs/heads/main/sdk-common/NOTICE (Apache 2.0) | | `bionic/` | google/bionic | https://android.googlesource.com/platform/bionic/ (Apache 2.0) | | `crc32.net/` | force-net/crc32.net | https://github.com/force-net/Crc32.NET (MIT) | +| `NUnitLite/` | nunit/nunitlite | https://github.com/nunit/nunitlite/ (MIT) | | `StrongNameSigner/` | brutaldev/StrongNameSigner | https://github.com/brutaldev/StrongNameSigner/ (Apache 2.0) | Note: `Mono.Security.Cryptography/`, `System.Diagnostics.CodeAnalysis/`, `System.Runtime.CompilerServices/`, and `dotnet/` are Microsoft-owned and do not need TPN entries. diff --git a/Documentation/workflow/UnitTests.md b/Documentation/workflow/UnitTests.md index 89926ff7c5b..b5c7518adca 100644 --- a/Documentation/workflow/UnitTests.md +++ b/Documentation/workflow/UnitTests.md @@ -393,7 +393,8 @@ public void MyAppShouldRunAndRespondToClick () There are a category of tests which run on the device itself, these test the runtime behaviour. These run `NUnit` tests directly on the device. Some of these are located in the runtime itself. We build them within the repo then run -the tests on the device. +the tests on the device. They use a custom mobile version of `NUnit` called +`NUnitLite`. For the most part they are the same. These tests are generally found in: @@ -401,32 +402,19 @@ These tests are generally found in: * [`tests/EmbeddedDSOs/EmbeddedDSO`](../../tests/EmbeddedDSOs/EmbeddedDSO) * [`tests/locales/Xamarin.Android.Locale-Tests`](../../tests/locales/Xamarin.Android.Locale-Tests) -As of [#11224](https://github.com/dotnet/android/pull/11224), -`Mono.Android.NET-Tests` runs **stock `NUnit`** through `dotnet test` with the -[Microsoft Testing Platform (MTP)](https://learn.microsoft.com/dotnet/core/testing/microsoft-testing-platform-intro); -the previous `NUnitLite` mobile runner and the custom `RunTestApp` target have -been removed. Build and install the instrumentation app with `-t:Install`, then -run `dotnet test` **from the project directory** so the project-local -`global.json` (which sets `"runner": "Microsoft.Testing.Platform"`) is picked -up: +These tests are run by using the `RunTestApp` target on the appropriate project +file, which includes: -```zsh -# 1. Build + install on a connected device/emulator: -./dotnet-local.sh build -t:Install -c Release tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj + * `tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj` -# 2. Run the on-device tests via dotnet test (MTP): -( cd tests/Mono.Android-Tests/Mono.Android-Tests && \ - ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release --report-trx ) -``` +For example: -After running the tests, a `.trx` file (VSTest format) will be created in the -test results directory containing the results of the tests. Pass -`--results-directory <dir>` to control where it is written, and -`-p:IncludeCategories=<Category>` at **build** time to restrict the run to -specific NUnit `[Category]` names. +```zsh +./dotnet-local.sh build -t:RunTestApp tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj +``` -> Note: the older `EmbeddedDSO` and `Xamarin.Android.Locale-Tests` apps were out -> of scope of #11224 and may still use the legacy `RunTestApp` path. +After running the tests, a `TestResult*.xml` file will be created in the +top checkout directory containing the results of the tests. The following is an example unit test. From 56318dd6a42f5e4e5323475dd6b96efd60bf5758 Mon Sep 17 00:00:00 2001 From: Simon Rozsival <simon@rozsival.com> Date: Wed, 17 Jun 2026 11:41:02 +0200 Subject: [PATCH 6/7] Keep device test guidance in tests PR Remove the MTP device-test reproduction snippet from this CI-status PR; that documentation now lives in the tests-skill PR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/copilot-instructions.md | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 79ffd6bedab..c8d93db9333 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -206,20 +206,13 @@ When diagnosing runtime, build, or test failures, follow these practices. They e - **Reproduce CI failures locally — do not iterate through CI.** A clean local test cycle is minutes; a CI iteration is hours. Run device tests the same way CI does: ```bash make prepare && make all CONFIGURATION=Release - ./dotnet-local.sh build -t:Install -c Release \ - tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj \ + ./dotnet-local.sh build tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj \ + -t:RunTestApp -c Release \ -p:_AndroidTypeMapImplementation=<llvm-ir|managed|trimmable> \ -p:UseMonoRuntime=<true|false> - ( - cd tests/Mono.Android-Tests/Mono.Android-Tests - ../../../dotnet-local.sh test Mono.Android.NET-Tests.csproj --no-build -c Release \ - --report-trx --results-directory ../../../bin/TestRelease/TestResults \ - -p:_AndroidTypeMapImplementation=<llvm-ir|managed|trimmable> \ - -p:UseMonoRuntime=<true|false> - ) ``` On Windows, use `build.cmd` and `dotnet-local.cmd` instead of `make`/`dotnet-local.sh`. - Results land in `.trx` files under `bin/TestRelease/TestResults`. + Results land in `TestResult-Mono.Android.NET_Tests-*.xml` at the repo root. - **When the build gets into a weird state, delete `bin/` and `obj/` and rebuild from scratch.** Stale incremental output causes phantom errors. See **Troubleshooting → Build** below. From 7ce322700f64f7002f9aa61f33ef5742f8df6cd4 Mon Sep 17 00:00:00 2001 From: Simon Rozsival <simon@rozsival.com> Date: Wed, 17 Jun 2026 11:45:53 +0200 Subject: [PATCH 7/7] Remove obsolete pipeline references from agent docs Keep CI guidance focused on the public dotnet-android pipeline without mentioning the retired PR pipeline path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/copilot-instructions.md | 2 +- .github/skills/android-reviewer/SKILL.md | 1 - .github/skills/ci-status/SKILL.md | 1 - 3 files changed, 1 insertion(+), 3 deletions(-) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index c8d93db9333..03ba90f7706 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -192,7 +192,7 @@ This pattern ensures proper encoding, timestamps, and file attributes are handle ## CI / Build Investigation -**dotnet/android PR validation runs on a public Azure DevOps pipeline (`dotnet-android` on `dnceng-public`), not GitHub Actions.** As of #11578 it runs the full test matrix for every PR (direct and fork); the old internal `Xamarin.Android-PR` (DevDiv) pipeline no longer runs on PRs. When a user asks about CI status, CI failures, why a PR is blocked, or build errors: +**dotnet/android PR validation runs on the public Azure DevOps `dotnet-android` pipeline on `dnceng-public`, not GitHub Actions.** When a user asks about CI status, CI failures, why a PR is blocked, or build errors: 1. **ALWAYS invoke the `ci-status` skill first.** The pipeline surfaces as ~39 `dotnet-android (...)` GitHub checks, but the skill adds build progress, ETA, per-stage failures, and failed-test names that `gh pr checks` alone doesn't give you. 2. The skill auto-detects the current PR from the git branch when no PR number is given. diff --git a/.github/skills/android-reviewer/SKILL.md b/.github/skills/android-reviewer/SKILL.md index a337b1f7054..7f130fa5a68 100644 --- a/.github/skills/android-reviewer/SKILL.md +++ b/.github/skills/android-reviewer/SKILL.md @@ -57,7 +57,6 @@ Review the CI results. **Never post ✅ LGTM if any required CI check is failing - Investigate the failure using the **azdo-build-investigator** skill (for Azure DevOps pipeline failures) or GitHub Actions job logs. - If the failure is caused by the PR's code changes, flag it as ❌ error. - If the failure is a known infrastructure issue or pre-existing flake unrelated to the PR, note it in the summary but still use ⚠️ Needs Changes — the PR isn't mergeable until CI is green. -- All PR checks now come from the single public `dotnet-android` pipeline (dnceng-public). If you see a `Xamarin.Android-PR` check, it's a branch/official build, not PR validation — don't gate the review on it. - If the PR description acknowledges the failure and documents a dependency (e.g., "blocked on X"), note it in the summary. ### 5. Load review rules diff --git a/.github/skills/ci-status/SKILL.md b/.github/skills/ci-status/SKILL.md index e734b817cbf..2c7257f5150 100644 --- a/.github/skills/ci-status/SKILL.md +++ b/.github/skills/ci-status/SKILL.md @@ -19,7 +19,6 @@ Every PR runs **one** public Azure DevOps build: pipeline **`dotnet-android`** o Everything else is standard `gh`/`az`; only these are non-obvious: - **Judge pass/fail by the build `result` + GitHub check states — never by the test API.** Device-test lanes run with `continueOnError`, so flaky failures (notably `System.NetTests.SslTest.*`, or failures only in flavor lanes like `-TrimModePartial`/`-NoAab`) show as failed tests on otherwise-green builds. -- **Ignore `Xamarin.Android-PR`** (devdiv): it has `pr: none` and never runs on PRs; if present, it's a branch/official build. - **Expect a fork PR to await `/azp run` approval** (re-approved per push); direct PRs auto-start on push. Forks change only triggering, not which pipeline runs. - **Query test results with `az rest`** — `az devops invoke --area test` 404s on dnceng-public. The `build` area works unauthenticated; `az rest` and log/artifact downloads need `az login` (else 401).