Skip to content

Commit ec90d3d

Browse files
authored
Merge pull request #9 from AvdLee/learnings-daily-macos-cocoapods
Incorporate learnings from Daily macOS (ObjC + CocoaPods) test case
2 parents 95f3308 + 728d72a commit ec90d3d

9 files changed

Lines changed: 152 additions & 47 deletions

File tree

.github/scripts/sync-readme.js

Lines changed: 5 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -15,43 +15,12 @@ const SKILL_DIRS = [
1515
const beginMarker = "<!-- BEGIN SKILL STRUCTURE -->";
1616
const endMarker = "<!-- END SKILL STRUCTURE -->";
1717

18-
const describeReference = (fileName) => {
19-
const descriptions = {
20-
"benchmarking-workflow.md": "Benchmark contract, clean vs incremental rules, and artifact expectations",
21-
"code-compilation-checks.md": "Swift compile hotspot checks and code-level heuristics",
22-
"project-audit-checks.md": "Build setting, script phase, and dependency audit checklist",
23-
"spm-analysis-checks.md": "Package graph, plugin overhead, and module variant review guide",
24-
"orchestration-report-template.md": "Prioritization, approval, and verification report template",
25-
"fix-patterns.md": "Concrete before/after patterns for each fix category",
26-
};
27-
return descriptions[fileName] || "Reference file";
28-
};
29-
3018
const buildTree = () => {
31-
const lines = [
32-
"xcode-build-optimization-agent-skill/",
33-
" .claude-plugin/",
34-
" marketplace.json",
35-
" plugin.json",
36-
" references/",
37-
" benchmark-artifacts.md",
38-
" build-optimization-sources.md",
39-
" build-settings-best-practices.md",
40-
" recommendation-format.md",
41-
" schemas/",
42-
" build-benchmark.schema.json",
43-
" scripts/",
44-
" benchmark_builds.py",
45-
" diagnose_compilation.py",
46-
" generate_optimization_report.py",
47-
" render_recommendations.py",
48-
" summarize_build_timing.py",
49-
" skills/",
50-
];
19+
const lines = ["skills/"];
5120

5221
for (const skillDir of SKILL_DIRS) {
53-
lines.push(` ${skillDir}/`);
54-
lines.push(" SKILL.md");
22+
lines.push(` ${skillDir}/`);
23+
lines.push(" SKILL.md");
5524
const referencesDir = path.join(SKILLS_ROOT, skillDir, "references");
5625
if (!fs.existsSync(referencesDir)) {
5726
continue;
@@ -63,9 +32,9 @@ const buildTree = () => {
6332
if (references.length === 0) {
6433
continue;
6534
}
66-
lines.push(" references/");
35+
lines.push(" references/");
6736
for (const fileName of references) {
68-
lines.push(` ${fileName} - ${describeReference(fileName)}`);
37+
lines.push(` ${fileName}`);
6938
}
7039
}
7140

OPTIMIZATION-CHECKS.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,14 @@ The `xcode-project-analyzer` audits project-level and target-level build setting
4545
- [Xcode Release Notes: Compilation Caching](https://developer.apple.com/documentation/xcode-release-notes/) (feature ID 149700201)
4646
- [Bitrise: Xcode Compilation Cache FAQ](https://docs.bitrise.io/en/bitrise-build-cache/build-cache-for-xcode/xcode-compilation-cache-faq.html)
4747

48+
### CocoaPods Projects
49+
50+
CocoaPods is deprecated. When a project uses CocoaPods, do not attempt CocoaPods-specific build optimizations (linkage mode changes, `COCOAPODS_PARALLEL_CODE_SIGN`, Podfile tweaks). These are unreliable and frequently regress build times.
51+
52+
Recommend migrating to Swift Package Manager as the highest-impact long-term improvement. SPM advantages for build time: compilation caching works out of the box, better build parallelism from the dependency graph, no `pod install` xcconfig regeneration overhead, and native Xcode integration with full support for modern features like explicit modules.
53+
54+
Focus analysis on first-party targets and build settings the project controls directly. Do not audit `Pods.xcodeproj` or the Podfile.
55+
4856
## Script Phase Analysis
4957

5058
The `xcode-project-analyzer` inspects every Run Script phase in the project for missing metadata and unnecessary execution.
@@ -97,7 +105,7 @@ Even with no source edits, incremental builds incur fixed overhead. The agent me
97105
| `CopySwiftLibs` | Copies Swift standard libraries | Runs even when nothing changed |
98106
| `RegisterWithLaunchServices` | Registers the built app | Fast but always present |
99107
| `ProcessInfoPlistFile` | Re-processes Info.plist files | Scales with target count |
100-
| `ExtractAppIntentsMetadata` | Extracts App Intents metadata | Unnecessary overhead if the project does not use App Intents |
108+
| `ExtractAppIntentsMetadata` | Extracts App Intents metadata from all targets including third-party dependencies | Driven by Xcode, not by per-target project settings; unnecessary overhead if the project does not use App Intents but not cleanly suppressible from the repo (classify as `xcode-behavior`) |
101109

102110
A zero-change build above 5 seconds on Apple Silicon typically indicates script phase overhead or excessive codesigning.
103111

references/recommendation-format.md

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,23 @@ Each recommendation should include:
88

99
- `title`
1010
- `wait_time_impact` -- plain-language statement of expected wall-clock impact, e.g. "Expected to reduce your clean build by ~3s", "Reduces parallel compile work but unlikely to reduce build wait time", or "Impact on wait time is uncertain -- re-benchmark to confirm"
11+
- `actionability` -- classifies how fixable the issue is from the project (see values below)
1112
- `category`
1213
- `observed_evidence`
1314
- `estimated_impact`
1415
- `confidence`
1516
- `approval_required`
1617
- `benchmark_verification_status`
1718

19+
### Actionability Values
20+
21+
Every recommendation must include an `actionability` classification:
22+
23+
- `repo-local` -- Fix lives entirely in project files, source code, or local configuration. The developer can apply it without side effects outside the repo.
24+
- `package-manager` -- Requires CocoaPods or SPM configuration changes that may have broad side effects (e.g., linkage mode, dependency restructuring). These should be benchmarked before and after.
25+
- `xcode-behavior` -- Observed cost is driven by Xcode internals and is not suppressible from the project. Report the finding for awareness but do not promise a fix.
26+
- `upstream` -- Requires changes in a third-party dependency or external tool. The developer cannot fix it locally.
27+
1828
## Suggested Optional Fields
1929

2030
- `scope`
@@ -32,6 +42,7 @@ Each recommendation should include:
3242
{
3343
"title": "Guard a release-only symbol upload script",
3444
"wait_time_impact": "Expected to reduce your incremental build by approximately 6 seconds.",
45+
"actionability": "repo-local",
3546
"category": "project",
3647
"observed_evidence": [
3748
"Incremental builds spend 6.3 seconds in a run script phase.",
@@ -54,11 +65,12 @@ When rendering for human review, preserve the same field order:
5465

5566
1. title
5667
2. wait-time impact
57-
3. observed evidence
58-
4. estimated impact
59-
5. confidence
60-
6. approval required
61-
7. benchmark verification status
68+
3. actionability
69+
4. observed evidence
70+
5. estimated impact
71+
6. confidence
72+
7. approval required
73+
8. benchmark verification status
6274

6375
That makes it easier for the developer to approve or reject specific items quickly.
6476

scripts/generate_optimization_report.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -395,6 +395,7 @@ def _section_recommendations(recommendations: Optional[Dict[str, Any]]) -> str:
395395
lines.append(f"### {i}. {title}\n")
396396
for field, label in [
397397
("wait_time_impact", "Wait-Time Impact"),
398+
("actionability", "Actionability"),
398399
("category", "Category"),
399400
("observed_evidence", "Evidence"),
400401
("estimated_impact", "Impact"),
@@ -427,8 +428,10 @@ def _section_approval(recommendations: Optional[Dict[str, Any]]) -> str:
427428
wait_impact = item.get("wait_time_impact", "")
428429
impact = item.get("estimated_impact", "")
429430
risk = item.get("risk_level", "")
431+
actionability = item.get("actionability", "")
430432
impact_str = wait_impact if wait_impact else impact
431-
lines.append(f"- [ ] **{i}. {title}** -- Impact: {impact_str} | Risk: {risk}")
433+
actionability_str = f" | Actionability: {actionability}" if actionability else ""
434+
lines.append(f"- [ ] **{i}. {title}** -- Impact: {impact_str}{actionability_str} | Risk: {risk}")
432435
return "\n".join(lines)
433436

434437

skills/xcode-build-fixer/SKILL.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,46 @@ Before applying version pin changes:
100100
```
101101
6. Compare post-change medians to the baseline and report deltas.
102102

103+
## Evaluating Regressions
104+
105+
Not every slower number is a true regression. The fixer must evaluate the full picture before recommending a revert.
106+
107+
### Compilation caching trade-off
108+
109+
A change like `COMPILATION_CACHING = YES` may make a standard clean build slightly slower (cache population overhead) while making cached clean builds significantly faster. Since cached clean builds reflect the realistic developer workflow (branch switching, pulling changes, Clean Build Folder with a warm cache), a slower standard clean build paired with a faster cached clean build is a net improvement, not a regression. The same logic applies to any change where the first-time cost is higher but subsequent builds benefit.
110+
111+
### Compare all build types
112+
113+
Evaluate standard clean, cached clean (when available), and incremental builds together. A change is only a clear regression if it makes the metrics that matter to the developer's daily workflow worse. If only the cold-start clean build is slower but the developer rarely does truly cold builds, that may be acceptable.
114+
115+
### Per-change vs. cumulative
116+
117+
A single change that regresses on its own may still be worth keeping if it enables a larger cumulative improvement (e.g., enabling compilation caching makes one build type slower but unlocks cache benefits for everything else).
118+
119+
### Best-practice settings are not revert candidates
120+
121+
Some build settings are Apple's recommended modern defaults. These should be applied and kept regardless of whether the immediate benchmark shows a measurable improvement. They are not speculative optimizations -- they align the project with Apple's toolchain direction and will compound in value over future Xcode releases. The fixer must never recommend reverting these settings based on a single benchmark result.
122+
123+
Best-practice settings that should always be kept once applied:
124+
125+
- `COMPILATION_CACHING = YES` -- Apple is actively investing in this; the cache improves with each Xcode release and compounds across real workflows
126+
- `EAGER_LINKING = YES` (Debug) -- allows the linker to overlap with compilation
127+
- `SWIFT_USE_INTEGRATED_DRIVER = YES` -- eliminates inter-process scheduling overhead
128+
- `DEBUG_INFORMATION_FORMAT = dwarf` (Debug) -- avoids unnecessary dSYM generation
129+
- `SWIFT_COMPILATION_MODE = singlefile` (Debug) -- incremental recompilation
130+
- `ONLY_ACTIVE_ARCH = YES` (Debug) -- no reason to build all architectures locally
131+
132+
When reporting on these settings, use language like: "Applied recommended build setting. No immediate benchmark improvement measured, but this aligns with Apple's recommended configuration and positions the project for future Xcode improvements."
133+
134+
### When to recommend revert (speculative changes only)
135+
136+
For changes that are not best-practice settings (e.g., source refactors, linkage experiments, script phase modifications, dependency restructuring):
137+
138+
- If the cumulative pass shows wall-clock regression across all measured build types (standard clean, cached clean, and incremental are all slower), recommend reverting all speculative changes unless the developer explicitly asks to keep specific items for non-performance reasons.
139+
- For each individual speculative change: if it shows no median improvement and no cached/incremental benefit either, flag it with `Recommend revert` and the measured delta.
140+
- Distinguish between "outlier reduction only" (improved worst-case but not median) and "median improvement" (improved typical developer wait).
141+
- When a change trades off one build type for another (e.g., slower standard clean but faster cached clean), present both numbers clearly and let the developer decide. Frame it as: "Standard clean builds are X.Xs slower, but cached clean builds (the realistic daily workflow) are Y.Ys faster."
142+
103143
## Reporting
104144

105145
Lead with the wall-clock result in plain language:
@@ -124,6 +164,45 @@ For changes valuable for non-benchmark reasons (deterministic package resolution
124164

125165
Note: `COMPILATION_CACHING` has been measured at 5-14% faster clean builds across tested projects (87 to 1,991 Swift files). The benefit compounds in real developer workflows where the cache persists between builds -- branch switching, pulling changes, and CI with persistent DerivedData. The benchmark script auto-detects this setting and runs a cached clean phase for validation.
126166

167+
## Execution Report
168+
169+
After the optimization pass is complete, produce a structured execution report. This gives the developer a clear summary of what was attempted, what worked, and what the final state is.
170+
171+
Structure:
172+
173+
```markdown
174+
## Execution Report
175+
176+
### Baseline
177+
- Clean build median: X.Xs
178+
- Cached clean build median: X.Xs (if applicable)
179+
- Incremental build median: X.Xs
180+
181+
### Changes Applied
182+
183+
| # | Change | Actionability | Measured Result | Status |
184+
|---|--------|---------------|-----------------|--------|
185+
| 1 | Description | repo-local | Clean: X.Xs→Y.Ys, Incr: X.Xs→Y.Ys | Kept / Reverted / Blocked |
186+
| 2 | ... | ... | ... | ... |
187+
188+
### Final Cumulative Result
189+
- Clean build median: X.Xs (was Y.Ys) -- Z.Zs faster/slower
190+
- Cached clean build median: X.Xs (was Y.Ys) -- Z.Zs faster/slower
191+
- Incremental build median: X.Xs (was Y.Ys) -- Z.Zs faster/slower
192+
- **Net result:** Faster / Slower / Unchanged
193+
194+
### Blocked or Non-Actionable Findings
195+
- Finding: reason it could not be addressed from the repo
196+
```
197+
198+
Status values:
199+
200+
- `Kept` -- Change improved or maintained build times and was kept.
201+
- `Kept (best practice)` -- Change is a recommended build setting; kept regardless of immediate benchmark result.
202+
- `Reverted` -- Change regressed build times and was reverted.
203+
- `Blocked` -- Change could not be applied due to project structure, Xcode behavior, or external constraints.
204+
- `No improvement` -- Change compiled but showed no measurable wall-time benefit. Include whether it was kept (for non-performance reasons) or reverted.
205+
127206
## Escalation
128207

129208
If during implementation you discover issues outside this skill's scope:

skills/xcode-build-orchestrator/SKILL.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ Run this phase in agent mode because the agent needs to execute builds, run benc
2727
1. Collect the build target context: workspace or project, scheme, configuration, destination, and current pain point. When both `.xcworkspace` and `.xcodeproj` exist, prefer `.xcodeproj` unless the workspace contains sub-projects required for the build. Workspaces that reference external projects may fail if those projects are not checked out.
2828
2. Run `xcode-build-benchmark` to establish a baseline if no fresh benchmark exists. The benchmark script auto-detects `COMPILATION_CACHING = YES` and includes cached clean builds that measure the realistic developer experience (warm cache). If the build fails to compile, check `git log` for a recent buildable commit. When working in a worktree, cherry-picking a targeted build fix from a feature branch is acceptable to reach a buildable state. If SPM packages reference gitignored directories in their `exclude:` paths (e.g., `__Snapshots__`), create those directories before building -- worktrees do not contain gitignored content and `xcodebuild -resolvePackageDependencies` will crash otherwise.
2929
3. Verify the benchmark artifact has non-empty `timing_summary_categories`. If empty, the timing summary parser may have failed -- re-parse the raw logs or inspect them manually. If `COMPILATION_CACHING` is enabled, also verify the artifact includes `cached_clean` runs.
30+
- **Benchmark confidence check**: For each build type (clean, cached clean, incremental), compare the min and max values. If the spread (max - min) exceeds 20% of the median, flag the benchmark as having high variance and recommend running additional repetitions (5+ runs) before drawing conclusions. High variance makes it difficult to distinguish real improvements from noise. After applying changes, only claim an improvement if the post-change median falls outside the baseline's min-max range.
3031
4. If incremental builds are the primary pain point and Xcode 16.4+ is available, recommend the developer enable **Task Backtraces** (Scheme Editor > Build tab > Build Debugging > "Task Backtraces"). This reveals why each task re-ran, which is critical for diagnosing unexpected replanning or input invalidation. Include any Task Backtrace evidence in the analysis.
3132
5. Determine whether compile tasks are likely blocking wall-clock progress or just consuming parallel CPU time. Compare the sum of all timing-summary category seconds against the wall-clock median: if the sum is 2x+ the median, most work is parallelized and compile hotspot fixes are unlikely to reduce wait time. If `SwiftCompile`, `CompileC`, `SwiftEmitModule`, or `Planning Swift module` dominate the timing summary **and** appear likely to be on the critical path, run `diagnose_compilation.py` to capture type-checking hotspots. If they are parallelized, still run diagnostics but label findings as "parallel efficiency improvements" rather than "build time improvements."
3233
6. Run the specialist analyses that fit the evidence by reading each skill's SKILL.md and applying its workflow:
@@ -104,7 +105,7 @@ Lead with the wall-clock result in plain language, e.g.: "Your clean build now t
104105
- absolute and percentage wall-clock deltas
105106
- what changed
106107
- what was intentionally left unchanged
107-
- confidence notes if noise prevents a strong conclusion
108+
- confidence notes if noise prevents a strong conclusion -- if benchmark variance is high (min-to-max spread exceeds 20% of median), say so explicitly rather than presenting noisy numbers as definitive improvements or regressions
108109
- if cumulative task metrics improved but wall-clock did not, say plainly: "Compiler workload decreased but build wait time did not improve. This is expected when Xcode runs these tasks in parallel with other equally long work."
109110
- a ready-to-paste community results row and a link to open a PR (see the report template)
110111

skills/xcode-build-orchestrator/references/orchestration-report-template.md

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,14 +90,33 @@ After implementing approved changes, re-benchmark with the same inputs:
9090
Compare the new wall-clock medians against the baseline. Report results as:
9191
"Your [clean/incremental] build now takes X.Xs (was Y.Ys) -- Z.Zs faster/slower."
9292

93-
## Verification (post-approval)
93+
## Execution Report (post-approval)
9494

95+
### Baseline
96+
- Clean build median: X.Xs
97+
- Cached clean build median: X.Xs (if applicable)
98+
- Incremental build median: X.Xs
99+
100+
### Changes Applied
101+
102+
| # | Change | Actionability | Measured Result | Status |
103+
|---|--------|---------------|-----------------|--------|
104+
| 1 | Description of change | repo-local | Clean: X.Xs→Y.Ys, Incr: X.Xs→Y.Ys | Kept / Reverted / Blocked |
105+
| 2 | ... | ... | ... | ... |
106+
107+
Status values: `Kept`, `Kept (best practice)`, `Reverted`, `Blocked`, `No improvement`
108+
109+
### Final Cumulative Result
95110
- Post-change clean build: X.Xs (was Y.Ys) -- Z.Zs faster/slower
96111
- Post-change cached clean build: X.Xs (was Y.Ys) -- Z.Zs faster/slower (when COMPILATION_CACHING enabled)
97112
- Post-change incremental build: X.Xs (was Y.Ys) -- Z.Zs faster/slower
113+
- **Net result:** Faster / Slower / Unchanged
98114
- If cumulative task metrics improved but wall-clock did not: "Compiler workload decreased but build wait time did not improve. This is expected when Xcode runs these tasks in parallel with other equally long work."
99115
- If standard clean builds are slower but cached clean builds are faster: "Standard clean builds show overhead from compilation cache population. Cached clean builds (the realistic developer workflow) are faster, confirming the net benefit."
100116

117+
### Blocked or Non-Actionable Findings
118+
- Finding: reason it could not be addressed from the repo
119+
101120
## Remaining follow-up ideas
102121
- Item:
103122
- Why it was deferred:

0 commit comments

Comments
 (0)