|
| 1 | +# Publisher Full-Scope Enrichment Prioritization Plan |
| 2 | + |
| 3 | +**Date:** 2026-03-12 |
| 4 | +**Author:** Codex (GPT-5) |
| 5 | +**Purpose:** define the recommended one-by-one execution order for full-scope publisher enrichment across the current `OSHConnect-Python` public-data fleet. |
| 6 | +**Scope:** `nws`, `ndbc`, `coops`, `aviation_wx`, `opensky`, `iss`, `usgs_water`, `usgs_nims`, `usgs_eq` |
| 7 | +**Relationship to prior analysis:** this plan operationalizes the findings in `All_Bootstraps_Full_Scope_Gap_Analysis_2026-03-12.md` and its appendices. |
| 8 | + |
| 9 | +--- |
| 10 | + |
| 11 | +## 1. Executive Decision |
| 12 | + |
| 13 | +Do not enrich the publishers in alphabetical order, by repo age, or only by which one looks weakest. |
| 14 | + |
| 15 | +The recommended sequence is: |
| 16 | + |
| 17 | +1. `ISS` |
| 18 | +2. `USGS Water` |
| 19 | +3. `NWS` |
| 20 | +4. `USGS EQ` |
| 21 | +5. `OpenSky` |
| 22 | +6. `NDBC` |
| 23 | +7. `USGS NIMS` |
| 24 | +8. `CO-OPS` |
| 25 | +9. `Aviation WX` |
| 26 | + |
| 27 | +This is not a pure maturity ranking. It is a `rework-minimizing, semantics-first execution order` designed to: |
| 28 | + |
| 29 | +- close the biggest current-fleet contradiction first; |
| 30 | +- resolve dependency roots before dependent publishers; |
| 31 | +- establish canonical templates before catch-up work; |
| 32 | +- convert the strongest current ideas into reusable patterns early; |
| 33 | +- delay the lowest-leverage catch-up work until the family conventions are already stable. |
| 34 | + |
| 35 | +--- |
| 36 | + |
| 37 | +## 2. Prioritization Logic |
| 38 | + |
| 39 | +The order above was chosen using six criteria. |
| 40 | + |
| 41 | +### 2.1 Canonical completeness |
| 42 | + |
| 43 | +If a publisher slot is supposed to exist in the active fleet but is materially incomplete, that gap should be closed early because it corrupts the credibility of the whole corpus. |
| 44 | + |
| 45 | +This is why `ISS` is first. |
| 46 | + |
| 47 | +### 2.2 Dependency roots |
| 48 | + |
| 49 | +If one publisher depends on another publisher's systems, semantics, or sidecars, the upstream publisher should usually be enriched first. |
| 50 | + |
| 51 | +This is why `USGS Water` must precede `USGS NIMS`. |
| 52 | + |
| 53 | +### 2.3 Pattern-template leverage |
| 54 | + |
| 55 | +The first few enrichments should define reusable family standards: |
| 56 | + |
| 57 | +- canonical station-family enrichment expectations; |
| 58 | +- canonical Pattern C feed-adapter expectations; |
| 59 | +- canonical Pattern A companion-datastream expectations. |
| 60 | + |
| 61 | +This is why `NWS`, `USGS EQ`, and `OpenSky` are early, and why `CO-OPS` and `Aviation WX` are late. |
| 62 | + |
| 63 | +### 2.4 Artifact-state risk |
| 64 | + |
| 65 | +Publishers whose current artifact state is contradictory, misleading, or absent should be addressed before publishers that are already materially present and inspectable. |
| 66 | + |
| 67 | +This is why `ISS` and `USGS Water` are ahead of already-packaged publishers like `USGS NIMS` and `USGS EQ`. |
| 68 | + |
| 69 | +### 2.5 Marginal fleet value |
| 70 | + |
| 71 | +An enrichment should be prioritized earlier if finishing it gives the rest of the fleet a better template, stronger test discipline, or clearer vocabulary. |
| 72 | + |
| 73 | +This is why `NWS` is ahead of `CO-OPS` and `Aviation WX`, and why `USGS EQ` is ahead of a second-pass refinement on already-strong `USGS NIMS`. |
| 74 | + |
| 75 | +### 2.6 Catch-up should come after template stabilization |
| 76 | + |
| 77 | +The thinnest publishers should not be first if their enrichment would otherwise be designed in a vacuum. |
| 78 | + |
| 79 | +This is why `CO-OPS` and especially `Aviation WX` are late-phase work rather than first-phase work. |
| 80 | + |
| 81 | +--- |
| 82 | + |
| 83 | +## 3. Phase 0 Guardrails |
| 84 | + |
| 85 | +Before the first publisher enrichment begins, define these fleet-level guardrails once and then reuse them for every publisher: |
| 86 | + |
| 87 | +1. a standard `full-scope package contract` |
| 88 | + - required folders |
| 89 | + - required manifests |
| 90 | + - required worked examples |
| 91 | + - required patch candidates |
| 92 | + - required source-corpus notes |
| 93 | +2. a standard `semantic acceptance checklist` |
| 94 | + - procedure |
| 95 | + - system |
| 96 | + - datastream |
| 97 | + - deployment |
| 98 | + - feature-of-interest |
| 99 | + - provenance |
| 100 | + - units and null semantics |
| 101 | +3. a standard `round-trip verification checklist` |
| 102 | + - POST |
| 103 | + - GET back |
| 104 | + - SensorML inspection |
| 105 | + - result-schema inspection |
| 106 | + - observation write-path verification |
| 107 | +4. a standard `runtime hardening checklist` |
| 108 | + - TLS verification |
| 109 | + - auth handling |
| 110 | + - retry and dedupe policy |
| 111 | + - logging and observability |
| 112 | +5. a standard `artifact state label` |
| 113 | + - `metadata pack` |
| 114 | + - `total pack` |
| 115 | + - `source basis` |
| 116 | + - `historical artifact` |
| 117 | + - `migration artifact` |
| 118 | + |
| 119 | +This preflight is important because otherwise the first three enrichments will each reinvent package shape and acceptance criteria. |
| 120 | + |
| 121 | +--- |
| 122 | + |
| 123 | +## 4. Ordered Publisher Plan |
| 124 | + |
| 125 | +| Rank | Publisher | Why This Position | Primary Outcome | Main Dependency / Leverage | |
| 126 | +|---|---|---|---|---| |
| 127 | +| 1 | ISS | The active fleet is currently incomplete; the bootstrap slot is missing even though the runtime and README imply it exists. | Migrate ISS into a canonical current bootstrap and package. | Removes the single clearest architecture contradiction in the fleet. | |
| 128 | +| 2 | USGS Water | It is a dependency root for NIMS and currently has an artifact-state mismatch. | Materialize the missing total package and make water the canonical USGS station reference. | Unblocks clean NIMS follow-on work and repairs repo-state trust. | |
| 129 | +| 3 | NWS | It is the best place to formalize station-family round-trip SensorML acceptance criteria. | Convert strong pack work into a canonical live-plus-package station reference. | Sets the first real station-family enrichment template. | |
| 130 | +| 4 | USGS EQ | It is already one of the strongest full packages and can define the canonical Pattern C total-pack standard. | Harden Pattern C lifecycle, provenance, and event semantics. | Establishes a reusable feed-adapter template. | |
| 131 | +| 5 | OpenSky | It is the other major Pattern C publisher and already has strong metadata foundations. | Graduate OpenSky from metadata-pack maturity to total-pack maturity. | Reuses the Pattern C template established in USGS EQ. | |
| 132 | +| 6 | NDBC | It is the richest station-family multi-stream case and should become the canonical station-plus-imagery reference. | Convert NDBC into the canonical multi-stream station publisher package. | Reuses station-family conventions from NWS and informs imagery semantics for later work. | |
| 133 | +| 7 | USGS NIMS | It is already strong, but its dependency semantics should be refined only after water and family templates are stable. | Formalize companion-datastream dependency semantics and policy constraints. | Depends on clean USGS Water semantics and benefits from earlier imagery-pattern work. | |
| 134 | +| 8 | CO-OPS | It lacks a pack, but its enrichment should borrow already-proven station-family structure rather than invent its own. | Create a full package and sharpen coastal product semantics. | Reuses station-family scaffolding from NWS and NDBC. | |
| 135 | +| 9 | Aviation WX | It is the thinnest current publisher, but also the one with the least leverage on the rest of the fleet. | Bring Aviation WX up to full-package parity without using it as the template setter. | Best done after station-family conventions are already frozen. | |
| 136 | + |
| 137 | +--- |
| 138 | + |
| 139 | +## 5. Recommended Waves |
| 140 | + |
| 141 | +### Wave 1. Canonical blockers and dependency roots |
| 142 | + |
| 143 | +1. `ISS` |
| 144 | +2. `USGS Water` |
| 145 | +3. `NWS` |
| 146 | + |
| 147 | +This wave fixes the biggest fleet contradiction, repairs the most important current artifact mismatch, and establishes the first canonical station-family enrichment method. |
| 148 | + |
| 149 | +### Wave 2. Canonical family references |
| 150 | + |
| 151 | +4. `USGS EQ` |
| 152 | +5. `OpenSky` |
| 153 | +6. `NDBC` |
| 154 | +7. `USGS NIMS` |
| 155 | + |
| 156 | +This wave completes the strongest reusable family references: |
| 157 | + |
| 158 | +- Pattern C total-package reference; |
| 159 | +- second Pattern C implementation; |
| 160 | +- multi-stream station-family reference; |
| 161 | +- Pattern A companion-datastream reference. |
| 162 | + |
| 163 | +### Wave 3. Catch-up and parity |
| 164 | + |
| 165 | +8. `CO-OPS` |
| 166 | +9. `Aviation WX` |
| 167 | + |
| 168 | +This wave uses already-proven conventions to bring the remaining station-family publishers to full-package parity with much lower design risk. |
| 169 | + |
| 170 | +--- |
| 171 | + |
| 172 | +## 6. Publisher-by-Publisher Intent |
| 173 | + |
| 174 | +### 6.1 ISS |
| 175 | + |
| 176 | +**Why first** |
| 177 | + |
| 178 | +- The current fleet is semantically incomplete while ISS remains a runtime without a current bootstrap. |
| 179 | +- The migration path is unusually well defined because `scripts/bootstrap_iss.py` already exists as a strong precedent. |
| 180 | +- The project should not start a long enrichment program while one advertised active publisher is still missing its bootstrap artifact. |
| 181 | + |
| 182 | +**What "full-scope enrichment" means here** |
| 183 | + |
| 184 | +- migrate `bootstrap_iss.py` into `publishers/iss/` |
| 185 | +- modernize it to current helper-layer and env conventions |
| 186 | +- produce a current ISS package, not just a migrated script |
| 187 | +- preserve the dual-product model and rich SensorML |
| 188 | +- remove every legacy credential, TLS, and hardcoded-endpoint anti-pattern |
| 189 | + |
| 190 | +**Exit condition** |
| 191 | + |
| 192 | +ISS is no longer a special-case hole in the active fleet. |
| 193 | + |
| 194 | +### 6.2 USGS Water |
| 195 | + |
| 196 | +**Why second** |
| 197 | + |
| 198 | +- `USGS NIMS` depends on it structurally |
| 199 | +- the current repo has a claimed total pack that is not actually present on disk |
| 200 | +- the water publisher is already semantically stronger than its artifact state suggests |
| 201 | + |
| 202 | +**What "full-scope enrichment" means here** |
| 203 | + |
| 204 | +- materialize the missing total package |
| 205 | +- reconcile research-note claims with actual on-disk artifacts |
| 206 | +- preserve and extend the strong parameter/statistic semantics |
| 207 | +- formalize datum, QC, null, and feature-of-interest semantics more explicitly |
| 208 | + |
| 209 | +**Exit condition** |
| 210 | + |
| 211 | +USGS Water becomes the canonical USGS station-family base that NIMS can depend on without ambiguity. |
| 212 | + |
| 213 | +### 6.3 NWS |
| 214 | + |
| 215 | +**Why third** |
| 216 | + |
| 217 | +- NWS is the best place to institutionalize the lessons from the historical SensorML field-shape failure |
| 218 | +- it already has substantial adjacent metadata-pack work |
| 219 | +- it can become the station-family proof point for round-trip validation discipline |
| 220 | + |
| 221 | +**What "full-scope enrichment" means here** |
| 222 | + |
| 223 | +- converge the best metadata-pack content into a live canonical package |
| 224 | +- codify station-family semantic-contract conventions |
| 225 | +- add explicit round-trip verification expectations |
| 226 | +- deepen QC, null, and feature-of-interest semantics |
| 227 | + |
| 228 | +**Exit condition** |
| 229 | + |
| 230 | +NWS becomes the reference station-family enrichment against which later station publishers are judged. |
| 231 | + |
| 232 | +### 6.4 USGS EQ |
| 233 | + |
| 234 | +**Why fourth** |
| 235 | + |
| 236 | +- it is already one of the strongest current total packages |
| 237 | +- it can define the canonical Pattern C total-package template with the least uncertainty |
| 238 | +- its lifecycle, provenance, and crosswalk semantics are unusually rich and reusable |
| 239 | + |
| 240 | +**What "full-scope enrichment" means here** |
| 241 | + |
| 242 | +- turn strong current package materials into the authoritative Pattern C template |
| 243 | +- deepen summary/detail/FDSN crosswalk semantics |
| 244 | +- formalize lifecycle, supersession, and quality semantics |
| 245 | +- define the feed-adapter artifact and acceptance pattern other Pattern C publishers should follow |
| 246 | + |
| 247 | +**Exit condition** |
| 248 | + |
| 249 | +USGS EQ becomes the canonical Pattern C reference package. |
| 250 | + |
| 251 | +### 6.5 OpenSky |
| 252 | + |
| 253 | +**Why fifth** |
| 254 | + |
| 255 | +- OpenSky is already a strong Pattern C implementation, but it still trails USGS EQ in package maturity |
| 256 | +- once USGS EQ defines the full Pattern C standard, OpenSky can be upgraded without inventing a second incompatible model |
| 257 | + |
| 258 | +**What "full-scope enrichment" means here** |
| 259 | + |
| 260 | +- graduate from metadata pack to total package |
| 261 | +- formalize auth-aware and budget-aware operational semantics |
| 262 | +- deepen data quality, provenance, and coverage semantics |
| 263 | +- align its package shape to the Pattern C standard frozen in the previous step |
| 264 | + |
| 265 | +**Exit condition** |
| 266 | + |
| 267 | +The fleet has two aligned Pattern C references rather than one strong feed-adapter and one separate event-feed model. |
| 268 | + |
| 269 | +### 6.6 NDBC |
| 270 | + |
| 271 | +**Why sixth** |
| 272 | + |
| 273 | +- NDBC is the richest current station-family multi-stream case |
| 274 | +- it benefits from earlier station-family and Pattern C decisions |
| 275 | +- it is the best place to formalize how imagery-related semantics coexist with a fixed-station observation model |
| 276 | + |
| 277 | +**What "full-scope enrichment" means here** |
| 278 | + |
| 279 | +- convert the existing metadata pack into a total-package-grade artifact |
| 280 | +- deepen the buoy-plus-imagery relationship model |
| 281 | +- sharpen QC, null, and provenance semantics |
| 282 | +- decide whether NDBC becomes the canonical "multi-stream station" reference |
| 283 | + |
| 284 | +**Exit condition** |
| 285 | + |
| 286 | +The station family now has a strong simple reference and a strong multi-stream reference. |
| 287 | + |
| 288 | +### 6.7 USGS NIMS |
| 289 | + |
| 290 | +**Why seventh** |
| 291 | + |
| 292 | +- it is already strong and materially present |
| 293 | +- its next gains depend more on dependency clarification than on raw metadata creation |
| 294 | +- it benefits from earlier USGS Water and imagery-related lessons |
| 295 | + |
| 296 | +**What "full-scope enrichment" means here** |
| 297 | + |
| 298 | +- formalize the dependency contract on USGS Water systems |
| 299 | +- decide whether selected-camera-per-station is permanent policy or transitional implementation |
| 300 | +- deepen coverage, product, and asset provenance semantics |
| 301 | +- align Pattern A packaging to the conventions proven earlier |
| 302 | + |
| 303 | +**Exit condition** |
| 304 | + |
| 305 | +USGS NIMS becomes the canonical Pattern A reference with explicit dependency semantics. |
| 306 | + |
| 307 | +### 6.8 CO-OPS |
| 308 | + |
| 309 | +**Why eighth** |
| 310 | + |
| 311 | +- it is a good publisher, but not a template setter |
| 312 | +- it currently lacks a pack |
| 313 | +- its enrichment can be done faster and more cleanly after station-family conventions are settled |
| 314 | + |
| 315 | +**What "full-scope enrichment" means here** |
| 316 | + |
| 317 | +- create a full package from scratch |
| 318 | +- sharpen product-family, datum, and coastal semantics |
| 319 | +- apply already-proven station-family packaging and validation discipline |
| 320 | + |
| 321 | +**Exit condition** |
| 322 | + |
| 323 | +CO-OPS reaches full-package parity without forcing the project to learn station-family lessons the hard way. |
| 324 | + |
| 325 | +### 6.9 Aviation WX |
| 326 | + |
| 327 | +**Why ninth** |
| 328 | + |
| 329 | +- it is the thinnest current publisher |
| 330 | +- it has the least leverage on the rest of the fleet |
| 331 | +- it will benefit most from arriving after the station-family conventions and package contract are already stable |
| 332 | + |
| 333 | +**What "full-scope enrichment" means here** |
| 334 | + |
| 335 | +- create its first full package |
| 336 | +- deepen airport/system/procedure semantics |
| 337 | +- formalize aviation vocabulary and null/missing-value handling |
| 338 | +- bring it to parity without turning it into the family template |
| 339 | + |
| 340 | +**Exit condition** |
| 341 | + |
| 342 | +Aviation WX is no longer the semantic and artifact outlier of the current fleet. |
| 343 | + |
| 344 | +--- |
| 345 | + |
| 346 | +## 7. Program-Level Checkpoints |
| 347 | + |
| 348 | +Do not run the whole program as nine disconnected tasks. Use checkpoints. |
| 349 | + |
| 350 | +### Checkpoint A: after ISS and USGS Water |
| 351 | + |
| 352 | +Lock down: |
| 353 | + |
| 354 | +- the standard package contract |
| 355 | +- the migration-artifact policy |
| 356 | +- the dependency-root policy |
| 357 | + |
| 358 | +### Checkpoint B: after NWS and USGS EQ |
| 359 | + |
| 360 | +Lock down: |
| 361 | + |
| 362 | +- canonical station-family acceptance criteria |
| 363 | +- canonical Pattern C acceptance criteria |
| 364 | +- round-trip verification expectations |
| 365 | + |
| 366 | +### Checkpoint C: after OpenSky, NDBC, and USGS NIMS |
| 367 | + |
| 368 | +Lock down: |
| 369 | + |
| 370 | +- final Pattern C conventions |
| 371 | +- final multi-stream station conventions |
| 372 | +- final Pattern A dependency conventions |
| 373 | + |
| 374 | +### Checkpoint D: after CO-OPS and Aviation WX |
| 375 | + |
| 376 | +Run a parity review to confirm the lagging station-family publishers now meet the same artifact and semantic baseline as the rest of the fleet. |
| 377 | + |
| 378 | +--- |
| 379 | + |
| 380 | +## 8. Recommended Operating Rule |
| 381 | + |
| 382 | +For this program, `done` should mean more than "a richer bootstrap file exists." |
| 383 | + |
| 384 | +For each publisher, completion should require: |
| 385 | + |
| 386 | +- a current bootstrap aligned to the target model |
| 387 | +- a materially present package on disk |
| 388 | +- corrected and verified provenance corpus |
| 389 | +- explicit semantic-contract notes |
| 390 | +- round-trip verification evidence |
| 391 | +- a runtime follow-on list if runtime work remains outside the package scope |
| 392 | + |
| 393 | +If the project follows that rule, the enrichment program will produce a canonical fleet rather than a collection of better comments. |
| 394 | + |
0 commit comments