Plan next publisher expansion steps

Sam-Bolling · Sam-Bolling · commit d5f230301980 · 2026-05-26T06:55:27.000-04:00
diff --git a/docs/research/new-publisher-source-planning/New_Publisher_Implementation_Checklist_2026-05-26.md b/docs/research/new-publisher-source-planning/New_Publisher_Implementation_Checklist_2026-05-26.md
@@ -0,0 +1,104 @@
+# New Publisher Implementation Checklist
+
+Date: 2026-05-26
+
+## Purpose
+
+This checklist captures the repeatable workflow proven during the Environment Agency Hydrology implementation. Use it as the baseline for the next publisher sources so each integration reaches the same standard: researched, scoped, implemented, documented, visible in Explorer, and verified against the live demo path.
+
+## 1. Source Triage
+
+- Confirm the source is real, public, and machine-readable.
+- Identify official documentation, API root, sample list endpoint, sample data endpoint, and license terms.
+- Probe endpoints directly and record exact working query shapes.
+- Capture quirks early, especially auth, rate limits, pagination, timestamp format, units, and non-standard parameters.
+- Decide whether the source should be implemented now, deferred, or split into multiple publisher opportunities.
+
+## 2. Existing Pattern Selection
+
+- Choose the closest mature publisher as the primary exemplar.
+- Prefer station-network patterns for fixed monitoring sites.
+- Prefer event-feed patterns for earthquakes, alerts, or incident streams.
+- Prefer image/media patterns only when the source actually exposes useful media.
+- Record any server compatibility constraints from prior publishers before coding.
+
+## 3. Curated First Pass
+
+- Start with a small demo-safe sidecar, not a full-network ingestion.
+- Include enough variety to prove the data model.
+- Choose stations/events that are geographically legible in Explorer.
+- Keep runtime polling bounded and predictable.
+- Preserve original source IDs in metadata and observations, even if CSAPI UIDs need sanitized tokens.
+
+## 4. CSAPI Model
+
+- Define one procedure for the source ingestion method.
+- Define systems around physical stations, platforms, or logical event sources.
+- Define one datastream per observed property/product/statistic combination.
+- Define deployments that make map placement and hierarchy explicit.
+- Include source URLs, license links, units, parameter names, quality flags, and provenance in metadata.
+
+## 5. Bootstrap
+
+- Use shared bootstrap helpers where possible.
+- Create minimal GeoJSON stubs first, then richer SensorML through PUT where the server accepts it.
+- Support `--dry-run`, `--clean`, `--clean-only`, and `--force-sml`.
+- Log and recover from known server compatibility failures without hiding them.
+- Compile and run the bootstrap locally before touching the live server.
+
+## 6. Runtime Publisher
+
+- Load curated source definitions from a sidecar file.
+- Fetch only latest or bounded recent readings in normal operation.
+- Normalize timestamps to UTC.
+- Preserve source quality, completeness, revision, and status metadata when available.
+- Dedupe unchanged readings during a running process.
+- Support `--dry-run`, `--once`, `--interval`, and source-subset flags.
+
+## 7. Explorer Readiness
+
+- Verify the map can classify the publisher with an appropriate STANAG/MIL-STD-2525 symbol.
+- Add source-specific symbol rules only when generic rules produce a poor result.
+- Make side-card summaries meaningful for the domain.
+- Surface latest observations where the current value is useful to a viewer.
+- Add image/media metadata only when it is accurate, licensed, and clearly attributed.
+- Use explicit representative-image language when exact station imagery is unavailable.
+
+## 8. Validation
+
+- Compile changed Python modules.
+- Run dry-run source fetches.
+- Run bootstrap against the live OSH endpoint.
+- Run one live publish cycle.
+- Verify backend observations directly.
+- Verify Explorer visibility on the correct preset.
+- Verify production bundle content after pushing Explorer changes.
+- Record any server warnings separately from publisher failures.
+
+## 9. Documentation
+
+- Add or update the publisher README.
+- Add an implementation plan before coding.
+- Add a completion report after first working publish.
+- Add a live-demo verification report after Explorer validation.
+- Record server compatibility issues as issue-ready drafts or remote issues.
+- Keep image attribution and license notes close to the implementation and report.
+
+## 10. Commit And Push
+
+- Keep publisher and Explorer commits separate when they live in separate repositories.
+- Verify `git status --short` before each commit.
+- Push immediately when the user expects live-demo behavior.
+- Recheck the deployed production bundle or runtime after push.
+
+## Minimum Done Definition
+
+A new publisher is not done until all of these are true:
+
+- curated bootstrap succeeds or documented server limitations are isolated,
+- runtime can publish at least one clean live cycle,
+- observations can be read back from CSAPI,
+- Explorer can find and explain the resources,
+- side-card/popup output is domain-meaningful,
+- docs explain source, model, commands, validation, and limitations,
+- commits are pushed to the relevant repositories.
diff --git a/docs/research/new-publisher-source-planning/OSH_Server_Compatibility_Issue_Draft_2026-05-26.md b/docs/research/new-publisher-source-planning/OSH_Server_Compatibility_Issue_Draft_2026-05-26.md
@@ -0,0 +1,110 @@
+# OSH Server Compatibility Issue Draft
+
+Date: 2026-05-26
+
+## Proposed Title
+
+System SensorML PUT returns HTTP 500 and prevents publisher media metadata updates
+
+## Summary
+
+Environment Agency Hydrology bootstrap can create systems, datastreams, deployments, and observations successfully on the configured OSH endpoint, but system-level SensorML replacement currently fails with HTTP 500. Procedure SensorML PUT succeeds. The failure prevents new system metadata such as representative side-card image documents from reaching the live CSAPI resources through the normal SensorML path.
+
+This is now visible as a publisher interoperability issue because station/system publishers rely on system SensorML `documents` metadata for rich Explorer cards, including thumbnails, source links, and attribution.
+
+## Affected Endpoint
+
+```text
+https://os4csapi-osh.duckdns.org/sensorhub/api
+```
+
+Representative production preset:
+
+```text
+OSH (OS4CSAPI)
+https://129-80-248-53.sslip.io/sensorhub/api
+```
+
+## Reproduction
+
+From `OSHConnect-Python`:
+
+```powershell
+py -m publishers.environment_agency_hydrology.bootstrap_environment_agency_hydrology --force-sml
+```
+
+Observed output includes one warning per Environment Agency station system:
+
+```text
+[WARN] SML PUT skipped for system urn:os4csapi:system:environment-agency-hydrology:48513a18-e485-4317-ae92-93bf4f7f3e54:v1 (id=05j0): HTTP 500 PUT https://os4csapi-osh.duckdns.org/sensorhub/api/systems/05j0: {
+  "status": 500,
+  "message": "Internal server error"
+}
+```
+
+Additional affected systems during the same run:
+
+```text
+urn:os4csapi:system:environment-agency-hydrology:d52d0eab-1e64-4d76-a1f2-e81c7948d2c0-435510:v1
+urn:os4csapi:system:environment-agency-hydrology:c7e13884-4a02-4df3-b184-09aea28cf8e8-3-020:v1
+urn:os4csapi:system:environment-agency-hydrology:959f3e4f-bb6e-4f4a-8082-0157eea99482:v1
+```
+
+## Expected Behavior
+
+`PUT /systems/{id}` with `Content-Type: application/sml+json` should accept a valid SensorML JSON document for an existing system, consistent with procedure SensorML replacement behavior.
+
+At minimum, the server should return a diagnostic 4xx response explaining which SensorML field is invalid instead of an opaque HTTP 500.
+
+## Actual Behavior
+
+The server returns HTTP 500 for every Environment Agency Hydrology system SensorML PUT attempted during `--force-sml`.
+
+The bootstrap safely logs the failure and continues, so operational publishing still works:
+
+- systems exist,
+- datastreams exist,
+- deployments exist,
+- live observations publish successfully.
+
+The missing system SensorML update still blocks normal rich metadata propagation.
+
+## Demo Impact
+
+Explorer side-card thumbnails normally come from system SensorML image documents. Because the system PUT fails, the Environment Agency Hydrology representative gauge photo cannot be relied on from live system SensorML metadata.
+
+Temporary mitigation implemented in Explorer:
+
+```text
+OS4CSAPI/ogc-csapi-explorer@5323b4d Show hydrology station thumbnail fallback
+```
+
+Publisher metadata and docs update:
+
+```text
+OS4CSAPI/OSHConnect-Python@87a8f77 Add hydrology station thumbnail metadata
+OS4CSAPI/OSHConnect-Python@c6fc2d9 Record hydrology thumbnail live verification
+```
+
+## Related Browser Finding
+
+After reloading the production Explorer, the OSH external URL also produced a browser CORS diagnostic indicating duplicate `Access-Control-Allow-Origin` values:
+
+```text
+The 'Access-Control-Allow-Origin' header contains multiple values '*, https://ogc-csapi-explorer.pages.dev', but only one is allowed.
+```
+
+The app can still work through the configured proxy path, but this should be tracked as adjacent server/proxy header behavior if direct browser access is expected to remain supported.
+
+## Suggested Labels
+
+```text
+bug
+server-interop
+sensorml
+publisher-support
+```
+
+## Notes
+
+The GitHub CLI was not available in the current environment and no issue-management tool was exposed, so this file is an issue-ready draft rather than a remotely created GitHub issue.
diff --git a/docs/research/new-publisher-source-planning/UK_AIR_Publisher_Implementation_Plan_2026-05-26.md b/docs/research/new-publisher-source-planning/UK_AIR_Publisher_Implementation_Plan_2026-05-26.md