Skip to content

Commit d3cec55

Browse files
committed
docs(usgs-water): add total bootstrap data model pack
1 parent e73684a commit d3cec55

28 files changed

Lines changed: 1990 additions & 0 deletions
Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
# USGS Water Total Bootstrap, Data Model, and Enrichment Pack
2+
3+
**Date:** 2026-03-11
4+
**Author:** Codex
5+
**Status:** Created, packaged, and intended for repository handoff
6+
**Scope:** `publishers/usgs_water` total package including bootstrap guidance, data-model review, metadata enrichment, and zip artifact
7+
8+
---
9+
10+
## 1. Executive Summary
11+
12+
A new comprehensive USGS water package was created under:
13+
14+
- `publishers/usgs_water/total_bootstrap_data_model_enrichment_pack`
15+
16+
This package is broader than the earlier metadata-only packs. It includes:
17+
18+
- a reviewed resource-model description
19+
- live-source verification notes against the current USGS Water Data OGC API
20+
- metadata sidecars and worked examples
21+
- bootstrap patch candidates for richer procedure, system, datastream, and deployment metadata
22+
- a shareable zip artifact of the full package
23+
24+
The package is designed to keep the current station-centric publisher architecture
25+
while materially improving provenance, semantic clarity, and future maintainability.
26+
27+
---
28+
29+
## 2. Why a Larger Package Was Justified
30+
31+
The USGS water publisher is already stronger than the early NWS/NDBC baselines.
32+
It already has:
33+
34+
- one shared procedure
35+
- one system per monitoring location
36+
- two datastreams per station
37+
- a coherent deployment tree
38+
- a working runtime using the USGS Water Data OGC API
39+
40+
So the right move was not to redesign it. The right move was to package the
41+
current implementation properly:
42+
43+
- make the data model explicit
44+
- anchor the bootstrap to current live USGS semantics
45+
- identify where metadata enrichment is safe now
46+
- record where runtime follow-on improvements should happen later
47+
48+
---
49+
50+
## 3. Live Research Findings That Matter
51+
52+
The package was informed by live verification on 2026-03-11 against the USGS
53+
Water Data OGC API.
54+
55+
The most important findings were:
56+
57+
1. `latest-continuous` is live and is the better latest-only runtime target than `continuous?limit=1`.
58+
2. `time-series-metadata` can return multiple series for one station and parameter code, including daily and instantaneous variants.
59+
3. `combined-metadata` is rich, but it must be filtered carefully or it may bind to the wrong statistic family.
60+
4. `monitoring-locations` exposes more authoritative system metadata than the current bootstrap carries.
61+
5. The active OGC API path is still `v0`, so the package intentionally keeps `v0` URLs.
62+
63+
The most important semantic conclusion is this:
64+
65+
For the current publisher, datastreams should be documented as the
66+
`statistic_id=00011` instantaneous series. `parameter_code` alone is not a
67+
precise enough description.
68+
69+
---
70+
71+
## 4. Package Contents
72+
73+
### 4.1 Notes
74+
75+
The `notes/` section contains:
76+
77+
- live source verification
78+
- audit and recommendations
79+
- apply order
80+
- runtime and model follow-on guidance
81+
82+
### 4.2 Data model
83+
84+
The `data_model/` section contains:
85+
86+
- a resource-model walkthrough
87+
- machine-readable inventory
88+
- current observation-contract documentation
89+
- upstream-to-CSAPI field mapping
90+
91+
### 4.3 Metadata and examples
92+
93+
The `metadata/` section contains:
94+
95+
- official source URLs
96+
- live worked examples from the current USGS API
97+
- an enriched station template
98+
- a worked enriched station example for `09380000`
99+
100+
### 4.4 Assets
101+
102+
The `assets/` section contains:
103+
104+
- a generic local USGS water station SVG
105+
- a note explaining why no single official station image was bundled
106+
107+
### 4.5 Bootstrap patch candidates
108+
109+
The `patches/` section contains:
110+
111+
- constants and helper URLs
112+
- an enriched procedure body
113+
- an enriched system stub
114+
- an enriched SensorML system body
115+
- enriched datastream schema candidates
116+
- enriched deployment blocks
117+
- an enriched station JSON example
118+
- a compact candidate snippet summary
119+
120+
---
121+
122+
## 5. Design Position
123+
124+
This package makes a deliberate distinction between:
125+
126+
- what should be changed now
127+
- what should be documented now but implemented later
128+
129+
### Recommended now
130+
131+
- richer procedure provenance
132+
- richer station SensorML metadata
133+
- more explicit datastream semantics and collection links
134+
- optional enriched station-config sidecars
135+
- better documentation of the current observation contract
136+
137+
### Recommended later
138+
139+
- move latest-only polling to `latest-continuous`
140+
- decide whether `time_series_id` or `last_modified` should ever become result-body fields
141+
- consider additional parameter families such as `00010`
142+
143+
This keeps the package robust without forcing unnecessary runtime churn.
144+
145+
---
146+
147+
## 6. Bottom Line
148+
149+
The new USGS water package is not just a metadata patch. It is a reviewed handoff
150+
bundle for the current publisher:
151+
152+
- architecture clarified
153+
- live upstream semantics verified
154+
- enrichment candidates prepared
155+
- zip artifact produced for transport and review
156+
157+
That is the right level of packaging for a publisher that is already functional
158+
and now needs to become more explicit, more authoritative, and easier to extend.
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# USGS Water Total Bootstrap, Data Model, and Enrichment Pack
2+
3+
This package is a comprehensive handoff bundle for the current `publishers/usgs_water`
4+
publisher in `OSHConnect-Python`.
5+
6+
It is intentionally broader than the earlier metadata-only packs. It includes:
7+
8+
- a reviewed data-model section
9+
- live-source verification notes
10+
- a metadata-enrichment pack
11+
- ready-to-apply bootstrap snippet candidates
12+
- a package manifest suitable for zipping and external sharing
13+
14+
The package was assembled after cross-referencing:
15+
16+
- the current local `bootstrap_usgs_water.py`
17+
- the current local `usgs_water_publisher.py`
18+
- the current local `stations.json`
19+
- existing NWS, NDBC, and OpenSky enrichment-pack patterns
20+
- live USGS Water Data OGC API responses verified on 2026-03-11
21+
22+
## Scope
23+
24+
This package is designed to improve and document the USGS water publisher without
25+
forcing risky runtime changes into the current production codepath.
26+
27+
It does three things:
28+
29+
1. documents the current architecture and observation contract
30+
2. provides a richer metadata-enrichment layer for bootstrap resources
31+
3. captures the most important follow-on runtime and modeling recommendations
32+
33+
## Why this package exists
34+
35+
The current USGS water publisher is already a strong Phase 1 implementation:
36+
37+
- one shared observing procedure
38+
- one system per USGS monitoring location
39+
- two datastreams per station
40+
- a clean deployment tree
41+
- a working publisher against the USGS Water Data OGC API
42+
43+
That baseline is sound. The main opportunity is to turn it into a better-documented,
44+
better-evidenced, more semantically explicit publisher package.
45+
46+
The most important findings from live verification are:
47+
48+
- the USGS Water Data OGC API still exposes `monitoring-locations`, `continuous`,
49+
`latest-continuous`, `time-series-metadata`, and `combined-metadata`
50+
- `latest-continuous` is available and is a better latest-only runtime target than
51+
`continuous?limit=1`
52+
- `time-series-metadata` can return multiple series for the same station and parameter
53+
code, including both daily and instantaneous statistics
54+
- `combined-metadata` is rich, but consumers must filter carefully or they may
55+
accidentally bind to the wrong statistic family
56+
57+
## Package layout
58+
59+
- `bundle_manifest.json`
60+
- `notes/`
61+
- `data_model/`
62+
- `metadata/`
63+
- `assets/`
64+
- `patches/`
65+
66+
## Recommended reading order
67+
68+
1. `notes/LIVE_SOURCE_VERIFICATION_2026-03-11.md`
69+
2. `notes/AUDIT_AND_RECOMMENDATIONS.md`
70+
3. `data_model/RESOURCE_MODEL.md`
71+
4. `metadata/source_urls.json`
72+
5. `patches/bootstrap_usgs_water_metadata_enriched_candidate_snippets.py`
73+
74+
## Implementation stance
75+
76+
This pack is conservative where the current runtime is already good and explicit
77+
where the current metadata is too thin.
78+
79+
It does not assume every recommended runtime improvement should be applied
80+
immediately. In particular:
81+
82+
- metadata enrichment is recommended now
83+
- stronger datastream provenance is recommended now
84+
- `latest-continuous` migration is recommended next
85+
- richer result-body fields are optional and should be adopted only if downstream
86+
consumers benefit from the added payload size and contract complexity
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
No single official station image is bundled in this package.
2+
3+
Reason:
4+
5+
- USGS Water Data OGC API metadata is authoritative for station identity and
6+
observation semantics, but it does not provide one stable canonical station
7+
hero image suitable for packaging here.
8+
- Station imagery is better handled by the separate USGS NIMS imagery track.
9+
10+
This package therefore includes a local generic SVG:
11+
12+
- `assets/usgs_water_station_generic.svg`
13+
14+
Use that asset for documentation, demos, or placeholder rendering unless you
15+
choose to pair this publisher with NIMS camera resources.
Lines changed: 12 additions & 0 deletions
Loading
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
{
2+
"name": "USGS Water Total Bootstrap, Data Model, and Enrichment Pack",
3+
"packageVersion": "2026-03-11",
4+
"publisherPath": "publishers/usgs_water",
5+
"generatedDate": "2026-03-11",
6+
"purpose": [
7+
"Document the current USGS water publisher architecture",
8+
"Provide live-verified source references and examples",
9+
"Deliver a metadata enrichment pack for bootstrap resources",
10+
"Package the result as a shareable zip artifact"
11+
],
12+
"artifactGroups": [
13+
"notes",
14+
"data_model",
15+
"metadata",
16+
"assets",
17+
"patches"
18+
],
19+
"upstreamVerifiedOn": "2026-03-11",
20+
"currentLocalInputs": [
21+
"publishers/usgs_water/bootstrap_usgs_water.py",
22+
"publishers/usgs_water/usgs_water_publisher.py",
23+
"publishers/usgs_water/stations.json",
24+
"docs/research/USGS_API_Reconnaissance_Notes.md",
25+
"docs/research/USGS_Water_Publisher_Phase1_Report.md"
26+
],
27+
"notes": [
28+
"Live verification confirms the USGS Water Data OGC API v0 surface is active.",
29+
"The package distinguishes current runtime contract from recommended future extensions.",
30+
"The zip archive for this package is created as a sibling artifact under publishers/usgs_water."
31+
]
32+
}

0 commit comments

Comments
 (0)