Skip to content

Commit 8525835

Browse files
committed
portability: remove hardcoded credentials, add .env config, README, fail-fast guards
- Remove 69 hardcoded references to server hostname/password across 24 files - All publishers now require OSH_ADDRESS/USER/PASS via env vars (fail-fast on missing) - bootstrap_helpers.py unified to derive config from same 5 env vars as publishers - DNS monkey-patch now opt-in via OSH_FORCE_IP env var (was always-on hardcoded IP) - docker-compose.yml uses variable substitution with error messages - Dockerfiles cleared of baked-in credentials - Add publishers/.env.example template - Add publishers/README.md with setup guide and architecture diagram - Add docs/research/Publisher_Fleet_Portability_Plan.md - Update repo-level README.md with full quick-start instructions
1 parent 3dab591 commit 8525835

29 files changed

Lines changed: 710 additions & 124 deletions

README.md

Lines changed: 128 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,132 @@
11
# OSHConnect-Python
22

3-
Library for communicating with Opensensorhub that provides options for saving configurations, getting visualization
4-
recommendations for data, retrieving data in real time, archival streams, and batch modes, and more.
3+
A Python library + publisher fleet for [OGC Connected Systems API (CSAPI)](https://ogcapi.ogc.org/connectedsystems/) servers such as [OpenSensorHub](https://opensensorhub.org/).
54

6-
API Documentation available [here](https://botts-innovative-research.github.io/OSHConnect-Python/)
5+
## What's in this repo
76

8-
Links:
9-
* [Architecture Doc](https://docs.google.com/document/d/1pIaeQw0ocU6ApNgqTVRZuSwjJAbhCcmweMq6RiVYEic/edit?usp=sharing)
10-
* [UML Diagram](https://drive.google.com/file/d/1FVrnYiuAR8ykqfOUa1NuoMyZ1abXzMPw/view?usp=drive_link)
7+
| Directory | What it does |
8+
|-----------|-------------|
9+
| `src/` | **OSHConnect library** — Python client for CSAPI (systems, datastreams, observations, real-time, batch) |
10+
| `publishers/` | **Publisher fleet** — 9 real-time data publishers that fetch from public APIs and push observations |
11+
| `scripts/` | One-off admin/migration scripts |
12+
| `scenarios/` | Scenario packs for testing |
13+
| `tests/` | Unit tests for the library |
14+
| `docs/` | Research notes, design docs, conformance reports |
15+
16+
## Quick Start — Publisher Fleet
17+
18+
The publisher fleet fetches live data from NWS, NDBC, CO-OPS, AviationWeather,
19+
OpenSky, USGS Water, USGS NIMS, USGS Earthquake, and ISS (CelesTrak) — and
20+
publishes observations to your CSAPI server.
21+
22+
### Prerequisites
23+
24+
- Python 3.12+
25+
- A running CSAPI server (e.g. OpenSensorHub) with admin credentials
26+
- Docker & Docker Compose (for containerised deployment, optional)
27+
28+
### 1. Clone & configure
29+
30+
```bash
31+
git clone https://github.com/OS4CSAPI/OSHConnect-Python.git
32+
cd OSHConnect-Python/publishers
33+
34+
# Create your config from the template
35+
cp .env.example .env
36+
```
37+
38+
Edit `.env` with your server details:
39+
40+
```dotenv
41+
OSH_ADDRESS=myserver.example.com
42+
OSH_PORT=443
43+
OSH_USER=admin
44+
OSH_PASS=my-secret-password
45+
OSH_ROOT=sensorhub
46+
```
47+
48+
### 2. Bootstrap your server
49+
50+
Bootstraps create the procedures, systems, datastreams, and deployment hierarchy
51+
on your server. They are idempotent — safe to re-run.
52+
53+
```bash
54+
# Load env vars
55+
export $(grep -v '^#' .env | xargs)
56+
57+
# Run each bootstrap (order doesn't matter)
58+
python -m publishers.nws.bootstrap_nws
59+
python -m publishers.ndbc.bootstrap_ndbc
60+
python -m publishers.coops.bootstrap_coops
61+
python -m publishers.aviation_wx.bootstrap_aviation_wx
62+
python -m publishers.opensky.bootstrap_opensky
63+
python -m publishers.usgs_water.bootstrap_usgs_water
64+
python -m publishers.usgs_nims.bootstrap_usgs_nims
65+
python -m publishers.usgs_eq.bootstrap_usgs_eq
66+
python -m publishers.iss.bootstrap_iss
67+
```
68+
69+
> **Windows (PowerShell):** Instead of `export`, set each variable:
70+
> ```powershell
71+
> Get-Content publishers\.env | ForEach-Object {
72+
> if ($_ -match '^([^#]\S+?)=(.*)$') {
73+
> [Environment]::SetEnvironmentVariable($matches[1], $matches[2], 'Process')
74+
> }
75+
> }
76+
> ```
77+
78+
### 3. Start publishers
79+
80+
**Option A — Docker Compose** (recommended for production):
81+
82+
```bash
83+
cd publishers
84+
docker compose up -d # start all 10 services
85+
docker compose logs -f nws # follow one service
86+
docker compose ps # check status
87+
docker compose down # stop all
88+
```
89+
90+
**Option B — Standalone** (for development/testing):
91+
92+
```bash
93+
export $(grep -v '^#' .env | xargs)
94+
python -m publishers.nws.nws_publisher --interval 3600
95+
python -m publishers.nws.nws_publisher --dry-run # print without publishing
96+
python -m publishers.nws.nws_publisher --once # single cycle then exit
97+
```
98+
99+
### 4. Verify
100+
101+
Open your server's API explorer at `https://<your-server>/sensorhub/api` and
102+
check that systems, datastreams, and observations are appearing.
103+
104+
## Publisher Fleet Summary
105+
106+
| Service | Data Source | Default Cadence |
107+
|---------|-----------|----------------|
108+
| ISS | CelesTrak TLE → SGP4 | 30 s |
109+
| NWS | NOAA NWS Surface Obs | 1 h |
110+
| NDBC | NOAA NDBC Buoy Met | 1 h |
111+
| NDBC BuoyCAM | NOAA NDBC Camera JPEGs | 15 min |
112+
| CO-OPS | NOAA Tide Stations | 6 min |
113+
| Aviation WX | FAA METAR | 5 min |
114+
| OpenSky | ADS-B Aircraft Tracking | 5 min |
115+
| USGS Water | NWIS Water Monitoring | 15 min |
116+
| USGS NIMS | NWIS Camera Imagery | 15 min |
117+
| USGS EQ | Earthquake Hazards | 60 s |
118+
119+
See [publishers/README.md](publishers/README.md) for detailed environment variables,
120+
architecture diagram, and per-publisher notes.
121+
122+
## OSHConnect Library
123+
124+
```bash
125+
pip install git+https://github.com/OS4CSAPI/OSHConnect-Python.git
126+
```
127+
128+
API Documentation: [https://botts-innovative-research.github.io/OSHConnect-Python/](https://botts-innovative-research.github.io/OSHConnect-Python/)
129+
130+
## License
131+
132+
See [LICENSE](LICENSE).
Lines changed: 235 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,235 @@
1+
# Publisher Fleet Portability Plan
2+
3+
**Date:** 2025-07-25
4+
**Status:** Complete
5+
**Scope:** Make the entire publisher fleet (9 publishers, 10 bootstraps, Docker Compose) reusable on any CSAPI-compliant server
6+
7+
---
8+
9+
## 1 Problem Statement
10+
11+
The publisher fleet *architecturally* supports portability — env vars, JSON station configs, idempotent bootstraps, and Dockerfiles are all in place. In practice, however, someone cloning this repo and running the publishers without carefully setting every environment variable would **silently authenticate against our production server** and begin writing observations into it.
12+
13+
### Audit Numbers
14+
15+
| Metric | Count |
16+
|---|---|
17+
| Files with hardcoded server/credential defaults | **24** |
18+
| Total hardcoded references (`os4csapi-osh` or `ogc134mm`) | **69** |
19+
| Distinct env var systems (bootstrap vs. publisher) | **2** |
20+
| Setup documentation | **0** |
21+
22+
### What's Already Portable (No Changes Needed)
23+
24+
- **Station / sensor JSON configs** — station lists, buoy lists, METAR stations, etc. are data files a user can swap out
25+
- **Idempotent bootstraps**`bootstrap_*.py` scripts create-or-skip, safe to re-run
26+
- **UIDs** — server-scoped, generated at bootstrap time per-server
27+
- **Dockerfiles** — all use `python:3.12-slim`, env vars are declared (just need safe defaults)
28+
- **Docker Compose structure** — YAML anchors, volume mounts, restart policies
29+
30+
---
31+
32+
## 2 Work Items
33+
34+
### 2.1 HIGH — Replace Dangerous Credential Defaults
35+
36+
**Risk:** A friend who forgets to set env vars silently hits our production server.
37+
38+
Every publisher `__init__` has the same five lines:
39+
40+
```python
41+
self.osh_address = os.environ.get("OSH_ADDRESS", "os4csapi-osh.duckdns.org")
42+
self.osh_port = int(os.environ.get("OSH_PORT", "443"))
43+
self.osh_user = os.environ.get("OSH_USER", "os4csapi")
44+
self.osh_pass = os.environ.get("OSH_PASS", "ogc134mm")
45+
self.osh_root = os.environ.get("OSH_ROOT", "sensorhub")
46+
```
47+
48+
And `bootstrap_helpers.py` `get_config()` at L39-40:
49+
50+
```python
51+
"base_url": os.environ.get(
52+
"BOOTSTRAP_URL",
53+
"https://os4csapi-osh.duckdns.org/sensorhub/api"),
54+
```
55+
56+
Plus every Dockerfile bakes in `ENV OSH_ADDRESS=os4csapi-osh.duckdns.org` etc.
57+
58+
**Fix:**
59+
60+
1. Change all Python defaults to obviously-wrong placeholders:
61+
- `OSH_ADDRESS``"your-server.example.com"`
62+
- `OSH_USER``"changeme"`
63+
- `OSH_PASS``"changeme"`
64+
- `BOOTSTRAP_URL``"https://your-server.example.com/sensorhub/api"`
65+
2. Add a fail-fast guard at the top of each publisher `__init__` and in `bootstrap_helpers.get_config()`:
66+
```python
67+
if self.osh_address == "your-server.example.com":
68+
sys.exit("ERROR: OSH_ADDRESS not configured. "
69+
"Copy .env.example → .env and fill in your server details.")
70+
```
71+
3. Remove hardcoded values from all 10 Dockerfiles; leave just `ENV OSH_ADDRESS=` (empty) so Docker Compose or `.env` must supply them.
72+
73+
**Files affected:** All 8 non-ISS publisher `.py` files, `publishers/base.py`, `bootstrap_helpers.py`, 10 Dockerfiles, `docker-compose.yml` `x-osh-env` anchor.
74+
75+
**Estimated changes:** ~69 line edits across 24 files.
76+
77+
---
78+
79+
### 2.2 HIGH — Remove or Guard the DNS Monkey-Patch
80+
81+
**Risk:** Silently forces all DuckDNS resolution to a hardcoded Oracle IP. Would break anyone on a different server.
82+
83+
In `bootstrap_helpers.py` L50-64:
84+
85+
```python
86+
ORACLE_IP = "129.80.248.53"
87+
88+
def _patched_getaddrinfo(host, port, *args, **kwargs):
89+
if isinstance(host, str) and "os4csapi-osh.duckdns.org" in host:
90+
return [(socket.AF_INET, socket.SOCK_STREAM, 6, '', (ORACLE_IP, port or 443))]
91+
return _original_getaddrinfo(host, port, *args, **kwargs)
92+
93+
socket.getaddrinfo = _patched_getaddrinfo
94+
```
95+
96+
**Fix:**
97+
98+
1. Move the monkey-patch behind an opt-in env var:
99+
```python
100+
_FORCE_IP = os.environ.get("OSH_FORCE_IP", "")
101+
if _FORCE_IP:
102+
# DNS override active — used when DuckDNS is unreachable from the host
103+
...
104+
```
105+
2. Remove the hardcoded `ORACLE_IP` constant.
106+
3. Patch condition should match the configured address, not a hardcoded hostname.
107+
108+
**Files affected:** `bootstrap_helpers.py` only (single location, imported by all bootstraps).
109+
110+
---
111+
112+
### 2.3 HIGH — Unify Bootstrap and Publisher Config
113+
114+
**Risk:** Two different env var schemes confuse new users and require setting overlapping values.
115+
116+
| Component | Env Var(s) | What It Expects |
117+
|---|---|---|
118+
| Bootstraps | `BOOTSTRAP_URL` | Full URL: `https://host/root/api` |
119+
| Publishers | `OSH_ADDRESS`, `OSH_PORT`, `OSH_ROOT` | Separate parts, assembled at runtime |
120+
121+
**Fix:**
122+
123+
1. Make `bootstrap_helpers.get_config()` derive its `base_url` from the same `OSH_ADDRESS` / `OSH_PORT` / `OSH_ROOT` env vars the publishers use:
124+
```python
125+
def get_config():
126+
addr = os.environ.get("OSH_ADDRESS", "your-server.example.com")
127+
port = int(os.environ.get("OSH_PORT", "443"))
128+
root = os.environ.get("OSH_ROOT", "sensorhub")
129+
scheme = "http" if port == 80 else "https"
130+
base_url = os.environ.get(
131+
"BOOTSTRAP_URL",
132+
f"{scheme}://{addr}/{root}/api"
133+
)
134+
return {
135+
"base_url": base_url,
136+
"user": os.environ.get("OSH_USER", "changeme"),
137+
"password": os.environ.get("OSH_PASS", "changeme"),
138+
}
139+
```
140+
2. Keep `BOOTSTRAP_URL` as an optional override for edge cases, but the default path only requires the standard five vars.
141+
3. Update `docker-compose.yml` `x-osh-env` to document this.
142+
143+
**Files affected:** `bootstrap_helpers.py`, `docker-compose.yml` (comments only).
144+
145+
---
146+
147+
### 2.4 MEDIUM — Create `.env.example` and Operator Guide
148+
149+
**Risk:** No documentation on what env vars to set, what order to run things, or how the fleet fits together.
150+
151+
**Deliverables:**
152+
153+
1. **`publishers/.env.example`**
154+
```env
155+
# ── OSH Server Connection ──
156+
OSH_ADDRESS=your-server.example.com
157+
OSH_PORT=443
158+
OSH_USER=admin
159+
OSH_PASS=changeme
160+
OSH_ROOT=sensorhub
161+
162+
# ── Optional ──
163+
# OSH_FORCE_IP=10.0.0.5 # Override DNS (useful behind NAT)
164+
# BOOTSTRAP_URL= # Override full bootstrap URL
165+
# BUOYCAM_CACHE_BASE_URL= # NDBC BuoyCAM image proxy base URL
166+
```
167+
168+
2. **`publishers/README.md`** — Getting-started guide:
169+
- Prerequisites (Python 3.12, Docker, target CSAPI server)
170+
- Quick start: copy `.env.example``.env`, fill in values, run a bootstrap, start a publisher
171+
- Architecture diagram (bootstraps → server, publishers → server)
172+
- Per-publisher notes (data sources, caveats, refresh intervals)
173+
- Docker Compose usage
174+
175+
**Files affected:** 2 new files.
176+
177+
---
178+
179+
### 2.5 LOW — Document BuoyCAM External Dependency
180+
181+
**Risk:** The NDBC BuoyCAM publisher serves cached images via a URL that must point somewhere accessible. The default refers to our server.
182+
183+
```yaml
184+
BUOYCAM_CACHE_BASE_URL: https://os4csapi-osh.duckdns.org/buoycam
185+
```
186+
187+
**Fix:**
188+
189+
1. Default to placeholder (`https://your-server.example.com/buoycam`).
190+
2. Add a note in `README.md` explaining that the operator must host a static file server (or use the same OSH server with an Nginx location block) to serve the cached BuoyCAM JPEGs.
191+
192+
**Files affected:** `docker-compose.yml`, `ndbc_buoycam_publisher.py`, `README.md`.
193+
194+
---
195+
196+
## 3 Execution Order
197+
198+
| Step | Item | Est. Time |
199+
|---|---|---|
200+
| 1 | 2.4 — Create `.env.example` + `README.md` | 30 min |
201+
| 2 | 2.1 — Replace credential defaults + add fail-fast guards | 60 min |
202+
| 3 | 2.2 — Guard the DNS monkey-patch | 15 min |
203+
| 4 | 2.3 — Unify config (bootstrap derives from publisher vars) | 15 min |
204+
| 5 | 2.5 — BuoyCAM docs | 10 min |
205+
| 6 | Smoke test — bootstrap + publish cycle with only `.env` set | 15 min |
206+
| | **Total** | **~2.5 hrs** |
207+
208+
Step 1 goes first because it establishes the env var contract that steps 2-4 reference.
209+
Step 6 verifies the whole chain works when only the `.env` file supplies credentials.
210+
211+
---
212+
213+
## 4 What This Plan Does NOT Cover
214+
215+
| Topic | Reason |
216+
|---|---|
217+
| Common base class extraction | Assessed separately; recommendation is to park it (no drift, fleet stable) |
218+
| AISHub publisher | Blocked on AISHub membership approval |
219+
| Commercial API publishers | Out of scope (no keys, different licensing) |
220+
| SWE schema improvements | Orthogonal to portability |
221+
| Explorer (csapi-explorer) portability | Separate repo, no hardcoded server defaults |
222+
223+
---
224+
225+
## 5 Success Criteria
226+
227+
A collaborator can:
228+
229+
1. Clone the repo
230+
2. Copy `.env.example` → `.env` and fill in their CSAPI server details
231+
3. Run any bootstrap script — it creates resources on *their* server
232+
4. Run `docker compose up` — all publishers start and write to *their* server
233+
5. At no point does any traffic reach `os4csapi-osh.duckdns.org` unless they explicitly configure it
234+
235+
If any step fails or silently contacts the wrong server, the portability work is incomplete.

0 commit comments

Comments
 (0)