diff --git a/docs/setup_installation/admin/airflow3.md b/docs/setup_installation/admin/airflow3.md index c4de368cf..98cdb90cf 100644 --- a/docs/setup_installation/admin/airflow3.md +++ b/docs/setup_installation/admin/airflow3.md @@ -68,6 +68,31 @@ Real-time membership changes are propagated by the Hopsworks backend pushing to A 60-second safety-net TTL on the cache catches drift even without an explicit invalidation. See [Airflow Security Model](../../user_guides/projects/airflow/security_model.md#token--cookie-behavior) for the full description. +## DAG reconciler + +`AirflowDagReconciler` is a Hopsworks-side singleton EJB that runs every 60 s (with a 30 s initial delay) on the Hopsworks admin pod. +It walks `Projects/

/Airflow/*.py` for every project, derives the canonical `dag_id` (`p____`), and reconciles `dag_project_index` against the on-disk truth: + +- A `.py` present on HopsFS without a matching index row triggers a row insert. + This is the path that picks up files uploaded via the Hopsworks File Browser or copied via `DatasetApi.copy`, without needing a backend restart. +- An index row whose `.py` is gone triggers a row delete plus the same `airflow.api.common.delete_dag.delete_dag` cleanup the explicit delete button uses. + +The reconciler runs under `@TransactionAttribute(NOT_SUPPORTED)` because the Airflow auth-manager HTTP calls it makes do not participate in the EJB global transaction. +Its logs appear in the admin pod under the logger `io.hops.hopsworks.common.airflow.AirflowDagReconciler`. + +## Orphan cleanup CronJob + +The chart deploys an `airflow-orphan-cleanup` CronJob (gated by `airflow.enabled`, no separate enable flag) that runs the SQL in `cleanup_orphans.sql` against the Airflow metadata DB. +It deletes orphan rows in `dag_run`, `task_instance`, `task_instance_history`, `xcom`, `log`, `dag_warning`, `asset_dag_run_queue` (`target_dag_id`), `task_outlet_asset_reference` (`dag_id`), and `deadline` (both `dag_id` and `dagrun_id`) that point at a `dag_id` no longer in the `dag` table. +This is the cleanup path Airflow itself does not run automatically when a DAG is hard-deleted out of band. +Only the most recent successful run of the CronJob is retained in-namespace; older Pods are reaped by the CronJob's history limit. + +## OpenShift compatibility + +The airflow image is built to OpenShift's arbitrary-UID + GID-0 contract. +`/etc/airflow` and `launcher.sh` are group-owned by root (`chown :0`) with the user permission bits mirrored onto the group (`chmod g=u`), so OpenShift's per-namespace UID (which always has GID 0) can read, write, and execute everything the `airflow` UID can on vanilla Kubernetes. +No `runAsUser` override is needed when deploying on OpenShift; the chart's pod-spec works unchanged. + ## Metrics The legacy `airflow-exporter` 1.3.0 does not support Airflow 3. Metrics diff --git a/docs/user_guides/projects/airflow/airflow.md b/docs/user_guides/projects/airflow/airflow.md index 90a443f74..bbc03a8a5 100644 --- a/docs/user_guides/projects/airflow/airflow.md +++ b/docs/user_guides/projects/airflow/airflow.md @@ -31,6 +31,8 @@ See the [security model](security_model.md) for the full surface-by-surface cont The Hopsworks UI's Airflow page shows each DAG's most recent runs as colored squares in a **Last runs** column (green = success, red = failed, blue = running, yellow = queued / scheduled, gray = other). Clicking anywhere on a DAG row opens the DAG in the Airflow UI. The pencil at the row's end opens the generated Python file in an in-app editor. +The trash icon deletes the DAG: a click-confirm dialog appears, and on confirm the Python file is removed from the project's `Airflow/` HopsFS dataset, the per-DAG `hopsworks_api_key_` Variable is deleted from Airflow, the row in `dag_project_index` is dropped, and `airflow.api.common.delete_dag.delete_dag` is called so the `dag`, `dag_run`, `task_instance`, `xcom`, `log`, and related rows go with it. +After delete the page reloads to reflect the new state. #### Hopsworks DAG Builder @@ -44,6 +46,10 @@ Click on _New Workflow_ to create a new Airflow DAG. You should provide a name for the DAG as well as a schedule interval. You can define the schedule using the dropdown menus or by providing a cron expression. +The schedule `@continuous` is rejected by both the UI form and the backend. +A continuous DAG re-runs as soon as the previous run finishes, so a DAG that errors at parse time (for example, missing the per-DAG API key Variable) loops at wall-clock speed and OOM-kills the shared scheduler pod, taking every other project's DAGs down with it. +Use a cron expression for periodic runs, or `@once` for one-shot DAGs. + You can add to the DAG Hopsworks operators and sensors: - **Operator**: The operator is used to trigger a job execution. diff --git a/docs/user_guides/projects/airflow/airflow3_upgrade.md b/docs/user_guides/projects/airflow/airflow3_upgrade.md index e8e38661e..36560ec1a 100644 --- a/docs/user_guides/projects/airflow/airflow3_upgrade.md +++ b/docs/user_guides/projects/airflow/airflow3_upgrade.md @@ -40,6 +40,7 @@ Concrete things to change: | `from airflow.models import BaseOperator` | `from airflow.sdk.bases.operator import BaseOperator` | | Custom Hopsworks operators imported via plugins | Provider package `apache-airflow-providers-hopsworks` | | Default `catchup_by_default = True` | Default `catchup=False`; set explicitly | +| `schedule_interval='@continuous'` | Rejected by Hopsworks; use cron or `@once` | The Hopsworks-provided operators are now exposed via a standard provider: diff --git a/docs/user_guides/projects/airflow/security_model.md b/docs/user_guides/projects/airflow/security_model.md index 45859fbec..a7cff74c6 100644 --- a/docs/user_guides/projects/airflow/security_model.md +++ b/docs/user_guides/projects/airflow/security_model.md @@ -23,6 +23,14 @@ DAG is composed or deleted, and stored in a table inside Airflow's own metadata DB. Editing a DAG file directly (e.g. changing `tags=[...]`) cannot move the DAG to a different project's namespace. +### Active-project scoping + +Opening Airflow from a project's UI narrows visibility further to that project alone, even for users who are members of several. +The Hopsworks proxy forwards the project context via a `?hopsworks_project=` query parameter that the auth manager turns into an `active_project_id` claim on the issued Airflow JWT. +The DAG list, the per-DAG endpoints, and the Audit Log are all filtered against the active project for the lifetime of the session. +Switching project in the Hopsworks UI re-mints the Airflow JWT with the new active project; the previous session's cookie remains valid for its TTL but is scoped to the previous project. +A Hopsworks admin opening Airflow without a project context sees every DAG; opening from a project still scopes to that project. + ## What is **not** isolated The shared `dag-processor` parses DAGs from all projects.