Conversation
| with tempfile.TemporaryDirectory() as tmp_dir: | ||
| path = os.path.join(tmp_dir, "entity_df.parquet") | ||
| entity_df.to_parquet(path, index=False) | ||
| mlflow.log_artifact(path) |
There was a problem hiding this comment.
🟡 Mixing global mlflow.log_artifact() with explicit-URI client causes artifact/metadata to target different servers
In _auto_log_entity_df_info, tags and params are logged via a locally-created MlflowClient(tracking_uri=tracking_uri) (lines 304, 309-318), but the entity DataFrame artifact is uploaded via the global mlflow.log_artifact(path) (line 327). The global function uses the tracking URI set by mlflow.set_tracking_uri(), not the explicit tracking_uri from the config. If the global tracking URI is changed by another library in the process, or if _init_mlflow_tracking failed silently (caught by except Exception at sdk/python/feast/feature_store.py:249), the artifact would be uploaded to a different server than where the tags/params were logged, splitting metadata across two servers.
Was this helpful? React with 👍 or 👎 to provide feedback.
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
| try: | ||
| import mlflow | ||
|
|
||
| tracking_uri = mlflow_cfg.tracking_uri or "http://127.0.0.1:5000" |
There was a problem hiding this comment.
🔴 UI server ignores MLFLOW_TRACKING_URI env var, falls back to hardcoded localhost
In ui_server.py, both the /api/mlflow-runs and /api/mlflow-feature-models endpoints resolve the MLflow tracking URI using mlflow_cfg.tracking_uri or "http://127.0.0.1:5000". This reads the raw tracking_uri field from the config and falls back to a hardcoded localhost URL, completely bypassing the MLFLOW_TRACKING_URI environment variable. In contrast, the rest of the codebase (feature_store.py:243, feature_store.py:325, feature_store.py:1696, feature_store.py:2885) correctly calls mlflow_cfg.get_tracking_uri() which checks the env var via sdk/python/feast/mlflow_integration/config.py:19-29. When a user sets MLFLOW_TRACKING_URI without setting tracking_uri in YAML (which is a very common deployment pattern, and documented in the PR's own docs at docs/reference/mlflow.md:51), the UI endpoints will incorrectly connect to http://127.0.0.1:5000 instead of the env-var-specified server.
| tracking_uri = mlflow_cfg.tracking_uri or "http://127.0.0.1:5000" | |
| tracking_uri = mlflow_cfg.get_tracking_uri() or "http://127.0.0.1:5000" |
Was this helpful? React with 👍 or 👎 to provide feedback.
| try: | ||
| import mlflow | ||
|
|
||
| tracking_uri = mlflow_cfg.tracking_uri or "http://127.0.0.1:5000" |
There was a problem hiding this comment.
🔴 Second UI endpoint also ignores MLFLOW_TRACKING_URI env var
Same issue as in the /api/mlflow-runs endpoint: the /api/mlflow-feature-models endpoint at sdk/python/feast/ui_server.py:234 uses mlflow_cfg.tracking_uri or "http://127.0.0.1:5000" instead of mlflow_cfg.get_tracking_uri(), causing the MLFLOW_TRACKING_URI environment variable to be ignored.
| tracking_uri = mlflow_cfg.tracking_uri or "http://127.0.0.1:5000" | |
| tracking_uri = mlflow_cfg.get_tracking_uri() or "http://127.0.0.1:5000" |
Was this helpful? React with 👍 or 👎 to provide feedback.
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
What this PR does / why we need it:
final_mlflow_demo.mp4
Auto-logging: Feature retrieval metadata is tagged on the active MLflow run (feast.feature_refs, feast.feature_views, feast.feature_service, feast.entity_count, etc.)Entity DataFrame archival: Optionally saves the training entity DataFrame as an MLflow artifact (entity_df.parquet) for full reproducibilityModel-to-feature-service resolution: resolve_feature_service_from_model_uri() maps any MLflow model URI back to its Feast feature service enabling serving pipelines to auto-discover which features a model needsEntity DataFrame reconstruction: get_entity_df_from_mlflow_run() rebuilds the exact entity DataFrame from a past run's artifacts, enabling training reproducibilityConfiguration: Controlled entirely via feature_store.yaml under a new mlflow: blockWhich issue(s) this PR fixes:
Checks
git commit -s)Testing Strategy
Misc