Skip to content

feat: Feast-MLflow Integration#6235

Open
Vperiodt wants to merge 11 commits intofeast-dev:masterfrom
Vperiodt:feast-mlflow
Open

feat: Feast-MLflow Integration#6235
Vperiodt wants to merge 11 commits intofeast-dev:masterfrom
Vperiodt:feast-mlflow

Conversation

@Vperiodt
Copy link
Copy Markdown
Contributor

@Vperiodt Vperiodt commented Apr 8, 2026

What this PR does / why we need it:

final_mlflow_demo.mp4
  • Auto-logging: Feature retrieval metadata is tagged on the active MLflow run (feast.feature_refs, feast.feature_views, feast.feature_service, feast.entity_count, etc.)
  • Entity DataFrame archival: Optionally saves the training entity DataFrame as an MLflow artifact (entity_df.parquet) for full reproducibility
  • Model-to-feature-service resolution: resolve_feature_service_from_model_uri() maps any MLflow model URI back to its Feast feature service enabling serving pipelines to auto-discover which features a model needs
  • Entity DataFrame reconstruction: get_entity_df_from_mlflow_run() rebuilds the exact entity DataFrame from a past run's artifacts, enabling training reproducibility
  • Configuration : Controlled entirely via feature_store.yaml under a new mlflow: block

Which issue(s) this PR fixes:

Checks

  • I've made sure the tests are passing.
  • My commits are signed off (git commit -s)
  • My PR title follows conventional commits format

Testing Strategy

  • Unit tests
  • Integration tests
  • Manual tests
  • Testing is not required for this change

Misc


Open with Devin

github-advanced-security[bot]

This comment was marked as resolved.

@Vperiodt Vperiodt marked this pull request as ready for review April 9, 2026 12:06
@Vperiodt Vperiodt requested a review from a team as a code owner April 9, 2026 12:06
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@Vperiodt Vperiodt changed the title Feast-MLflow Integration feat: Feast-MLflow Integration Apr 15, 2026
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 19 additional findings in Devin Review.

Open in Devin Review

with tempfile.TemporaryDirectory() as tmp_dir:
path = os.path.join(tmp_dir, "entity_df.parquet")
entity_df.to_parquet(path, index=False)
mlflow.log_artifact(path)
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Mixing global mlflow.log_artifact() with explicit-URI client causes artifact/metadata to target different servers

In _auto_log_entity_df_info, tags and params are logged via a locally-created MlflowClient(tracking_uri=tracking_uri) (lines 304, 309-318), but the entity DataFrame artifact is uploaded via the global mlflow.log_artifact(path) (line 327). The global function uses the tracking URI set by mlflow.set_tracking_uri(), not the explicit tracking_uri from the config. If the global tracking URI is changed by another library in the process, or if _init_mlflow_tracking failed silently (caught by except Exception at sdk/python/feast/feature_store.py:249), the artifact would be uploaded to a different server than where the tags/params were logged, splitting metadata across two servers.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Vperiodt added 10 commits April 16, 2026 18:16
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 20 additional findings in Devin Review.

Open in Devin Review

try:
import mlflow

tracking_uri = mlflow_cfg.tracking_uri or "http://127.0.0.1:5000"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 UI server ignores MLFLOW_TRACKING_URI env var, falls back to hardcoded localhost

In ui_server.py, both the /api/mlflow-runs and /api/mlflow-feature-models endpoints resolve the MLflow tracking URI using mlflow_cfg.tracking_uri or "http://127.0.0.1:5000". This reads the raw tracking_uri field from the config and falls back to a hardcoded localhost URL, completely bypassing the MLFLOW_TRACKING_URI environment variable. In contrast, the rest of the codebase (feature_store.py:243, feature_store.py:325, feature_store.py:1696, feature_store.py:2885) correctly calls mlflow_cfg.get_tracking_uri() which checks the env var via sdk/python/feast/mlflow_integration/config.py:19-29. When a user sets MLFLOW_TRACKING_URI without setting tracking_uri in YAML (which is a very common deployment pattern, and documented in the PR's own docs at docs/reference/mlflow.md:51), the UI endpoints will incorrectly connect to http://127.0.0.1:5000 instead of the env-var-specified server.

Suggested change
tracking_uri = mlflow_cfg.tracking_uri or "http://127.0.0.1:5000"
tracking_uri = mlflow_cfg.get_tracking_uri() or "http://127.0.0.1:5000"
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

try:
import mlflow

tracking_uri = mlflow_cfg.tracking_uri or "http://127.0.0.1:5000"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Second UI endpoint also ignores MLFLOW_TRACKING_URI env var

Same issue as in the /api/mlflow-runs endpoint: the /api/mlflow-feature-models endpoint at sdk/python/feast/ui_server.py:234 uses mlflow_cfg.tracking_uri or "http://127.0.0.1:5000" instead of mlflow_cfg.get_tracking_uri(), causing the MLFLOW_TRACKING_URI environment variable to be ignored.

Suggested change
tracking_uri = mlflow_cfg.tracking_uri or "http://127.0.0.1:5000"
tracking_uri = mlflow_cfg.get_tracking_uri() or "http://127.0.0.1:5000"
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Signed-off-by: Vanshika Vanshika <vvanshik@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants