Skip to content

Feature/snowflake s3 stage v1#19

Open
abhishek-pattern wants to merge 167 commits intomainfrom
feature/snowflake-s3-stage-v1
Open

Feature/snowflake s3 stage v1#19
abhishek-pattern wants to merge 167 commits intomainfrom
feature/snowflake-s3-stage-v1

Conversation

@abhishek-pattern
Copy link

@abhishek-pattern abhishek-pattern commented Feb 27, 2026

Updates:

  • New use_s3_stage parameter in publish_pandas and query_pandas_from_snowflake
  • New BatchInferencePipeline for inference using s3 stage
  • Added Auto warehouse selection functionallity in witch you only specify warehouse as "xl", "med", "xs" and it derives warehouse from tags and current.is_production (Backwards compatible)

- Add query_pandas_from_snowflake_via_s3_stage() for efficient large query results (>10M rows)
- Add publish_pandas_via_s3_stage() for efficient large DataFrame writes (>10M rows)
- Add make_batch_predictions_from_snowflake_via_s3_stage() for batch ML predictions
- Support dev/prod environment switching via current.is_production
- Add helper functions for S3 operations and SQL generation
- Add metaflow_s3/utils.py with S3 utility functions
- Add comprehensive functional tests
- Integrate with existing Metaflow card system and cost tracking
…function for improved readability and maintainability
- Introduced new documentation for `make_pydantic_parser_fn`, `publish`, `publish_pandas`, `query_pandas_from_snowflake`, and `restore_step_state` functions.
- Removed the outdated `pandas.md` and `validate_config.md` documentation files.
- Updated the Snowflake utilities README to reflect the integration with Metaflow and emphasize the use of high-level APIs.
…e formatting in publish_pandas documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants