Skip to content

Commit 2e5eb98

Browse files
committed
feat: auto-register ObjectStore and accept it in read/register methods
Closes #899. Integrates pyo3-object-store 0.9 to replace hand-rolled store classes and adds two quality-of-life improvements: 1. **Auto-registration**: Every `read_*` / `register_*` call now invokes `try_register_url_store()`, which inspects the URL scheme (s3 / gs / az / http / https) and silently registers an appropriate `ObjectStore`. A guard (`object_store_registry.get_store()`) prevents overwriting a store the user already registered. 2. **Explicit store parameter**: All eight read/register methods on `SessionContext` now accept an optional `object_store` keyword argument. When supplied the store is registered directly (keyed on the path URL) before the operation runs; the auto-registration path is skipped. Changes: - `Cargo.toml` / `crates/core/Cargo.toml`: add `pyo3-object_store 0.9` workspace dependency. - `crates/core/src/lib.rs`: replace hand-rolled store sub-module registration with `pyo3_object_store::register_store_module()`. - `crates/core/src/context.rs`: - `register_object_store(url, store)` rewritten to accept `PyObjectStore` directly (no more `StorageContexts` enum). - New `prepare_store_for_path(path, store)` helper centralises the explicit-vs-auto dispatch. - `try_register_url_store` gains an early-return guard. - All eight read/register methods gain `object_store: Option<PyObjectStore>`. - `python/datafusion/object_store.py`: rewritten to re-export `S3Store`, `GCSStore`, `AzureStore`, `HTTPStore`, `LocalStore`, `MemoryStore`, and `from_url` from `pyo3-object-store`, plus backward-compat aliases (`AmazonS3`, `GoogleCloud`, `MicrosoftAzure`, `Http`, `LocalFileSystem`). - `python/datafusion/context.py`: all eight Python-side methods updated with `object_store: Any | None = None` and docstring entries. - `python/tests/test_sql.py`: new integration tests covering explicit S3 store, auto-registered S3 URL, HTTP CSV, HTTPS CSV, and HTTPS Parquet (using public `coiled-datasets` bucket and GitHub raw URLs).
1 parent be8dd9d commit 2e5eb98

File tree

9 files changed

+395
-68
lines changed

9 files changed

+395
-68
lines changed

Cargo.lock

Lines changed: 54 additions & 6 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ async-trait = "0.1.89"
5656
futures = "0.3"
5757
cstr = "0.2"
5858
object_store = { version = "0.13.1" }
59+
pyo3-object_store = { version = "0.9" }
5960
url = "2"
6061
log = "0.4.29"
6162
parking_lot = "0.12"

crates/core/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ async-trait = { workspace = true }
6363
futures = { workspace = true }
6464
cstr = { workspace = true }
6565
object_store = { workspace = true, features = ["aws", "gcp", "azure", "http"] }
66+
pyo3-object_store = { workspace = true }
6667
url = { workspace = true }
6768
log = { workspace = true }
6869
parking_lot = { workspace = true }

0 commit comments

Comments
 (0)