Skip to content

Bug: REST catalog auth cannot be configured via environment variables unless auth JSON strings are decoded #3422

@kevinjqliu

Description

@kevinjqliu

Apache Iceberg version

None

Please describe the bug 🐞

Summary

RestCatalog._create_session() expects auth to be a dict. When catalog config comes from environment variables, values are strings, so auth is received as a string and auth initialization fails.

This blocks env-var-based configuration for pluggable REST auth (basic, oauth2, google, entra, custom) unless string JSON is explicitly decoded first.

Minimal repro

export PYICEBERG_CATALOG__REST__TYPE=rest
export PYICEBERG_CATALOG__REST__URI=http://localhost:8181
export PYICEBERG_CATALOG__REST__AUTH='{"type":"oauth2","oauth2":{"client_id":"id","client_secret":"secret","token_url":"https://auth.example/token"}}'
from pyiceberg.catalog import load_catalog
load_catalog("rest")

Actual (without this fix)

Expected: catalog initializes and uses the configured auth manager.

Actual: initialization fails because auth is treated as a string and .get(...) is called on it.

Suggested fix

In REST catalog session setup, if auth is a string, decode it as JSON before reading auth.type and type-specific config.

Add regression tests for both:

  • PYICEBERG_CATALOG__<NAME>__AUTH (JSON string) initializes auth manager correctly.
  • PYICEBERG_CATALOG__<NAME>__AUTH__... maps correctly into auth manager configuration.

Alternative fix (follows current env-var standard)

Support flattened auth properties from environment variables instead of requiring a JSON blob in ...__AUTH.

Example:

export PYICEBERG_CATALOG__REST__AUTH__TYPE=oauth2
export PYICEBERG_CATALOG__REST__AUTH__OAUTH2__CLIENT_ID=id
export PYICEBERG_CATALOG__REST__AUTH__OAUTH2__CLIENT_SECRET=secret
export PYICEBERG_CATALOG__REST__AUTH__OAUTH2__TOKEN_URL=https://auth.example/token

This aligns with existing flattened env-var configuration behavior and avoids JSON-in-env quoting/escaping issues.

Verification

Observed with current code path (Config._from_environment_variables):

export PYICEBERG_CATALOG__REST__AUTH__TYPE=oauth2
export PYICEBERG_CATALOG__REST__AUTH__OAUTH2__CLIENT_ID=id
export PYICEBERG_CATALOG__REST__AUTH__OAUTH2__CLIENT_SECRET=secret

Parsed result:

{'catalog': {'rest': {'auth.type': 'oauth2', 'auth.oauth2.client-id': 'id', 'auth.oauth2.client-secret': 'secret'}}}

This confirms flattened AUTH__... env vars are currently stored as dotted keys, not as a nested auth object consumed by RestCatalog._create_session().

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions