Skip to content

Python driver: Add skip_load parameter to skip LOAD 'age' statement#2366

Open
uesleilima wants to merge 2 commits intoapache:masterfrom
uesleilima:master
Open

Python driver: Add skip_load parameter to skip LOAD 'age' statement#2366
uesleilima wants to merge 2 commits intoapache:masterfrom
uesleilima:master

Conversation

@uesleilima
Copy link
Copy Markdown

Fixes #2353

Problem

setUpAge() and Age.connect() unconditionally execute LOAD 'age' (or LOAD '$libdir/plugins/age'). On managed PostgreSQL services like Azure Database for PostgreSQL, the AGE extension is loaded server-side via shared_preload_libraries and the binary is not accessible at the file path expected by LOAD, causing a psycopg.errors.UndefinedFile error. There is no way to use the driver in these environments without bypassing the setup entirely.

Solution

Add a skip_load parameter (default False) to setUpAge(), Age.connect(), and the module-level age.connect(). When True, the LOAD statement is skipped while all other setup is preserved:

  • SET search_path
  • agtype adapter registration
  • graph existence check / creation

This is fully backward-compatible — existing code is unaffected.

Usage

import age

ag = age.connect(
    dsn="host=myserver.postgres.database.azure.com ...",
    graph="my_graph",
    skip_load=True
)

- Add 4 unit tests for setUpAge() skip_load behavior:
  skip_load=True skips LOAD, skip_load=False executes LOAD,
  load_from_plugins integration, and search_path always set.
- Document skip_load in README under new "Managed PostgreSQL Usage"
  section for Azure/AWS RDS/etc. environments.
- Fix syntax in existing load_from_plugins code example.

Made-with: Cursor
Copy link
Copy Markdown

@tcolo tcolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uesleilima thanks a a lot for the PR - please addresse the issue sin the Claude Code Check.


Code Review

Summary

This PR addresses a real and valid problem — managed PostgreSQL environments (Azure, AWS RDS) where LOAD 'age' fails because the extension is pre-loaded via shared_preload_libraries. The implementation is backward-compatible and test coverage is solid.

However, there's a significant design conflict with the already-merged configure_connection() API that should be resolved before merging.


🔴 Critical

skip_load may leak into psycopg.connect() via **kwargs (age/__init__.py)

The current module-level connect() signature is:

def connect(dsn=None, graph=None, connection_factory=None, cursor_factory=ClientCursor, load_from_plugins=False, **kwargs):
    ag.connect(dsn=dsn, ..., load_from_plugins=load_from_plugins, **kwargs)

If skip_load is not explicitly named in this signature, it falls through **kwargsAge.connect(**kwargs)psycopg.connect(**kwargs), which will raise:

ProgrammingError: invalid dsn parameter 'skip_load'

skip_load must be extracted explicitly, the same way load_from_plugins is.


🟡 Suggestions

1. API duplication with configure_connection() (age/age.pysetUpAge())

The already-merged configure_connection(load=False) (see commits 7d64707e, 2a8a7c2d) solves the exact same problem — skipping LOAD 'age' on managed services — with a safer opt-in default. This PR introduces a parallel API for the same use case.

2. Inverted semantics between the two APIs

Function Pattern Safe default
configure_connection() load=False (opt-in) ✅ skips LOAD by default
setUpAge() (this PR) skip_load=False (opt-out) ⚠️ executes LOAD by default

A user reading both APIs won't know which to use for managed services, and the meaning of "safe default" is different for each. If both must coexist, the semantics should be aligned and the distinction documented clearly.

3. Undocumented interaction between skip_load=True and load_from_plugins=True

When both flags are set, LOAD is silently skipped — which is probably correct, but surprising. The docstring should document the precedence, or a ValueError should be raised for the contradictory combination.

4. Call chain not covered by tests

The unit tests mock setUpAge() in isolation, but the path age.connect(skip_load=True)Age.connect()setUpAge() isn't tested end-to-end. A test verifying the parameter is forwarded correctly through the full call chain would catch regressions like the **kwargs leak described above.

5. README should mention configure_connection() as an alternative

The new README section shows age.connect(skip_load=True) as the solution for managed environments. Since configure_connection() also exists and is better suited for pool-based setups, the docs should mention both and clarify the trade-offs.


✅ What Looks Good

  • Fully backward-compatible (skip_load=False default preserves existing behavior)
  • Search path and adapter registration are correctly preserved when skip_load=True
  • Unit tests use mocking cleanly — no live DB required
  • 5 test methods give solid coverage of setUpAge() in isolation
  • The problem (UndefinedFile on Azure PostgreSQL) is well-documented in the PR description

Verdict: Request Changes

The **kwargs leak is a likely runtime error that needs to be addressed. Beyond that, the team should align on whether skip_load on setUpAge() is needed given configure_connection(load=False) already exists, or document a clear distinction between the two. Merging both without resolving this will leave users with two confusingly similar solutions with opposite flag polarities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants