fix(python-driver): quote-escape column aliases in buildCypher()#2373
Open
uesleilima wants to merge 1 commit intoapache:masterfrom
Open
fix(python-driver): quote-escape column aliases in buildCypher()#2373uesleilima wants to merge 1 commit intoapache:masterfrom
uesleilima wants to merge 1 commit intoapache:masterfrom
Conversation
Always double-quote column names in the AS (...) clause generated by
buildCypher() and _validate_column(). This prevents PostgreSQL parse
errors when column aliases happen to be reserved words (e.g. 'count',
'order', 'type', 'group', 'select').
Before: SELECT * from cypher(NULL,NULL) as (count agtype);
After: SELECT * from cypher(NULL,NULL) as ("count" agtype);
PostgreSQL always accepts double-quoted identifiers, so quoting
unconditionally is safe for all names — no reserved word list needed.
Closes apache#2370
|
@uesleilima many thanks for this. First check with Claude code Code Review: apache/age#2373 —
|
| # | File | Location | Suggestion | Category |
|---|---|---|---|---|
| 1 | age.py |
_validate_column line 261 |
Type name is not quoted. return f'"{name}" {type_name}' only quotes the name, but type_name is also a user-provided identifier. While in practice AGE only uses agtype, it is inconsistent — if a caller passes v mytype, the type name is unquoted. Consider quoting it too, or documenting why it is intentionally left unquoted. |
Correctness |
| 2 | age.py |
buildCypher line 268 |
if graphName == None: should be if graphName is None: — pre-existing issue, but touched in this diff's context. |
Style |
| 3 | test_age_py.py |
test_reserved_word_count |
The assertNotIn("(count agtype", result) check is narrowly crafted. It would miss a regression like ( count agtype (space after paren). A stronger assertion would be a regex like r'\bcount\s+agtype\b' outside of quotes. |
Testing |
| 4 | test_age_py.py |
TestBuildCypher imports |
from age.age import buildCypher, _validate_column imports a private function (_validate_column is prefixed _). Testing private functions directly is acceptable here since the behavior is important, but worth a comment noting it's intentional. |
Maintainability |
| 5 | age.py |
buildCypher |
No test covers the "name type" path where type_name contains a reserved-word-like string (e.g., "order agtype"). Adding one would verify the quoting is applied in the two-token branch as well. |
Testing |
✅ What Looks Good
- Minimal diff — only the three affected lines changed, no scope creep.
- Correct fix — unconditional double-quoting is the right approach; PostgreSQL always accepts double-quoted identifiers, and the identifier regex
^[A-Za-z_][A-Za-z0-9_]*$already prevents injection of embedded quotes. - Updated comment — the design note in
buildCypherwas proactively updated to document the quoting rationale. - Solid test coverage — 11 unit tests with no DB dependency is exactly right for this kind of fix; the reserved-word cases are all explicitly called out.
test_validate_column_quotingdirectly asserts the output format, making future regressions immediately obvious.
Verdict
Approve with minor suggestions. The fix is correct and safe. Suggestion #1 (unquoted type_name) is the only one worth a follow-up conversation — the rest are low-priority polish.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix
execCypher()crashing when column aliases are PostgreSQL reserved words (#2370).Problem
The
buildCypher()function generates SQL like:When the column alias is a reserved word (
count,order,type,group,select, etc.), PostgreSQL raises a parse error because the identifier is unquoted.This commonly happens when Cypher
RETURNclauses use aggregation functions:Fix
Always double-quote column names in
_validate_column():PostgreSQL always accepts double-quoted identifiers, so quoting unconditionally is safe for all names — no reserved word list needed. This is simpler and more maintainable than checking against a list of 60+ reserved words.
Generated SQL after fix:
Tests
Added 11 new unit tests in
TestBuildCypherclass (no DB required):count,order,type,select,group_validate_column()direct quoting verificationCloses