Skip to content

Multiprefix/test#471

Draft
rbs333 wants to merge 10 commits intomainfrom
multiprefix/test
Draft

Multiprefix/test#471
rbs333 wants to merge 10 commits intomainfrom
multiprefix/test

Conversation

@rbs333
Copy link
Collaborator

@rbs333 rbs333 commented Feb 11, 2026

No description provided.

rbs333 and others added 10 commits February 3, 2026 10:35
The previous implementation used str.replace() which had two bugs:
1. Partial matching: :id would incorrectly replace inside :product_id
2. No quote escaping: values like O'Brien produced invalid SQL

The new token-based approach:
- Splits SQL on :param patterns, then rebuilds with substituted values
- Prevents partial matching by design (tokens are complete identifiers)
- Escapes single quotes using SQL standard (' -> '')
- 2-7x faster than regex alternative, scales better with more parameters

Added unit tests covering partial matching, quote escaping,
and edge cases (unicode, backslashes, empty strings, bytes).dit tool working with tests
Copilot AI review requested due to automatic review settings February 11, 2026 09:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces two major features to the redisvl library: SQL query support and multi-prefix index capability. The PR bumps the version from 0.13.2 to 0.14.0, reflecting the addition of significant new functionality.

Changes:

  • Adds SQLQuery class to translate SQL SELECT statements into Redis FT.SEARCH/FT.AGGREGATE commands via the sql-redis library
  • Implements multi-prefix index support allowing a single index to monitor multiple key prefixes
  • Adds comprehensive test coverage for both new features

Reviewed changes

Copilot reviewed 15 out of 17 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
pyproject.toml Version bump to 0.14.0, adds sql-redis optional dependency
uv.lock Updates lock file with sql-redis and sqlglot dependencies
redisvl/query/sql.py New SQLQuery class with parameter substitution and Redis command translation
redisvl/query/init.py Exports SQLQuery from the query module
redisvl/index/index.py Adds multi-prefix support and _sql_query method for executing SQL queries
tests/unit/test_sql_parameter_substitution.py Unit tests for SQL parameter substitution edge cases
tests/integration/test_sql_redis_json.py Integration tests for SQL queries against JSON storage
tests/integration/test_sql_redis_hash.py Integration tests for SQL queries against Hash storage
tests/integration/test_multi_prefix.py Integration tests verifying multi-prefix index functionality
examples/multi_prefix_example.py Example demonstrating multi-prefix index usage
docs/api/query.rst API documentation for SQLQuery
docs/user_guide/index.md Adds reference to new SQL query guide
docs/user_guide/01_getting_started.ipynb Updates prefix configuration example
docs/user_guide/02_hybrid_queries.ipynb Re-executed notebook with updated outputs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


Note:
Requires the optional `sql-redis` package. Install with:
``pip install redisvl[sql]``
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring in line 30 refers to pip install redisvl[sql], but the extra name in pyproject.toml is sql-redis. This inconsistency will confuse users. Update the docstring to use pip install redisvl[sql-redis].

Suggested change
``pip install redisvl[sql]``
``pip install redisvl[sql-redis]``

Copilot uses AI. Check for mistakes.
except ImportError:
raise ImportError(
"sql-redis is required for SQL query support. "
"Install it with: pip install redisvl[sql]"
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message instructs users to install with pip install redisvl[sql], but the actual extra name defined in pyproject.toml is sql-redis. This inconsistency will cause installation failures. The error message should be updated to pip install redisvl[sql-redis] to match the extra name defined in pyproject.toml line 54.

Suggested change
"Install it with: pip install redisvl[sql]"
"Install it with: pip install redisvl[sql-redis]"

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +91
def _substitute_params(self, sql: str, params: Dict[str, Any]) -> str:
"""Substitute parameter placeholders in SQL with actual values.

Uses token-based approach: splits SQL on :param patterns, then rebuilds
with substituted values. This prevents partial matching (e.g., :id
won't match inside :product_id) and is faster than regex at scale.

Args:
sql: The SQL string with :param placeholders.
params: Dictionary mapping parameter names to values.

Returns:
SQL string with parameters substituted.

Note:
- String values are wrapped in single quotes with proper escaping
- Numeric values are converted to strings
- Bytes values (e.g., vectors) are NOT substituted here
"""
if not params:
return sql

# Split SQL on :param patterns, keeping the delimiters
# Pattern matches : followed by valid identifier (letter/underscore, then alphanumeric/underscore)
tokens = re.split(r"(:[a-zA-Z_][a-zA-Z0-9_]*)", sql)

result = []
for token in tokens:
if token.startswith(":"):
key = token[1:] # Remove leading :
if key in params:
value = params[key]
if isinstance(value, (int, float)):
result.append(str(value))
elif isinstance(value, str):
# Escape single quotes using SQL standard: ' -> ''
escaped = value.replace("'", "''")
result.append(f"'{escaped}'")
else:
# Keep placeholder for bytes (vectors handled by Executor)
result.append(token)
else:
# Keep unmatched placeholders as-is
result.append(token)
else:
result.append(token)

return "".join(result)
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter substitution function only escapes single quotes for string values. While this follows SQL standard escaping, the implementation doesn't prevent SQL injection attacks through other means (e.g., backslashes, special characters). Consider whether the sql-redis library performs additional sanitization on the SQL string after substitution. If not, this could be a security vulnerability. Additionally, document clearly what sanitization is expected from downstream components versus what is done here.

Copilot uses AI. Check for mistakes.
" \"index\": {\n",
" \"name\": \"user_simple\",\n",
" \"prefix\": \"user_simple_docs\",\n",
" \"prefix\": [\"prefix_1\", \"prefix_2\"],\n",
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getting started notebook has been updated to use a multi-prefix configuration ["prefix_1", "prefix_2"] instead of the single prefix "user_simple_docs". However, this change appears to be unintentional for a getting started guide, as:

  1. The table output still shows ['user_simple_docs'] as the prefix (line 296), suggesting the notebook output wasn't re-run after the code change
  2. Multi-prefix configuration is an advanced topic that may confuse beginners in a "getting started" guide
  3. There's no explanation or context for why multiple prefixes are being used

This should either be reverted to the simple single-prefix example, or the notebook should be re-run and additional explanation added.

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +17
"""Simulate the CURRENT buggy implementation for comparison.

This is the exact code from redisvl/query/sql.py lines 105-113.
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring comment on lines 15-18 states "This is the exact code from redisvl/query/sql.py lines 105-113", but those line numbers don't correspond to any code in the actual redisvl/query/sql.py file (which only has 159 lines total). This comment appears to reference a buggy implementation that may have existed during development but doesn't match the current code. The comment should either be removed or updated to clarify that this is a demonstration of what NOT to do, without referencing non-existent line numbers.

Suggested change
"""Simulate the CURRENT buggy implementation for comparison.
This is the exact code from redisvl/query/sql.py lines 105-113.
"""Simulate a naive, intentionally buggy implementation for comparison.
This mirrors the earlier str.replace-based approach used in SQLQuery and
is kept only to demonstrate incorrect behavior; it is not the production
implementation.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +13
import pytest

from redisvl.query.sql import SQLQuery


Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'pytest' is not used.

Suggested change
import pytest
from redisvl.query.sql import SQLQuery
from redisvl.query.sql import SQLQuery

Copilot uses AI. Check for mistakes.
# Clean up any existing index
try:
client.ft(INDEX_NAME).dropindex(delete_documents=True)
except Exception:
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
except Exception:
# The index may not exist yet or may have been removed already; ignore drop errors.

Copilot uses AI. Check for mistakes.
# Clean up any existing index
try:
client.ft(index_name).dropindex(delete_documents=True)
except Exception:
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
except Exception:
# The index may not exist yet or may have already been removed; ignore
# cleanup errors so that test setup can proceed.

Copilot uses AI. Check for mistakes.
# Cleanup
try:
index.delete(drop=True)
except Exception:
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
except Exception:
# Best-effort cleanup: ignore errors if the index is already deleted
# or the Redis connection is unavailable during test teardown.

Copilot uses AI. Check for mistakes.
finally:
try:
index.delete(drop=True)
except Exception:
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
except Exception:
# Best-effort cleanup: ignore errors when deleting the index so tests
# do not fail due to teardown issues.

Copilot uses AI. Check for mistakes.
@tylerhutcherson
Copy link
Collaborator

@rbs333 Looks like this has leaked commits from the SQL work. Can you pare it back to the multi-prefix additions?

Copy link
Collaborator

@vishal-bala vishal-bala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major comments from me aside a question about whether building the registry/executor for SQL translation is expensive. There are a few Copilot comments that I agreed should be addressed (left a 👍 on them) - aside from those, it generally looks good! Super nice to be able to have SQL queries translated 🙌

Comment on lines 945 to 948
registry = SchemaRegistry(self._redis_client)
registry.load_all() # Loads index schemas from Redis

executor = Executor(self._redis_client, registry)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How involved are these to be run on every query? If they're non-negligible, do we want to cache/pre-load/lazy-load them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants