Multiprefix/test by rbs333 · Pull Request #471 · redis/redis-vl-python

rbs333 · 2026-02-11T09:04:57Z

No description provided.

The previous implementation used str.replace() which had two bugs: 1. Partial matching: :id would incorrectly replace inside :product_id 2. No quote escaping: values like O'Brien produced invalid SQL The new token-based approach: - Splits SQL on :param patterns, then rebuilds with substituted values - Prevents partial matching by design (tokens are complete identifiers) - Escapes single quotes using SQL standard (' -> '') - 2-7x faster than regex alternative, scales better with more parameters Added unit tests covering partial matching, quote escaping, and edge cases (unicode, backslashes, empty strings, bytes).dit tool working with tests

Copilot

Pull request overview

This pull request introduces two major features to the redisvl library: SQL query support and multi-prefix index capability. The PR bumps the version from 0.13.2 to 0.14.0, reflecting the addition of significant new functionality.

Changes:

Adds SQLQuery class to translate SQL SELECT statements into Redis FT.SEARCH/FT.AGGREGATE commands via the sql-redis library
Implements multi-prefix index support allowing a single index to monitor multiple key prefixes
Adds comprehensive test coverage for both new features

Reviewed changes

Copilot reviewed 15 out of 17 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
pyproject.toml	Version bump to 0.14.0, adds sql-redis optional dependency
uv.lock	Updates lock file with sql-redis and sqlglot dependencies
redisvl/query/sql.py	New SQLQuery class with parameter substitution and Redis command translation
redisvl/query/init.py	Exports SQLQuery from the query module
redisvl/index/index.py	Adds multi-prefix support and _sql_query method for executing SQL queries
tests/unit/test_sql_parameter_substitution.py	Unit tests for SQL parameter substitution edge cases
tests/integration/test_sql_redis_json.py	Integration tests for SQL queries against JSON storage
tests/integration/test_sql_redis_hash.py	Integration tests for SQL queries against Hash storage
tests/integration/test_multi_prefix.py	Integration tests verifying multi-prefix index functionality
examples/multi_prefix_example.py	Example demonstrating multi-prefix index usage
docs/api/query.rst	API documentation for SQLQuery
docs/user_guide/index.md	Adds reference to new SQL query guide
docs/user_guide/01_getting_started.ipynb	Updates prefix configuration example
docs/user_guide/02_hybrid_queries.ipynb	Re-executed notebook with updated outputs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-11T09:10:25Z

redisvl/query/sql.py

+
+    Note:
+        Requires the optional `sql-redis` package. Install with:
+        ``pip install redisvl[sql]``


The docstring in line 30 refers to pip install redisvl[sql], but the extra name in pyproject.toml is sql-redis. This inconsistency will confuse users. Update the docstring to use pip install redisvl[sql-redis].

Suggested change

``pip install redisvl[sql]``

``pip install redisvl[sql-redis]``

Copilot · 2026-02-11T09:10:25Z

redisvl/query/sql.py

+        except ImportError:
+            raise ImportError(
+                "sql-redis is required for SQL query support. "
+                "Install it with: pip install redisvl[sql]"


The error message instructs users to install with pip install redisvl[sql], but the actual extra name defined in pyproject.toml is sql-redis. This inconsistency will cause installation failures. The error message should be updated to pip install redisvl[sql-redis] to match the extra name defined in pyproject.toml line 54.

Suggested change

"Install it with: pip install redisvl[sql]"

"Install it with: pip install redisvl[sql-redis]"

Copilot · 2026-02-11T09:10:25Z

redisvl/query/sql.py

+    def _substitute_params(self, sql: str, params: Dict[str, Any]) -> str:
+        """Substitute parameter placeholders in SQL with actual values.
+
+        Uses token-based approach: splits SQL on :param patterns, then rebuilds
+        with substituted values. This prevents partial matching (e.g., :id
+        won't match inside :product_id) and is faster than regex at scale.
+
+        Args:
+            sql: The SQL string with :param placeholders.
+            params: Dictionary mapping parameter names to values.
+
+        Returns:
+            SQL string with parameters substituted.
+
+        Note:
+            - String values are wrapped in single quotes with proper escaping
+            - Numeric values are converted to strings
+            - Bytes values (e.g., vectors) are NOT substituted here
+        """
+        if not params:
+            return sql
+
+        # Split SQL on :param patterns, keeping the delimiters
+        # Pattern matches : followed by valid identifier (letter/underscore, then alphanumeric/underscore)
+        tokens = re.split(r"(:[a-zA-Z_][a-zA-Z0-9_]*)", sql)
+
+        result = []
+        for token in tokens:
+            if token.startswith(":"):
+                key = token[1:]  # Remove leading :
+                if key in params:
+                    value = params[key]
+                    if isinstance(value, (int, float)):
+                        result.append(str(value))
+                    elif isinstance(value, str):
+                        # Escape single quotes using SQL standard: ' -> ''
+                        escaped = value.replace("'", "''")
+                        result.append(f"'{escaped}'")
+                    else:
+                        # Keep placeholder for bytes (vectors handled by Executor)
+                        result.append(token)
+                else:
+                    # Keep unmatched placeholders as-is
+                    result.append(token)
+            else:
+                result.append(token)
+
+        return "".join(result)


The parameter substitution function only escapes single quotes for string values. While this follows SQL standard escaping, the implementation doesn't prevent SQL injection attacks through other means (e.g., backslashes, special characters). Consider whether the sql-redis library performs additional sanitization on the SQL string after substitution. If not, this could be a security vulnerability. Additionally, document clearly what sanitization is expected from downstream components versus what is done here.

Copilot · 2026-02-11T09:10:26Z

docs/user_guide/01_getting_started.ipynb

    "    \"index\": {\n",
    "        \"name\": \"user_simple\",\n",
-    "        \"prefix\": \"user_simple_docs\",\n",
+    "        \"prefix\": [\"prefix_1\", \"prefix_2\"],\n",


The getting started notebook has been updated to use a multi-prefix configuration ["prefix_1", "prefix_2"] instead of the single prefix "user_simple_docs". However, this change appears to be unintentional for a getting started guide, as:

The table output still shows ['user_simple_docs'] as the prefix (line 296), suggesting the notebook output wasn't re-run after the code change

Multi-prefix configuration is an advanced topic that may confuse beginners in a "getting started" guide

There's no explanation or context for why multiple prefixes are being used

This should either be reverted to the simple single-prefix example, or the notebook should be re-run and additional explanation added.

Copilot · 2026-02-11T09:10:26Z

tests/unit/test_sql_parameter_substitution.py

+    """Simulate the CURRENT buggy implementation for comparison.
+
+    This is the exact code from redisvl/query/sql.py lines 105-113.


The docstring comment on lines 15-18 states "This is the exact code from redisvl/query/sql.py lines 105-113", but those line numbers don't correspond to any code in the actual redisvl/query/sql.py file (which only has 159 lines total). This comment appears to reference a buggy implementation that may have existed during development but doesn't match the current code. The comment should either be removed or updated to clarify that this is a demonstration of what NOT to do, without referencing non-existent line numbers.

Suggested change

"""Simulate the CURRENT buggy implementation for comparison.

This is the exact code from redisvl/query/sql.py lines 105-113.

"""Simulate a naive, intentionally buggy implementation for comparison.

This mirrors the earlier str.replace-based approach used in SQLQuery and

is kept only to demonstrate incorrect behavior; it is not the production

implementation.

Copilot · 2026-02-11T09:10:27Z

tests/unit/test_sql_parameter_substitution.py

+import pytest
+
+from redisvl.query.sql import SQLQuery
+
+


Import of 'pytest' is not used.

Suggested change

import pytest

from redisvl.query.sql import SQLQuery

from redisvl.query.sql import SQLQuery

Copilot · 2026-02-11T09:10:27Z

examples/multi_prefix_example.py

+# Clean up any existing index
+try:
+    client.ft(INDEX_NAME).dropindex(delete_documents=True)
+except Exception:


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except Exception:

except Exception:

# The index may not exist yet or may have been removed already; ignore drop errors.

Copilot · 2026-02-11T09:10:27Z

tests/integration/test_multi_prefix.py

+    # Clean up any existing index
+    try:
+        client.ft(index_name).dropindex(delete_documents=True)
+    except Exception:


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except Exception:

except Exception:

# The index may not exist yet or may have already been removed; ignore

# cleanup errors so that test setup can proceed.

Copilot · 2026-02-11T09:10:28Z

tests/integration/test_multi_prefix.py

+    # Cleanup
+    try:
+        index.delete(drop=True)
+    except Exception:


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except Exception:

except Exception:

# Best-effort cleanup: ignore errors if the index is already deleted

# or the Redis connection is unavailable during test teardown.

Copilot · 2026-02-11T09:10:28Z

tests/integration/test_multi_prefix.py

+        finally:
+            try:
+                index.delete(drop=True)
+            except Exception:


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except Exception:

except Exception:

# Best-effort cleanup: ignore errors when deleting the index so tests

# do not fail due to teardown issues.

tylerhutcherson · 2026-02-15T09:49:21Z

@rbs333 Looks like this has leaked commits from the SQL work. Can you pare it back to the multi-prefix additions?

vishal-bala

No major comments from me aside a question about whether building the registry/executor for SQL translation is expensive. There are a few Copilot comments that I agreed should be addressed (left a 👍 on them) - aside from those, it generally looks good! Super nice to be able to have SQL queries translated 🙌

vishal-bala · 2026-02-16T12:54:32Z

redisvl/index/index.py

        registry = SchemaRegistry(self._redis_client)
        registry.load_all()  # Loads index schemas from Redis

        executor = Executor(self._redis_client, registry)


How involved are these to be run on every query? If they're non-negligible, do we want to cache/pre-load/lazy-load them?

rbs333 and others added 10 commits February 3, 2026 10:35

first working sql tests

cd11eec

test spreadsheet approach

5dbdc05

working primary reducers

d555f8c

support quantile

1c5dcdb

update example

3ee26e2

phrase wip

b0dc8ab

cut for 0.1.1

41e14a1

fix tests

7f3f5ff

prefix init support

ff5fedd

Copilot AI review requested due to automatic review settings February 11, 2026 09:04

Copilot started reviewing on behalf of rbs333 February 11, 2026 09:05 View session

Copilot AI reviewed Feb 11, 2026

View reviewed changes

vishal-bala reviewed Feb 16, 2026

View reviewed changes

	``pip install redisvl[sql]``
	``pip install redisvl[sql-redis]``

	"Install it with: pip install redisvl[sql]"
	"Install it with: pip install redisvl[sql-redis]"

		"""Simulate the CURRENT buggy implementation for comparison.

		This is the exact code from redisvl/query/sql.py lines 105-113.

-    """Simulate the CURRENT buggy implementation for comparison.
-    This is the exact code from redisvl/query/sql.py lines 105-113.
+    """Simulate a naive, intentionally buggy implementation for comparison.
+    This mirrors the earlier str.replace-based approach used in SQLQuery and
+    is kept only to demonstrate incorrect behavior; it is not the production
+    implementation.

	except Exception:
	except Exception:
	# The index may not exist yet or may have been removed already; ignore drop errors.

Conversation

rbs333 commented Feb 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

tylerhutcherson commented Feb 15, 2026

Uh oh!

vishal-bala left a comment

Choose a reason for hiding this comment

Uh oh!

vishal-bala Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants