mitre · Manvith03 · Dec 1, 2025 · Dec 7, 2025
diff --git a/python/django-filteredrelation-sqli-cve-2025-57833.md b/python/django-filteredrelation-sqli-cve-2025-57833.md
@@ -0,0 +1,156 @@
+# SQL Injection in Django `FilteredRelation` Aliases (CVE-2025-57833)
+**Author:** M V S Siva Sai Mourya Pasupuleti • **Course:** SWE/ISA-681 • **Institution:** GMU  
+**License:** CC-BY-4.0  
+**Issue:** https://github.com/mitre/secure-coding-case-studies/issues/19
+
+## Introduction
+This case study is on CVE-2025-57833, a SQL injection in Django’s `FilteredRelation` disclosed on September 3, 2025. It affects Django 4.2 (before 4.2.24), 5.1 (before 5.1.12), and 5.2 (before 5.2.6) and is rated High (CVSS 8.1) on NVD. The flaw occurs when `FilteredRelation` fails to validate column alias identifiers during dictionary expansion via `**kwargs` passed to `QuerySet.annotate()` or `QuerySet.alias()`, allowing attacker-controlled keys to shape generated SQL. This matters because it shows SQL injection can surface in a mature, security-focused framework not through tainted values, but through dynamic identifiers at an API boundary where developers often assume the ORM is fully protective. This case study emphasizes practical prevention: validate anything that can influence identifiers, default APIs to safe/constant naming, add static checks that flag risky dynamic SQL, use fuzz/property tests around query builders, and make reviews trace untrusted input into SQL structure so issues like this are caught before production.
+
+## Software
+**Name:** Django (Django Software Foundation)  
+**Language:** Python  
+**URL:** https://github.com/django/django  
+
+**Affected versions:** 4.2 < 4.2.24; 5.1 < 5.1.12; 5.2 < 5.2.6  
+**Fixes shipped:** 2025-09-03
+
+## Weakness
+**CWE-89: Improper Neutralization of Special Elements Used in an SQL Command (SQL Injection)**
+
+SQL injection happens when an attacker slips SQL into inputs the app trusts, so that input gets treated as part of the query instead of data. If it works, they can read sensitive rows, change or delete data, run admin actions on the database, and sometimes even reach the underlying OS. It is one type of injection attack, but the idea is the same: untrusted input ends up controlling SQL structure, not just values, so the database executes something the developer never intended.
+
+This weakness shows up when the app builds part of an SQL statement from outside input and does not neutralize characters that can change the query’s structure. Most people focus on sanitizing values, but identifiers (like column or alias names) are also part of SQL, and if user input controls them, an attacker can slip in tokens (quotes, operators, keywords) that change how the query is parsed. In frameworks that support dynamic parameters, for example expanding a dictionary into a query, missing checks on those identifier-like inputs can bypass the safety of parameterized queries. Parameterization protects values; it does not protect the SQL syntax itself.
+
+### Generic example
+In a typical login, the app builds a query like:
+```sql
+SELECT * FROM Users WHERE Username='$username' AND Password='$password';
+and checks if any row comes back. If an attacker types 1' or '1'='1 for both fields, the final SQL becomes:
+
+... WHERE Username='1' OR '1'='1' AND Password='1' OR '1'='1'
+Because '1'='1' is always true and operator precedence still leaves at least one true branch, the whole WHERE clause evaluates to true and the database returns a row, even without a real username or password. That is the core of SQL injection: untrusted input slips into the structure of the query and flips your security check.
+```
+## Vulnerability
+### CVE-2025-57833 — SQL injection in Django FilteredRelation column aliases
+
+An issue exists in Django 4.2 (before 4.2.24), 5.1 (before 5.1.12), and 5.2 (before 5.2.6) where FilteredRelation is subject to SQL injection in column aliases when a crafted dictionary is expanded via **kwargs passed to QuerySet.annotate() or QuerySet.alias().
+
+Where the bug exists (plain English). Inside the ORM, a queryset can apply a FilteredRelation and accept alias names coming from caller-supplied **kwargs. In the affected versions, that alias was used to build SQL without validating the alias token first. If application code lets untrusted input influence that alias, special characters or keywords in the alias can change the shape of the generated SQL. Parameter binding does not help here because it protects values, not identifiers.
+
+Minimal vulnerable code path (pre-fix).
+File: django/db/models/sql/query.py
+Function: Query.add_filtered_relation(...)
+```python
+def add_filtered_relation(self, filtered_relation, alias):
+    filtered_relation.alias = alias   # no validation in affected versions
+    lookups = dict(get_children_from_q(filtered_relation.condition))
+    relation_lookup_parts, relation_field_parts, _ = self.solve_lookup_type(
+        filtered_relation.relation_name
+    )
+```
+Why this is vulnerable. Assigning alias directly allows untrusted keys (from **kwargs to annotate() or alias()) to become SQL identifiers, which can alter query structure and bypass normal parameterization safeguards.
+
+## Exploit
+### CAPEC-66: SQL Injection
+
+How it is exploited. An attacker sends a crafted request so a user-controlled key ends up being used as an alias name in the ORM call. When the app expands that dictionary via **kwargs into QuerySet.annotate() or QuerySet.alias(), the alias is stitched into the SQL without validation. Because that input is treated as part of the SQL identifier (syntax) instead of a bound value, normal parameterization does not help; the injected characters change how the database parses the query.
+
+Minimal, representative input (what the attacker controls).
+
+HTTP parameter (attacker-supplied):
+
+```ini
+metric = total_sales' ) /*
+Application pattern (conceptual):
+```
+```python
+
+alias_map = { metric: Sum("price") }   # metric comes from the request
+queryset = Orders.objects.annotate(**alias_map)
+```
+Here, the key of the dict (metric) becomes the alias identifier. If that key is not validated against a safe pattern, special characters in it can leak into the generated SQL and change query structure.
+
+## Fix
+What changed (at a glance). Django now validates the alias before use. In django/db/models/sql/query.py, the add_filtered_relation(...) function was updated to run a guard that checks the alias up front; only then is the alias assigned and used to build SQL.
+
+Fixed source code (minimal).
+File: django/db/models/sql/query.py
+Function: Query.add_filtered_relation(...)
+
+```python
+# after the security fix
+def add_filtered_relation(self, filtered_relation, alias):
+    self.check_alias(alias)          # validate alias early
+    filtered_relation.alias = alias  # assign only after validation
+    lookups = dict(get_children_from_q(filtered_relation.condition))
+    relation_lookup_parts, relation_field_parts, _ = self.solve_lookup_type(
+        filtered_relation.relation_name
+    )
+```
+Walk-through.
+
+self.check_alias(alias) enforces a safe identifier format and rejects anything that could tamper with SQL syntax. Putting it first blocks attacker-controlled keys from ever becoming raw SQL identifiers.
+
+filtered_relation.alias = alias is the same assignment as before, but now it runs only after the alias passes validation, so later SQL construction cannot be bent by untrusted characters.
+
+Everything else in the function stays the same and is not part of the fix; by the time those execute, the alias is already safe.
+
+Why this fixes the bug. Previously the code trusted the alias and embedded it into SQL without checks. The new first line adds a deny-by-default gate for alias names, so invalid tokens never reach SQL building. That closes the identifier-context injection path while keeping normal behavior for valid, developer-chosen aliases.
+
+## Prevention
+***First***, treat identifiers (alias, column, table names) as untrusted, just like values. Put a strict rule in front of anything that could become an identifier (for example, only letters, digits, and underscores, and keep it length-limited) and reject or normalize everything else. The goal is to validate alias names before they ever reach the ORM so user input cannot shape SQL syntax.
+
+***Second***, avoid dynamic aliasing by default. If you are using **kwargs to build queries, make alias names constants or pick them from a small, vetted map you control (for example: { "total": Sum("price"), "count": Count("id") }). If you truly need dynamic aliases, route them through the same validator as above, using one central helper so every call goes through the same gate.
+
+***Third***(, add automated checks that look specifically for this pattern. In CI, use a linter, CodeQL, or Semgrep rule to flag calls like annotate(**expr) or alias(**expr) when expr comes from request data or free-form dictionaries. A simple team rule also works: if **kwargs feed query builders, there must be a nearby call to a validate_alias() helper (or use a documented constant map). Fail the build when the rule is violated.
+
+***Fourth***, write a small property test or fuzzer for your query layer. Generate alias keys with punctuation, quotes, unicode, reserved words, and comment markers, and assert your code rejects or normalizes them before SQL is produced. Keep a few seeds (', ", --, /* */, ;, SELECT, FROM) so this becomes a regression test that proves the fix and stays green.
+
+Most parameterized-statement systems can bind values but cannot bind identifiers (column/alias/table names). If your API accepts arbitrary identifiers, you cannot rely on parameterization to keep you safe—you must validate or map those identifiers before constructing SQL. Prefer mapping user choices to a small allowlist of constant identifiers; if you must accept free-form names, enforce a strict identifier regex (e.g., ^[A-Za-z_][A-Za-z0-9_]{0,63}$) and reject anything that does not match before concatenation.
+
+Finally, lock in process and hygiene: add two checklist lines to code review (“Does any untrusted input influence identifiers?” and “If **kwargs are used, are keys validated or chosen from a fixed map?”). Turn on dependency updates and prioritize security releases (for example, Django 4.2.24, 5.1.12, 5.2.6). Together, these steps validate identifiers up front, keep dynamic aliasing under control, catch mistakes automatically, and verify through tests and CI that this class of bug cannot sneak back in.
+
+## Conclusion
+This case study walked through a SQL injection in Django’s ORM where alias names from **kwargs could flow into FilteredRelation and end up as SQL identifiers without any validation. It affected the supported 4.2, 5.1, and 5.2 branches before the September 2025 security releases. The core issue was untrusted input shaping SQL structure through identifiers, which normal parameter binding does not protect.
+
+The fix was simple and effective: validate the alias first, then use it. The takeaways are just as direct: treat identifiers as untrusted, avoid dynamic aliasing unless it is validated, add automated checks, keep a small property test or fuzzer around query builders, use review checklists, and stay current with security releases. Do those consistently, and this class of bug becomes much easier to catch and much harder to ship.
+
+## References
+http://www.openwall.com/lists/oss-security/2025/09/03/3
+
+https://docs.djangoproject.com/en/dev/releases/security/
+
+https://groups.google.com/g/django-announce
+
+https://lists.debian.org/debian-lts-announce/2025/09/msg00017.html
+
+https://www.djangoproject.com/weblog/2025/sep/03/security-releases/
+
+https://nvd.nist.gov/vuln/detail/CVE-2025-57833
+
+### Fix references:
+
+Upstream (main) commit: https://github.com/django/django/commit/51711717098d3f469f795dfa6bc3758b24f69ef7
+
+Backport 5.2: https://github.com/django/django/commit/4c044fcc866ec226f612c475950b690b0139d243
+
+Backport 5.1: https://github.com/django/django/commit/102965ea93072fe3c39a30be437c683ec1106ef5
+
+Backport 4.2: https://github.com/django/django/commit/31334e6965ad136a5e369993b01721499c5d1a92
+
+## Contributions
+Authored by M V S Siva Sai Mourya Pasupuleti for SWE/ISA-681 (GMU).
+
+Researched CVE-2025-57833 and verified details against NVD and Django security advisories.
+
+Analyzed the vulnerable path in django/db/models/sql/query.py (Query.add_filtered_relation), wrote the minimal pre-fix snippet, and explained the mechanism.
+
+Demonstrated a representative exploit input, and mapped it to CWE-89 and CAPEC-66.
+
+Reviewed the upstream patch and branch backports; summarized the exact fix.
+
+Wrote prevention guidance and curated references.
+
+GitHub proposal issue: https://github.com/mitre/secure-coding-case-studies/issues/19
+
+Released under CC-BY-4.0.