gh-143198: fix some crashes in sqlite3.executemany with concurrent mutations#143210
gh-143198: fix some crashes in sqlite3.executemany with concurrent mutations#143210picnixz wants to merge 27 commits intopython:mainfrom
sqlite3.executemany with concurrent mutations#143210Conversation
…nt parameter iterator
Misc/NEWS.d/next/Library/2025-12-27-10-36-18.gh-issue-143198.DdIHyC.rst
Outdated
Show resolved
Hide resolved
sqlite3.execute[many] with re-entrant parameter iteratorsqlite3.execute[many] with concurrent mutations
sqlite3.execute[many] with concurrent mutationssqlite3.executemany with concurrent mutations
|
There were more UAFs that I could find... |
sqlite3.executemany with concurrent mutationssqlite3.executemany with concurrent mutations
|
Ok, let's avoid force-pushing more. I force pushed because I wanted a single commit for the follow-up PR |
This is not always possible and it's more prone to errors in the future. The connection may be closed but we would check it way later. I tried to only add checks when the code in question never uses the sqlite3 API (e.g., I can't make the diff smaller though. It's the minimal diff on the C code I can do. |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Thank you for minimizing the diff, @picnixz.
It seems that most new checks are not covered by tests. At least tests were passed when I commented out them one by one. Maybe more tests are needed?
It seems that neither bind_param(), nor bind_parameters() use self->connection->db, so the check can be added only after the call of bind_parameters(). Maybe much later -- immediately before sqlite3_changes() and sqlite3_last_insert_rowid().
I wonder, what happens when we close the connection in one of numerous callbacks? Then self->connection->db can be set NULL after calling some SQLite C API, not only after calling some Python C API.
Misc/NEWS.d/next/Library/2025-12-27-10-36-18.gh-issue-143198.DdIHyC.rst
Outdated
Show resolved
Hide resolved
Modules/_sqlite/cursor.c
Outdated
| num_params = PySequence_Size(parameters); | ||
| if (num_params == -1) { | ||
| return; | ||
| if (num_params == -1 && PyErr_Occurred()) { |
There was a problem hiding this comment.
Isn't PyErr_Occurred redundant?
There was a problem hiding this comment.
This is what I thought but, in release mode the check is not performed. You can have a __len__ that returns -1 without setting an exception:
PySequenceMethods *m = Py_TYPE(s)->tp_as_sequence;
if (m && m->sq_length) {
Py_ssize_t len = m->sq_length(s);
assert(_Py_CheckSlotResult(s, "__len__", len >= 0));
return len;
}There was a problem hiding this comment.
Actually, you're right. I thought we directly called __len__ but it's warpped as follows:
PyObject *res = vectorcall_method(&_Py_ID(__len__), stack, 1);
Py_ssize_t len;
if (res == NULL)
return -1;
Py_SETREF(res, _PyNumber_Index(res));
if (res == NULL)
return -1;So returning -1 always implies that there is an exception set here. I'll remove that check.
There was a problem hiding this comment.
We can still get -1 for the extension class. But this is a bug in the extension.
There was a problem hiding this comment.
Do you want me to re-add the check though?
There was a problem hiding this comment.
Because on DEBUG builds there's some assert that would be crashing if the extension class does something bad
|
@serhiy-storchaka I've extensively added test cases for the adpater's part. I didn't check all |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Thank you for adding more tests.
I think that all checks in bind_parameters() can be replaced with a single check after bind_parameters(). All tests will pass except the one with broken __getitem__(), which is expected and not related to this issue. It will pass after fixing __getitem__().
| if (current_param == NULL) { | ||
| return -1; | ||
| } | ||
| else if (!pysqlite_check_connection(conn)) { |
There was a problem hiding this comment.
Tests are passed after deleting this check.
| return; | ||
| return -1; | ||
| } | ||
| else if (!pysqlite_check_connection(conn)) { |
|
|
||
| rc = bind_param(state, self, i + 1, adapted); | ||
| Py_DECREF(adapted); | ||
| if (!pysqlite_check_connection(conn)) { |
There was a problem hiding this comment.
Because this check covers them. And it can be moved out of the loop, and even out of bind_parameters().
|
The minimal fix: diff --git a/Modules/_sqlite/cursor.c b/Modules/_sqlite/cursor.c
index 4611c9e5e3e..9a7e3d7164e 100644
--- a/Modules/_sqlite/cursor.c
+++ b/Modules/_sqlite/cursor.c
@@ -881,6 +881,9 @@ _pysqlite_query_execute(pysqlite_Cursor* self, int multiple, PyObject* operation
}
}
+ if (!pysqlite_check_connection(self->connection)) {
+ goto error;
+ }
PyObject *stmt = get_statement_from_cache(self, operation);
Py_XSETREF(self->statement, (pysqlite_Statement *)stmt);
if (!self->statement) {
@@ -930,6 +933,9 @@ _pysqlite_query_execute(pysqlite_Cursor* self, int multiple, PyObject* operation
if (PyErr_Occurred()) {
goto error;
}
+ if (!pysqlite_check_connection(self->connection)) {
+ goto error;
+ }
rc = stmt_step(self->statement->st);
if (rc != SQLITE_DONE && rc != SQLITE_ROW) { |
|
I think I'll need to check with more than 1 parameter as well. I cannot say for sure that checks are all redundant because my table has a single parameter to bind. But if binding doesn't require the connection at all, it could be possible to be safe and reduce the amount of checks. |
|
Ok, I'm now back to business with CPython. I will likely work on this one later today or tomorrow. I have some other shorter PRs that I want to address before this one. |
FTR, the reason why
executeis actually not affected is that the iterator we are taking is built from a list we create ourselves so it's not possible to have "bad" side effects._sqlitecursor cache via re-entrant parameter iterator #143198