-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
The Relevancy Task is being a bad boy right now, routinely erroring with integrity errors. Here is the most recent one:
sqlalchemy.exc.IntegrityError: (sqlalchemy.dialects.postgresql.asyncpg.IntegrityError) <class 'asyncpg.exceptions.UniqueViolationError'>: duplicate key value violates unique constraint "url_task_error_pkey"
DETAIL: Key (url_id, task_type)=(4142, Relevancy) already exists.
[SQL: INSERT INTO url_task_error (task_type, error, url_id, task_id) VALUES ($1::task_type, $2::VARCHAR, $3::INTEGER, $4::INTEGER), ($5::task_type, $6::VARCHAR, $7::INTEGER, $8::INTEGER), ($9::task_type, $10::VARCHAR, $11::INTEGER, $12::INTEGER), ($13::tas ... 66696 characters truncated ... $4000::INTEGER) RETURNING url_task_error.created_at, url_task_error.url_id, url_task_error.task_type]
[parameters: ('Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 4142, 30049, 'Relevancy', 'Server disconnected', 4150, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 4171, 30049, 'Relevancy', 'Server disconnected', 4215, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 4238, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 4708, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 4729, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 4766, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 4792, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 7201, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 7202, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 7208, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'" ... 3900 parameters truncated ... 12519, 30049, 'Relevancy', 'Server disconnected', 12520, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 12521, 30049, 'Relevancy', 'Server disconnected', 12522, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 12523, 30049, 'Relevancy', 'Server disconnected', 12524, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 12525, 30049, 'Relevancy', 'Server disconnected', 12526, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 12527, 30049, 'Relevancy', 'Server disconnected', 12528, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 12529, 30049, 'Relevancy', 'Server disconnected', 12530, 30049, 'Relevancy', "400, message='Bad Request', url='https://erjp42mzm4k2tyn1.us-east-1.aws.endpoints.huggingface.cloud'", 12531, 30049)]
(Background on this error at: https://sqlalche.me/e/20/gkpj)
So there's a few things going on here:
- There's a bunch of Bad Requests, which are sub-optimal!
- There's errors when these URL Task Errors are being run
- Additionally, it is possible that this is producing a memory leak, as described in Investigate App Memory Leak #569
For the moment, I've disabled this task, via the URL_AUTO_RELEVANCE_TASK_FLAG environment variable. That should also help us zero in on whether it's the source of the memory leak. 1
Footnotes
-
My suspicion is that it's a plausible candidate, as it's a third party library that we don't know the innards of, and might not be optimized for this sort of start-stop functionality. We can also look into whether upgrading the library would help. ↩
Metadata
Metadata
Assignees
Labels
No labels