Optimize re-sync content#7754
Conversation
Assisted By: Claude Opus 4.6
| def __init__(self, repo_version=None, deferred_fields=None, *args, **kwargs): | ||
| super().__init__(*args, **kwargs) | ||
| self._repo_version = repo_version | ||
| self._deferred_fields = deferred_fields or {} |
There was a problem hiding this comment.
Woudn't this be the opposite of "deferred"? It's the list of fields we want to query up-front for each type, right?
There was a problem hiding this comment.
Agree. The name I chose is misleading, changed to extra_fields
| cached = self._content_cache.get(model_type, {}).get(nat_key) | ||
| if cached is not None: | ||
| d_content.content = cached | ||
| cache_hits_by_type[model_type] |= Q(pk=cached.pk) |
There was a problem hiding this comment.
Is there a reason to do it this way instead of just collecting a list of PKs and passing pk__in=...?
(there could be, probably the constructed SQL is different, I just don't know)
There was a problem hiding this comment.
No, I just copied the logic from content_q_by_type. Changed
| cache_hits_by_type = defaultdict(lambda: Q(pk__in=[])) | ||
|
|
||
| for d_content in batch: | ||
| if d_content.content._state.adding: |
There was a problem hiding this comment.
Theoretically content could be passed through already saved, in which case I think we're probably not touch ing it. That might be an existing bug, though not a particularly serious one.
|
|
||
| for model_type, content_q in content_q_by_type.items(): | ||
| try: | ||
| await sync_to_async(model_type.objects.filter(content_q).touch)() |
There was a problem hiding this comment.
This is really an existing issue, but we're executing the content_q query (w/ natural keys) twice (once to touch, and once for the model swap), and also executing a touch query twice (once with cache hits, once with existing packages that were not in the latest version cache).
Is it possible to collect PKs within the loop, combine them with the cache hit PKs, and have one touch block below this loop?
Or (maybe question for @mdellweg), does that touch really need to happen before the swap for a timing-related reason?
|
Looks good! A few things to look at, maybe we can make this even more efficient |
📜 Checklist
See: Pull Request Walkthrough