-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix(platform): prevent qq_official duplicate message consumption (#5848) #6519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
001VIsir
wants to merge
5
commits into
AstrBotDevs:master
Choose a base branch
from
001VIsir:visir_fix
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+553
−3
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
80f8f88
fix(platform): prevent qq_official duplicate message consumption (#5848)
001VIsir e35a26b
fix(platform):address-dedup-review-feedback-for-qqofficial
001VIsir ea2d9aa
fix(platform):streamline-dedup-architecture-and-logging
001VIsir 7bd3ca4
refactor(platform):harden-dedup-and-extract-event-deduplicator
001VIsir 1851e16
Merge remote-tracking branch 'upstream/master' into visir_fix
001VIsir File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| from astrbot.core import logger | ||
| from astrbot.core.message.utils import ( | ||
| build_content_dedup_key, | ||
| build_message_id_dedup_key, | ||
| ) | ||
| from astrbot.core.utils.ttl_registry import TTLKeyRegistry | ||
|
|
||
| from .platform import AstrMessageEvent | ||
|
|
||
|
|
||
| class EventDeduplicator: | ||
| def __init__(self, ttl_seconds: float = 0.5) -> None: | ||
| self._registry = TTLKeyRegistry(ttl_seconds=ttl_seconds) | ||
|
|
||
| def is_duplicate(self, event: AstrMessageEvent) -> bool: | ||
| if self._registry.ttl_seconds == 0: | ||
| return False | ||
|
|
||
| message_id_key = self._build_message_id_key(event) | ||
| if message_id_key is not None: | ||
| if self._registry.contains(message_id_key): | ||
| logger.debug( | ||
| "Skip duplicate event in event_bus (by message_id): umo=%s, sender=%s", | ||
| event.unified_msg_origin, | ||
| event.get_sender_id(), | ||
| ) | ||
| return True | ||
| self._registry.add(message_id_key) | ||
|
|
||
| content_key = self._build_content_key(event) | ||
| if self._registry.contains(content_key): | ||
| logger.debug( | ||
| "Skip duplicate event in event_bus (by content): umo=%s, sender=%s", | ||
| event.unified_msg_origin, | ||
| event.get_sender_id(), | ||
| ) | ||
| if message_id_key is not None: | ||
| self._registry.discard(message_id_key) | ||
| return True | ||
|
|
||
| self._registry.add(content_key) | ||
| return False | ||
|
|
||
| @staticmethod | ||
| def _build_content_key(event: AstrMessageEvent) -> str: | ||
| return build_content_dedup_key( | ||
| platform_id=str(event.get_platform_id() or ""), | ||
| unified_msg_origin=str(event.unified_msg_origin or ""), | ||
| sender_id=str(event.get_sender_id() or ""), | ||
| text=str(event.get_message_str() or ""), | ||
| components=event.get_messages(), | ||
| ) | ||
|
|
||
| @staticmethod | ||
| def _build_message_id_key(event: AstrMessageEvent) -> str | None: | ||
| message_id = getattr(event.message_obj, "message_id", "") or getattr( | ||
| event.message_obj, | ||
| "id", | ||
| "", | ||
| ) | ||
| return build_message_id_dedup_key( | ||
| platform_id=str(event.get_platform_id() or ""), | ||
| unified_msg_origin=str(event.unified_msg_origin or ""), | ||
| message_id=str(message_id or ""), | ||
| ) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,101 @@ | ||
| """Message utilities for deduplication and component handling.""" | ||
|
|
||
| import hashlib | ||
| from collections.abc import Iterable | ||
|
|
||
| from astrbot.core.message.components import BaseMessageComponent, File, Image | ||
|
|
||
| _MAX_RAW_TEXT_FINGERPRINT_LEN = 256 | ||
|
|
||
|
|
||
| def build_component_dedup_signature( | ||
| components: Iterable[BaseMessageComponent], | ||
| ) -> str: | ||
| """Build a deduplication signature from message components. | ||
|
|
||
| This function extracts unique identifiers from Image and File components | ||
| and creates a hash-based signature for deduplication purposes. | ||
|
|
||
| Args: | ||
| components: An iterable of message components to analyze. | ||
|
|
||
| Returns: | ||
| A SHA1 hash (16 hex characters) representing the component signatures, | ||
| or an empty string if no valid components are found. | ||
| """ | ||
| parts: list[str] = [] | ||
| for component in components: | ||
| if isinstance(component, Image): | ||
| # Image can have url, file, or file_unique | ||
| ref = component.url or component.file or component.file_unique or "" | ||
| if ref: | ||
| parts.append(f"img:{ref}") | ||
| elif isinstance(component, File): | ||
| # File can have url, file (via property), or name | ||
| ref = component.url or component.file or component.name or "" | ||
| if ref: | ||
| parts.append(f"file:{ref}") | ||
| # Future component types can be added here | ||
|
|
||
| if not parts: | ||
| return "" | ||
|
|
||
| payload = "|".join(parts) | ||
| return hashlib.sha1(payload.encode("utf-8")).hexdigest()[:16] | ||
|
|
||
|
|
||
| def build_sender_content_dedup_key(content: str, sender_id: str) -> str | None: | ||
| """Build a sender+content hash key for short-window deduplication.""" | ||
| if not (content and sender_id): | ||
| return None | ||
| content_hash = hashlib.sha1(content.encode("utf-8")).hexdigest()[:16] | ||
| return f"{sender_id}:{content_hash}" | ||
|
|
||
|
|
||
| def build_content_dedup_key( | ||
| *, | ||
| platform_id: str, | ||
| unified_msg_origin: str, | ||
| sender_id: str, | ||
| text: str, | ||
| components: Iterable[BaseMessageComponent], | ||
| ) -> str: | ||
| """Build a content fingerprint key for event deduplication.""" | ||
| msg_text = str(text or "").strip() | ||
| if len(msg_text) <= _MAX_RAW_TEXT_FINGERPRINT_LEN: | ||
| msg_sig = msg_text | ||
| else: | ||
| msg_hash = hashlib.sha1(msg_text.encode("utf-8")).hexdigest()[:16] | ||
| msg_sig = f"h:{len(msg_text)}:{msg_hash}" | ||
|
|
||
| attach_sig = build_component_dedup_signature(components) | ||
| return "|".join( | ||
| [ | ||
| "content", | ||
| str(platform_id or ""), | ||
| str(unified_msg_origin or ""), | ||
| str(sender_id or ""), | ||
| msg_sig, | ||
| attach_sig, | ||
| ] | ||
| ) | ||
|
|
||
|
|
||
| def build_message_id_dedup_key( | ||
| *, | ||
| platform_id: str, | ||
| unified_msg_origin: str, | ||
| message_id: str, | ||
| ) -> str | None: | ||
| """Build a message_id fingerprint key for event deduplication.""" | ||
| normalized_message_id = str(message_id or "") | ||
| if not normalized_message_id: | ||
| return None | ||
| return "|".join( | ||
| [ | ||
| "message_id", | ||
| str(platform_id or ""), | ||
| str(unified_msg_origin or ""), | ||
| normalized_message_id, | ||
| ] | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Event deduplication is disabled only when TTL is exactly 0; it might be safer to treat all non-positive TTLs as disabled.
Here
is_duplicateshort-circuits only whenself._registry.ttl_seconds == 0. Even thoughsafe_positive_floatshould prevent negative values, using<= 0would be more robust against unexpected config or future parsing changes, and would align with_clean_expired, which already treatsttl_seconds <= 0as disabling cleanup.