fix event compaction to nonblocking #4326

lwangverizon · 2026-01-30T19:43:00Z

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

N/A (No existing issue)

2. Or, if no issue exists, describe the change:

Problem:
Event compaction was running synchronously and blocking runner.run_async() exit, causing significant delays on the frontend. When compaction was enabled, the async generator would not complete until compaction finished, which could take several seconds because compaction involves:

LLM API calls for event summarization (maybe_summarize_events) - typically taking 1-3 seconds per compaction
Database writes (append_event) - adding additional latency

Impact:
Even though all agent events had already been yielded to the frontend, the generator would not complete until compaction finished. This meant:

Frontend had to wait for compaction to complete before receiving the completion signal
User-perceived latency increased by the compaction duration (often 1-3+ seconds)
Poor user experience, especially noticeable in interactive applications
Compaction, intended as a background maintenance task, was blocking user-facing responses

Additional Issue:
Under high concurrency scenarios, there was no mechanism to limit concurrent compaction tasks, which could lead to:

Resource exhaustion (too many concurrent LLM API calls hitting rate limits)
Database connection pool exhaustion
Unbounded background task accumulation
Potential service degradation under load

Solution:
This PR introduces a solution to make event compaction truly non-blocking, improving application performance by eliminating the blocking delay before runner.run_async() exits.

Made compaction non-blocking: Changed compaction from synchronous await to an asynchronous background task using asyncio.create_task(). This allows:
- The generator to complete immediately after yielding all events
- Frontend to receive the completion signal without waiting for compaction
- Compaction to run asynchronously in the background without blocking user-facing responses
- Performance improvement: Eliminates 1-3+ second delays caused by LLM calls during compaction
Added concurrency control: Introduced a configurable max_concurrent_compactions parameter (default: 10) to the Runner class that uses a semaphore to limit concurrent compaction tasks. This prevents:
- Resource exhaustion under high concurrency
- LLM API rate limit violations
- Database connection pool exhaustion
- Unbounded background task accumulation
Improved error handling: Wrapped compaction in comprehensive error handling so failures:
- Don't crash the runner
- Are logged appropriately for debugging
- Don't affect user responses
Updated documentation: Updated docstrings to accurately reflect that compaction runs asynchronously and no longer blocks generator completion.

Key Improvement:
The solution transforms compaction from a blocking operation (that delayed frontend responses) into a truly asynchronous background task, significantly improving application responsiveness while maintaining all compaction functionality.

The solution maintains backward compatibility (default behavior works for most scenarios) while providing fine-grained control for production environments with different resource constraints.

Testing Plan

Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.

Test Results:

tests/unittests/test_runners.py::TestRunnerCompaction::test_max_concurrent_compactions_default_value PASSED
tests/unittests/test_runners.py::TestRunnerCompaction::test_max_concurrent_compactions_custom_value PASSED
tests/unittests/test_runners.py::TestRunnerCompaction::test_max_concurrent_compactions_shared_across_instances PASSED
tests/unittests/test_runners.py::TestRunnerCompaction::test_max_concurrent_compactions_validation PASSED
tests/unittests/test_runners.py::TestRunnerCompaction::test_compaction_runs_in_background_non_blocking PASSED
tests/unittests/test_runners.py::TestRunnerCompaction::test_compaction_semaphore_limits_concurrency PASSED
tests/unittests/test_runners.py::TestRunnerCompaction::test_compaction_error_does_not_block_generator PASSED
tests/unittests/test_runners.py::TestRunnerCompaction::test_compaction_not_run_when_config_missing PASSED

======================== 8 passed, 6 warnings in 2.14s =========================

Test Coverage:

Configuration: Default values, custom values, validation, shared semaphore
Non-blocking behavior: Generator completes before compaction finishes
Concurrency control: Semaphore limits concurrent compactions
Error handling: Compaction errors don't block generator
Edge cases: Missing config, timing verification

Manual End-to-End (E2E) Tests:

Setup:

Create an app with event compaction enabled:

from google.adk import Agent
from google.adk.apps import App
from google.adk.apps.app import EventsCompactionConfig
from google.adk.runners import Runner

app = App(
    name='test_app',
    root_agent=Agent(model='gemini-2.0-flash', name='test_agent'),
    events_compaction_config=EventsCompactionConfig(
        compaction_interval=2,
        overlap_size=1,
    ),
)

# Test with default limit
runner = Runner(
    app=app,
    session_service=session_service,
    artifact_service=artifact_service,
)

# Test with custom limit
runner_custom = Runner(
    app=app,
    session_service=session_service,
    artifact_service=artifact_service,
    max_concurrent_compactions=5,
)

Manual Testing Steps:

Non-blocking behavior: Run multiple invocations and observe that the generator completes immediately (within milliseconds) while compaction runs in the background. Verify frontend receives completion signal without delay.
Concurrency limiting: Under high load (multiple concurrent requests), verify that compaction tasks are limited by the semaphore. Monitor resource usage (LLM API calls, DB connections) to ensure they don't exceed limits.
Error handling: Simulate compaction failures (e.g., network errors) and verify that:
- Generator still completes successfully
- Errors are logged but don't crash the runner
- User responses are not affected
Configuration validation: Test invalid max_concurrent_compactions values (0, -1) and verify ValueError is raised.

Expected Results:

Generator completes immediately after yielding all events
Compaction runs in background without blocking
Frontend receives responses without delay
Concurrent compactions are limited by semaphore
Errors are handled gracefully

Sample Code:
See contributing/samples/compaction_config_example/ for complete examples demonstrating the new features.

Checklist

I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.

Additional context

Files Changed:

src/google/adk/runners.py: Added non-blocking compaction, semaphore-based concurrency control, and max_concurrent_compactions parameter
tests/unittests/test_runners.py: Added comprehensive test suite (TestRunnerCompaction class with 8 tests)
contributing/samples/compaction_config_example/: Added sample code demonstrating the new features

Key Implementation Details:

Background Task: Compaction runs via asyncio.create_task() with error handling
Semaphore: Class-level semaphore shared across all Runner instances for global concurrency control
Default Value: 10 concurrent compactions (reasonable default for most scenarios)
Validation: Parameter must be positive integer (raises ValueError if invalid)

Performance Impact:

Positive: Frontend responsiveness significantly improved (generator completes immediately)
Positive: Resource usage controlled under high concurrency
Neutral: Compaction still happens, just asynchronously
Minimal: Semaphore overhead is negligible

Backward Compatibility:

Fully backward compatible: Default behavior works for existing code
Optional parameter: max_concurrent_compactions is optional with sensible default
No API changes: Only adds optional parameter, doesn't change existing behavior

Production Recommendations:

Default (10): Works well for most scenarios
Low-traffic: 5-10 concurrent compactions
High-traffic: 15-20 concurrent compactions (ensure infrastructure can handle)
Resource-constrained: 2-5 concurrent compactions

…n-blocking-runner-exist

gemini-code-assist · 2026-01-30T19:43:19Z

Summary of Changes

Hello @lwangverizon, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the performance and stability of the application by transforming event compaction from a blocking operation into an asynchronous background process. Previously, compaction, which involves LLM API calls and database writes, caused noticeable delays for users. The changes introduce concurrency control to manage resource usage under high load and robust error handling to prevent service disruptions, ultimately leading to a more responsive and resilient user experience.

Highlights

Non-blocking Event Compaction: Event compaction now runs as an asynchronous background task using asyncio.create_task(), preventing delays in frontend responses and allowing the generator to complete immediately.
Concurrency Control: A new max_concurrent_compactions parameter (default: 10) and a shared asyncio.Semaphore have been introduced to limit concurrent compaction tasks, mitigating resource exhaustion and API rate limit issues.
Improved Error Handling: Compaction failures are now gracefully handled and logged, ensuring they do not crash the runner or affect user-facing responses.
Updated Documentation: Docstrings for the run and run_async methods have been updated to accurately reflect the non-blocking nature of event compaction.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request is a great improvement, making event compaction non-blocking and adding concurrency controls. The changes significantly enhance performance and robustness by moving the synchronous compaction process to a background task, managed by a semaphore to prevent resource exhaustion. The implementation is well-thought-out, with good error handling and comprehensive tests. I've included a few suggestions to further improve robustness, such as ensuring background tasks are not prematurely garbage collected and enhancing thread safety. Overall, this is an excellent contribution.

src/google/adk/runners.py

tests/unittests/test_runners.py

lwangverizon added 2 commits January 30, 2026 11:08

fix event compaction to nonblocking

6fc702d

Merge remote-tracking branch 'upstream/main' into fix/event-compactio…

cda914e

…n-blocking-runner-exist

adk-bot added the core [Component] This issue is related to the core interface and implementation label Jan 30, 2026

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

addressing feedback

18f4da5

lwangverizon marked this pull request as ready for review January 30, 2026 20:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix event compaction to nonblocking #4326

fix event compaction to nonblocking #4326

lwangverizon commented Jan 30, 2026

Uh oh!

gemini-code-assist bot commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix event compaction to nonblocking #4326

Are you sure you want to change the base?

fix event compaction to nonblocking #4326

Conversation

lwangverizon commented Jan 30, 2026

Link to Issue or Description of Change

Testing Plan

Checklist

Additional context

Uh oh!

gemini-code-assist bot commented Jan 30, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants