|
| 1 | +# IndexSyncAgent |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +The `IndexSyncAgent` is a background maintenance agent that periodically synchronizes project indexing status between the vector database/index storage and the application database. It ensures that the web UI displays accurate and up-to-date information about project indexing progress. |
| 6 | + |
| 7 | +## English Prompt |
| 8 | + |
| 9 | +The agent is guided by the following prompt (embedded in the agent as a constant): |
| 10 | + |
| 11 | +``` |
| 12 | +You are the Index Synchronization Agent for PicoCode. |
| 13 | +
|
| 14 | +Your responsibilities: |
| 15 | +1. Monitor all registered projects in the system |
| 16 | +2. Check the actual indexing status by inspecting project databases and storage |
| 17 | +3. Reconcile differences between stored status and actual status |
| 18 | +4. Update project metadata with accurate status information |
| 19 | +5. Track indexing progress, completion times, and version information |
| 20 | +6. Report any errors or inconsistencies found during reconciliation |
| 21 | +
|
| 22 | +Guidelines: |
| 23 | +- Run reconciliation at regular intervals (configurable) |
| 24 | +- Be efficient - only update when status has changed |
| 25 | +- Handle errors gracefully and log them appropriately |
| 26 | +- Never block the main application thread |
| 27 | +- Provide clear status information for the web UI |
| 28 | +
|
| 29 | +Status values: |
| 30 | +- "created": Project registered but not yet indexed |
| 31 | +- "indexing": Indexing in progress |
| 32 | +- "ready": Indexing completed successfully |
| 33 | +- "error": Indexing failed or error detected |
| 34 | +``` |
| 35 | + |
| 36 | +## Features |
| 37 | + |
| 38 | +- **Automatic Status Reconciliation**: Periodically checks actual indexing status against stored status |
| 39 | +- **Background Operation**: Runs in a daemon thread, non-blocking for main application |
| 40 | +- **Configurable Interval**: Adjustable reconciliation interval (default: 30 seconds, minimum: 5 seconds) |
| 41 | +- **Graceful Shutdown**: Clean thread termination on application shutdown |
| 42 | +- **Error Handling**: Robust error handling with logging |
| 43 | +- **Enable/Disable**: Can be disabled via configuration |
| 44 | +- **Status Reporting**: Provides status information via health endpoint |
| 45 | + |
| 46 | +## Configuration |
| 47 | + |
| 48 | +The agent can be configured via environment variables in your `.env` file: |
| 49 | + |
| 50 | +```bash |
| 51 | +# Enable/disable the agent (default: true) |
| 52 | +INDEX_SYNC_ENABLED=true |
| 53 | + |
| 54 | +# Reconciliation interval in seconds (default: 30, minimum: 5) |
| 55 | +INDEX_SYNC_INTERVAL=30 |
| 56 | +``` |
| 57 | + |
| 58 | +## Usage |
| 59 | + |
| 60 | +### Automatic Start/Stop |
| 61 | + |
| 62 | +The agent automatically starts when the FastAPI application starts (if enabled) and stops when the application shuts down. No manual intervention is required. |
| 63 | + |
| 64 | +```python |
| 65 | +# In main.py, the agent is automatically managed: |
| 66 | +# - Started during application lifespan startup |
| 67 | +# - Stopped during application lifespan shutdown |
| 68 | +``` |
| 69 | + |
| 70 | +### Manual Usage |
| 71 | + |
| 72 | +If you need to use the agent programmatically: |
| 73 | + |
| 74 | +```python |
| 75 | +from ai.agents import IndexSyncAgent |
| 76 | +from db import operations as db_operations |
| 77 | +from utils.logger import get_logger |
| 78 | + |
| 79 | +# Create and start the agent |
| 80 | +agent = IndexSyncAgent( |
| 81 | + db_client=db_operations, |
| 82 | + interval_seconds=30, |
| 83 | + logger=get_logger(__name__), |
| 84 | + enabled=True |
| 85 | +) |
| 86 | + |
| 87 | +agent.start() |
| 88 | + |
| 89 | +# Later, stop the agent |
| 90 | +agent.stop(timeout=5.0) |
| 91 | +``` |
| 92 | + |
| 93 | +### Checking Agent Status |
| 94 | + |
| 95 | +The agent status is included in the `/api/health` endpoint: |
| 96 | + |
| 97 | +```bash |
| 98 | +curl http://localhost:8080/api/health |
| 99 | +``` |
| 100 | + |
| 101 | +Response: |
| 102 | +```json |
| 103 | +{ |
| 104 | + "status": "ok", |
| 105 | + "version": "0.2.0", |
| 106 | + "features": [...], |
| 107 | + "index_sync_agent": { |
| 108 | + "enabled": true, |
| 109 | + "running": true, |
| 110 | + "interval_seconds": 30, |
| 111 | + "thread_alive": true |
| 112 | + } |
| 113 | +} |
| 114 | +``` |
| 115 | + |
| 116 | +## How It Works |
| 117 | + |
| 118 | +### Reconciliation Process |
| 119 | + |
| 120 | +1. **List Projects**: Retrieves all registered projects from the database |
| 121 | +2. **Check Each Project**: For each project: |
| 122 | + - Verifies the project path exists |
| 123 | + - Inspects the project's database to determine actual status |
| 124 | + - Computes file count and embedding count |
| 125 | +3. **Determine Status**: |
| 126 | + - `created`: No database or no files indexed |
| 127 | + - `ready`: Has files and embeddings |
| 128 | + - `error`: Project path doesn't exist |
| 129 | +4. **Update if Changed**: Updates the status in the registry if it differs from stored status |
| 130 | + |
| 131 | +### Status Logic |
| 132 | + |
| 133 | +``` |
| 134 | +Database exists? |
| 135 | + No → "created" |
| 136 | + Yes → Check file_count: |
| 137 | + file_count == 0 → "created" |
| 138 | + file_count > 0 AND embedding_count > 0 → "ready" |
| 139 | + file_count > 0 AND embedding_count == 0 → keep current (might be indexing) |
| 140 | +``` |
| 141 | + |
| 142 | +## Architecture |
| 143 | + |
| 144 | +``` |
| 145 | +┌─────────────────────────┐ |
| 146 | +│ FastAPI Application │ |
| 147 | +│ (main.py) │ |
| 148 | +└───────────┬─────────────┘ |
| 149 | + │ creates & manages |
| 150 | + ▼ |
| 151 | +┌─────────────────────────┐ |
| 152 | +│ IndexSyncAgent │ |
| 153 | +│ (Background Thread) │ |
| 154 | +└───────────┬─────────────┘ |
| 155 | + │ uses |
| 156 | + ▼ |
| 157 | +┌─────────────────────────┐ |
| 158 | +│ db.operations │ |
| 159 | +│ - list_projects() │ |
| 160 | +│ - get_project_stats() │ |
| 161 | +│ - update_status() │ |
| 162 | +└─────────────────────────┘ |
| 163 | +``` |
| 164 | + |
| 165 | +## Testing |
| 166 | + |
| 167 | +Run the unit tests: |
| 168 | + |
| 169 | +```bash |
| 170 | +python -m unittest tests.test_index_sync_agent |
| 171 | +``` |
| 172 | + |
| 173 | +Or with pytest (if installed): |
| 174 | + |
| 175 | +```bash |
| 176 | +pytest tests/test_index_sync_agent.py |
| 177 | +``` |
| 178 | + |
| 179 | +## Benefits |
| 180 | + |
| 181 | +1. **Accurate UI Information**: Web UI always shows current indexing status |
| 182 | +2. **Automatic Recovery**: Detects and corrects status inconsistencies |
| 183 | +3. **Error Detection**: Identifies projects with missing paths or database issues |
| 184 | +4. **Low Overhead**: Minimal resource usage with configurable interval |
| 185 | +5. **Maintainable**: Clean separation of concerns with clear responsibilities |
| 186 | + |
| 187 | +## Troubleshooting |
| 188 | + |
| 189 | +### Agent Not Starting |
| 190 | + |
| 191 | +Check the logs for errors: |
| 192 | +```bash |
| 193 | +# Look for "IndexSyncAgent started successfully" or error messages |
| 194 | +``` |
| 195 | + |
| 196 | +Verify configuration: |
| 197 | +```bash |
| 198 | +# In your .env file: |
| 199 | +INDEX_SYNC_ENABLED=true |
| 200 | +``` |
| 201 | + |
| 202 | +### Status Not Updating |
| 203 | + |
| 204 | +Increase logging verbosity and check reconciliation logs. The agent logs: |
| 205 | +- When it starts/stops |
| 206 | +- When status changes are detected |
| 207 | +- Any errors during reconciliation |
| 208 | + |
| 209 | +### Performance Issues |
| 210 | + |
| 211 | +If the agent impacts performance: |
| 212 | +1. Increase the reconciliation interval (e.g., 60 seconds) |
| 213 | +2. Check database performance (ensure WAL mode is enabled) |
| 214 | +3. Review logs for repeated errors |
| 215 | + |
| 216 | +## Future Enhancements |
| 217 | + |
| 218 | +Potential improvements for future versions: |
| 219 | + |
| 220 | +- Track index version hashes for change detection |
| 221 | +- Report progress percentage for in-progress indexing |
| 222 | +- Support for custom status reconciliation logic |
| 223 | +- Metrics collection (status change counts, reconciliation duration) |
| 224 | +- Integration with notification systems |
| 225 | +- Support for distributed deployments (multiple agents coordinating) |
0 commit comments