Skip to content

fix: enhanced document upload, page limit, upload limit tests#839

Merged
MODSetter merged 12 commits intoMODSetter:devfrom
AnishSarkar22:feat/document-test
Feb 26, 2026
Merged

fix: enhanced document upload, page limit, upload limit tests#839
MODSetter merged 12 commits intoMODSetter:devfrom
AnishSarkar22:feat/document-test

Conversation

@AnishSarkar22
Copy link
Contributor

@AnishSarkar22 AnishSarkar22 commented Feb 26, 2026

Description

  • Added integration tests for the document upload HTTP API covering upload, multi-file, duplicate detection, auth, error handling, and searchability
  • Separated test architecture: document_upload/ tests API behavior through public endpoints, indexing_pipeline/ tests pipeline internals in isolation
  • Centralized test database configuration in root conftest.py with automatic surfsense_test database override
  • Replaced hardcoded embedding dimension (1024) with app_config.embedding_model_instance.dimension across all test files
  • Cleaned up .env.example and pyproject.toml by removing unused testing config and the e2e marker
  • Rewrite the testing doc.
  • Updated UI for surfsense docs.

Motivation and Context

FIX #

Screenshots

API Changes

  • This PR includes API changes

Change Type

  • Bug fix
  • New feature
  • Performance improvement
  • Refactoring
  • Documentation
  • Dependency/Build system
  • Breaking change
  • Other (specify):

Testing Performed

  • Tested locally
  • Manual/QA verification

Checklist

  • Follows project coding standards and conventions
  • Documentation updated as needed
  • Dependencies updated as needed
  • No lint/build errors or new warnings
  • All relevant tests are passing

High-level PR Summary

This PR refactors the document upload testing architecture by replacing end-to-end tests with integration tests that run the FastAPI app in-process via ASGITransport. It introduces a task dispatcher abstraction (TaskDispatcher protocol) that allows tests to process documents synchronously without requiring Celery/Redis, centralizes test database configuration in the root conftest.py with automatic test database override, and replaces hardcoded embedding dimensions with dynamic values from app_config. The testing configuration is cleaned up by removing the e2e marker and unused environment variables, while the test suite now covers document upload API behavior including multi-file uploads, duplicate detection, authentication, error handling, page limits, upload limits, and searchability.

⏱️ Estimated Review Time: 30-90 minutes

💡 Review Order Suggestion
Order File Path
1 surfsense_backend/tests/conftest.py
2 surfsense_backend/pyproject.toml
3 surfsense_backend/.env.example
4 surfsense_backend/app/services/task_dispatcher.py
5 surfsense_backend/app/routes/documents_routes.py
6 surfsense_backend/tests/integration/conftest.py
7 surfsense_backend/tests/integration/document_upload/conftest.py
8 surfsense_backend/tests/utils/helpers.py
9 surfsense_backend/tests/integration/document_upload/test_document_upload.py
10 surfsense_backend/tests/integration/document_upload/test_page_limits.py
11 surfsense_backend/tests/integration/document_upload/test_upload_limits.py
12 surfsense_backend/tests/integration/indexing_pipeline/test_index_document.py

Need help? Join our Discord

Analyze latest changes

- Introduced a TaskDispatcher abstraction to decouple the upload endpoint from Celery, allowing for easier testing with synchronous implementations.
- Updated the create_documents_file_upload function to utilize the new dispatcher for task management.
- Removed direct Celery task imports from the upload function, enhancing modularity.
- Added integration tests for document upload, including page limit enforcement and file size restrictions.
- Removed commented-out testing configuration from .env.example to streamline the file.
- Updated markers in pyproject.toml to remove the e2e test marker, clarifying the purpose of the remaining markers.
…ation

- Updated the embedding dimension in test configurations to use the value from the application config, enhancing maintainability and consistency across tests.
@vercel
Copy link

vercel bot commented Feb 26, 2026

@AnishSarkar22 is attempting to deploy a commit to the Rohan Verma's projects Team on Vercel.

A member of the Team first needs to authorize it.

@AnishSarkar22 AnishSarkar22 marked this pull request as ready for review February 26, 2026 20:28
Copy link

@recurseml recurseml bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by RecurseML

🔍 Review performed on 30617c6..836d529

✨ No bugs found, your code is sparkling clean

✅ Files analyzed, no issues (13)

surfsense_backend/.env.example
surfsense_backend/app/routes/documents_routes.py
surfsense_backend/app/services/task_dispatcher.py
surfsense_backend/pyproject.toml
surfsense_backend/tests/conftest.py
surfsense_backend/tests/e2e/conftest.py
surfsense_backend/tests/integration/conftest.py
surfsense_backend/tests/integration/document_upload/conftest.py
surfsense_backend/tests/integration/document_upload/test_document_upload.py
surfsense_backend/tests/integration/document_upload/test_page_limits.py
surfsense_backend/tests/integration/document_upload/test_upload_limits.py
surfsense_backend/tests/integration/indexing_pipeline/test_index_document.py
surfsense_backend/tests/utils/helpers.py

⏭️ Files skipped (1)
  Locations  
surfsense_backend/tests/integration/document_upload/__init__.py

Copy link

@recurseml recurseml bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by RecurseML

🔍 Review performed on 836d529..836d529

✨ No files to analyze

Copy link

@recurseml recurseml bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by RecurseML

🔍 Review performed on 836d529..2468cc2

✨ No bugs found, your code is sparkling clean

✅ Files analyzed, no issues (12)

surfsense_web/app/docs/layout.tsx
surfsense_web/app/docs/sidebar-separator.tsx
surfsense_web/app/globals.css
surfsense_web/app/layout.config.tsx
surfsense_web/content/docs/connectors/meta.json
surfsense_web/content/docs/docker-installation.mdx
surfsense_web/content/docs/how-to/meta.json
surfsense_web/content/docs/index.mdx
surfsense_web/content/docs/installation.mdx
surfsense_web/content/docs/manual-installation.mdx
surfsense_web/content/docs/testing.mdx
surfsense_web/lib/source.ts

@MODSetter MODSetter merged commit 2f08dc9 into MODSetter:dev Feb 26, 2026
4 of 7 checks passed
Copy link

@recurseml recurseml bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by RecurseML

🔍 Review performed on 2468cc2..2468cc2

✨ No files to analyze

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants