Skip to content

avid-fullstack/ai-doc-search-fastAPI

Repository files navigation

AI Document Search Backend

Lint and test Load test Auto-updating coverage badge Build and deploy container app to Azure Web App - ai-document-search-backend Supported Python versions

The server is deployed at https://ai-document-search-backend.azurewebsites.net/. The deployment is automatic on push to the master branch. The OpenAPI schema is available at https://ai-document-search-backend.azurewebsites.net/docs.

This repository uses Poetry package manager (see useful commands).

The server uses FastAPI framework.

The code uses dependency injection and is tested using pytest.

How to run the server locally

The server is available at http://localhost:8000.

Start by creating an .env file in the project root with the following content:

APP_OPENAI_API_KEY=your_openai_api_key
APP_WEAVIATE_API_KEY=api_key_for_weaviate_url_specified_in_config
COSMOS_KEY=key_for_cosmos_url_specified_in_config
AUTH_SECRET_KEY=any_secret_key
AUTH_USERNAME=any_user
AUTH_PASSWORD=any_password

Without Docker

  • poetry install
  • poetry run uvicorn ai_document_search_backend.application:app --reload

With Docker

  • docker compose up

Other useful commands

Unit tests

  • poetry run pytest

Load tests

  • Start the server locally (see above).
  • poetry run locust
  • Open http://localhost:8089/ in your browser.
  • Enter the number of users, the spawn rate and Host (http://localhost:8000 – without trailing slash).
  • Click "Start swarming".

Lint autoformat

  • poetry run black --config black.py.toml .
  • poetry run ruff check . --fix

Lint check

  • poetry run black --config black.py.toml . --check
  • poetry run ruff check .

Build Docker image, tag and push to Azure Container Registry

  • docker build -t ai-document-search-backend -f Dockerfile .
  • docker tag ai-document-search-backend:latest crdocsearchdev.azurecr.io/crdocsearchdev/ai-document-search-backend:0.0.1
  • az login
  • az acr login --name crdocsearchdev
  • docker push crdocsearchdev.azurecr.io/crdocsearchdev/ai-document-search-backend:0.0.1

Useful Poetry commands

  • Install all dependencies: poetry install.
  • Add new package at the latest version: poetry add <package>, e.g. poetry add numpy.
  • Add package only for development: poetry add <package> --group dev, e.g. poetry add jupyter --group dev.
  • Regenerate poetry.lock file: poetry lock --no-update.
  • Remove package: poetry remove <package>, e.g. poetry remove numpy.

Populating the vector database

  • Download NTNU2.xlsx from the customer and save it to data/NTNU2.xlsx. This file is private and is therefore not included in the repository. See prepare_data.py for the columns that must be present in the file.
  • Run poetry run python ai_document_search_backend/scripts/prepare_data.py to pre-process the data.
  • Run poetry run python ai_document_search_backend/scripts/download_documents.py [limit] to download the PDFs into a local folder. The limit is optional and specifies the number of documents to download. If not specified, all documents will be downloaded.
  • Run poetry run python ai_document_search_backend/scripts/fill_vectorstore.py to store the documents in the vector database.

Project structure, architecture and design

For a more detailed description of the project structure, architecture and design, see the project structure document.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •