Distributed Document Search Service

Node.js REST API for multi-tenant document search with Elasticsearch, Redis caching, Redis rate limiting, structured logs, and Prometheus metrics.

Default local API base URL: http://localhost:3020

What This Project Demonstrates

Public health and metrics endpoints
Authenticated document and search APIs
Tenant isolation enforced on every protected request
Reader/writer role separation
Search caching and document caching in Redis
Rate limiting backed by Redis
Safe Elasticsearch query construction
Soft delete behavior
Reviewer-friendly demo and load-test scripts

Quick Start

Local Node API + Docker Dependencies

cp .env.example .env
npm install
docker compose up -d elasticsearch redis
npm start

Then check health:

curl --max-time 5 http://localhost:3020/health
curl --max-time 5 http://localhost:3020/metrics

Full Docker Compose

docker compose up --build

The API is exposed on http://localhost:3020 even when the container listens on port 3000 internally.

Requirements

The application reads the following values from .env:

PORT=3020
ELASTICSEARCH_URL=http://localhost:9200
REDIS_URL=redis://localhost:6379
ELASTICSEARCH_INDEX_PREFIX=documents
LOG_LEVEL=info
SEARCH_CACHE_TTL_SECONDS=60
SEARCH_QUERY_MAX_LENGTH=200
SEARCH_DEFAULT_PAGE=1
SEARCH_DEFAULT_SIZE=10
SEARCH_MAX_SIZE=50
DOCUMENT_CACHE_TTL_SECONDS=300
TENANT_RATE_LIMIT_PER_MINUTE=100
DOCUMENT_RATE_LIMIT_PER_MINUTE=30
RATE_LIMIT_WINDOW_SECONDS=60
HEALTH_CHECK_TIMEOUT_MS=2000
ELASTICSEARCH_REQUEST_TIMEOUT_MS=2000
REDIS_CONNECT_TIMEOUT_MS=2000

Authentication

Protected endpoints require both headers:

Authorization: Bearer <token>
X-Tenant-Id: <tenantId>

Available prototype tokens:

tenant-a-reader-token
tenant-a-writer-token
tenant-b-reader-token
tenant-b-writer-token

Role behavior:

reader: GET /search, GET /documents/:id
writer: POST /documents, GET /search, GET /documents/:id, DELETE /documents/:id

API Examples

GET /health

curl -i http://localhost:3020/health

GET /metrics

curl -i http://localhost:3020/metrics

POST /documents

curl -sS -X POST http://localhost:3020/documents \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer tenant-a-writer-token" \
  -H "X-Tenant-Id: tenant-a" \
  -d '{
    "title": "Contract Renewal Policy",
    "content": "The contract renewal process begins 60 days before expiry.",
    "metadata": {
      "department": "legal",
      "category": "contracts"
    }
  }'

GET /search

curl -sS "http://localhost:3020/search?q=contract&page=1&size=10" \
  -H "Authorization: Bearer tenant-a-reader-token" \
  -H "X-Tenant-Id: tenant-a"

GET /documents/:id

curl -sS http://localhost:3020/documents/<id> \
  -H "Authorization: Bearer tenant-a-reader-token" \
  -H "X-Tenant-Id: tenant-a"

DELETE /documents/:id

curl -sS -X DELETE http://localhost:3020/documents/<id> \
  -H "Authorization: Bearer tenant-a-writer-token" \
  -H "X-Tenant-Id: tenant-a"

Demo Data

Seed repeatable demo content for both tenants:

npm run seed

The seed script uses the real HTTP API, prints created ids, and exits non-zero if any document fails.

Load Testing

Run a local concurrency test against search:

npm run load-test

Optional document GET test:

DOCUMENT_ID=<id> npm run load-test

Useful overrides:

API_BASE_URL=http://localhost:3020
TENANT_ID=tenant-a
TOKEN=tenant-a-reader-token
SEARCH_QUERY=contract
CONNECTIONS=50
DURATION_SECONDS=30

Load Test Rate-Limit Clarification

The default tenant rate limit is 100 requests/minute.
autocannon can exceed that window quickly even with low CONNECTIONS.
429 responses during load tests are expected unless local limits are raised.
For raw latency measurement, temporarily raise:
- TENANT_RATE_LIMIT_PER_MINUTE
- DOCUMENT_RATE_LIMIT_PER_MINUTE
For abuse-protection verification, keep or lower limits and observe RATE_LIMITED responses.
Local load tests demonstrate methodology and baseline behavior, not 10M-document production scale.

Testing Strategy

Quick review path:

Local functional checks cover /health, /metrics, auth, tenant isolation, create/search/get/delete, and soft delete.
npm run seed loads repeatable demo data for both tenants without destructive cleanup.
Cache checks verify X-Cache: MISS and X-Cache: HIT for search and document GET.
Rate-limit checks verify 429 RATE_LIMITED and rate_limited_total.
Observability checks verify structured logs, requestId, and the expected metric families.
npm run load-test provides a local concurrency baseline with autocannon.
Production-scale strategy and pass/fail criteria are documented in Testing Strategy.

Metrics

The local prototype exposes Prometheus text metrics at:

GET /metrics

Example:

curl http://localhost:3020/metrics

Custom metric families:

http_requests_total
http_request_duration_seconds
cache_hits_total
cache_misses_total
rate_limited_total
documents_indexed_total
documents_deleted_total
search_requests_total
search_duration_seconds

Label safety:

Metrics avoid raw search query text.
Metrics avoid document IDs.
Metrics avoid request IDs.
Metrics avoid Authorization headers or bearer tokens.
Metrics avoid Redis cache keys.
HTTP route labels use low-cardinality route patterns such as /documents/:id.

Architecture

Design notes and operational assumptions are documented here:

Repository Layout

.
├── docker-compose.yml
├── Dockerfile
├── docs/
├── scripts/
├── src/
└── package.json

Reviewer Notes

/health is public and should report healthy when dependencies are available.
/metrics is public and returns Prometheus-formatted metrics.
Cache behavior uses X-Cache: HIT and X-Cache: MISS.
Tenant mismatches are rejected by the auth layer.
Logs are structured and should not leak bearer tokens or document content.

Manual Verification Checklist

AI Usage Note

AI tools were used to assist with architecture brainstorming, implementation scaffolding, documentation organization, and test planning. Final design decisions, code review, and validation were performed by me.

distributed-document-search

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Document Search Service

What This Project Demonstrates

Quick Start

Local Node API + Docker Dependencies

Full Docker Compose

Requirements

Authentication

API Examples

GET /health

GET /metrics

POST /documents

GET /search

GET /documents/:id

DELETE /documents/:id

Demo Data

Load Testing

Load Test Rate-Limit Clarification

Testing Strategy

Metrics

Architecture

Repository Layout

Reviewer Notes

Manual Verification Checklist

AI Usage Note

distributed-document-search

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

Distributed Document Search Service

What This Project Demonstrates

Quick Start

Local Node API + Docker Dependencies

Full Docker Compose

Requirements

Authentication

API Examples

GET /health

GET /metrics

POST /documents

GET /search

GET /documents/:id

DELETE /documents/:id

Demo Data

Load Testing

Load Test Rate-Limit Clarification

Testing Strategy

Metrics

Architecture

Repository Layout

Reviewer Notes

Manual Verification Checklist

AI Usage Note

distributed-document-search

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages