Skip to content

Parallelize the ingester by turning each worker into a full ingestion pipeline #1673

@gustavobtflores

Description

@gustavobtflores

Today multiple workers parse/validate files, but all inserts are funneled through a single process, which becomes a bottleneck. This limits throughput and leaves workers idle during busy ingestion cycles.

We should evaluate shifting to a model where each worker performs the full pipeline, parse, validate, and insert, using its own database connection. This removes the single-process bottleneck and allows inserts to happen in parallel.

Metadata

Metadata

Labels

IngesterThe issue relates to the ingester tool, including the command itself and related functions.

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions