real-time-market-data-aggregator

A real-time market data aggregator built with Spring Boot, Apache Kafka, Kafka Streams, and MongoDB.
This project simulates live stock tick data, processes it in-flight using Kafka Streams to compute indicators like RSI, stores results in MongoDB, and uses MongoDB aggregation pipelines to analyze historical trends.

This is the completed app from the tutorial:
Building a Real-Time Market Data Aggregator with Kafka and MongoDB

What this app does

End-to-end pipeline:

Simulate live market tick data and publish it to Kafka.
Process the stream using Kafka Streams (for example, computing RSI).
Persist both raw and derived data into MongoDB.
Query historical data using MongoDB aggregation pipelines for analytics.

Tech stack

Java 21
Spring Boot
Spring for Apache Kafka
Kafka Streams
MongoDB Atlas (or local MongoDB)

Prerequisites

Java 21+
Maven 3.9+
Kafka 3.5+ (local install or Docker)
MongoDB Atlas cluster (M0 works) or local MongoDB

Configuration

Update src/main/resources/application.properties with your MongoDB connection string and database name:

spring.data.mongodb.uri=<YOUR_MONGODB_CONNECTION_STRING>
spring.data.mongodb.database=<YOUR_DATABASE_NAME>

Kafka broker (default local):

spring.kafka.bootstrap-servers=localhost:9092

Running Kafka

Create and start Kafka using whatever setup you prefer (local install or Docker). Once Kafka is running, create the required topic(s) used by the app.

Example:

bin/kafka-topics.sh --create \
  --topic market-data \
  --bootstrap-server localhost:9092 \
  --partitions 1 \
  --replication-factor 1

If your app uses different topic names, update them accordingly.

Run the application

From the project root:

mvn clean spring-boot:run

You should see logs indicating:

simulated market data is being produced
Kafka Streams is processing events
computed indicators are being written to MongoDB

Data model (high level)

You can expect documents representing:

raw tick data (symbol, price, timestamp)
derived indicator values (for example RSI windows)

Exact fields depend on the implementation in the tutorial.

Notes on scaling

This demo is intentionally simple and designed to show the full loop:

Kafka -> Kafka Streams -> MongoDB

For production-like workloads, you would typically:

increase topic partitions for parallelism
tune Kafka Streams state stores and commit intervals
use MongoDB indexes appropriate for query patterns
consider time series collections if you are not using Search indexes

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea		.idea
Financial Aggregator		Financial Aggregator
client		client
.gitignore		.gitignore
README.md		README.md
client.iml		client.iml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

real-time-market-data-aggregator

What this app does

Tech stack

Prerequisites

Configuration

Running Kafka

Run the application

Data model (high level)

Notes on scaling

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

real-time-market-data-aggregator

What this app does

Tech stack

Prerequisites

Configuration

Running Kafka

Run the application

Data model (high level)

Notes on scaling

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages