🚶‍♂️ MOTS_Mask-RCNN

📖 Project Description

This project implements a robust person tracking system by combining Faster R-CNN for object detection and a Siamese network for person re-identification (ReID). The system continuously tracks a pedestrian across frames in a video sequence, even under challenging conditions such as occlusions, lighting changes, and similar appearances among individuals. The tracking pipeline achieves both accuracy and efficiency by leveraging deep learning models, data augmentation, and GPU-accelerated training.

✨ Key Features

Faster R-CNN with ResNet50 Backbone
Fine-tuned for pedestrian detection using the MOT16-02 dataset, achieving a training loss of ~1.0065 after 10 epochs.
Siamese Network for Person Re-Identification
Learns feature embeddings to uniquely identify individuals across frames using triplet loss and fine-tuning on Market1501.
Bounding Box Tracking with Motion Prediction
Predicts future locations based on prior bounding box velocities, maintaining continuity even under brief occlusions.
Similarity-Based Data Association
Combines IoU and embedding similarity scores for robust identity matching between frames.
GPU-Accelerated Training
Fully utilizes CUDA with mixed precision and gradient accumulation for optimized performance.
Augmented Datasets for Robustness
Includes color jitter, Gaussian blur, and brightness variation to enhance generalization under varied lighting and visual conditions.

🧠 Tech Stack

Category	Technologies
Programming Language	Python 3.10+
Deep Learning Framework	PyTorch, Torchvision
Computer Vision	OpenCV, PIL
Data Handling & Visualization	Pandas, NumPy, scikit-image, Matplotlib
Datasets	MOT16, Market1501
Hardware	CUDA-enabled GPU

🎥 Project Demo

Tracking_video_MOTS.mp4

🏗️ Project Architecture

📂 Project Root

├── Faster_RCNN_GPU.py              # Training script for Faster R-CNN on MOT16
├── Siamese_network.py              # Core Siamese model (Triplet-based ReID)
├── siamese_network_final_v2_prerit.py  # Advanced Siamese training with augmentation
├── inference_test.py               # Inference and tracking pipeline
├── Project Report.pdf              # System design, methodology, and results
└── outputs/
    ├── faster_rcnn_mots16.pth      # Trained detection model
    ├── finetuned_siamese_model_test.pth  # Trained re-ID model
    └── tracked_output_test.mp4     # Output tracking video

🧩 Pipeline Overview

Detection (Faster R-CNN) — Detects pedestrians in each frame.
Feature Embedding (Siamese Network) — Extracts a 256-dimensional feature vector per detection.
Similarity Calculation — Compares embeddings between frames to match identities.
Bounding Box Prediction (Tracker) — Predicts next-frame positions using velocity estimation.
Data Association — Combines IoU and embedding similarity to maintain consistent IDs.

⚙️ Getting Started

🧰 Prerequisites

Ensure you have the following installed:

Python ≥ 3.10
CUDA-compatible GPU
PyTorch ≥ 2.0
Torchvision ≥ 0.15
OpenCV ≥ 4.5

🔧 Installation

# Clone the repository
git clone https://github.com/YOUR_USERNAME/mots-person-tracking.git
cd mots-person-tracking

# (Optional) Create a virtual environment
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows

# Install dependencies
pip install torch torchvision opencv-python pandas scikit-image tqdm matplotlib

🔐 Configuration

Create a .env file in the project root to specify dataset and output paths:

# Dataset and model paths
MOT16_TRAIN_DIR=D:\Path\To\MOT16\train\MOT16-02
MARKET1501_DIR=D:\Path\To\Market1501
OUTPUT_VIDEO=tracked_output_test.mp4

🚀 Usage

1. Train the Detection Model

python Faster_RCNN_GPU.py

This trains a Faster R-CNN (ResNet50-FPN) on the MOT16-02 dataset and saves the model as faster_rcnn_mots16.pth.

2. Train the Siamese Re-Identification Network

python siamese_network_final_v2_prerit.py

This trains and fine-tunes the Siamese network using triplet loss across multiple MOT16 sequences and Market1501.

3. Run Inference and Tracking

python inference_test.py

The script will:

Detect people per frame
Match them using re-ID embeddings
Track them across frames
Export an annotated output video (tracked_output_test.mp4)

📊 Results Summary

Model	Dataset	Epochs	Train Loss	Valid Loss	Notes
Faster R-CNN	MOT16-02	10	1.0065	—	Fine-tuned from pretrained ResNet50
Siamese Network	Market1501 + MOT16	10	0.1401	0.2154	Fine-tuned with augmentation

🔮 Future Work

Train both models on the complete MOT16 dataset for broader scene understanding.
Optimize similarity thresholds for dynamic environments.
Integrate Kalman filters or DeepSORT for enhanced temporal consistency.
Explore multi-camera tracking and cross-view re-identification.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
MOTS		MOTS
.gitignore		.gitignore
README.md		README.md
assign3.py		assign3.py
data_augmentation.py		data_augmentation.py
data_preparation.py		data_preparation.py
mask_rcnn.py		mask_rcnn.py
siamesenetwork.py		siamesenetwork.py
tracker.py		tracker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚶‍♂️ MOTS_Mask-RCNN

📖 Project Description

✨ Key Features

🧠 Tech Stack

🎥 Project Demo

🏗️ Project Architecture

🧩 Pipeline Overview

⚙️ Getting Started

🧰 Prerequisites

🔧 Installation

🔐 Configuration

🚀 Usage

1. Train the Detection Model

2. Train the Siamese Re-Identification Network

3. Run Inference and Tracking

📊 Results Summary

🔮 Future Work

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

shahzeb-jadoon/MOTS_Mask-RCNN

Folders and files

Latest commit

History

Repository files navigation

🚶‍♂️ MOTS_Mask-RCNN

📖 Project Description

✨ Key Features

🧠 Tech Stack

🎥 Project Demo

🏗️ Project Architecture

🧩 Pipeline Overview

⚙️ Getting Started

🧰 Prerequisites

🔧 Installation

🔐 Configuration

🚀 Usage

1. Train the Detection Model

2. Train the Siamese Re-Identification Network

3. Run Inference and Tracking

📊 Results Summary

🔮 Future Work

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages