From fa255418536bc19a92819c0a66c8c2e1347c55c5 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Wed, 18 Mar 2026 23:31:31 +0530
Subject: [PATCH 1/8] Update README.md

---
 Colorize Black & white images [OPEN CV]/models/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Colorize Black & white images [OPEN CV]/models/README.md b/Colorize Black & white images [OPEN CV]/models/README.md
index c7c3975..75d2fa9 100644
--- a/Colorize Black & white images [OPEN CV]/models/README.md	
+++ b/Colorize Black & white images [OPEN CV]/models/README.md	
@@ -1 +1 @@
-# download the model file from [here](https://drive.google.com/file/d/14YmdCfcMOgfJEBNJEl6Xj1SB-RccgJBO/view?usp=sharing) and add them to this folder in order to run this project.
+# download the model file from [https://huggingface.co/spaces/BilalSardar/Black-N-White-To-Color/blob/main/colorization_release_v2.caffemodel](https://huggingface.co/spaces/BilalSardar/Black-N-White-To-Color/blob/main/colorization_release_v2.caffemodel) and add them to this folder in order to run this project.

From 4ecc8e3e26af902cedfca46bd7b8b3acaf98f019 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Wed, 18 Mar 2026 23:34:26 +0530
Subject: [PATCH 2/8] Update README.md

---
 .../README.md                                 | 508 +++++++-----------
 1 file changed, 202 insertions(+), 306 deletions(-)

diff --git a/Colorize Black & white images [OPEN CV]/README.md b/Colorize Black & white images [OPEN CV]/README.md
index 986ad44..e8f7dac 100644
--- a/Colorize Black & white images [OPEN CV]/README.md	
+++ b/Colorize Black & white images [OPEN CV]/README.md	
@@ -1,386 +1,282 @@
-<div class="cell markdown" id="ONf1qRd1K7J7">
+<div align="center">
 
-# Colorize Black white Image
+# 🎨 Colorize Black & White Images — OpenCV Deep Learning
 
-This Deep Learning Project aims to provide colorizing black & white
-images with Python.
+[![Python](https://img.shields.io/badge/Python-3.7+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/)
+[![OpenCV](https://img.shields.io/badge/OpenCV-DNN-5C3EE8?style=for-the-badge&logo=opencv&logoColor=white)](https://opencv.org/)
+[![Caffe](https://img.shields.io/badge/Caffe-Pre--trained%20Model-red?style=for-the-badge)](http://caffe.berkeleyvision.org/)
+[![Tkinter](https://img.shields.io/badge/Tkinter-GUI%20App-blue?style=for-the-badge)](https://docs.python.org/3/library/tkinter.html)
+[![License](https://img.shields.io/badge/License-MIT-1abc9c?style=for-the-badge)](../LICENSE.md)
 
-In image colorization, we take a black and white image as input and
-produce a colored image. We will solve this project with OpenCV deep
-neural network.
+> Automatically colorizes **black & white images** using a pre-trained deep learning model loaded via **OpenCV DNN** — wrapped in a clean **Tkinter desktop GUI** where you upload an image and get a colorized result instantly.
 
-</div>
-
-<div class="cell markdown" id="pasjVk5WRXMM">
-
-<img src="Animation.gif" />
-
-</div>
-
-<div class="cell markdown" id="Tb-GeIj8Nl6Y">
-
-### Lab Color Space:
-
-Like RGB, Lab is another color space. It is also three channel color
-space like RGB where the channels are:
-
-    L channel: This channel represents the Lightness
-    a channel: This channel represents green-red
-    b channel: This channel represents blue-yellow
-
-In this color space, the grayscale part of the image is only encoded in
-L channel. Therefore Lab color space is more favorable for our project.
+[🔙 Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
 
 </div>
 
-<div class="cell markdown" id="t25jOjTGNpQf">
-
-### Problem Statement:
+---
 
-deep learning project colorize black white images with python
+## 📌 Table of Contents
 
-We can formulate our problem statement as to predict a and b channels,
-given an input grayscale image.
+- [About the Project](#-about-the-project)
+- [How It Works](#-how-it-works)
+- [The Science — Lab Color Space](#-the-science--lab-color-space)
+- [The Model — Zhang et al. 2016](#-the-model--zhang-et-al-2016)
+- [Model Files](#-model-files)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [App Preview](#-app-preview)
+- [Tech Stack](#-tech-stack)
+- [References & Citation](#-references--citation)
 
-In this deep learning project, we will use OpenCV DNN architecture which
-is trained on ImageNet dataset. The neural net is trained with the L
-channel of images as input data and a,b channels as target data.
+---
 
-</div>
+## 🔬 About the Project
 
-<div class="cell markdown" id="JF04ygEWN1Dg">
+Manually colorizing historical black & white photographs is an extremely time-consuming artistic process. This project automates it entirely using a **Convolutional Neural Network** trained to "hallucinate" plausible colors for any grayscale input — from old family photos to historical images.
 
-#### Steps to implement Image Colorization Project:
+Rather than training a model from scratch, the project loads **Richard Zhang et al.'s 2016 pre-trained Caffe model** directly through **OpenCV's DNN module**, making inference fast and dependency-light. The entire experience is wrapped in a **Tkinter GUI** where users upload a grayscale image and receive a colorized version with a single click.
 
-For colorizing black and white images we will be using a pre-trained
-caffe model, a prototxt file, and a NumPy file.
+**What this project covers:**
+- Understanding Lab color space and why it is ideal for colorization
+- Loading and running a pre-trained Caffe model via OpenCV DNN
+- Image preprocessing: RGB → Lab, extracting the L channel as input
+- Post-processing: merging predicted `ab` channels back with `L`, converting Lab → BGR
+- Building a desktop GUI with Tkinter for real-time image upload and display
 
-The prototxt file defines the network and the numpy file stores the
-cluster center points in numpy format.
+---
 
-1.  Make a directory with name models.
-
-</div>
+## ⚙️ How It Works
 
-<div class="cell code" data-execution_count="1" id="iyWZYzh65gX2">
-
-``` python
-!mkdir models
 ```
-
-</div>
-
-<div class="cell markdown" id="aTQIFk1MN8-U">
-
-download the caffemodel, prototxt file and the NumPy file.
-
-</div>
-
-<div class="cell code" data-execution_count="3" data-colab="{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}" id="lE0XfKuP5kZd" data-outputId="4d70b345-f785-43b1-9d29-88f4c482f8ed">
-
-``` python
-!wget https://github.com/richzhang/colorization/blob/caffe/colorization/resources/pts_in_hull.npy?raw=true -O ./pts_in_hull.npy
+Input: Grayscale / B&W Image
+              │
+              ▼
+   Convert: BGR → RGB → Lab
+              │
+              ▼
+   Extract L channel (lightness only)
+   Resize to 224 × 224
+              │
+              ▼
+   OpenCV DNN Forward Pass
+   (Zhang et al. Caffe model)
+              │
+              ▼
+   Predict ab channels
+   (313 quantized color bins → soft-decoded to ab)
+              │
+              ▼
+   Resize predicted ab → original image size
+              │
+              ▼
+   Concatenate: L (original) + ab (predicted)
+              │
+              ▼
+   Convert: Lab → BGR
+   Clip values to [0, 1], scale to [0, 255]
+              │
+              ▼
+   Output: Colorized Image → Display in GUI / Save
 ```
 
-</div>
-
-<div class="cell code" data-execution_count="4" data-colab="{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}" id="fLpvCltE5u72" data-outputId="567f403e-e26c-4bfd-9c06-c938b5ba4743">
+---
 
-``` python
-!wget https://raw.githubusercontent.com/richzhang/colorization/caffe/colorization/models/colorization_deploy_v2.prototxt -O ./models/colorization_deploy_v2.prototxt
-```
+## 🎨 The Science — Lab Color Space
 
+This project uses the **Lab color space** rather than the familiar RGB. Here's why it matters:
 
+| Channel | Represents | Role in This Project |
+|---------|-----------|---------------------|
+| **L** | Lightness (0 = black, 100 = white) | Input to the model — this IS the grayscale image |
+| **a** | Green ↔ Red axis | Predicted by the neural network |
+| **b** | Blue ↔ Yellow axis | Predicted by the neural network |
 
-</div>
+**The key insight:** In Lab, grayscale information is *entirely* encoded in the `L` channel. Color information lives only in `a` and `b`. This means the model only needs to learn to predict two channels from one — a much cleaner problem than mapping RGB to RGB.
 
-<div class="cell code" data-execution_count="5" data-colab="{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}" id="OmIJf0BI7acl" data-outputId="ad130ef0-f3bf-4730-c9af-415d9831c25f">
-
-``` python
-!wget http://eecs.berkeley.edu/~rich.zhang/projects/2016_colorization/files/demo_v2/colorization_release_v2.caffemodel -O ./models/colorization_release_v2.caffemodel
 ```
-
-</div>
-
-<div class="cell markdown" id="hu-I5bAeOCHp">
-
-### Import Essential Library
-
-</div>
-
-<div class="cell code" data-execution_count="12" id="RUWZq8Sq7g4m">
-
-``` python
-import numpy as np
-import cv2 as cv
-from matplotlib import pyplot as plt
-import os.path
+Grayscale Image = L channel
+                  │
+         ┌────────┴────────┐
+         ▼                 ▼
+  Neural Network      (kept as-is)
+  predicts: a, b          L
+         │                 │
+         └────────┬────────┘
+                  ▼
+           Lab image → BGR
+           = Colorized Output
 ```
 
-</div>
+---
 
-<div class="cell markdown" id="lwURjJ_IOFyk">
+## 🧠 The Model — Zhang et al. 2016
 
-### Read B\&W image and load the caffemodel:
+The colorization model is from the landmark 2016 ECCV paper **"Colorful Image Colorization"** by Richard Zhang, Phillip Isola, and Alexei A. Efros (UC Berkeley).
 
-</div>
+**Key design choices in the paper:**
 
-<div class="cell code" data-execution_count="14" data-colab="{&quot;height&quot;:269,&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}" id="qm-fStTe7ybo" data-outputId="a2542c6b-f748-4ad1-9313-149c8b3cf28c">
+| Aspect | Detail |
+|--------|--------|
+| **Training data** | 1.3M images from ImageNet (Lab converted) |
+| **Input** | L channel (grayscale), resized to 224×224 |
+| **Output** | Predicted `ab` channels over 313 quantized color bins |
+| **Loss function** | Multinomial cross-entropy with rebalanced class weights (to prevent desaturated outputs) |
+| **Architecture** | Deep CNN with 8 conv blocks, no pooling — uses dilated convolutions to preserve spatial resolution |
+| **Color decoding** | Annealed-mean of the 313 bin distribution (avoids washed-out grays from using the mean) |
 
-``` python
-frame = cv.imread("new.jpg")
+> **Why 313 bins?** The `ab` color space is quantized into 313 bins with a grid size of 10. The model predicts a probability distribution over all 313 possible colors for each pixel, then decodes to a single `ab` value.
 
-numpy_file = np.load('./pts_in_hull.npy')
-Caffe_net = cv.dnn.readNetFromCaffe("./models/colorization_deploy_v2.prototxt", "./models/colorization_release_v2.caffemodel")
+---
 
+## 📦 Model Files
 
+Three files are required to run inference. They are **not included** in the repository due to size and must be downloaded separately:
 
-rgb_img = cv.cvtColor(frame, cv.COLOR_BGR2RGB)		# this converts it into RGB
-plt.imshow(rgb_img)
-plt.show()
-```
+| File | Description | Download |
+|------|-------------|---------|
+| `colorization_release_v2.caffemodel` | Pre-trained model weights (~125 MB) | [Berkeley EECS](http://eecs.berkeley.edu/~rich.zhang/projects/2016_colorization/files/demo_v2/colorization_release_v2.caffemodel) |
+| `colorization_deploy_v2.prototxt` | Network architecture definition | [richzhang/colorization](https://raw.githubusercontent.com/richzhang/colorization/master/colorization/models/colorization_deploy_v2.prototxt) |
+| `pts_in_hull.npy` | 313 cluster center points in ab space | [richzhang/colorization](https://github.com/richzhang/colorization/blob/caffe/colorization/resources/pts_in_hull.npy?raw=true) |
 
-<div class="output display_data">
+Download all three with:
 
-![](input.png)
+```bash
+mkdir -p models
 
-</div>
+# Caffe model weights (~125 MB)
+wget http://eecs.berkeley.edu/~rich.zhang/projects/2016_colorization/files/demo_v2/colorization_release_v2.caffemodel \
+     -O ./models/colorization_release_v2.caffemodel
 
-</div>
-
-<div class="cell markdown" id="qgVEsYfxONnb">
+# Network prototxt definition
+wget https://raw.githubusercontent.com/richzhang/colorization/master/colorization/models/colorization_deploy_v2.prototxt \
+     -O ./models/colorization_deploy_v2.prototxt
 
-### Add layers to the caffe model:
-
-</div>
-
-<div class="cell code" data-execution_count="9" id="f-UAR2AS72yi">
-
-``` python
-numpy_file = numpy_file.transpose().reshape(2, 313, 1, 1)
-Caffe_net.getLayer(Caffe_net.getLayerId('class8_ab')).blobs = [numpy_file.astype(np.float32)]
-Caffe_net.getLayer(Caffe_net.getLayerId('conv8_313_rh')).blobs = [np.full([1, 313], 2.606, np.float32)]
+# Cluster centers (ab quantization bins)
+wget https://github.com/richzhang/colorization/blob/caffe/colorization/resources/pts_in_hull.npy?raw=true \
+     -O ./pts_in_hull.npy
 ```
 
-</div>
-
-<div class="cell markdown" id="jxty2X4BORrv">
+---
 
-### Extract L channel and resize it:
+## 📁 Project Structure
 
-</div>
+```
+Colorize Black & white images [OPEN CV]/
+│
+├── 📂 models/
+│   ├── colorization_release_v2.caffemodel    # Pre-trained weights (download separately)
+│   └── colorization_deploy_v2.prototxt       # Network architecture
+│
+├── pts_in_hull.npy                           # 313 ab color bin cluster centers
+├── colorize.py                               # Core colorization logic (OpenCV DNN pipeline)
+├── gui.py                                    # Tkinter GUI application
+├── new.jpg                                   # Sample test image
+├── result.png                                # Sample colorized output
+├── requirements.txt                          # Python dependencies
+└── README.md                                 # You are here
+```
 
-<div class="cell code" data-execution_count="10" id="r4UdVyYx8l8N">
+---
 
-``` python
-input_width = 224
-input_height = 224
+## 🚀 Getting Started
 
-rgb_img = (frame[:,:,[2, 1, 0]] * 1.0 / 255).astype(np.float32)
-lab_img = cv.cvtColor(rgb_img, cv.COLOR_RGB2Lab)
-l_channel = lab_img[:,:,0] 
+### 1. Clone the repository
 
-l_channel_resize = cv.resize(l_channel, (input_width, input_height)) 
-l_channel_resize -= 50
+```bash
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Colorize Black & white images [OPEN CV]"
 ```
 
-</div>
-
-<div class="cell markdown" id="KhrO4zrIOWPA">
+### 2. Set up environment
 
-### Predict the ab channel and save the result:
+```bash
+python -m venv venv
+source venv/bin/activate        # Linux / macOS
+venv\Scripts\activate           # Windows
 
-</div>
+pip install -r requirements.txt
+```
 
-<div class="cell code" data-execution_count="11" data-colab="{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}" id="Quh9YGtL8oR5" data-outputId="450a4077-e0cb-4f13-e308-68977d7073e9">
+### 3. Download model files
 
-``` python
-Caffe_net.setInput(cv.dnn.blobFromImage(l_channel_resize))
-ab_channel = Caffe_net.forward()[0,:,:,:].transpose((1,2,0)) 
+Run the wget commands from the [Model Files](#-model-files) section above, or download manually and place them in `./models/`.
 
-(original_height,original_width) = rgb_img.shape[:2] 
-ab_channel_us = cv.resize(ab_channel, (original_width, original_height))
-lab_output = np.concatenate((l_channel[:,:,np.newaxis],ab_channel_us),axis=2) 
-bgr_output = np.clip(cv.cvtColor(lab_output, cv.COLOR_Lab2BGR), 0, 1)
+### 4. Run the GUI app
 
-cv.imwrite("./result.png", (bgr_output*255).astype(np.uint8))
+```bash
+python gui.py
 ```
 
-<div class="output execute_result" data-execution_count="11">
+This opens the Tkinter desktop window:
+- **File → Upload Image** — select any grayscale or black & white `.jpg` / `.png`
+- **File → Color Image** — run the colorization model and display the result
 
-    True
-
-</div>
+### 5. Run colorization directly (no GUI)
 
-</div>
-
-<div class="cell markdown" id="QP2B9ifWOY_o">
+```bash
+python colorize.py --image new.jpg
+# Outputs: result.png in the current directory
+```
 
-### Output
+---
 
-</div>
+## 🖥️ App Preview
 
-<div class="cell code" data-execution_count="15" data-colab="{&quot;height&quot;:269,&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}" id="l59rGCbJ8sk3" data-outputId="ae2952af-376b-40d3-e49d-832853706fb2">
-
-``` python
-frame1 = cv.imread("result.png")
-rgb_img = cv.cvtColor(frame1, cv.COLOR_BGR2RGB)		# this converts it into RGB
-plt.imshow(rgb_img)
-plt.show()
+```
+┌──────────────────────────────────────────────────┐
+│           B&W Image Colorization                 │
+│  File ▾                                          │
+│  ├── Upload Image                                │
+│  └── Color Image                                 │
+│                                                  │
+│  ┌───────────────────┐  ┌───────────────────┐    │
+│  │                   │  │                   │    │
+│  │   [B&W Input]     │  │  [Colorized Out]  │    │
+│  │                   │  │                   │    │
+│  └───────────────────┘  └───────────────────┘    │
+└──────────────────────────────────────────────────┘
 ```
 
-<div class="output display_data">
+---
 
-![](output.png)
+## 🛠️ Tech Stack
 
-</div>
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Computer Vision | OpenCV (`cv2.dnn`) |
+| Pre-trained Model | Caffe (Zhang et al. 2016) |
+| GUI Framework | Tkinter |
+| Numerical Computing | NumPy |
+| Visualization | Matplotlib |
 
-</div>
+---
 
-<div class="cell markdown" id="deB9kvi3OgAI">
+## 📚 References & Citation
 
-## Code for GUI:
+**Paper behind the model:**
 
-</div>
+```bibtex
+@inproceedings{zhang2016colorful,
+  title     = {Colorful Image Colorization},
+  author    = {Zhang, Richard and Isola, Phillip and Efros, Alexei A},
+  booktitle = {ECCV},
+  year      = {2016}
+}
+```
 
-<div class="cell code" data-execution_count="22" data-colab="{&quot;base_uri&quot;:&quot;https://localhost:8080/&quot;}" id="d4IxXTo-Akzj" data-outputId="f4d92e86-eab0-4066-e157-4ec730618d5d">
+- [Colorful Image Colorization — Zhang et al. (2016)](https://arxiv.org/abs/1603.08511)
+- [Official Demo & Model — richzhang/colorization](https://github.com/richzhang/colorization)
+- [OpenCV DNN colorization sample](https://github.com/opencv/opencv/blob/master/samples/dnn/colorization.py)
+- [PyImageSearch Tutorial — Adrian Rosebrock](https://pyimagesearch.com/2019/02/25/black-and-white-image-colorization-with-opencv-and-deep-learning/)
 
-``` python
-%%writefile gui.py
+---
 
-import tkinter as tk
-from tkinter import *
-from tkinter import filedialog
-from PIL import Image, ImageTk
-import os
-import numpy as np
-import cv2 as cv
-import os.path
-import matplotlib
-matplotlib.use('Agg')
-
-import sys
-import os
-
-if os.environ.get('DISPLAY','') == '':
-    print('no display found. Using :0.0')
-    os.environ.__setitem__('DISPLAY', ':0.0')
-    
-numpy_file = np.load('./pts_in_hull.npy')
-Caffe_net = cv.dnn.readNetFromCaffe("./models/colorization_deploy_v2.prototxt", "./models/colorization_release_v2.caffemodel")
-numpy_file = numpy_file.transpose().reshape(2, 313, 1, 1)
-
-class Window(Frame):
-    def __init__(self, master=None):
-        Frame.__init__(self, master)
-
-        self.master = master
-        self.pos = []
-        self.master.title("B&W Image Colorization")
-        self.pack(fill=BOTH, expand=1)
-
-        menu = Menu(self.master)
-        self.master.config(menu=menu)
-
-        file = Menu(menu)
-        file.add_command(label="Upload Image", command=self.uploadImage)
-        file.add_command(label="Color Image", command=self.color)
-        menu.add_cascade(label="File", menu=file)
-
-        self.canvas = tk.Canvas(self)
-        self.canvas.pack(fill=tk.BOTH, expand=True)
-        self.image = None
-        self.image2 = None
-
-        label1=Label(self,image=img)
-        label1.image=img
-        label1.place(x=400,y=370)
-
-
-
-
-    def uploadImage(self):
-        filename = filedialog.askopenfilename(initialdir=os.getcwd())
-        if not filename:
-            return
-        load = Image.open(filename)
-
-        load = load.resize((480, 360), Image.ANTIALIAS)
-
-        if self.image is None:
-            w, h = load.size
-            width, height = root.winfo_width(), root.winfo_height()
-            self.render = ImageTk.PhotoImage(load)
-            self.image = self.canvas.create_image((w / 2, h / 2), image=self.render)
-           
-        else:
-            self.canvas.delete(self.image3)
-            w, h = load.size
-            width, height = root.winfo_screenmmwidth(), root.winfo_screenheight()
-           
-            self.render2 = ImageTk.PhotoImage(load)
-            self.image2 = self.canvas.create_image((w / 2, h / 2), image=self.render2)
-
-
-        frame = cv.imread(filename)
-    
-        Caffe_net.getLayer(Caffe_net.getLayerId('class8_ab')).blobs = [numpy_file.astype(np.float32)]
-        Caffe_net.getLayer(Caffe_net.getLayerId('conv8_313_rh')).blobs = [np.full([1, 313], 2.606, np.float32)]
-
-        input_width = 224
-        input_height = 224
-
-        rgb_img = (frame[:,:,[2, 1, 0]] * 1.0 / 255).astype(np.float32)
-        lab_img = cv.cvtColor(rgb_img, cv.COLOR_RGB2Lab)
-        l_channel = lab_img[:,:,0] 
-
-        l_channel_resize = cv.resize(l_channel, (input_width, input_height)) 
-        l_channel_resize -= 50 
-
-        Caffe_net.setInput(cv.dnn.blobFromImage(l_channel_resize))
-        ab_channel = Caffe_net.forward()[0,:,:,:].transpose((1,2,0)) 
-
-        (original_height,original_width) = rgb_img.shape[:2] 
-        ab_channel_us = cv.resize(ab_channel, (original_width, original_height))
-        lab_output = np.concatenate((l_channel[:,:,np.newaxis],ab_channel_us),axis=2) 
-        bgr_output = np.clip(cv.cvtColor(lab_output, cv.COLOR_Lab2BGR), 0, 1)
-
-  
-        cv.imwrite("./result.png", (bgr_output*255).astype(np.uint8))
-
-    def color(self):
-
-        load = Image.open("./result.png")
-        load = load.resize((480, 360), Image.ANTIALIAS)
-
-        if self.image is None:
-            w, h = load.size
-            self.render = ImageTk.PhotoImage(load)
-            self.image = self.canvas.create_image((w / 2, h/2), image=self.render)
-            root.geometry("%dx%d" % (w, h))
-        else:
-            w, h = load.size
-            width, height = root.winfo_screenmmwidth(), root.winfo_screenheight()
-
-            self.render3 = ImageTk.PhotoImage(load)
-            self.image3 = self.canvas.create_image((w / 2, h / 2), image=self.render3)
-            self.canvas.move(self.image3, 500, 0)
- 
-
-root = tk.Tk()
-root.geometry("%dx%d" % (980, 600))
-root.title("B&W Image Colorization GUI")
-img = ImageTk.PhotoImage(Image.open("logo2.png"))
-
-app = Window(root)
-app.pack(fill=tk.BOTH, expand=1)
-root.mainloop()
-```
+<div align="center">
 
-<div class="output stream stdout">
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
 
-    Overwriting gui.py
+⭐ Star the main repo if this helped you!
 
 </div>
-
-</div>
\ No newline at end of file

From e110c0b37b91fc14be7fcbf07ba842b8a5a07392 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Wed, 18 Mar 2026 23:40:50 +0530
Subject: [PATCH 3/8] Update README.md

---
 Distracted Driver Detection/README.md | 1390 ++++---------------------
 1 file changed, 206 insertions(+), 1184 deletions(-)

diff --git a/Distracted Driver Detection/README.md b/Distracted Driver Detection/README.md
index c60775d..06892fa 100644
--- a/Distracted Driver Detection/README.md	
+++ b/Distracted Driver Detection/README.md	
@@ -1,1273 +1,295 @@
-<div class="cell markdown">
+<div align="center">
 
-## Distracted-Driver-Detection
+# 🚗 Distracted Driver Detection — ResNet50 from Scratch
 
-<img src="./supp/front_page.png" style="width:250;height:250;">
+[![Python](https://img.shields.io/badge/Python-3.7+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/)
+[![Keras](https://img.shields.io/badge/Keras-D00000?style=for-the-badge&logo=keras&logoColor=white)](https://keras.io/)
+[![TensorFlow](https://img.shields.io/badge/TensorFlow-FF6F00?style=for-the-badge&logo=tensorflow&logoColor=white)](https://www.tensorflow.org/)
+[![Dataset](https://img.shields.io/badge/Dataset-State%20Farm%20%7C%20Kaggle-20BEFF?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/c/state-farm-distracted-driver-detection)
+[![Classes](https://img.shields.io/badge/10%20Behavior%20Classes-orange?style=for-the-badge)]()
+[![License](https://img.shields.io/badge/License-MIT-1abc9c?style=for-the-badge)](../LICENSE.md)
 
-</div>
-
-<div class="cell markdown">
-
-### Problem Description
-
-</div>
-
-<div class="cell markdown">
-
-In this competition you are given driver images, each taken in a car
-with a driver doing something in the car (texting, eating, talking on
-the phone, makeup, reaching behind, etc). Your goal is to predict the
-likelihood of what the driver is doing in each picture.
-
-The 10 classes to predict are as follows, <br> <br> <table> <tr> <td>
-<li>c0: safe driving</li> <br> <li>c1: texting - right</li> <br> <li>c2:
-talking on the phone - right</li> <br> <li>c3: texting - left</li> <br>
-<li>c4: talking on the phone - left</li> <br> <li>c5: operating the
-radio</li> <br> <li>c6: drinking</li> <br> <li>c7: reaching behind</li>
-<br> <li>c8: hair and makeup</li> <br> <li>c9: talking to passenger</li>
-</td> <td> <img src="./supp/driver.gif" style="width:300;height:300px;">
-</td> </tr>
-
-</table>
-
-</div>
-
-<div class="cell markdown">
-
-### Summary of Results
-
-</div>
-
-<div class="cell markdown">
-
-Using a 50-layer Residual Network (with the following parameters) the
-following scores (losses) were obtained. <table> <li>10 Epochs</li>
-<li>32 Batch Size</li> <li>Adam Optimizer</li> <li>Glorot Uniform
-Initializer</li> <tr> <td> **Training Loss** </td> <td> 0.93 </td> </tr>
-<tr> <td> **Validation Loss** </td> <td> 3.79 </td> </tr> <tr> <td>
-**Holdout Loss** </td> <td> 2.64 </td> </tr> </table>
-
-**Why the high losses? Simply put - we don't have enough resources to
-quickly iterate / hyper-parameter tune the model\!** If more resources
-were available (RAM, CPU speed), we could hyper-parameter tune over grid
-searches and combat high bias / high variance, which this model
-currently suffers. [This is how you'd fix high bias/variance.](#improve)
-
-</div>
-
-<div class="cell markdown">
-
-### Import Dependencies and Define Functions
-
-</div>
-
-<div class="cell markdown">
-
-Let's begin by importing some useful dependencies and defining some key
-functions that we'll use throughout the notebook.
-
-</div>
-
-<div class="cell code">
-
-``` python
-import numpy as np
-import pandas as pd
-import tensorflow as tf
-import matplotlib.pyplot as plt
-
-from keras import layers
-from keras.layers import (Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization, 
-                          Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D)
-from keras.wrappers.scikit_learn import KerasClassifier
-from keras.models import Model, load_model, save_model
-from keras.preprocessing import image
-from keras.utils import layer_utils
-from keras.utils.data_utils import get_file
-from keras.applications.imagenet_utils import preprocess_input
-import pydot
-from IPython.display import SVG
-from keras.utils.vis_utils import model_to_dot
-from keras.utils import plot_model
-from resnets_utils import *
-from keras.initializers import glorot_uniform
-import scipy.misc
-from matplotlib.pyplot import imshow
-
-%matplotlib inline
-
-import keras.backend as K
-K.set_image_data_format('channels_last')
-K.set_learning_phase(1)
-
-from sklearn.model_selection import StratifiedKFold, cross_validate, LeaveOneGroupOut
-
-from PIL import Image
-```
-
-</div>
-
-<div class="cell code" data-execution_count="2">
-
-``` python
-def PlotClassFrequency(class_counts):
-    plt.figure(figsize=(15,4))
-    plt.bar(class_counts.index,class_counts)
-    plt.xlabel('class')
-    plt.xticks(np.arange(0, 10, 1.0))
-    plt.ylabel('count')
-    plt.title('Number of Images per Class')
-    plt.show()
-
-def DescribeImageData(data):
-    print('Average number of images: ' + str(np.mean(data)))
-    print("Lowest image count: {}. At: {}".format(data.min(), data.idxmin()))
-    print("Highest image count: {}. At: {}".format(data.max(), data.idxmax()))
-    print(data.describe())
-    
-def CreateImgArray(height, width, channel, data, folder, save_labels = True):
-    """
-    Writes image files found in 'imgs/train' to array of shape
-    [examples, height, width, channel]
-    
-    Arguments:
-    height -- integer, height in pixels
-    width --  integer, width in pixels
-    channel -- integer, number of channels (or dimensions) for image (3 for RGB)
-    data -- dataframe, containing associated image properties, such as:
-            subject -> string, alpha-numeric code of participant in image
-            classname -> string, the class name i.e. 'c0', 'c1', etc. 
-            img -> string, image name
-    folder -- string, either 'test' or 'train' folder containing the images
-    save_labels -- bool, True if labels should be saved, or False (just save 'X' images array).  
-                   Note: only applies if using train folder
-            
-    Returns:
-    .npy file -- file, contains the associated conversion of images to numerical values for processing
-    """
-    
-    num_examples = len(data)
-    X = np.zeros((num_examples,height,width,channel))
-    if (folder == 'train') & (save_labels == True):
-        Y = np.zeros(num_examples)
-    
-    for m in range(num_examples):
-        current_img = data.img[m]
-        img_path = 'imgs/' + folder + '/' + current_img
-        img = image.load_img(img_path, target_size=(height, width))
-        x = image.img_to_array(img)
-        x = preprocess_input(x)
-        X[m] = x
-        if (folder == 'train') & (save_labels == True):
-            Y[m] = data.loc[data['img'] == current_img, 'classname'].iloc[0]
-        
-    np.save('X_'+ folder + '_' + str(height) + '_' + str(width), X)
-    if (folder == 'train') & (save_labels == True):
-        np.save('Y_'+ folder + '_' + str(height) + '_' + str(width), Y)
-        
-def Rescale(X):
-    return (1/(2*np.max(X))) * X + 0.5
-
-def PrintImage(X_scaled, index, Y = None):
-    plt.imshow(X_scaled[index])
-    if Y is not None:
-        if Y.shape[1] == 1:
-            print ("y = " + str(np.squeeze(Y[index])))
-        else:
-            print("y = " + str(np.argmax(Y[index])))
-            
-def LOGO(X, Y, group, model_name, input_shape, classes, init, optimizer, metrics, epochs, batch_size):
-    logo = LeaveOneGroupOut()
-    logo.get_n_splits(X, Y, group);
-    cvscores = np.zeros((26,4))
-    subject_id = []
-    i = 0
-    for train, test in logo.split(X, Y, group):
-        # Create model
-        model = model_name(input_shape = input_shape, classes = classes, init = init)
-        # Compile the model
-        model.compile(optimizer = optimizer, loss='sparse_categorical_crossentropy', metrics=[metrics])
-        # Fit the model
-        model.fit(X[train], Y[train], epochs = epochs, batch_size = batch_size, verbose = 0)
-        # Evaluate the model
-        scores_train = model.evaluate(X[train], Y[train], verbose = 0)
-        scores_test = model.evaluate(X[test], Y[test], verbose = 0)
-        # Save to cvscores
-        cvscores[i] = [scores_train[0], scores_train[1] * 100, scores_test[0], scores_test[1] * 100]
-        subject_id.append(group.iloc[test[0]])
-        # Clear session
-        K.clear_session()
-        # Update counter
-        i += 1
-        
-    return pd.DataFrame(cvscores, index = subject_id, columns=['Train_loss', 'Train_acc','Test_loss', 'Test_acc'])
-```
-
-</div>
-
-<div class="cell markdown">
-
-### Quick EDA
-
-</div>
-
-<div class="cell markdown">
-
-Let's begin by loading the provided dataset 'driver\_imgs\_list' doing a
-quick analysis.
-
-</div>
-
-<div class="cell code" data-execution_count="3">
-
-``` python
-driver_imgs_df = pd.read_csv('driver_imgs_list/driver_imgs_list.csv')
-driver_imgs_df.head()
-```
-
-<div class="output execute_result" data-execution_count="3">
-
-``` 
-  subject classname            img
-0    p002        c0  img_44733.jpg
-1    p002        c0  img_72999.jpg
-2    p002        c0  img_25094.jpg
-3    p002        c0  img_69092.jpg
-4    p002        c0  img_92629.jpg
-```
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-We can note the number of examples by printing the shape of the
-dataframe. Looks like the training set has 22,424 images.
-
-</div>
-
-<div class="cell code" data-execution_count="4">
-
-``` python
-driver_imgs_df.shape
-```
-
-<div class="output execute_result" data-execution_count="4">
-
-    (22424, 3)
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-We can plot the number of images per class to see if any classes have a
-low number of images.
-
-</div>
-
-<div class="cell code" data-execution_count="5">
-
-``` python
-class_counts = (driver_imgs_df.classname).value_counts()
-PlotClassFrequency(class_counts)
-DescribeImageData(class_counts)
-```
-
-<div class="output display_data">
-
-![](c304b6e1f57c2d464ca2c216a0e5c596439c5b93.png)
-
-</div>
-
-<div class="output stream stdout">
-
-    Average number of images: 2242.4
-    Lowest image count: 1911. At: c8
-    Highest image count: 2489. At: c0
-    count      10.000000
-    mean     2242.400000
-    std       175.387951
-    min      1911.000000
-    25%      2163.500000
-    50%      2314.500000
-    75%      2325.750000
-    max      2489.000000
-    Name: classname, dtype: float64
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-Additionally, we can plot the number of images per test subject. It
-would be much more helpful to plot the number of images belonging to
-each class *per subject*. We could then ensure that the distribution is
-somewhat uniform. We did not show this here, and instead just plotted
-number of images per subject.
-
-</div>
+> Classifies **10 distracted driving behaviors** from dashboard camera images using a **custom ResNet50 implementation built from scratch in Keras** — including manual `convolutional_block` and `identity_block` definitions, `glorot_uniform` initialization, and LOGO cross-validation strategy.
 
-<div class="cell code" data-execution_count="6">
-
-``` python
-subject_counts = (driver_imgs_df.subject).value_counts()
-plt.figure(figsize=(15,4))
-plt.bar(subject_counts.index,subject_counts)
-plt.xlabel('subject')
-plt.ylabel('count')
-plt.title('Number of Images per Subject')
-plt.show()
-DescribeImageData(subject_counts)
-```
-
-<div class="output display_data">
-
-![](e142a949a0c9691c28ce079e9311e2ddd0a1fad4.png)
-
-</div>
-
-<div class="output stream stdout">
-
-    Average number of images: 862.461538462
-    Lowest image count: 346. At: p072
-    Highest image count: 1237. At: p021
-    count      26.000000
-    mean      862.461538
-    std       214.298713
-    min       346.000000
-    25%       752.500000
-    50%       823.000000
-    75%       988.250000
-    max      1237.000000
-    Name: subject, dtype: float64
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-Furthermore, we can check if there are any null image examples.
-
-</div>
-
-<div class="cell code" data-execution_count="7">
-
-``` python
-pd.isnull(driver_imgs_df).sum()
-```
-
-<div class="output execute_result" data-execution_count="7">
-
-    subject      0
-    classname    0
-    img          0
-    dtype: int64
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-### Preprocess Data
-
-</div>
-
-<div class="cell markdown">
-
-The data was provided with the classes in order (from class 0 to class
-9). Let's shuffle the data by permutating the 'classname' and 'img'
-attributes.
-
-</div>
-
-<div class="cell code" data-execution_count="8">
-
-``` python
-np.random.seed(0)
-myarray = np.random.permutation(driver_imgs_df)
-driver_imgs_df = pd.DataFrame(data = myarray, columns=['subject', 'classname', 'img'])
-```
-
-</div>
-
-<div class="cell markdown">
-
-We'll go ahead and apply a dictionary to the 'classname' attribute and
-assign the strings to their respective integers.
-
-</div>
-
-<div class="cell code" data-execution_count="9">
-
-``` python
-d = {'c0': 0, 'c1': 1, 'c2': 2, 'c3': 3, 'c4': 4, 'c5': 5, 'c6': 6, 'c7': 7, 'c8': 8, 'c9': 9}
-driver_imgs_df.classname = driver_imgs_df.classname.map(d)
-```
+[🔙 Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
 
 </div>
 
-<div class="cell markdown">
+---
 
-### Convert Dataframe to Array for Training
+## ⚠️ Safety Context
 
-</div>
+> Distracted driving causes thousands of road fatalities annually. Automated in-vehicle behavior classification from dashboard cameras is an active area of road safety AI research.
 
-<div class="cell markdown">
+---
 
-Let's convert the images into numerical arrays of dimension '64, 64, 3'.
-Both the height and width of the images will be 64 pixels, and each
-image will have 3 channels (for red, green and blue). The following
-function saves the array as a .npy file.
+## 📌 Table of Contents
 
-</div>
+- [About the Project](#-about-the-project)
+- [How It Works](#-how-it-works)
+- [Dataset](#-dataset)
+- [Class Definitions](#-class-definitions)
+- [Model Architecture](#-model-architecture)
+- [Training Analysis & Challenges](#-training-analysis--challenges)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+- [References](#-references)
 
-<div class="cell code" data-execution_count="39">
+---
 
-``` python
-CreateImgArray(64, 64, 3, driver_imgs_df, 'train')
-```
+## 🔬 About the Project
 
-</div>
+This project tackles the **State Farm Distracted Driver Detection** Kaggle challenge — classifying driver images into 10 behavior classes. What makes it distinctive is that **ResNet50 is implemented completely from scratch** using the Keras functional API, manually defining every bottleneck block and skip connection rather than using `tf.keras.applications`.
 
-<div class="cell markdown">
+The notebook also demonstrates handling real-world ML challenges: **high bias**, **high variance**, and the **LOGO (Leave-One-Group-Out) cross-validation** strategy needed because multiple images belong to the same driver — random splits would leak the same driver into both train and validation sets.
 
-Let's now load the new image arrays into the environment. Note that this
-step is used to save memory so that CreateImgArray does not have to be
-executed every time.
+**What this project covers:**
+- Manual `identity_block` and `convolutional_block` implementations in Keras
+- `resnets_utils` helper module for block definitions
+- Diagnosing and addressing underfitting (high bias) and overfitting (high variance)
+- LOGO cross-validation to prevent driver-level data leakage
 
-</div>
+---
 
-<div class="cell code" data-execution_count="10" data-scrolled="true">
+## ⚙️ How It Works
 
-``` python
-X = np.load('X_train_64_64.npy')
-X.shape
 ```
-
-<div class="output execute_result" data-execution_count="10">
-
-    (22424, 64, 64, 3)
-
-</div>
-
-</div>
-
-<div class="cell code" data-execution_count="11">
-
-``` python
-Y = np.load('Y_train_64_64.npy')
-Y.shape
-```
-
-<div class="output execute_result" data-execution_count="11">
-
-    (22424,)
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-Let's check our new arrays and ensure we compiled everything correctly.
-We can see that we do not have any entries in X that contain zero, and Y
-contains all the target labels.
-
-</div>
-
-<div class="cell code" data-execution_count="12">
-
-``` python
-(X == 0).sum()
-```
-
-<div class="output execute_result" data-execution_count="12">
-
-``` 
-0
-```
-
-</div>
-
-</div>
-
-<div class="cell code" data-execution_count="13">
-
-``` python
-PlotClassFrequency(pd.DataFrame(Y)[0].value_counts())
-```
-
-<div class="output display_data">
-
-![](a8d3e7336933ca4c99e8cd00289dfbdbfb2dd0b9.png)
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-Furthermore, we can print the images from X and the associated class as
-a sanity check. Re-scaling the images (between 0 and 1):
-
-</div>
-
-<div class="cell code" data-execution_count="14">
-
-``` python
-X_scaled = Rescale(X)
-```
-
-</div>
-
-<div class="cell code" data-execution_count="15">
-
-``` python
-PrintImage(X_scaled, 2, Y = Y.reshape(-1,1))
+Dashboard Camera Image
+         │
+         ▼
+  Load + Preprocess
+  (Normalize pixel values / 255)
+         │
+         ▼
+  ResNet50 Forward Pass
+  (Custom Keras implementation)
+  ┌─────────────────────────────────┐
+  │ ZeroPadding2D (3,3)             │
+  │ Conv2D(64,7×7,s=2) → BN → ReLU │
+  │ MaxPool(3×3, s=2)               │
+  │ Stage 2: ConvBlock + IdBlock×2  │
+  │ Stage 3: ConvBlock + IdBlock×3  │
+  │ Stage 4: ConvBlock + IdBlock×5  │
+  │ Stage 5: ConvBlock + IdBlock×2  │
+  │ AveragePooling2D(2×2)           │
+  │ Flatten → Dense(10, softmax)    │
+  └─────────────────────────────────┘
+         │
+         ▼
+  10-Class Softmax Output → c0–c9
 ```
 
-<div class="output stream stdout">
-
-    y = 7.0
-
-</div>
-
-<div class="output display_data">
-
-![](99927c379ca668ba9b7976499c8b83f0bbc6bce4.png)
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-Class of "7" corresponds to a driver "reaching behind", which appears to
-be the case shown above.
-
-</div>
-
-<div class="cell markdown">
-
-### Build the Model
-
-</div>
-
-<div class="cell markdown">
-
-We'll use the popular Residual Net with 50 layers. Residual networks are
-essential to preventing vanishing gradients when using a rather 'deep'
-network (many layers). The identity\_block and convolutional\_block are
-defined below.
-
-</div>
-
-<div class="cell code" data-execution_count="16">
+---
 
-``` python
-def identity_block(X, f, filters, stage, block, init):
-    """
-    Implementation of the identity block as defined in Figure 3
-    
-    Arguments:
-    X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
-    f -- integer, specifying the shape of the middle CONV's window for the main path
-    filters -- python list of integers, defining the number of filters in the CONV layers of the main path
-    stage -- integer, used to name the layers, depending on their position in the network
-    block -- string/character, used to name the layers, depending on their position in the network
-    
-    Returns:
-    X -- output of the identity block, tensor of shape (n_H, n_W, n_C)
-    """
-    
-    # defining name basis
-    conv_name_base = 'res' + str(stage) + block + '_branch'
-    bn_name_base = 'bn' + str(stage) + block + '_branch'
-    
-    # Retrieve Filters
-    F1, F2, F3 = filters
-    
-    # Save the input value. You'll need this later to add back to the main path. 
-    X_shortcut = X
-    
-    # First component of main path
-    X = Conv2D(filters = F1, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2a', kernel_initializer = init)(X)
-    X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
-    X = Activation('relu')(X)
-    
-    ### START CODE HERE ###
-    
-    # Second component of main path (≈3 lines)
-    X = Conv2D(filters = F2, kernel_size = (f, f), strides = (1,1), padding = 'same', name = conv_name_base + '2b', kernel_initializer = init)(X)
-    X = BatchNormalization(axis = 3, name = bn_name_base + '2b')(X)
-    X = Activation('relu')(X)
-
-    # Third component of main path (≈2 lines)
-    X = Conv2D(filters = F3, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2c', kernel_initializer = init)(X)
-    X = BatchNormalization(axis = 3, name = bn_name_base + '2c')(X)
-
-    # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
-    X = Add()([X,X_shortcut])
-    X = Activation('relu')(X)
-    
-    ### END CODE HERE ###
-    
-    return X
-```
+## 📊 Dataset
 
-</div>
-
-<div class="cell code" data-execution_count="17">
+| Property | Details |
+|----------|---------|
+| **Name** | State Farm Distracted Driver Detection |
+| **Source** | [Kaggle Competition](https://www.kaggle.com/c/state-farm-distracted-driver-detection) |
+| **Training Images** | 22,424 |
+| **Classes** | 10 driving behaviors |
+| **Input Shape** | Resized to `64 × 64 × 3` for training |
+| **Metadata** | `driver_imgs_list.csv` — subject ID, classname, filename |
+| **Key Challenge** | Multiple images per driver → LOGO cross-validation required |
 
-``` python
-def convolutional_block(X, f, filters, stage, block, init, s = 2):
-    """
-    Implementation of the convolutional block as defined in Figure 4
-    
-    Arguments:
-    X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
-    f -- integer, specifying the shape of the middle CONV's window for the main path
-    filters -- python list of integers, defining the number of filters in the CONV layers of the main path
-    stage -- integer, used to name the layers, depending on their position in the network
-    block -- string/character, used to name the layers, depending on their position in the network
-    s -- Integer, specifying the stride to be used
-    
-    Returns:
-    X -- output of the convolutional block, tensor of shape (n_H, n_W, n_C)
-    """
-    
-    # defining name basis
-    conv_name_base = 'res' + str(stage) + block + '_branch'
-    bn_name_base = 'bn' + str(stage) + block + '_branch'
-    
-    # Retrieve Filters
-    F1, F2, F3 = filters
-    
-    # Save the input value
-    X_shortcut = X
-
-
-    ##### MAIN PATH #####
-    # First component of main path 
-    X = Conv2D(F1, (1, 1), strides = (s,s), name = conv_name_base + '2a', kernel_initializer = init)(X)
-    X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
-    X = Activation('relu')(X)
-    
-    ### START CODE HERE ###
-
-    # Second component of main path (≈3 lines)
-    X = Conv2D(F2, (f, f), strides = (1,1), padding = 'same', name = conv_name_base + '2b', kernel_initializer = init)(X)
-    X = BatchNormalization(axis = 3, name = bn_name_base + '2b')(X)
-    X = Activation('relu')(X)
-
-    # Third component of main path (≈2 lines)
-    X = Conv2D(F3, (1, 1), strides = (1,1), name = conv_name_base + '2c', kernel_initializer = init)(X)
-    X = BatchNormalization(axis = 3, name = bn_name_base + '2c')(X)
-
-    ##### SHORTCUT PATH #### (≈2 lines)
-    X_shortcut = Conv2D(F3, (1, 1), strides = (s,s), name = conv_name_base + '1', kernel_initializer = init)(X_shortcut)
-    X_shortcut = BatchNormalization(axis = 3, name = bn_name_base + '1')(X_shortcut)
-
-    # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
-    X = Add()([X,X_shortcut])
-    X = Activation('relu')(X)
-    
-    ### END CODE HERE ###
-    
-    return X
-```
+---
 
-</div>
+## 🚦 Class Definitions
 
-<div class="cell markdown">
+| Code | Behavior |
+|:----:|----------|
+| **c0** | ✅ Safe Driving |
+| **c1** | 📱 Texting — Right Hand |
+| **c2** | 📞 Phone Call — Right Hand |
+| **c3** | 📱 Texting — Left Hand |
+| **c4** | 📞 Phone Call — Left Hand |
+| **c5** | 🎵 Operating Radio |
+| **c6** | 🥤 Drinking |
+| **c7** | 🔙 Reaching Behind |
+| **c8** | 💄 Hair / Makeup |
+| **c9** | 💬 Talking to Passenger |
 
-With the two blocks defined, we'll now create the model ResNet50, as
-shown below.
+---
 
-</div>
+## 🏗️ Model Architecture
 
-<div class="cell code" data-execution_count="18">
+The notebook defines **ResNet50 from scratch** — no pretrained weights, no `tf.keras.applications`:
+
+```python
+from keras.layers import (Input, Add, Dense, Activation, ZeroPadding2D,
+    BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D)
+from keras.models import Model
+from keras.initializers import glorot_uniform
+from resnets_utils import *
 
-``` python
-def ResNet50(input_shape = (64, 64, 3), classes = 10, init = glorot_uniform(seed=0)):
+def ResNet50(input_shape=(64, 64, 3), classes=10, init=glorot_uniform(seed=0)):
     """
-    Implementation of the popular ResNet50 the following architecture:
-    CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK*2 -> CONVBLOCK -> IDBLOCK*3
-    -> CONVBLOCK -> IDBLOCK*5 -> CONVBLOCK -> IDBLOCK*2 -> AVGPOOL -> TOPLAYER
-
-    Arguments:
-    input_shape -- shape of the images of the dataset
-    classes -- integer, number of classes
-
-    Returns:
-    model -- a Model() instance in Keras
+    CONV2D -> BATCHNORM -> RELU -> MAXPOOL
+    -> CONVBLOCK -> IDBLOCK*2
+    -> CONVBLOCK -> IDBLOCK*3
+    -> CONVBLOCK -> IDBLOCK*5
+    -> CONVBLOCK -> IDBLOCK*2
+    -> AVGPOOL -> TOPLAYER
     """
-    
-    # Define the input as a tensor with shape input_shape
-    X_input = Input(input_shape)
-
-    
-    # Zero-Padding
-    X = ZeroPadding2D((3, 3))(X_input)
-    
-    # Stage 1
-    X = Conv2D(64, (7, 7), strides = (2, 2), name = 'conv1', kernel_initializer = init)(X)
-    X = BatchNormalization(axis = 3, name = 'bn_conv1')(X)
-    X = Activation('relu')(X)
-    X = MaxPooling2D((3, 3), strides=(2, 2))(X)
-
-    # Stage 2
-    X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, block='a', s = 1, init = init)
-    X = identity_block(X, 3, [64, 64, 256], stage=2, block='b', init = init)
-    X = identity_block(X, 3, [64, 64, 256], stage=2, block='c', init = init)
-
-    ### START CODE HERE ###
-
-    # Stage 3 (≈4 lines)
-    X = convolutional_block(X, f = 3, filters = [128,128,512], stage = 3, block='a', s = 2, init = init)
-    X = identity_block(X, 3, [128,128,512], stage=3, block='b', init = init)
-    X = identity_block(X, 3, [128,128,512], stage=3, block='c', init = init)
-    X = identity_block(X, 3, [128,128,512], stage=3, block='d', init = init)
-
-    # Stage 4 (≈6 lines)
-    X = convolutional_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block='a', s = 2, init = init)
-    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='b', init = init)
-    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='c', init = init)
-    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='d', init = init)
-    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='e', init = init)
-    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='f', init = init)
-
-    # Stage 5 (≈3 lines)
-    X = convolutional_block(X, f = 3, filters = [512, 512, 2048], stage = 5, block='a', s = 2, init = init)
-    X = identity_block(X, 3, [512, 512, 2048], stage=5, block='b', init = init)
-    X = identity_block(X, 3, [512, 512, 2048], stage=5, block='c', init = init)
-
-    # AVGPOOL (≈1 line). Use "X = AveragePooling2D(...)(X)"
-    X = AveragePooling2D(pool_size=(2, 2), name = 'avg_pool')(X)
-    
-    ### END CODE HERE ###
-
-    # output layer
-    X = Flatten()(X)
-    X = Dense(classes, activation='softmax', name='fc' + str(classes), kernel_initializer = init)(X)
-    
-    # Create model
-    model = Model(inputs = X_input, outputs = X, name='ResNet50')
-    
-    return model
-```
-
-</div>
-
-<div class="cell markdown">
-
-### Cross Validation Training (Leave-One-Group-Out)
-
-</div>
-
-<div class="cell markdown">
-
-Let's do some basic transformation on the training / label arrays, and
-print the shapes. After, we'll define some key functions for use in our
-first CNN model.
-
-</div>
-
-<div class="cell code" data-execution_count="19">
-
-``` python
-# Normalize image vectors
-X_train = X/255
-
-# Convert training and test labels to one hot matrices
-#Y = convert_to_one_hot(Y.astype(int), 10).T
-Y_train = np.expand_dims(Y.astype(int), -1)
-
-print ("number of training examples = " + str(X_train.shape[0]))
-print ("X_train shape: " + str(X_train.shape))
-print ("Y_train shape: " + str(Y_train.shape))
-```
-
-<div class="output stream stdout">
-
-    number of training examples = 22424
-    X_train shape: (22424, 64, 64, 3)
-    Y_train shape: (22424, 1)
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-Next, let's call our LOGO function that incorporates the Leave One Group
-Out cross-validator. This function will allow us to split the data using
-the drivers ('subject') as the group, which should help us prevent
-overfitting as the model will probably learn too much information off
-the type of driver/subject and become biased.
-
-Below we pass the arguments to the self-defined LOGO function and
-execute. The return is a dataframe consistering of the accuracy/loss
-scores of the training/dev sets (for each group/driver).
-
-</div>
-
-<div class="cell code" data-execution_count="25">
-
-``` python
-scores = LOGO(X_train, Y_train, group = driver_imgs_df['subject'],
-              model_name = ResNet50, input_shape = (64, 64, 3), classes = 10, 
-              init = glorot_uniform(seed=0), optimizer = 'adam', metrics = 'accuracy',
-              epochs = 2, batch_size = 32)
-```
-
-</div>
-
-<div class="cell markdown">
-
-Plotting the dev set accuracy, we can see that 'p081' had the lowest
-accuracy at 8.07%, and 'p002' had the highest accuracy at 71.52%.
-
-</div>
-
-<div class="cell code" data-execution_count="41">
-
-``` python
-plt.figure(figsize=(15,4))
-plt.bar(scores.index, scores.loc[:,'Test_acc'].sort_values(ascending=False))
-plt.yticks(np.arange(0, 110, 10.0))
-plt.show()
-```
-
-<div class="output display_data">
-
-![](7e4541a8ba586f1c4e438af2304ade4a1206364d.png)
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-Calling 'describe' method, we can note some useful statistics.
-
-</div>
-
-<div class="cell code" data-execution_count="42">
-
-``` python
-scores.describe()
-```
-
-<div class="output execute_result" data-execution_count="42">
-
-``` 
-       Train_loss  Train_acc  Test_loss   Test_acc
-count   26.000000  26.000000  26.000000  26.000000
-mean     4.118791  27.908272   5.293537  21.190364
-std      3.597604  19.144588   4.731039  16.150668
-min      0.722578   8.477557   0.820852   8.070501
-25%      1.849149  11.193114   2.133728  10.137083
-50%      2.545475  25.507787   2.562653  14.259937
-75%      5.299684  39.668163   8.664656  26.789961
-max     14.751674  74.439192  14.553808  71.521739
-```
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-And finally, let's print out the train/dev scores.
-
-</div>
-
-<div class="cell code" data-execution_count="65">
-
-``` python
-print("Train acc: {:.2f}. Dev. acc: {:.2f}".format(scores['Train_acc'].mean(), scores['Test_acc'].mean()))
-print("Train loss: {:.2f}. Dev. loss: {:.2f}".format(scores['Train_loss'].mean(), scores['Test_loss'].mean()))
-```
-
-<div class="output stream stdout">
-
-    Train acc: 27.91. Dev. acc: 21.19
-    Train loss: 4.12. Dev. loss: 5.29
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-We can note that the train accuracy is higher than the dev accuracy,
-which is expected. The accuracy is quite low in comparison to our
-assumed Bayes accuracy of 100% (using human accuracy as a proxy to
-Bayes), and we have some variance (differnce between train and dev) of
-about 6.72%. Let's try increasing the number of epochs to 10 and observe
-if the train/dev accuracies increase (loss decreases).
-
-</div>
-
-<div class="cell code" data-execution_count="66">
-
-``` python
-scores = LOGO(X_train, Y_train, group = driver_imgs_df['subject'],
-              model_name = ResNet50, input_shape = (64, 64, 3), classes = 10, 
-              init = glorot_uniform(seed=0), optimizer = 'adam', metrics = 'accuracy',
-              epochs = 5, batch_size = 32)
 ```
 
-</div>
+**Block types:**
 
-<div class="cell code" data-execution_count="67">
+| Block | Shape Change | Used When |
+|-------|-------------|-----------|
+| **Identity Block** | Input = Output shape | Deepening without dimension change |
+| **Convolutional Block** | Input ≠ Output shape | When stride changes or filter count increases |
 
-``` python
-print("Train acc: {:.2f}. Dev. acc: {:.2f}".format(scores['Train_acc'].mean(), scores['Test_acc'].mean()))
-print("Train loss: {:.2f}. Dev. loss: {:.2f}".format(scores['Train_loss'].mean(), scores['Test_loss'].mean()))
-```
+**Stage filter configurations:**
 
-<div class="output stream stdout">
+| Stage | Filters | Blocks |
+|-------|---------|--------|
+| Stage 2 | [64, 64, 256] | ConvBlock + IdBlock × 2 |
+| Stage 3 | [128, 128, 512] | ConvBlock + IdBlock × 3 |
+| Stage 4 | [256, 256, 1024] | ConvBlock + IdBlock × 5 |
+| Stage 5 | [512, 512, 2048] | ConvBlock + IdBlock × 2 |
 
-    Train acc: 37.83. Dev. acc: 25.79
-    Train loss: 2.61. Dev. loss: 3.30
+**Training config:**
 
-</div>
+| Parameter | Value |
+|-----------|-------|
+| Initializer | `glorot_uniform(seed=0)` |
+| Optimizer | Adam |
+| Loss | Categorical Cross-Entropy |
+| Input Shape | `(64, 64, 3)` |
+| Output | Dense(10, softmax) |
 
-</div>
+---
 
-<div class="cell markdown">
-
-<a class="anchor" id="improve"></a> The train and dev accuracy increased
-to 37.83% and 25.79%, respectively. We can note that we still have an
-underfitting problem (high bias, about 62.17% from 100%), *however, our
-variance has increased dramatically between 2 epochs and 5 by about 80%
-(12.04% variance)\!* Not only do **we have high bias, but our model also
-exhibits high variance**. In order to tackle this, we'll need to address
-the high bias first (get as close to Bayes error as possible) and then
-deal with the resulting high variance. Note that ALL of the steps below
-should be performed with LOGO cross-validation. This way, we can be sure
-our estimates of the dev set are in line with the holdout set.
-
-In order to tackle **high bias**, we can do any of the following:
-<li>run more epochs</li> <li>increase the batch size (up to number of
-examples)</li> <li>make a deeper network</li> <li>increases the image
-size from 64x64 to 128x128, 256x256, etc.</li> <li>GridSearching over
-params (batch size, epoch, optimizer and it's parameters,
-initializer)</li>
+## 📉 Training Analysis & Challenges
 
-</div>
+The notebook provides honest, detailed bias-variance analysis across training runs — a key learning documented in the project:
 
-<div class="cell markdown">
+### Epoch 2 Results
+| Set | Accuracy |
+|-----|:--------:|
+| Train | ~26% |
+| Dev | ~13% |
 
-Let's up the epoch count to 10. The assumption is that the train
-accuracy will be higher than the previous 5 epoch model, but our
-variance will increase.
+> High bias (underfitting) — model hasn't converged. High variance — large gap between train/dev.
 
-</div>
+### Epoch 5 Results
+| Set | Accuracy |
+|-----|:--------:|
+| Train | **37.83%** |
+| Dev | **25.79%** |
 
-<div class="cell code" data-execution_count="75">
+> Train accuracy improved but **underfitting persists** (~62% away from 100%). Variance increased dramatically (+80% gap between epochs 2→5). The notebook diagnoses this explicitly:
 
-``` python
-scores = LOGO(X_train, Y_train, group = driver_imgs_df['subject'],
-              model_name = ResNet50, input_shape = (64, 64, 3), classes = 10, 
-              init = glorot_uniform(seed=0), optimizer = 'adam', metrics = 'accuracy',
-              epochs = 10, batch_size = 32)
 ```
-
-</div>
-
-<div class="cell code" data-execution_count="76">
-
-``` python
-print("Train acc: {:.2f}. Dev. acc: {:.2f}".format(scores['Train_acc'].mean(), scores['Test_acc'].mean()))
-print("Train loss: {:.2f}. Dev. loss: {:.2f}".format(scores['Train_loss'].mean(), scores['Test_loss'].mean()))
+"We still have an underfitting problem (high bias, about 62.17% from 100%),
+however, our variance has increased dramatically between 2 and 5 epochs by about 80%."
 ```
 
-<div class="output stream stdout">
-
-    Train acc: 86.95. Dev. acc: 40.68
-    Train loss: 0.93. Dev. loss: 3.79
-
-</div>
-
-</div>
-
-<div class="cell markdown">
+### Prescribed fixes documented in the notebook:
 
-As expected, the training accuracy increased to 86.95%, but the variance
-increase from 5 epochs to 10 was about 284% (46.27% variance)\! Thus, we
-can conclude that this model suffers from severe high variance. We can
-continue on and use the steps above to fix the remaining bias, then we
-can use the steps below to reduce the variance.
+**To address High Bias (underfitting):**
+- Increase epoch count
+- Use a bigger/deeper network
+- Try different optimizers or learning rate schedules
 
-</div>
-
-<div class="cell markdown">
-
-In order to tackle **high variance**, we can do any of the following:
-<li>Augment images to increase sample size</li> <li>Regularization</li>
-<li>GridSearching over params (batch size, epoch, optimizer and it's
-parameters, initializer)</li> <li>Decrease dev set size (allows more
-examples to be trained, making model less prone to overfitting)</li>
-<li>Investigate classes with low accuracy, and fix them</li>
-
-</div>
-
-<div class="cell markdown">
-
-<table> 
-    <tr> 
-        <td>
-            **Model**
-        </td>
-        <td>
-           **Epoch**
-        </td>
-        <td>
-         **Train Accuracy**
-        </td>
-        <td>
-         **Dev Accuracy**
-        </td>
-        <td>
-         **Bias**
-        </td>
-        <td>
-         **Variance**
-        </td>
-    </tr>
-    <tr>
-        <td>
-            **Model A**
-        </td>
-        <td>
-           2
-        </td>
-        <td>
-         27.91
-        </td>
-        <td>
-         21.19
-        </td>
-        <td>
-         72.09
-        </td>
-        <td>
-         6.72
-        </td>
-    </tr>
-    <tr> 
-        <td>
-            **Model B**
-        </td>
-        <td> 
-        5
-        </td>
-        <td>
-         37.83
-        </td>
-        <td>
-         25.79
-        </td>
-        <td>
-         62.17
-        </td>
-        <td>
-         12.04
-        </td>
-    </tr>
-    <tr> 
-        <td>
-            **Model C**
-        </td>
-        <td>
-         10
-        </td>
-        <td>
-         86.95
-        </td>
-        <td>
-         40.68
-        </td>
-        <td>
-         13.06
-        </td>
-        <td>
-        46.27
-        </td>
-    </tr>
-</table>
+**To address High Variance (overfitting):**
+- Apply L2 regularization
+- Add dropout layers
+- Use data augmentation
+- Increase training data volume
 
-</div>
-
-<div class="cell markdown">
-
-### Predictions on the Holdout Set 
+### LOGO Cross-Validation Note
 
-</div>
-
-<div class="cell markdown">
+> Standard random train/val splits cause **data leakage** — the same driver's images appear in both sets, inflating dev accuracy. The notebook flags this and recommends **Leave-One-Group-Out (LOGO)** cross-validation, splitting by `subject` (driver ID) from `driver_imgs_list.csv`.
 
-We'll go ahead and fit the 10 epoch model.
+---
 
-</div>
+## 📁 Project Structure
 
-<div class="cell code" data-execution_count="87">
-
-``` python
-model = ResNet50(input_shape = (64, 64, 3), classes = 10)
-model.compile(optimizer = 'adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
-model.fit(X_train, Y_train, epochs = 10, batch_size = 32)
-```
-
-<div class="output stream stdout">
-
-    Epoch 1/10
-    22424/22424 [==============================] - 83s 4ms/step - loss: 2.4026 - acc: 0.3128
-    Epoch 2/10
-    22424/22424 [==============================] - 76s 3ms/step - loss: 1.8118 - acc: 0.4996
-    Epoch 3/10
-    22424/22424 [==============================] - 76s 3ms/step - loss: 1.5023 - acc: 0.6153
-    Epoch 4/10
-    22424/22424 [==============================] - 76s 3ms/step - loss: 0.8445 - acc: 0.8483
-    Epoch 5/10
-    22424/22424 [==============================] - 76s 3ms/step - loss: 1.2427 - acc: 0.7447
-    Epoch 6/10
-    22424/22424 [==============================] - 76s 3ms/step - loss: 0.8930 - acc: 0.8216
-    Epoch 7/10
-    22424/22424 [==============================] - 76s 3ms/step - loss: 0.9400 - acc: 0.8144
-    Epoch 8/10
-    22424/22424 [==============================] - 76s 3ms/step - loss: 0.7440 - acc: 0.8748
-    Epoch 9/10
-    22424/22424 [==============================] - 76s 3ms/step - loss: 1.4076 - acc: 0.6559
-    Epoch 10/10
-    22424/22424 [==============================] - 76s 3ms/step - loss: 0.6796 - acc: 0.8135
-
-</div>
-
-<div class="output execute_result" data-execution_count="87">
-
-    <keras.callbacks.History at 0x2382a055f98>
-
-</div>
-
-</div>
-
-<div class="cell code" data-execution_count="88">
-
-``` python
-save_model(model, 'e10.h5');
 ```
-
-</div>
-
-<div class="cell code" data-execution_count="10">
-
-``` python
-model = load_model('e10.h5')
+Distracted Driver Detection/
+│
+├── 📂 dataset/
+│   ├── train/                         # Training images, organized by class
+│   │   ├── c0/  c1/  c2/  ...  c9/
+│   └── test/                          # Unlabeled test images
+│
+├── driver_imgs_list.csv               # subject, classname, img columns
+├── resnets_utils.py                   # identity_block + convolutional_block helpers
+├── distracted_driver_detection.ipynb  # Main notebook
+├── requirements.txt                   # Python dependencies
+└── README.md                          # You are here
 ```
 
-</div>
-
-<div class="cell markdown">
-
-Let's load the holdout data set from out 'test\_file\_names' csv file
-and then create the necessary array.
+---
 
-</div>
+## 🚀 Getting Started
 
-<div class="cell code">
+### 1. Clone the repository
 
-``` python
-holdout_imgs_df = pd.read_csv('test_file_names.csv')
-holdout_imgs_df.rename(columns={"imagename": "img"}, inplace = True)
+```bash
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Distracted Driver Detection"
 ```
 
-</div>
-
-<div class="cell code">
+### 2. Download the dataset from Kaggle
 
-``` python
-CreateImgArray(64, 64, 3, holdout_imgs_df, 'test')
+```bash
+pip install kaggle
+kaggle competitions download -c state-farm-distracted-driver-detection
+unzip state-farm-distracted-driver-detection.zip -d dataset/
 ```
 
-</div>
+Or download manually from: [kaggle.com/c/state-farm-distracted-driver-detection/data](https://www.kaggle.com/c/state-farm-distracted-driver-detection/data)
 
-<div class="cell markdown">
+### 3. Set up environment
 
-Again, we'll load the data here instead of having to run CreateImgArray
-repeatedly.
+```bash
+python -m venv venv
+source venv/bin/activate        # Linux / macOS
+venv\Scripts\activate           # Windows
 
-</div>
-
-<div class="cell code" data-execution_count="11">
-
-``` python
-X_holdout = np.load('X_test_64_64.npy')
-X_holdout.shape
+pip install -r requirements.txt
 ```
 
-<div class="output execute_result" data-execution_count="11">
-
-    (79726, 64, 64, 3)
-
-</div>
-
-</div>
-
-<div class="cell markdown">
-
-And now calling predictions on the holdout set, as shown below. MAKE
-SURE to clear the memory before this step\!
-
-</div>
+### 4. Run the notebook
 
-<div class="cell code" data-execution_count="12">
-
-``` python
-probabilities = model.predict(X_holdout, batch_size = 32)
+```bash
+jupyter notebook distracted_driver_detection.ipynb
 ```
 
-</div>
+---
 
-<div class="cell markdown">
+## 🛠️ Tech Stack
 
-If desired (as a sanity check) we can visually check our predictions by
-scaling the X\_holdout array and then printing the image.
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Deep Learning | TensorFlow / Keras |
+| Model | ResNet50 (from scratch via Keras functional API) |
+| Utilities | `resnets_utils.py` (custom block helpers) |
+| Data | Pandas, NumPy |
+| Visualization | Matplotlib |
+| Notebook | Jupyter / Google Colab |
 
-</div>
+---
 
-<div class="cell code" data-execution_count="15">
+## 📚 References
 
-``` python
-X_holdout_scaled = Rescale(X_holdout)
-```
+- [State Farm Distracted Driver Detection — Kaggle](https://www.kaggle.com/c/state-farm-distracted-driver-detection)
+- He, K., Zhang, X., Ren, S., & Sun, J. (2015). *Deep Residual Learning for Image Recognition.* [arXiv:1512.03385](https://arxiv.org/abs/1512.03385)
+- [deeplearning.ai — ResNet50 from scratch (Coursera)](https://www.coursera.org/learn/convolutional-neural-networks)
+- [Keras Functional API Documentation](https://keras.io/guides/functional_api/)
 
-</div>
+---
 
-<div class="cell code" data-execution_count="29">
+<div align="center">
 
-``` python
-index = 50000
-PrintImage(X_holdout_scaled, index = index, Y = probabilities)
-print('y_pred = ' + str(probabilities[index].argmax()))
-```
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
 
-<div class="output stream stdout">
-
-    y = 9
-    y_pred = 9
+⭐ Star the main repo if this helped you!
 
 </div>
-
-
-</div>
-

From 3ebaf3b97621a9b1092e82ca0e5a7e821cdd8ea8 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Wed, 18 Mar 2026 23:51:36 +0530
Subject: [PATCH 4/8] Update README.md

---
 Drowsiness detection [OPEN CV]/README.md | 347 ++++++++++++++++++-----
 1 file changed, 278 insertions(+), 69 deletions(-)

diff --git a/Drowsiness detection [OPEN CV]/README.md b/Drowsiness detection [OPEN CV]/README.md
index 1d46eb7..d52d79a 100644
--- a/Drowsiness detection [OPEN CV]/README.md	
+++ b/Drowsiness detection [OPEN CV]/README.md	
@@ -1,105 +1,314 @@
-# Driver Drowsiness Detection System
+<div align="center">
 
-## Introduction
+# 😴 Driver Drowsiness Detection — OpenCV + Keras CNN
 
-This project focuses on building a Driver Drowsiness Detection System that monitors a driver's eye status using a webcam and alerts them if they appear drowsy. We utilize **OpenCV** for image capture and preprocessing, while a **Convolutional Neural Network (CNN)** model classifies whether the driver's eyes are 'Open' or 'Closed.' If drowsiness is detected, an alarm is triggered to alert the driver.
+[![Python](https://img.shields.io/badge/Python-3.7+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/)
+[![OpenCV](https://img.shields.io/badge/OpenCV-5C3EE8?style=for-the-badge&logo=opencv&logoColor=white)](https://opencv.org/)
+[![Keras](https://img.shields.io/badge/Keras-D00000?style=for-the-badge&logo=keras&logoColor=white)](https://keras.io/)
+[![Pygame](https://img.shields.io/badge/Pygame-Alarm-green?style=for-the-badge)](https://www.pygame.org/)
+[![Real-Time](https://img.shields.io/badge/Real--Time-Webcam-brightgreen?style=for-the-badge)]()
+[![License](https://img.shields.io/badge/License-MIT-1abc9c?style=for-the-badge)](../LICENSE.md)
 
-## Project Overview
+> A **real-time driver drowsiness detection system** that uses **Haar Cascade classifiers** to locate the driver's eyes in every webcam frame and a **custom-trained CNN** (`cnnCat2.h5`) to classify each eye as **Open** or **Closed** — sounding a `pygame` alarm when drowsiness is detected.
 
-### Steps in the Detection Process:
-1. **Image Capture**: Capture the image using a webcam.
-2. **Face Detection**: Detect the face in the captured image and create a Region of Interest (ROI).
-3. **Eye Detection**: Detect the eyes from the ROI and feed them into the classifier.
-4. **Eye Classification**: The classifier categorizes whether the eyes are open or closed.
-5. **Drowsiness Score Calculation**: Calculate a score to determine if the driver is drowsy based on how long their eyes remain closed.
+[🔙 Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
 
-## CNN Model
+</div>
 
-The **Convolutional Neural Network (CNN)** architecture consists of the following layers:
-- **Convolutional Layers**:
-  - 32 nodes, kernel size 3
-  - 32 nodes, kernel size 3
-  - 64 nodes, kernel size 3
-- **Fully Connected Layers**:
-  - 128 nodes
-  - Output layer: 2 nodes (with Softmax activation for classification)
+---
 
-### Activation Function:
-- **ReLU**: Used in all layers except the output layer.
-- **Softmax**: Used in the output layer to classify the eyes as either 'Open' or 'Closed.'
+## ⚠️ Safety Context
 
-## Project Prerequisites
+> Drowsy driving causes thousands of road fatalities annually. This system provides a real-time, automated alert to combat driver fatigue using a lightweight CNN that runs entirely on a standard webcam feed.
 
-### Required Hardware:
-- A webcam for image capture.
+---
 
-### Required Libraries:
-Ensure Python (version 3.6 recommended) is installed on your system. Then, install the following libraries using `pip`:
+## 📌 Table of Contents
 
-```bash
-pip install opencv-python
-pip install tensorflow
-pip install keras
-pip install pygame
+- [About the Project](#-about-the-project)
+- [How It Works](#-how-it-works)
+- [CNN Model Architecture](#-cnn-model-architecture)
+- [Dataset](#-dataset)
+- [Haar Cascade Files](#-haar-cascade-files)
+- [Scoring & Alert Logic](#-scoring--alert-logic)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+- [Known Limitations](#-known-limitations)
+- [References](#-references)
+
+---
+
+## 🔬 About the Project
+
+This project detects driver drowsiness through a two-stage pipeline:
+
+1. **Detection** — OpenCV Haar Cascade classifiers locate the face and each eye (left, right) in every frame
+2. **Classification** — A custom-trained Keras CNN (`cnnCat2.h5`) classifies each eye ROI as **Open** or **Closed**
+
+A running score is incremented each frame when eyes are detected as closed. When the score crosses a threshold, `pygame` plays `alarm.wav` and a "**DROWSY**" warning is overlaid on the video feed.
+
+**What this project covers:**
+- Training a binary CNN classifier on a custom ~7,000-image eye dataset
+- Real-time face and eye detection with OpenCV Haar cascades
+- Score-based drowsiness logic (accumulate → threshold → alarm)
+- Alarm playback with `pygame.mixer`
+
+---
+
+## ⚙️ How It Works
+
+```
+Webcam Frame (live stream)
+         │
+         ▼
+  Convert BGR → Grayscale
+         │
+         ▼
+  Haar Cascade: Detect Face
+  (haarcascade_frontalface_alt.xml)
+         │
+         ▼
+  Haar Cascade: Detect Eyes from frame
+  ├── Left Eye  (haarcascade_lefteye_2splits.xml)
+  └── Right Eye (haarcascade_righteye_2splits.xml)
+         │
+         ▼
+  Crop Eye ROI → Resize → Normalize
+         │
+         ▼
+  CNN Forward Pass (cnnCat2.h5)
+  → Predict: ['Close', 'Open']
+  → rpred / lpred updated per frame
+         │
+         ├── Both eyes Open  → score decremented (min 0)
+         │
+         └── Eye(s) Closed   → score incremented
+                   │
+                   └── score > threshold
+                             │
+                             ▼
+                      🔔 pygame alarm.wav
+                      📺 "DROWSY" on screen
+                      🟥 Red border on frame
+```
+
+---
+
+## 🧠 CNN Model Architecture
+
+`model.py` defines and trains the CNN classifier. The trained weights are saved as `models/cnnCat2.h5`.
+
+```
+Input: Eye ROI image (24 × 24 × 1, grayscale)
+         │
+         ▼
+Conv2D(32, 3×3) → ReLU → MaxPool(1,1)
+Conv2D(32, 3×3) → ReLU → MaxPool(1,1)
+Conv2D(64, 3×3) → ReLU → MaxPool(1,1)
+         │
+         ▼
+Flatten
+Dense(128) → ReLU
+Dropout(0.5)
+Dense(2) → Softmax
+         │
+         ▼
+Output: ['Close', 'Open']
+```
+
+**Training configuration:**
+
+| Parameter | Value |
+|-----------|-------|
+| Classes | 2 — `Close` / `Open` |
+| Input Size | 24 × 24 × 1 (grayscale) |
+| Optimizer | Adam |
+| Loss | Categorical Cross-Entropy |
+| Activation (hidden) | ReLU |
+| Activation (output) | Softmax |
+| Regularization | Dropout (0.5) |
+
+---
+
+## 📊 Dataset
+
+| Property | Details |
+|----------|---------|
+| **Type** | Custom — captured via webcam script |
+| **Total Images** | ~7,000 eye images |
+| **Classes** | `Open` / `Close` |
+| **Conditions** | Various lighting conditions |
+| **Cleaning** | Manually cleaned to remove unusable frames |
+
+The dataset was created by writing a capture script that crops eye regions frame by frame and saves them to disk, labeled by folder (`Open/` or `Closed/`). It was then manually reviewed to remove noisy or ambiguous images.
+
+> **Want to train on your own data?** Run `model.py` against your own captured eye dataset following the same `Open/Close` folder structure.
+
+---
+
+## 📂 Haar Cascade Files
+
+Three XML classifiers are used from the `haar cascade files/` folder:
+
+| File | Purpose |
+|------|---------|
+| `haarcascade_frontalface_alt.xml` | Detects the driver's face bounding box |
+| `haarcascade_lefteye_2splits.xml` | Detects the left eye region within the frame |
+| `haarcascade_righteye_2splits.xml` | Detects the right eye region within the frame |
+
+These are pre-trained OpenCV Haar cascades — no training required. They are loaded in `drowsinessdetection.py` as:
+
+```python
+face  = cv2.CascadeClassifier('haar cascade files/haarcascade_frontalface_alt.xml')
+leye  = cv2.CascadeClassifier('haar cascade files/haarcascade_lefteye_2splits.xml')
+reye  = cv2.CascadeClassifier('haar cascade files/haarcascade_righteye_2splits.xml')
 ```
 
-### Other Project Files:
-- **Haar Cascade Files**: Located in the "haar cascade files" folder, these XML files are necessary for detecting faces and eyes.
-- **Model File**: The "models" folder contains the pre-trained CNN model `cnnCat2.h5`.
-- **Alarm Sound**: The audio clip `alarm.wav` will play when drowsiness is detected.
-- **Python Files**:
-  - `Model.py`: The file used to build and train the CNN model.
-  - `Drowsiness detection.py`: The main file that executes the driver drowsiness detection system.
+---
 
-## How the Algorithm Works
+## 🎯 Scoring & Alert Logic
 
-### Step 1 – Image Capture
-The webcam captures images in real-time using `cv2.VideoCapture(0)` and processes each frame. The frames are stored in a variable `frame`.
+The system uses a **running score counter** rather than a fixed-frame threshold:
 
-### Step 2 – Face Detection
-The image is converted to grayscale for face detection using a **Haar Cascade Classifier**. The faces are detected using `detectMultiScale()`, and boundary boxes are drawn around the detected faces.
+```python
+lbl = ['Close', 'Open']   # CNN output labels
 
-### Step 3 – Eye Detection
-Similar to face detection, eyes are detected within the ROI using another cascade classifier. The eye images are extracted and passed to the CNN model for classification.
+# Per frame:
+if rpred[0] == 0 and lpred[0] == 0:   # Both eyes closed
+    score += 1
+    cv2.putText(frame, "Closed", ...)
+else:                                  # Eyes open
+    score -= 1
+    cv2.putText(frame, "Open", ...)
 
-### Step 4 – Eye Classification
-The extracted eye images are preprocessed by resizing to 24x24 pixels, normalizing the values, and then passed into the CNN model (`cnnCat2.h5`). The model predicts whether the eyes are open or closed.
+score = max(score, 0)                  # Score never goes negative
 
-### Step 5 – Drowsiness Detection
-A score is calculated based on the status of both eyes. If both eyes are closed for an extended period, the score increases, indicating drowsiness. If the score exceeds a threshold, an alarm is triggered using the **Pygame** library.
+if score > 15:                         # Drowsiness threshold
+    # Sound alarm
+    mixer.Sound('alarm.wav').play()
+    # Draw red border on frame
+    thicc = min(thicc + 2, 16)
+    cv2.rectangle(frame, (0,0), (width,height), (0,0,255), thicc)
+```
+
+| Variable | Value | Meaning |
+|----------|:-----:|---------|
+| `score` threshold | **15** | Frames of closed eyes before alarm |
+| `rpred` / `lpred` | `0` = Closed, `1` = Open | CNN prediction per eye |
+| Border thickness `thicc` | Grows up to 16px | Visual urgency indicator |
+
+---
+
+## 📁 Project Structure
+
+```
+Drowsiness detection [OPEN CV]/
+│
+├── 📂 haar cascade files/
+│   ├── haarcascade_frontalface_alt.xml     # Face detector
+│   ├── haarcascade_lefteye_2splits.xml     # Left eye detector
+│   └── haarcascade_righteye_2splits.xml    # Right eye detector
+│
+├── 📂 models/
+│   └── cnnCat2.h5                          # Trained CNN weights (download separately)
+│
+├── drowsinessdetection.py                  # Main script — webcam loop + detection + alarm
+├── model.py                                # CNN model definition + training script
+├── alarm.wav                               # Alert sound file
+└── README.md                               # You are here
+```
+
+> **Note:** `models/cnnCat2.h5` is not included in the repo due to GitHub file size limits. Download it from the Google Drive link in the project or train your own by running `model.py`.
 
-## Execution Instructions
+---
 
-### Running the Detection System
+## 🚀 Getting Started
 
-1. Open the command prompt and navigate to the directory where the main file `drowsiness detection.py` is located.
-2. Run the script using the following command:
+### 1. Clone the repository
 
 ```bash
-python drowsiness detection.py
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Drowsiness detection [OPEN CV]"
 ```
 
-The system will access the webcam and start detecting drowsiness. The real-time status will be displayed on the screen.
+### 2. Set up environment
 
-## Summary
+```bash
+python -m venv venv
+source venv/bin/activate        # Linux / macOS
+venv\Scripts\activate           # Windows
+
+pip install -r requirements.txt
+```
 
-This Python project implements a **Driver Drowsiness Detection System** using **OpenCV** and a **CNN model** to detect whether the driver’s eyes are open or closed. When the eyes are detected as closed for a prolonged time, an alert sound is played to prevent potential accidents. This system can be implemented in vehicles or other applications to enhance driver safety.
+### 3. Download the trained model
 
-## Future Enhancements
+The `cnnCat2.h5` model file must be placed in the `models/` folder. Download it from the link provided in the repository issues/releases, then:
 
-- Improve the detection accuracy by training on a larger dataset.
-- Implement real-time monitoring for multiple people.
-- Add functionalities to detect other signs of drowsiness like head tilting or yawning.
-  
-## Contributing
+```bash
+mkdir models
+# Place cnnCat2.h5 inside models/
+```
 
-Feel free to contribute by submitting issues or pull requests. For major changes, please open an issue to discuss the proposed changes before submitting a PR.
+Or train your own model from scratch:
 
+```bash
+python model.py
+# Saves models/cnnCat2.h5 automatically
+```
 
-## Acknowledgments
+### 4. Run the detector
 
-- [OpenCV Documentation](https://opencv.org/)
+```bash
+python drowsinessdetection.py
+```
+
+- The webcam opens automatically
+- Eyes detected as closed → score increments
+- Score exceeds threshold → **alarm sounds + red border appears**
+- Press **`q`** to quit
+
+---
+
+## 🛠️ Tech Stack
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Computer Vision | OpenCV (`cv2`) |
+| Eye Detection | Haar Cascade Classifiers |
+| Deep Learning | Keras + TensorFlow backend |
+| Model | Custom CNN (`cnnCat2.h5`) |
+| Audio Alarm | Pygame (`pygame.mixer`) |
+| Numerical Processing | NumPy |
+
+---
+
+## ⚠️ Known Limitations
+
+| Limitation | Detail |
+|-----------|--------|
+| **Lighting sensitivity** | Haar cascades and CNN accuracy drop under poor or uneven lighting |
+| **Glasses / sunglasses** | Frames and tinted lenses obstruct eye detection |
+| **Head pose** | Extreme angles may cause Haar cascade face/eye detection to fail |
+| **Single eye closure** | If only one eye closes (winking), score increments only partially |
+| **No yawn detection** | Fatigue from yawning is not measured — only eye closure |
+
+---
+
+## 📚 References
+
+- [OpenCV Haar Cascade Documentation](https://docs.opencv.org/4.x/db/d28/tutorial_cascade_classifier.html)
 - [Keras Documentation](https://keras.io/)
-- [TensorFlow Documentation](https://www.tensorflow.org/)
+- [Pygame mixer Documentation](https://www.pygame.org/docs/ref/mixer.html)
+
+---
+
+<div align="center">
+
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
+
+⭐ Star the main repo if this helped you!
 
----
\ No newline at end of file
+</div>

From 4c45f0d0f20c0d62bdaa1538cce68c31c9011eff Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Thu, 19 Mar 2026 00:02:10 +0530
Subject: [PATCH 5/8] Update README.md

---
 .../README.md                                 | 255 ++++++++++++++++++
 1 file changed, 255 insertions(+)

diff --git a/Gender and age detection using deep learning/README.md b/Gender and age detection using deep learning/README.md
index e69de29..7cf6f24 100644
--- a/Gender and age detection using deep learning/README.md	
+++ b/Gender and age detection using deep learning/README.md	
@@ -0,0 +1,255 @@
+<div align="center">
+
+# 🧑‍🤝‍🧑 Gender & Age Detection — OpenCV Deep Learning
+
+[![Python](https://img.shields.io/badge/Python-3.7+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/)
+[![OpenCV](https://img.shields.io/badge/OpenCV-DNN-5C3EE8?style=for-the-badge&logo=opencv&logoColor=white)](https://opencv.org/)
+[![Caffe](https://img.shields.io/badge/Caffe-Pre--trained%20Models-red?style=for-the-badge)](http://caffe.berkeleyvision.org/)
+[![Dataset](https://img.shields.io/badge/Dataset-Adience-blueviolet?style=for-the-badge)](https://talhassner.github.io/home/projects/Adience/Adience-data.html)
+[![License](https://img.shields.io/badge/License-MIT-1abc9c?style=for-the-badge)](../LICENSE.md)
+
+> Detects **faces** in images or a live webcam feed and predicts each person's **gender** (Male/Female) and **age range** across 8 age buckets — using three pre-trained deep learning models loaded via **OpenCV DNN**.
+
+[🔙 Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
+
+</div>
+
+---
+
+## 📌 Table of Contents
+
+- [About the Project](#-about-the-project)
+- [How It Works](#-how-it-works)
+- [The Three Models](#-the-three-models)
+- [Age & Gender Classes](#-age--gender-classes)
+- [CNN Architecture](#-cnn-architecture)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+- [References & Citation](#-references--citation)
+
+---
+
+## 🔬 About the Project
+
+This project builds a **real-time gender and age detection system** using three pre-trained models served through OpenCV's DNN module — no model training required. Based on the DataFlair deep learning project, it uses:
+
+- A **TensorFlow SSD** model for face detection
+- A **Caffe CNN** (Levi & Hassner, 2015) for gender classification
+- A **Caffe CNN** (Levi & Hassner, 2015) for age prediction
+
+The script (`gad.py`) accepts a **static image** via `--image` argument or runs on a **live webcam feed**, draws bounding boxes around detected faces, and overlays the predicted gender and age range on each face.
+
+---
+
+## ⚙️ How It Works
+
+```
+Input: Image / Webcam Frame
+              │
+              ▼
+  blobFromImage(frame, 1.0, (300×300), [104,117,123])
+              │
+              ▼
+  ┌─────────────────────────────────────┐
+  │  Face Detection (TensorFlow SSD)    │
+  │  opencv_face_detector_uint8.pb      │
+  │  opencv_face_detector.pbtxt         │
+  └─────────────────────────────────────┘
+              │
+              ▼
+  For each face (confidence > 0.7):
+    Crop face ROI + 20px padding
+    blobFromImage(face, 1.0, (227×227), MODEL_MEAN_VALUES)
+              │
+        ┌─────┴──────┐
+        ▼            ▼
+  ┌──────────┐  ┌──────────┐
+  │  Gender  │  │   Age    │
+  │  Network │  │  Network │
+  │ (Caffe)  │  │ (Caffe)  │
+  └──────────┘  └──────────┘
+        │            │
+        ▼            ▼
+   Male/Female   Age Bucket
+        └─────┬──────┘
+              ▼
+  "Gender: Male  Age: (25-32)"
+  overlaid on bounding box
+```
+
+**Key preprocessing constant:**
+```python
+MODEL_MEAN_VALUES = (78.4263377603, 87.7689143744, 114.895847746)
+```
+> BGR mean values subtracted from every face blob to normalize for illumination variation across the Adience training data.
+
+---
+
+## 🧠 The Three Models
+
+| Model | Framework | Files | Purpose |
+|-------|-----------|-------|---------|
+| **Face Detector** | TensorFlow SSD | `opencv_face_detector_uint8.pb` + `opencv_face_detector.pbtxt` | Detect face bounding boxes |
+| **Gender Net** | Caffe (Levi & Hassner) | `gender_net.caffemodel` + `gender_deploy.prototxt` | Classify Male / Female |
+| **Age Net** | Caffe (Levi & Hassner) | `age_net.caffemodel` + `age_deploy.prototxt` | Predict one of 8 age ranges |
+
+```python
+faceNet   = cv2.dnn.readNet("opencv_face_detector_uint8.pb", "opencv_face_detector.pbtxt")
+ageNet    = cv2.dnn.readNet("age_net.caffemodel",    "age_deploy.prototxt")
+genderNet = cv2.dnn.readNet("gender_net.caffemodel", "gender_deploy.prototxt")
+```
+
+---
+
+## 🏷️ Age & Gender Classes
+
+**Gender** (2 classes):
+```python
+genderList = ['Male', 'Female']
+```
+
+**Age** (8 buckets):
+```python
+ageList = ['(0-2)', '(4-6)', '(8-12)', '(15-20)',
+           '(25-32)', '(38-43)', '(48-53)', '(60-100)']
+```
+
+> Age is treated as a **classification problem** over 8 discrete ranges rather than regression — Levi & Hassner (2015) found classification over predefined buckets more robust than direct regression on the Adience benchmark.
+
+---
+
+## 🏗️ CNN Architecture
+
+Both age and gender models share the same architecture — a lightweight CNN similar to CaffeNet/AlexNet, trained on the **Adience dataset**:
+
+```
+Input: 227 × 227 × 3 face crop (mean-subtracted)
+         │
+Conv1: 96 filters, 7×7 kernel → ReLU → MaxPool → LRN
+Conv2: 256 filters, 5×5 kernel → ReLU → MaxPool → LRN
+Conv3: 384 filters, 3×3 kernel → ReLU → MaxPool
+         │
+FC1: 512 nodes → ReLU → Dropout
+FC2: 512 nodes → ReLU → Dropout
+         │
+Softmax
+├── Gender Net output: 2  (Male / Female)
+└── Age Net output:    8  (age range buckets)
+```
+
+---
+
+## 📁 Project Structure
+
+```
+Gender and age detection using deep learning/
+│
+├── gad.py                              # Main script — detection pipeline
+│
+├── age_net.caffemodel                  # Age model weights (Caffe, ~44 MB)
+├── age_deploy.prototxt                 # Age model architecture
+├── gender_net.caffemodel               # Gender model weights (Caffe, ~44 MB)
+├── gender_deploy.prototxt              # Gender model architecture
+├── opencv_face_detector_uint8.pb       # Face detector weights (TensorFlow)
+├── opencv_face_detector.pbtxt          # Face detector architecture
+│
+├── girl1.jpg                           # Sample test images
+├── girl2.jpg                           # ↑
+├── kid1.jpg                            # ↑
+├── man1.jpg                            # ↑
+├── minion.jpg                          # ↑
+├── woman1.jpg                          # ↑
+├── woman3.jpg                          # ↑
+│
+└── README.md
+```
+
+> **Note:** The `.caffemodel` files (~44 MB each) may not be included in the repository due to GitHub's file size limits. If missing, download them from [Tal Hassner's Adience page](https://talhassner.github.io/home/projects/Adience/Adience-data.html) and place them in the project root.
+
+---
+
+## 🚀 Getting Started
+
+### 1. Clone the repository
+
+```bash
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Gender and age detection using deep learning"
+```
+
+### 2. Set up environment
+
+```bash
+python -m venv venv
+source venv/bin/activate        # Linux / macOS
+venv\Scripts\activate           # Windows
+
+pip install -r requirements.txt
+```
+
+### 3. Run on a sample image
+
+```bash
+python gad.py --image girl1.jpg
+# Output → Gender: Female  Age: (25-32) years
+```
+
+Try the included sample images:
+
+```bash
+python gad.py --image man1.jpg
+python gad.py --image kid1.jpg
+python gad.py --image woman1.jpg
+python gad.py --image minion.jpg   # 🤔
+```
+
+### 4. Run on live webcam
+
+```bash
+python gad.py
+# No --image flag → defaults to webcam (index 0)
+# Press Q to quit
+```
+
+---
+
+## 🛠️ Tech Stack
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Computer Vision | OpenCV (`cv2.dnn`) |
+| Face Detection | TensorFlow SSD (ResNet-10 backbone) |
+| Age / Gender Models | Caffe (Levi & Hassner, 2015) |
+| Argument Parsing | `argparse` |
+| Numerical Processing | NumPy |
+
+---
+
+## 📚 References & Citation
+
+```bibtex
+@inproceedings{Levi2015,
+  author    = {Gil Levi and Tal Hassner},
+  title     = {Age and Gender Classification Using Convolutional Neural Networks},
+  booktitle = {IEEE Workshop on Analysis and Modeling of Faces and Gestures (AMFG),
+               at the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
+  year      = {2015}
+}
+```
+
+- [Levi & Hassner (2015) — Original Paper & Models](https://talhassner.github.io/home/projects/Adience/Adience-data.html)
+- [Adience Benchmark Dataset](https://talhassner.github.io/home/projects/Adience/Adience-data.html)
+- [OpenCV DNN Face Detector](https://github.com/opencv/opencv/tree/master/samples/dnn)
+- [LearnOpenCV — Age & Gender Classification](https://learnopencv.com/age-gender-classification-using-opencv-deep-learning-c-python/)
+
+---
+
+<div align="center">
+
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
+
+⭐ Star the main repo if this helped you!
+
+</div>

From 788922517d9338f3a8aa4e63331a06cd7de911b7 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Thu, 19 Mar 2026 00:05:28 +0530
Subject: [PATCH 6/8] Create README.md

---
 .../README.md                                 | 218 ++++++++++++++++++
 1 file changed, 218 insertions(+)
 create mode 100644 Getting Admission in College Prediction/README.md

diff --git a/Getting Admission in College Prediction/README.md b/Getting Admission in College Prediction/README.md
new file mode 100644
index 0000000..d4278cc
--- /dev/null
+++ b/Getting Admission in College Prediction/README.md	
@@ -0,0 +1,218 @@
+<div align="center">
+
+# 🎓 Getting Admission in College Prediction
+
+[![Python](https://img.shields.io/badge/Python-3.7+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/)
+[![scikit-learn](https://img.shields.io/badge/scikit--learn-F7931E?style=for-the-badge&logo=scikit-learn&logoColor=white)](https://scikit-learn.org/)
+[![Jupyter](https://img.shields.io/badge/Jupyter-Notebook-F37626?style=for-the-badge&logo=jupyter&logoColor=white)](https://jupyter.org/)
+[![Dataset](https://img.shields.io/badge/Dataset-Kaggle-20BEFF?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/mohansacharya/graduate-admissions)
+[![Best R²](https://img.shields.io/badge/Best%20R²-0.821-brightgreen?style=for-the-badge)]()
+[![License](https://img.shields.io/badge/License-MIT-1abc9c?style=for-the-badge)](../LICENSE.md)
+
+> Predicts a student's **probability of graduate college admission** (as a continuous value between 0 and 1) from 7 academic and profile features — using a `GridSearchCV`-powered model comparison across 6 regression algorithms.
+
+[🔙 Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
+
+</div>
+
+---
+
+## 📌 Table of Contents
+
+- [About the Project](#-about-the-project)
+- [Dataset](#-dataset)
+- [Features](#-features)
+- [Methodology](#-methodology)
+- [Model Comparison Results](#-model-comparison-results)
+- [Final Model Performance](#-final-model-performance)
+- [Sample Predictions](#-sample-predictions)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+
+---
+
+## 🔬 About the Project
+
+Getting into a good graduate program is one of the most competitive processes for students worldwide. This project builds a **regression model** that predicts the probability of admission based on a student's GRE score, TOEFL score, CGPA, university rating, SOP, LOR, and research experience.
+
+Six regression algorithms are trained and compared using **GridSearchCV with 5-fold cross-validation** via a custom `find_best_model()` function. The best-performing model is then evaluated on a held-out test set.
+
+**What this project covers:**
+- Exploratory data analysis on 500 graduate applicant profiles
+- Custom `find_best_model()` with GridSearchCV across 6 regressors
+- Feature importance and correlation analysis
+- Linear Regression selected as the final model with **R² = 0.821** on test set
+
+---
+
+## 📊 Dataset
+
+| Property | Details |
+|----------|---------|
+| **File** | `admission_predict.csv` |
+| **Source** | [Kaggle — Graduate Admissions](https://www.kaggle.com/mohansacharya/graduate-admissions) |
+| **Rows** | 500 student records |
+| **Columns** | 9 (including Serial No. and target) |
+| **Task** | Regression — predict `Chance of Admit` ∈ [0, 1] |
+| **Missing Values** | None |
+
+---
+
+## 🔬 Features
+
+| Column | Type | Range | Description |
+|--------|------|:-----:|-------------|
+| `GRE Score` | Integer | 290–340 | Graduate Record Examination score |
+| `TOEFL Score` | Integer | 92–120 | Test of English as a Foreign Language score |
+| `University Rating` | Integer | 1–5 | Prestige rating of undergraduate university |
+| `SOP` | Float | 1.0–5.0 | Strength of Statement of Purpose |
+| `LOR` | Float | 1.0–5.0 | Strength of Letter of Recommendation |
+| `CGPA` | Float | 6.8–9.92 | Undergraduate GPA (out of 10) |
+| `Research` | Binary | 0 / 1 | Research experience (0 = No, 1 = Yes) |
+| `Chance of Admit` ⭐ | Float | 0.34–0.97 | **Target variable** — probability of admission |
+
+> `Serial No.` is dropped before training as it carries no predictive information.
+
+---
+
+## ⚙️ Methodology
+
+```
+Load admission_predict.csv (500 × 9)
+          │
+          ▼
+EDA + Correlation Analysis
+(heatmap, pairplots, distributions)
+          │
+          ▼
+Drop 'Serial No.' column
+Define X (7 features) and y ('Chance of Admit')
+          │
+          ▼
+find_best_model(X, y)
+└── GridSearchCV (cv=5) over 6 models
+          │
+          ▼
+Select best model → Linear Regression (normalize=True)
+          │
+          ▼
+Train/Test Split (80/20, random_state=5)
+→ 400 train samples, 100 test samples
+          │
+          ▼
+Fit LinearRegression(normalize=True)
+Evaluate on test set → R² = 0.821
+          │
+          ▼
+Sample Predictions
+```
+
+---
+
+## 📈 Model Comparison Results
+
+All 6 models evaluated using `GridSearchCV(cv=5)` via the custom `find_best_model()` function:
+
+| Model | Best Parameters | CV R² Score |
+|-------|----------------|:-----------:|
+| **Linear Regression** ✅ | `{'normalize': True}` | **0.8108** |
+| Random Forest | `{'n_estimators': 15}` | 0.7689 |
+| KNN | `{'n_neighbors': 20}` | 0.7230 |
+| SVR | `{'gamma': 'scale'}` | 0.6541 |
+| Decision Tree | `{'criterion': 'mse', 'splitter': 'random'}` | 0.5868 |
+| Lasso | `{'alpha': 1, 'selection': 'random'}` | 0.2151 |
+
+> ✅ **Linear Regression** selected as the final model — highest cross-validation R² score of **0.8108**.
+
+> Lasso performed poorly (R² = 0.2151) because L1 regularization shrinks coefficients aggressively, which is harmful here where all 7 features are genuinely correlated with admission probability.
+
+---
+
+## 🏆 Final Model Performance
+
+| Metric | Value |
+|--------|:-----:|
+| Model | `LinearRegression(normalize=True)` |
+| 5-Fold Cross-Validation Score | **81.0%** |
+| Train samples | 400 |
+| Test samples | 100 |
+| **Test R² Score** | **0.8215** |
+
+---
+
+## 🔮 Sample Predictions
+
+```python
+# Input: [GRE, TOEFL, Univ Rating, SOP, LOR, CGPA, Research]
+
+model.predict([[337, 118, 4, 4.5, 4.5, 9.65, 0]])
+# → Chance of getting into UCLA is 92.855%
+
+model.predict([[320, 113, 2, 2.0, 2.5, 8.64, 1]])
+# → Chance of getting into UCLA is 73.627%
+```
+
+---
+
+## 📁 Project Structure
+
+```
+Getting Admission in College Prediction/
+│
+├── Admission_prediction.ipynb      # Main notebook — EDA, model comparison, training
+├── admission_predict.csv           # Dataset (500 student records)
+├── requirements.txt                # Python dependencies
+└── README.md                       # You are here
+```
+
+---
+
+## 🚀 Getting Started
+
+### 1. Clone the repository
+
+```bash
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Getting Admission in College Prediction"
+```
+
+### 2. Set up environment
+
+```bash
+python -m venv venv
+source venv/bin/activate        # Linux / macOS
+venv\Scripts\activate           # Windows
+
+pip install -r requirements.txt
+```
+
+### 3. Launch the notebook
+
+```bash
+jupyter notebook Admission_prediction.ipynb
+```
+
+---
+
+## 🛠️ Tech Stack
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7.4 |
+| ML Library | scikit-learn |
+| Model Selection | `GridSearchCV`, `cross_val_score` |
+| Models | `LinearRegression`, `Lasso`, `SVR`, `DecisionTreeRegressor`, `RandomForestRegressor`, `KNeighborsRegressor` |
+| Data Processing | Pandas, NumPy |
+| Visualization | Matplotlib |
+| Notebook | Jupyter |
+
+---
+
+<div align="center">
+
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
+
+⭐ Star the main repo if this helped you!
+
+</div>

From 93b1a30e8951ef2833e860afc7a8c40a0344e1d3 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Thu, 19 Mar 2026 00:10:25 +0530
Subject: [PATCH 7/8] Update README.md

---
 .../README.md                                 | 303 +++++++++++++++++-
 1 file changed, 302 insertions(+), 1 deletion(-)

diff --git a/Heart Disease Prediction [END 2 END]/README.md b/Heart Disease Prediction [END 2 END]/README.md
index bf59832..d82f9b5 100644
--- a/Heart Disease Prediction [END 2 END]/README.md	
+++ b/Heart Disease Prediction [END 2 END]/README.md	
@@ -1 +1,302 @@
-Look for Deployed Project At ![https://github.com/shsarv/Cardio-Monitor](https://github.com/shsarv/Cardio-Monitor)
\ No newline at end of file
+- Look for final Project At **![https://github.com/shsarv/Cardio-Monitor](https://github.com/shsarv/Cardio-Monitor)**
+
+<div align="center">
+
+# 🫀 Cardio Monitor — Heart Disease Prediction Web App
+
+[![Python](https://img.shields.io/badge/Python-3.7+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/)
+[![Flask](https://img.shields.io/badge/Flask-Web%20App-000000?style=for-the-badge&logo=flask&logoColor=white)](https://flask.palletsprojects.com/)
+[![MongoDB](https://img.shields.io/badge/MongoDB-Database-47A248?style=for-the-badge&logo=mongodb&logoColor=white)](https://www.mongodb.com/)
+[![scikit-learn](https://img.shields.io/badge/scikit--learn-F7931E?style=for-the-badge&logo=scikit-learn&logoColor=white)](https://scikit-learn.org/)
+[![Accuracy](https://img.shields.io/badge/Accuracy-92%25-brightgreen?style=for-the-badge)]()
+[![License](https://img.shields.io/badge/License-MIT-1abc9c?style=for-the-badge)](LICENSE)
+
+> **Cardio Monitor** is a full-stack web application that predicts whether a patient is at risk of developing **heart disease** using a machine learning model with **92% accuracy** — built with Flask, MongoDB, and scikit-learn. Course project for **Big Data Analytics (BCSE0158)**.
+
+[![Stars](https://img.shields.io/github/stars/shsarv/Cardio-Monitor?style=social)](https://github.com/shsarv/Cardio-Monitor/stargazers)
+[![Forks](https://img.shields.io/github/forks/shsarv/Cardio-Monitor?style=social)](https://github.com/shsarv/Cardio-Monitor/forks)
+
+[🔗 Core ML Project](https://github.com/shsarv/Heart-Disease-Prediction) &nbsp;·&nbsp; [🐛 Report Bug](https://github.com/shsarv/Cardio-Monitor/issues) &nbsp;·&nbsp; [✨ Request Feature](https://github.com/shsarv/Cardio-Monitor/issues)
+
+</div>
+
+---
+
+## ⚠️ Medical Disclaimer
+
+> **This application is for educational and research purposes only.** It does not constitute medical advice. Always consult a qualified cardiologist or medical professional for clinical decisions.
+
+---
+
+## 📌 Table of Contents
+
+- [About the Project](#-about-the-project)
+- [How It Works](#-how-it-works)
+- [Dataset & Features](#-dataset--features)
+- [Model & Performance](#-model--performance)
+- [Architecture](#-architecture)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Future Roadmap](#-future-roadmap)
+- [Tech Stack](#-tech-stack)
+- [References](#-references)
+
+---
+
+## 🔬 About the Project
+
+Heart disease is the leading cause of death globally. Early detection through continuous monitoring can significantly reduce mortality rates. **Cardio Monitor** combines:
+
+- A **machine learning classifier** (92% accuracy) trained on the Cleveland Heart Disease dataset
+- A **Flask web app** for real-time patient input and prediction
+- A **MongoDB** backend for storing patient records and prediction history
+- A **visualization module** for EDA and model insights
+- A roadmap toward **Apache Spark Streaming** for large-scale real-time data processing
+
+The core ML research and model building is documented in the companion repository: [shsarv/Heart-Disease-Prediction](https://github.com/shsarv/Heart-Disease-Prediction).
+
+---
+
+## ⚙️ How It Works
+
+```
+Patient Inputs Clinical Data via Web Form
+              │
+              ▼
+       Flask (app.py)
+       routes request to
+              │
+              ▼
+     prediction.py
+     Loads Heart_model1.pkl
+     Runs model.predict()
+              │
+       ┌──────┴──────┐
+       ▼             ▼
+   At Risk ❤️‍🩹    Not at Risk ✅
+       │
+       ▼
+  Result displayed on web page
+  Record saved to MongoDB (database.py)
+```
+
+---
+
+## 📊 Dataset & Features
+
+| Property | Details |
+|----------|---------|
+| **File** | `heart.csv` |
+| **Source** | Cleveland Heart Disease Dataset (UCI ML Repository) |
+| **Samples** | 303 patient records |
+| **Task** | Binary classification — Heart Disease (1) / No Heart Disease (0) |
+
+### Input Features
+
+| Feature | Description | Range |
+|---------|-------------|-------|
+| `age` | Age of patient | Years |
+| `sex` | Sex | 0 = Female, 1 = Male |
+| `cp` | Chest pain type | 0–3 |
+| `trestbps` | Resting blood pressure | mm Hg |
+| `chol` | Serum cholesterol | mg/dl |
+| `fbs` | Fasting blood sugar > 120 mg/dl | 0 / 1 |
+| `restecg` | Resting ECG results | 0–2 |
+| `thalach` | Maximum heart rate achieved | bpm |
+| `exang` | Exercise induced angina | 0 / 1 |
+| `oldpeak` | ST depression induced by exercise | Float |
+| `slope` | Slope of peak exercise ST segment | 0–2 |
+| `ca` | Number of major vessels coloured by fluoroscopy | 0–3 |
+| `thal` | Thalassemia | 0–3 |
+| `target` ⭐ | **Heart disease present** | 0 / 1 |
+
+---
+
+## 🤖 Model & Performance
+
+| Metric | Value |
+|--------|:-----:|
+| **Accuracy** | **92%** |
+| **Saved Model** | `Heart_model1.pkl` / `heartmodel.pkl` |
+| **Algorithm** | scikit-learn classifier (see core project) |
+| **Library** | scikit-learn + mlxtend |
+
+> Two model files are present in the repo: `Heart_model1.pkl` (primary, used by `prediction.py`) and `heartmodel.pkl` (earlier iteration). Both are serialized with `pickle`.
+
+> For full model building details — EDA, feature selection, algorithm comparison, and evaluation — see the core project: [shsarv/Heart-Disease-Prediction](https://github.com/shsarv/Heart-Disease-Prediction).
+
+---
+
+## 🏗️ Architecture
+
+```
+┌─────────────────────────────────────────────┐
+│              Flask Application              │
+│                  (app.py)                   │
+│                                             │
+│  ┌──────────┐  ┌────────────┐  ┌─────────┐ │
+│  │templates/│  │prediction  │  │database │ │
+│  │  HTML    │  │   .py      │  │  .py    │ │
+│  │  pages   │  │ ML model   │  │ MongoDB │ │
+│  └──────────┘  └────────────┘  └─────────┘ │
+│                                             │
+│  ┌──────────────────────────────────────┐   │
+│  │          static/                     │   │
+│  │   CSS · JS · images                  │   │
+│  └──────────────────────────────────────┘   │
+└─────────────────────────────────────────────┘
+         │                    │
+         ▼                    ▼
+  Heart_model1.pkl       MongoDB Atlas
+  (scikit-learn)         (patient records
+                         + predictions)
+```
+
+---
+
+## 📁 Project Structure
+
+```
+Cardio-Monitor/
+│
+├── 📂 heart disease prediction/     # Jupyter notebooks — EDA & model training
+├── 📂 static/                       # CSS, JS, images
+├── 📂 templates/                    # Jinja2 HTML templates (input form, result pages)
+├── 📂 __pycache__/
+│
+├── app.py                           # Flask entry point — routes and app config
+├── prediction.py                    # Loads Heart_model1.pkl, runs inference
+├── modelbuild.py                    # Model training and serialization script
+├── database.py                      # MongoDB connection and CRUD operations
+├── visualization.py                 # EDA and data visualization utilities
+│
+├── Heart_model1.pkl                 # Primary trained model (pickle)
+├── heartmodel.pkl                   # Alternate model iteration (pickle)
+├── heart.csv                        # Cleveland Heart Disease dataset
+├── Input Data.png                   # Screenshot of the web app input form
+│
+├── Procfile                         # Heroku deployment config
+├── requirements.txt                 # Python dependencies
+├── .gitignore
+└── README.md
+```
+
+---
+
+## 🚀 Getting Started
+
+### Prerequisites
+
+- Python 3.7+
+- MongoDB (local or [MongoDB Atlas](https://www.mongodb.com/cloud/atlas))
+
+### 1. Clone the repository
+
+```bash
+git clone https://github.com/shsarv/Cardio-Monitor.git
+cd Cardio-Monitor
+```
+
+### 2. Set up environment
+
+```bash
+python -m venv venv
+source venv/bin/activate        # Linux / macOS
+venv\Scripts\activate           # Windows
+
+pip install -r requirements.txt
+```
+
+### 3. Configure MongoDB
+
+In `database.py`, update your MongoDB connection string:
+
+```python
+# Local MongoDB
+client = pymongo.MongoClient("mongodb://localhost:27017/")
+
+# MongoDB Atlas (cloud)
+client = pymongo.MongoClient("mongodb+srv://<user>:<password>@cluster.mongodb.net/")
+```
+
+### 4. Run the app
+
+```bash
+python app.py
+```
+
+Navigate to → **http://127.0.0.1:5000**
+
+### 5. Deploy to Heroku
+
+```bash
+heroku login
+heroku create cardio-monitor-app
+git push heroku main
+heroku open
+```
+
+> The `Procfile` already contains: `web: gunicorn app:app`
+
+---
+
+## 🗺️ Future Roadmap
+
+| Feature | Status |
+|---------|:------:|
+| Flask web app with MongoDB | ✅ Done |
+| 92% accuracy ML model | ✅ Done |
+| Heroku deployment | ✅ Done |
+| **Apache Spark Streaming** — real-time patient data ingestion | 🔜 Planned |
+| **PySpark MLlib** — large-scale distributed model training | 🔜 Planned |
+| **Deep Learning model** (Keras/TensorFlow) | 🔜 Planned |
+| Live demo deployment | 🔜 Planned |
+
+---
+
+## 🛠️ Tech Stack
+
+**Current:**
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Web Framework | Flask |
+| ML Library | scikit-learn, mlxtend |
+| Database | MongoDB (PyMongo) |
+| Model Serialization | Pickle |
+| Frontend | HTML5, CSS3, Bootstrap |
+| Deployment | Heroku (Procfile + gunicorn) |
+| Notebook | Jupyter |
+
+**Planned (Future):**
+
+| Layer | Technology |
+|-------|-----------|
+| Streaming | Apache Spark Streaming |
+| Distributed ML | PySpark MLlib |
+| Deep Learning | Keras / TensorFlow (DeepL) |
+| Database (scale) | MongoDB Atlas |
+
+---
+
+## 📚 References
+
+- [Cleveland Heart Disease Dataset — UCI ML Repository](https://archive.ics.uci.edu/ml/datasets/Heart+Disease)
+- [Core ML Project — shsarv/Heart-Disease-Prediction](https://github.com/shsarv/Heart-Disease-Prediction)
+- [Flask Documentation](https://flask.palletsprojects.com/)
+- [PyMongo Documentation](https://pymongo.readthedocs.io/)
+- [mlxtend Documentation](https://rasbt.github.io/mlxtend/)
+- [Apache Spark Streaming](https://spark.apache.org/streaming/)
+
+---
+
+<div align="center">
+
+**Created by [Sarvesh Kumar Sharma](https://github.com/shsarv)**
+
+Course Project — Big Data Analytics (BCSE0158)
+
+⭐ Star this repo if you found it helpful!
+
+</div>

From 23afd01662bce6d157958c873d9cff84a788ae93 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Thu, 19 Mar 2026 00:17:03 +0530
Subject: [PATCH 8/8] Create README.md

---
 Human Activity Detection/README.md | 300 +++++++++++++++++++++++++++++
 1 file changed, 300 insertions(+)
 create mode 100644 Human Activity Detection/README.md

diff --git a/Human Activity Detection/README.md b/Human Activity Detection/README.md
new file mode 100644
index 0000000..985071e
--- /dev/null
+++ b/Human Activity Detection/README.md	
@@ -0,0 +1,300 @@
+<div align="center">
+
+# 🏃 Human Activity Recognition — 2D Pose + LSTM RNN
+
+[![Python](https://img.shields.io/badge/Python-3.7+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/)
+[![TensorFlow](https://img.shields.io/badge/TensorFlow-1.x-FF6F00?style=for-the-badge&logo=tensorflow&logoColor=white)](https://www.tensorflow.org/)
+[![LSTM](https://img.shields.io/badge/LSTM-2%20Stacked%20Layers-9B59B6?style=for-the-badge)]()
+[![Accuracy](https://img.shields.io/badge/Accuracy->90%25-brightgreen?style=for-the-badge)]()
+[![ngrok](https://img.shields.io/badge/Deployed-ngrok-1F8ACB?style=for-the-badge)]()
+[![License](https://img.shields.io/badge/License-MIT-1abc9c?style=for-the-badge)](../LICENSE.md)
+
+> Classifies **6 human activities** from **2D pose time series** (OpenPose keypoints) using a **2-layer stacked LSTM RNN** built in TensorFlow 1.x — achieving **>90% accuracy** in ~7 minutes of training. Deployed via ngrok with a Flask web app and `sample_video.mp4` demo.
+
+[🔙 Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
+
+</div>
+
+---
+
+## 📌 Table of Contents
+
+- [About the Project](#-about-the-project)
+- [Key Idea — Why 2D Pose?](#-key-idea--why-2d-pose)
+- [Dataset](#-dataset)
+- [LSTM Architecture](#-lstm-architecture)
+- [Training Configuration](#-training-configuration)
+- [Results & Findings](#-results--findings)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+- [References](#-references)
+
+---
+
+## 🔬 About the Project
+
+This experiment classifies human activities using **2D pose time series data** and a **stacked LSTM RNN**. Rather than feeding raw RGB images or expensive 3D pose data into the network, it uses **2D (x, y) keypoints** extracted from video frames via OpenPose — a much lighter and more accessible input representation.
+
+The core research questions:
+
+- Can **2D pose** match **3D pose** accuracy for activity recognition? (removes need for RGBD cameras)
+- Can **2D pose** match **raw RGB image** accuracy? (smaller input = smaller model = better with limited data)
+- Does this approach generalize to **animal** behaviour classification for robotics applications?
+
+The network architecture is based on Guillaume Chevalier's *LSTMs for Human Activity Recognition (2016)*, with key modifications for large class-ordered datasets using **random batch sampling without replacement**.
+
+---
+
+## 🧠 Key Idea — Why 2D Pose?
+
+```
+Raw Video Frame (640×480 RGB)
+        │
+        ▼
+   OpenPose Inference
+   18 body keypoints × (x, y) coords
+        │
+        ▼
+   36-dimensional feature vector per frame
+        │
+        ▼  (32 frames = 1 time window)
+   LSTM RNN  →  Activity Class
+```
+
+| Input Type | Pros | Cons |
+|------------|------|------|
+| Raw RGB images | High information | Large models, lots of data needed |
+| 3D pose (RGBD) | Rich spatial info | Needs depth sensors |
+| **2D pose (x,y)** ✅ | Lightweight, RGB-only camera, small model | Some spatial ambiguity |
+
+> Limiting the feature vector to 2D pose keypoints allows for a **smaller LSTM model** that generalises better on limited datasets — particularly relevant for future animal behaviour recognition tasks.
+
+---
+
+## 📊 Dataset
+
+| Property | Details |
+|----------|---------|
+| **Source** | Berkeley Multimodal Human Action Database (MHAD) — 2D poses extracted via OpenPose |
+| **Download** | `RNN-HAR-2D-Pose-database.zip` (~19.2 MB, Google Drive) |
+| **Subjects** | 12 |
+| **Angles** | 4 camera angles |
+| **Repetitions** | 5 per subject per action |
+| **Total videos** | 1,438 (2 missing from original 1,440) |
+| **Total frames** | 211,200 |
+| **Training windows** | 22,625 (32 timesteps each, 50% overlap) |
+| **Test windows** | 5,751 |
+| **Input shape** | `(22625, 32, 36)` → windows × timesteps × features |
+| **Preprocessing** | ❌ None — raw, unnormalized pose coordinates |
+
+### Activity Classes (6)
+
+| Label | Activity |
+|-------|----------|
+| `JUMPING` | Vertical jumps |
+| `JUMPING_JACKS` | Jumping jacks |
+| `BOXING` | Boxing motions |
+| `WAVING_2HANDS` | Waving with both hands |
+| `WAVING_1HAND` | Waving with one hand |
+| `CLAPPING_HANDS` | Clapping hands |
+
+### Data Files
+
+```
+RNN-HAR-2D-Pose-database/
+├── X_train.txt    # 22,625 training windows (36 comma-separated floats per row)
+├── X_test.txt     # 5,751 test windows
+├── Y_train.txt    # Training labels (0–5)
+└── Y_test.txt     # Test labels (0–5)
+```
+
+---
+
+## 🏗️ LSTM Architecture
+
+```
+Input: (batch_size, 32 timesteps, 36 features)
+             │
+             ▼
+  Linear projection: 36 → 34 (ReLU)
+             │
+             ▼
+  ┌──────────────────────────────────┐
+  │  BasicLSTMCell(34, forget_bias=1)│  ← Layer 1
+  ├──────────────────────────────────┤
+  │  BasicLSTMCell(34, forget_bias=1)│  ← Layer 2
+  └──────────────────────────────────┘
+  tf.contrib.rnn.MultiRNNCell (stacked)
+  tf.contrib.rnn.static_rnn (many-to-one)
+             │
+        Last output only
+             │
+             ▼
+  Linear: 34 → 6
+  Softmax → Activity class
+```
+
+> **Why n_hidden = 34?** Testing across a range of hidden unit counts showed best generalisation when hidden units ≈ n_input (36). 34 was found to be optimal.
+
+> **Many-to-one classifier** — only the last LSTM output (timestep 32) is used for classification, not the full sequence output.
+
+---
+
+## ⚙️ Training Configuration
+
+| Parameter | Value |
+|-----------|-------|
+| Framework | TensorFlow 1.x (`%tensorflow_version 1.x`) |
+| Timesteps (`n_steps`) | 32 |
+| Input features (`n_input`) | 36 (18 keypoints × x, y) |
+| Hidden units (`n_hidden`) | 34 |
+| Classes (`n_classes`) | 6 |
+| Epochs | 300 |
+| Batch size | 512 |
+| Optimizer | Adam |
+| Initial learning rate | 0.005 |
+| LR decay | Exponential — `0.96` per 100,000 steps |
+| Loss | Softmax cross-entropy + L2 regularization |
+| L2 lambda | 0.0015 |
+| Batch strategy | Random sampling **without replacement** (prevents class-order bias) |
+| Training time | ~7 minutes (Google Colab) |
+
+**L2 regularization formula:**
+```python
+l2 = lambda_loss_amount * sum(
+    tf.nn.l2_loss(tf_var) for tf_var in tf.trainable_variables()
+)
+cost = tf.reduce_mean(softmax_cross_entropy) + l2
+```
+
+**Decayed learning rate:**
+```python
+learning_rate = init_lr * decay_rate ^ (global_step / decay_steps)
+# = 0.005 * 0.96 ^ (global_step / 100000)
+```
+
+---
+
+## 📈 Results & Findings
+
+| Metric | Value |
+|--------|:-----:|
+| **Final Accuracy** | **> 90%** |
+| Training time | ~7 minutes |
+
+**Confusion pairs observed:**
+- `CLAPPING_HANDS` ↔ `BOXING` — similar upper-body motion pattern
+- `JUMPING_JACKS` ↔ `WAVING_2HANDS` — symmetric arm movements
+
+**Key conclusions:**
+- 2D pose achieves >90% accuracy, validating its use over more expensive 3D pose or raw RGB inputs
+- Hidden units ≈ n_input (34 ≈ 36) gives optimal generalisation
+- Random batch sampling without replacement is **critical** — ordered class batches degrade training significantly
+- Approach is promising for future animal behaviour estimation with autonomous mobile robots
+
+---
+
+## 📁 Project Structure
+
+```
+Human Activity Detection/
+│
+├── 📂 images/                                        # Result plots and visualizations
+├── 📂 models/                                        # Saved LSTM model weights
+├── 📂 src/                                           # Helper source scripts
+├── 📂 templates/                                     # HTML templates (Flask app)
+│
+├── Human_Activity_Recogination.ipynb                 # Main notebook — dataset, LSTM, training
+├── Human_Action_Classification_deployment_with_ngrok.ipynb  # Flask + ngrok deployment notebook
+├── lstm_train.ipynb                                  # Standalone LSTM training notebook
+├── app.py                                            # Flask web application
+├── sample_video.mp4                                  # Sample video for live demo
+└── requirements.txt                                  # Python dependencies
+```
+
+---
+
+## 🚀 Getting Started
+
+### 1. Clone the repository
+
+```bash
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Human Activity Detection"
+```
+
+### 2. Set up environment
+
+```bash
+python -m venv venv
+source venv/bin/activate        # Linux / macOS
+venv\Scripts\activate           # Windows
+
+pip install -r requirements.txt
+```
+
+> ⚠️ **TensorFlow 1.x required.** The LSTM uses `tf.contrib.rnn` and `tf.placeholder` APIs from TF1.
+> ```bash
+> pip install tensorflow==1.15.0
+> ```
+
+### 3. Download the dataset
+
+The dataset is downloaded automatically in the notebook:
+```python
+!wget -O RNN-HAR-2D-Pose-database.zip \
+  https://drive.google.com/u/1/uc?id=1IuZlyNjg6DMQE3iaO1Px6h1yLKgatynt
+!unzip RNN-HAR-2D-Pose-database.zip
+```
+
+### 4. Run on Google Colab (recommended)
+
+```
+1. Open Human_Activity_Recogination.ipynb in Google Colab
+2. Runtime → Change runtime type → GPU (optional, speeds training)
+3. Run all cells — training completes in ~7 minutes
+```
+
+### 5. Deploy with ngrok
+
+```
+Open Human_Action_Classification_deployment_with_ngrok.ipynb
+Follow the ngrok setup cells to expose the Flask app publicly
+```
+
+---
+
+## 🛠️ Tech Stack
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Deep Learning | TensorFlow 1.x (`tf.contrib.rnn`) |
+| Model | 2-layer stacked LSTM (`BasicLSTMCell`) |
+| Pose Extraction | OpenPose (CMU Perceptual Computing Lab) |
+| Data Processing | NumPy |
+| Visualization | Matplotlib |
+| Web Framework | Flask |
+| Deployment | ngrok (tunnel) |
+| Notebook | Jupyter / Google Colab |
+
+---
+
+## 📚 References
+
+- Guillaume Chevalier (2016). *LSTMs for Human Activity Recognition.* [github.com/guillaume-chevalier](https://github.com/guillaume-chevalier/LSTM-Human-Activity-Recognition) — MIT License
+- [Berkeley MHAD Dataset](http://tele-immersion.citris-uc.org/berkeley_mhad)
+- [OpenPose — CMU Perceptual Computing Lab](https://github.com/CMU-Perceptual-Computing-Lab/openpose)
+- Goodfellow et al. *"It has been observed in practice that when using a larger batch there is a significant degradation in the quality of the model..."* — basis for small batch strategy
+- [Andrej Karpathy — The Unreasonable Effectiveness of RNNs](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) — referenced for many-to-one classifier design
+
+---
+
+<div align="center">
+
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
+
+⭐ Star the main repo if this helped you!
+
+</div>

Training Loss	0.93
Validation Loss	3.79
-Holdout Loss	2.64
- Model -	- Epoch -	- Train Accuracy -	- Dev Accuracy -	- Bias -	- Variance -
- Model A -	- 2 -	- 27.91 -	- 21.19 -	- 72.09 -	- 6.72 -
- Model B -	- 5 -	- 37.83 -	- 25.79 -	- 62.17 -	- 12.04 -
- Model C -	- 10 -	- 86.95 -	- 40.68 -	- 13.06 -	- 46.27 -