+- [Colorful Image Colorization โ Zhang et al. (2016)](https://arxiv.org/abs/1603.08511)
+- [Official Demo & Model โ richzhang/colorization](https://github.com/richzhang/colorization)
+- [OpenCV DNN colorization sample](https://github.com/opencv/opencv/blob/master/samples/dnn/colorization.py)
+- [PyImageSearch Tutorial โ Adrian Rosebrock](https://pyimagesearch.com/2019/02/25/black-and-white-image-colorization-with-opencv-and-deep-learning/)
-``` python
-%%writefile gui.py
+---
-import tkinter as tk
-from tkinter import *
-from tkinter import filedialog
-from PIL import Image, ImageTk
-import os
-import numpy as np
-import cv2 as cv
-import os.path
-import matplotlib
-matplotlib.use('Agg')
-
-import sys
-import os
-
-if os.environ.get('DISPLAY','') == '':
- print('no display found. Using :0.0')
- os.environ.__setitem__('DISPLAY', ':0.0')
-
-numpy_file = np.load('./pts_in_hull.npy')
-Caffe_net = cv.dnn.readNetFromCaffe("./models/colorization_deploy_v2.prototxt", "./models/colorization_release_v2.caffemodel")
-numpy_file = numpy_file.transpose().reshape(2, 313, 1, 1)
-
-class Window(Frame):
- def __init__(self, master=None):
- Frame.__init__(self, master)
-
- self.master = master
- self.pos = []
- self.master.title("B&W Image Colorization")
- self.pack(fill=BOTH, expand=1)
-
- menu = Menu(self.master)
- self.master.config(menu=menu)
-
- file = Menu(menu)
- file.add_command(label="Upload Image", command=self.uploadImage)
- file.add_command(label="Color Image", command=self.color)
- menu.add_cascade(label="File", menu=file)
-
- self.canvas = tk.Canvas(self)
- self.canvas.pack(fill=tk.BOTH, expand=True)
- self.image = None
- self.image2 = None
-
- label1=Label(self,image=img)
- label1.image=img
- label1.place(x=400,y=370)
-
-
-
-
- def uploadImage(self):
- filename = filedialog.askopenfilename(initialdir=os.getcwd())
- if not filename:
- return
- load = Image.open(filename)
-
- load = load.resize((480, 360), Image.ANTIALIAS)
-
- if self.image is None:
- w, h = load.size
- width, height = root.winfo_width(), root.winfo_height()
- self.render = ImageTk.PhotoImage(load)
- self.image = self.canvas.create_image((w / 2, h / 2), image=self.render)
-
- else:
- self.canvas.delete(self.image3)
- w, h = load.size
- width, height = root.winfo_screenmmwidth(), root.winfo_screenheight()
-
- self.render2 = ImageTk.PhotoImage(load)
- self.image2 = self.canvas.create_image((w / 2, h / 2), image=self.render2)
-
-
- frame = cv.imread(filename)
-
- Caffe_net.getLayer(Caffe_net.getLayerId('class8_ab')).blobs = [numpy_file.astype(np.float32)]
- Caffe_net.getLayer(Caffe_net.getLayerId('conv8_313_rh')).blobs = [np.full([1, 313], 2.606, np.float32)]
-
- input_width = 224
- input_height = 224
-
- rgb_img = (frame[:,:,[2, 1, 0]] * 1.0 / 255).astype(np.float32)
- lab_img = cv.cvtColor(rgb_img, cv.COLOR_RGB2Lab)
- l_channel = lab_img[:,:,0]
-
- l_channel_resize = cv.resize(l_channel, (input_width, input_height))
- l_channel_resize -= 50
-
- Caffe_net.setInput(cv.dnn.blobFromImage(l_channel_resize))
- ab_channel = Caffe_net.forward()[0,:,:,:].transpose((1,2,0))
-
- (original_height,original_width) = rgb_img.shape[:2]
- ab_channel_us = cv.resize(ab_channel, (original_width, original_height))
- lab_output = np.concatenate((l_channel[:,:,np.newaxis],ab_channel_us),axis=2)
- bgr_output = np.clip(cv.cvtColor(lab_output, cv.COLOR_Lab2BGR), 0, 1)
-
-
- cv.imwrite("./result.png", (bgr_output*255).astype(np.uint8))
-
- def color(self):
-
- load = Image.open("./result.png")
- load = load.resize((480, 360), Image.ANTIALIAS)
-
- if self.image is None:
- w, h = load.size
- self.render = ImageTk.PhotoImage(load)
- self.image = self.canvas.create_image((w / 2, h/2), image=self.render)
- root.geometry("%dx%d" % (w, h))
- else:
- w, h = load.size
- width, height = root.winfo_screenmmwidth(), root.winfo_screenheight()
-
- self.render3 = ImageTk.PhotoImage(load)
- self.image3 = self.canvas.create_image((w / 2, h / 2), image=self.render3)
- self.canvas.move(self.image3, 500, 0)
-
-
-root = tk.Tk()
-root.geometry("%dx%d" % (980, 600))
-root.title("B&W Image Colorization GUI")
-img = ImageTk.PhotoImage(Image.open("logo2.png"))
-
-app = Window(root)
-app.pack(fill=tk.BOTH, expand=1)
-root.mainloop()
-```
+
\ No newline at end of file
From e110c0b37b91fc14be7fcbf07ba842b8a5a07392 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Wed, 18 Mar 2026 23:40:50 +0530
Subject: [PATCH 3/8] Update README.md
---
Distracted Driver Detection/README.md | 1390 ++++---------------------
1 file changed, 206 insertions(+), 1184 deletions(-)
diff --git a/Distracted Driver Detection/README.md b/Distracted Driver Detection/README.md
index c60775d..06892fa 100644
--- a/Distracted Driver Detection/README.md
+++ b/Distracted Driver Detection/README.md
@@ -1,1273 +1,295 @@
-
+
-## Distracted-Driver-Detection
+# ๐ Distracted Driver Detection โ ResNet50 from Scratch
-

+[](https://www.python.org/)
+[](https://keras.io/)
+[](https://www.tensorflow.org/)
+[](https://www.kaggle.com/c/state-farm-distracted-driver-detection)
+[]()
+[](../LICENSE.md)
-
-
-
-
-### Problem Description
-
-
-
-
-
-In this competition you are given driver images, each taken in a car
-with a driver doing something in the car (texting, eating, talking on
-the phone, makeup, reaching behind, etc). Your goal is to predict the
-likelihood of what the driver is doing in each picture.
-
-The 10 classes to predict are as follows,
-c0: safe driving c1: texting - right c2:
-talking on the phone - right c3: texting - left
-c4: talking on the phone - left c5: operating the
-radio c6: drinking c7: reaching behind
- c8: hair and makeup c9: talking to passenger
- |
- |
-
-
-
-
-
-
-
-### Summary of Results
-
-
-
-
-
-Using a 50-layer Residual Network (with the following parameters) the
-following scores (losses) were obtained.
10 Epochs
-32 Batch Size Adam Optimizer Glorot Uniform
-Initializer | **Training Loss** | 0.93 |
- | **Validation Loss** | 3.79 |
|
-**Holdout Loss** | 2.64 |
-
-**Why the high losses? Simply put - we don't have enough resources to
-quickly iterate / hyper-parameter tune the model\!** If more resources
-were available (RAM, CPU speed), we could hyper-parameter tune over grid
-searches and combat high bias / high variance, which this model
-currently suffers. [This is how you'd fix high bias/variance.](#improve)
-
-
-
-
-
-### Import Dependencies and Define Functions
-
-
-
-
-
-Let's begin by importing some useful dependencies and defining some key
-functions that we'll use throughout the notebook.
-
-
-
-
-
-``` python
-import numpy as np
-import pandas as pd
-import tensorflow as tf
-import matplotlib.pyplot as plt
-
-from keras import layers
-from keras.layers import (Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization,
- Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D)
-from keras.wrappers.scikit_learn import KerasClassifier
-from keras.models import Model, load_model, save_model
-from keras.preprocessing import image
-from keras.utils import layer_utils
-from keras.utils.data_utils import get_file
-from keras.applications.imagenet_utils import preprocess_input
-import pydot
-from IPython.display import SVG
-from keras.utils.vis_utils import model_to_dot
-from keras.utils import plot_model
-from resnets_utils import *
-from keras.initializers import glorot_uniform
-import scipy.misc
-from matplotlib.pyplot import imshow
-
-%matplotlib inline
-
-import keras.backend as K
-K.set_image_data_format('channels_last')
-K.set_learning_phase(1)
-
-from sklearn.model_selection import StratifiedKFold, cross_validate, LeaveOneGroupOut
-
-from PIL import Image
-```
-
-
-
-
-
-``` python
-def PlotClassFrequency(class_counts):
- plt.figure(figsize=(15,4))
- plt.bar(class_counts.index,class_counts)
- plt.xlabel('class')
- plt.xticks(np.arange(0, 10, 1.0))
- plt.ylabel('count')
- plt.title('Number of Images per Class')
- plt.show()
-
-def DescribeImageData(data):
- print('Average number of images: ' + str(np.mean(data)))
- print("Lowest image count: {}. At: {}".format(data.min(), data.idxmin()))
- print("Highest image count: {}. At: {}".format(data.max(), data.idxmax()))
- print(data.describe())
-
-def CreateImgArray(height, width, channel, data, folder, save_labels = True):
- """
- Writes image files found in 'imgs/train' to array of shape
- [examples, height, width, channel]
-
- Arguments:
- height -- integer, height in pixels
- width -- integer, width in pixels
- channel -- integer, number of channels (or dimensions) for image (3 for RGB)
- data -- dataframe, containing associated image properties, such as:
- subject -> string, alpha-numeric code of participant in image
- classname -> string, the class name i.e. 'c0', 'c1', etc.
- img -> string, image name
- folder -- string, either 'test' or 'train' folder containing the images
- save_labels -- bool, True if labels should be saved, or False (just save 'X' images array).
- Note: only applies if using train folder
-
- Returns:
- .npy file -- file, contains the associated conversion of images to numerical values for processing
- """
-
- num_examples = len(data)
- X = np.zeros((num_examples,height,width,channel))
- if (folder == 'train') & (save_labels == True):
- Y = np.zeros(num_examples)
-
- for m in range(num_examples):
- current_img = data.img[m]
- img_path = 'imgs/' + folder + '/' + current_img
- img = image.load_img(img_path, target_size=(height, width))
- x = image.img_to_array(img)
- x = preprocess_input(x)
- X[m] = x
- if (folder == 'train') & (save_labels == True):
- Y[m] = data.loc[data['img'] == current_img, 'classname'].iloc[0]
-
- np.save('X_'+ folder + '_' + str(height) + '_' + str(width), X)
- if (folder == 'train') & (save_labels == True):
- np.save('Y_'+ folder + '_' + str(height) + '_' + str(width), Y)
-
-def Rescale(X):
- return (1/(2*np.max(X))) * X + 0.5
-
-def PrintImage(X_scaled, index, Y = None):
- plt.imshow(X_scaled[index])
- if Y is not None:
- if Y.shape[1] == 1:
- print ("y = " + str(np.squeeze(Y[index])))
- else:
- print("y = " + str(np.argmax(Y[index])))
-
-def LOGO(X, Y, group, model_name, input_shape, classes, init, optimizer, metrics, epochs, batch_size):
- logo = LeaveOneGroupOut()
- logo.get_n_splits(X, Y, group);
- cvscores = np.zeros((26,4))
- subject_id = []
- i = 0
- for train, test in logo.split(X, Y, group):
- # Create model
- model = model_name(input_shape = input_shape, classes = classes, init = init)
- # Compile the model
- model.compile(optimizer = optimizer, loss='sparse_categorical_crossentropy', metrics=[metrics])
- # Fit the model
- model.fit(X[train], Y[train], epochs = epochs, batch_size = batch_size, verbose = 0)
- # Evaluate the model
- scores_train = model.evaluate(X[train], Y[train], verbose = 0)
- scores_test = model.evaluate(X[test], Y[test], verbose = 0)
- # Save to cvscores
- cvscores[i] = [scores_train[0], scores_train[1] * 100, scores_test[0], scores_test[1] * 100]
- subject_id.append(group.iloc[test[0]])
- # Clear session
- K.clear_session()
- # Update counter
- i += 1
-
- return pd.DataFrame(cvscores, index = subject_id, columns=['Train_loss', 'Train_acc','Test_loss', 'Test_acc'])
-```
-
-
-
-
-
-### Quick EDA
-
-
-
-
-
-Let's begin by loading the provided dataset 'driver\_imgs\_list' doing a
-quick analysis.
-
-
-
-
-
-``` python
-driver_imgs_df = pd.read_csv('driver_imgs_list/driver_imgs_list.csv')
-driver_imgs_df.head()
-```
-
-
-
-```
- subject classname img
-0 p002 c0 img_44733.jpg
-1 p002 c0 img_72999.jpg
-2 p002 c0 img_25094.jpg
-3 p002 c0 img_69092.jpg
-4 p002 c0 img_92629.jpg
-```
-
-
-
-
-
-
-
-We can note the number of examples by printing the shape of the
-dataframe. Looks like the training set has 22,424 images.
-
-
-
-
-
-``` python
-driver_imgs_df.shape
-```
-
-
-
- (22424, 3)
-
-
-
-
-
-
-
-We can plot the number of images per class to see if any classes have a
-low number of images.
-
-
-
-
-
-``` python
-class_counts = (driver_imgs_df.classname).value_counts()
-PlotClassFrequency(class_counts)
-DescribeImageData(class_counts)
-```
-
-
-
-
-
-
-
-
-
- Average number of images: 2242.4
- Lowest image count: 1911. At: c8
- Highest image count: 2489. At: c0
- count 10.000000
- mean 2242.400000
- std 175.387951
- min 1911.000000
- 25% 2163.500000
- 50% 2314.500000
- 75% 2325.750000
- max 2489.000000
- Name: classname, dtype: float64
-
-
-
-
-
-
-
-Additionally, we can plot the number of images per test subject. It
-would be much more helpful to plot the number of images belonging to
-each class *per subject*. We could then ensure that the distribution is
-somewhat uniform. We did not show this here, and instead just plotted
-number of images per subject.
-
-
+> Classifies **10 distracted driving behaviors** from dashboard camera images using a **custom ResNet50 implementation built from scratch in Keras** โ including manual `convolutional_block` and `identity_block` definitions, `glorot_uniform` initialization, and LOGO cross-validation strategy.
-
-
-``` python
-subject_counts = (driver_imgs_df.subject).value_counts()
-plt.figure(figsize=(15,4))
-plt.bar(subject_counts.index,subject_counts)
-plt.xlabel('subject')
-plt.ylabel('count')
-plt.title('Number of Images per Subject')
-plt.show()
-DescribeImageData(subject_counts)
-```
-
-
-
-
-
-
-
-
-
- Average number of images: 862.461538462
- Lowest image count: 346. At: p072
- Highest image count: 1237. At: p021
- count 26.000000
- mean 862.461538
- std 214.298713
- min 346.000000
- 25% 752.500000
- 50% 823.000000
- 75% 988.250000
- max 1237.000000
- Name: subject, dtype: float64
-
-
-
-
-
-
-
-Furthermore, we can check if there are any null image examples.
-
-
-
-
-
-``` python
-pd.isnull(driver_imgs_df).sum()
-```
-
-
-
- subject 0
- classname 0
- img 0
- dtype: int64
-
-
-
-
-
-
-
-### Preprocess Data
-
-
-
-
-
-The data was provided with the classes in order (from class 0 to class
-9). Let's shuffle the data by permutating the 'classname' and 'img'
-attributes.
-
-
-
-
-
-``` python
-np.random.seed(0)
-myarray = np.random.permutation(driver_imgs_df)
-driver_imgs_df = pd.DataFrame(data = myarray, columns=['subject', 'classname', 'img'])
-```
-
-
-
-
-
-We'll go ahead and apply a dictionary to the 'classname' attribute and
-assign the strings to their respective integers.
-
-
-
-
-
-``` python
-d = {'c0': 0, 'c1': 1, 'c2': 2, 'c3': 3, 'c4': 4, 'c5': 5, 'c6': 6, 'c7': 7, 'c8': 8, 'c9': 9}
-driver_imgs_df.classname = driver_imgs_df.classname.map(d)
-```
+[๐ Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
-
+---
-### Convert Dataframe to Array for Training
+## โ ๏ธ Safety Context
-
+> Distracted driving causes thousands of road fatalities annually. Automated in-vehicle behavior classification from dashboard cameras is an active area of road safety AI research.
-
+---
-Let's convert the images into numerical arrays of dimension '64, 64, 3'.
-Both the height and width of the images will be 64 pixels, and each
-image will have 3 channels (for red, green and blue). The following
-function saves the array as a .npy file.
+## ๐ Table of Contents
-
+- [About the Project](#-about-the-project)
+- [How It Works](#-how-it-works)
+- [Dataset](#-dataset)
+- [Class Definitions](#-class-definitions)
+- [Model Architecture](#-model-architecture)
+- [Training Analysis & Challenges](#-training-analysis--challenges)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+- [References](#-references)
-
+---
-``` python
-CreateImgArray(64, 64, 3, driver_imgs_df, 'train')
-```
+## ๐ฌ About the Project
-
+This project tackles the **State Farm Distracted Driver Detection** Kaggle challenge โ classifying driver images into 10 behavior classes. What makes it distinctive is that **ResNet50 is implemented completely from scratch** using the Keras functional API, manually defining every bottleneck block and skip connection rather than using `tf.keras.applications`.
-
+The notebook also demonstrates handling real-world ML challenges: **high bias**, **high variance**, and the **LOGO (Leave-One-Group-Out) cross-validation** strategy needed because multiple images belong to the same driver โ random splits would leak the same driver into both train and validation sets.
-Let's now load the new image arrays into the environment. Note that this
-step is used to save memory so that CreateImgArray does not have to be
-executed every time.
+**What this project covers:**
+- Manual `identity_block` and `convolutional_block` implementations in Keras
+- `resnets_utils` helper module for block definitions
+- Diagnosing and addressing underfitting (high bias) and overfitting (high variance)
+- LOGO cross-validation to prevent driver-level data leakage
-
+---
-
+## โ๏ธ How It Works
-``` python
-X = np.load('X_train_64_64.npy')
-X.shape
```
-
-
-
- (22424, 64, 64, 3)
-
-
-
-
-
-
-
-``` python
-Y = np.load('Y_train_64_64.npy')
-Y.shape
-```
-
-
-
- (22424,)
-
-
-
-
-
-
-
-Let's check our new arrays and ensure we compiled everything correctly.
-We can see that we do not have any entries in X that contain zero, and Y
-contains all the target labels.
-
-
-
-
-
-``` python
-(X == 0).sum()
-```
-
-
-
-```
-0
-```
-
-
-
-
-
-
-
-``` python
-PlotClassFrequency(pd.DataFrame(Y)[0].value_counts())
-```
-
-
-
-
-
-
-
-
-
-
-
-Furthermore, we can print the images from X and the associated class as
-a sanity check. Re-scaling the images (between 0 and 1):
-
-
-
-
-
-``` python
-X_scaled = Rescale(X)
-```
-
-
-
-
-
-``` python
-PrintImage(X_scaled, 2, Y = Y.reshape(-1,1))
+Dashboard Camera Image
+ โ
+ โผ
+ Load + Preprocess
+ (Normalize pixel values / 255)
+ โ
+ โผ
+ ResNet50 Forward Pass
+ (Custom Keras implementation)
+ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+ โ ZeroPadding2D (3,3) โ
+ โ Conv2D(64,7ร7,s=2) โ BN โ ReLU โ
+ โ MaxPool(3ร3, s=2) โ
+ โ Stage 2: ConvBlock + IdBlockร2 โ
+ โ Stage 3: ConvBlock + IdBlockร3 โ
+ โ Stage 4: ConvBlock + IdBlockร5 โ
+ โ Stage 5: ConvBlock + IdBlockร2 โ
+ โ AveragePooling2D(2ร2) โ
+ โ Flatten โ Dense(10, softmax) โ
+ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+ โ
+ โผ
+ 10-Class Softmax Output โ c0โc9
```
-
-
- y = 7.0
-
-
-
-
-
-
-
-
-
-
-
-
-
-Class of "7" corresponds to a driver "reaching behind", which appears to
-be the case shown above.
-
-
-
-
-
-### Build the Model
-
-
-
-
-
-We'll use the popular Residual Net with 50 layers. Residual networks are
-essential to preventing vanishing gradients when using a rather 'deep'
-network (many layers). The identity\_block and convolutional\_block are
-defined below.
-
-
-
-
+---
-``` python
-def identity_block(X, f, filters, stage, block, init):
- """
- Implementation of the identity block as defined in Figure 3
-
- Arguments:
- X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
- f -- integer, specifying the shape of the middle CONV's window for the main path
- filters -- python list of integers, defining the number of filters in the CONV layers of the main path
- stage -- integer, used to name the layers, depending on their position in the network
- block -- string/character, used to name the layers, depending on their position in the network
-
- Returns:
- X -- output of the identity block, tensor of shape (n_H, n_W, n_C)
- """
-
- # defining name basis
- conv_name_base = 'res' + str(stage) + block + '_branch'
- bn_name_base = 'bn' + str(stage) + block + '_branch'
-
- # Retrieve Filters
- F1, F2, F3 = filters
-
- # Save the input value. You'll need this later to add back to the main path.
- X_shortcut = X
-
- # First component of main path
- X = Conv2D(filters = F1, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2a', kernel_initializer = init)(X)
- X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
- X = Activation('relu')(X)
-
- ### START CODE HERE ###
-
- # Second component of main path (โ3 lines)
- X = Conv2D(filters = F2, kernel_size = (f, f), strides = (1,1), padding = 'same', name = conv_name_base + '2b', kernel_initializer = init)(X)
- X = BatchNormalization(axis = 3, name = bn_name_base + '2b')(X)
- X = Activation('relu')(X)
-
- # Third component of main path (โ2 lines)
- X = Conv2D(filters = F3, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2c', kernel_initializer = init)(X)
- X = BatchNormalization(axis = 3, name = bn_name_base + '2c')(X)
-
- # Final step: Add shortcut value to main path, and pass it through a RELU activation (โ2 lines)
- X = Add()([X,X_shortcut])
- X = Activation('relu')(X)
-
- ### END CODE HERE ###
-
- return X
-```
+## ๐ Dataset
-
-
-
+| Property | Details |
+|----------|---------|
+| **Name** | State Farm Distracted Driver Detection |
+| **Source** | [Kaggle Competition](https://www.kaggle.com/c/state-farm-distracted-driver-detection) |
+| **Training Images** | 22,424 |
+| **Classes** | 10 driving behaviors |
+| **Input Shape** | Resized to `64 ร 64 ร 3` for training |
+| **Metadata** | `driver_imgs_list.csv` โ subject ID, classname, filename |
+| **Key Challenge** | Multiple images per driver โ LOGO cross-validation required |
-``` python
-def convolutional_block(X, f, filters, stage, block, init, s = 2):
- """
- Implementation of the convolutional block as defined in Figure 4
-
- Arguments:
- X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
- f -- integer, specifying the shape of the middle CONV's window for the main path
- filters -- python list of integers, defining the number of filters in the CONV layers of the main path
- stage -- integer, used to name the layers, depending on their position in the network
- block -- string/character, used to name the layers, depending on their position in the network
- s -- Integer, specifying the stride to be used
-
- Returns:
- X -- output of the convolutional block, tensor of shape (n_H, n_W, n_C)
- """
-
- # defining name basis
- conv_name_base = 'res' + str(stage) + block + '_branch'
- bn_name_base = 'bn' + str(stage) + block + '_branch'
-
- # Retrieve Filters
- F1, F2, F3 = filters
-
- # Save the input value
- X_shortcut = X
-
-
- ##### MAIN PATH #####
- # First component of main path
- X = Conv2D(F1, (1, 1), strides = (s,s), name = conv_name_base + '2a', kernel_initializer = init)(X)
- X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
- X = Activation('relu')(X)
-
- ### START CODE HERE ###
-
- # Second component of main path (โ3 lines)
- X = Conv2D(F2, (f, f), strides = (1,1), padding = 'same', name = conv_name_base + '2b', kernel_initializer = init)(X)
- X = BatchNormalization(axis = 3, name = bn_name_base + '2b')(X)
- X = Activation('relu')(X)
-
- # Third component of main path (โ2 lines)
- X = Conv2D(F3, (1, 1), strides = (1,1), name = conv_name_base + '2c', kernel_initializer = init)(X)
- X = BatchNormalization(axis = 3, name = bn_name_base + '2c')(X)
-
- ##### SHORTCUT PATH #### (โ2 lines)
- X_shortcut = Conv2D(F3, (1, 1), strides = (s,s), name = conv_name_base + '1', kernel_initializer = init)(X_shortcut)
- X_shortcut = BatchNormalization(axis = 3, name = bn_name_base + '1')(X_shortcut)
-
- # Final step: Add shortcut value to main path, and pass it through a RELU activation (โ2 lines)
- X = Add()([X,X_shortcut])
- X = Activation('relu')(X)
-
- ### END CODE HERE ###
-
- return X
-```
+---
-
+## ๐ฆ Class Definitions
-
+| Code | Behavior |
+|:----:|----------|
+| **c0** | โ
Safe Driving |
+| **c1** | ๐ฑ Texting โ Right Hand |
+| **c2** | ๐ Phone Call โ Right Hand |
+| **c3** | ๐ฑ Texting โ Left Hand |
+| **c4** | ๐ Phone Call โ Left Hand |
+| **c5** | ๐ต Operating Radio |
+| **c6** | ๐ฅค Drinking |
+| **c7** | ๐ Reaching Behind |
+| **c8** | ๐ Hair / Makeup |
+| **c9** | ๐ฌ Talking to Passenger |
-With the two blocks defined, we'll now create the model ResNet50, as
-shown below.
+---
-
+## ๐๏ธ Model Architecture
-
+The notebook defines **ResNet50 from scratch** โ no pretrained weights, no `tf.keras.applications`:
+
+```python
+from keras.layers import (Input, Add, Dense, Activation, ZeroPadding2D,
+ BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D)
+from keras.models import Model
+from keras.initializers import glorot_uniform
+from resnets_utils import *
-``` python
-def ResNet50(input_shape = (64, 64, 3), classes = 10, init = glorot_uniform(seed=0)):
+def ResNet50(input_shape=(64, 64, 3), classes=10, init=glorot_uniform(seed=0)):
"""
- Implementation of the popular ResNet50 the following architecture:
- CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK*2 -> CONVBLOCK -> IDBLOCK*3
- -> CONVBLOCK -> IDBLOCK*5 -> CONVBLOCK -> IDBLOCK*2 -> AVGPOOL -> TOPLAYER
-
- Arguments:
- input_shape -- shape of the images of the dataset
- classes -- integer, number of classes
-
- Returns:
- model -- a Model() instance in Keras
+ CONV2D -> BATCHNORM -> RELU -> MAXPOOL
+ -> CONVBLOCK -> IDBLOCK*2
+ -> CONVBLOCK -> IDBLOCK*3
+ -> CONVBLOCK -> IDBLOCK*5
+ -> CONVBLOCK -> IDBLOCK*2
+ -> AVGPOOL -> TOPLAYER
"""
-
- # Define the input as a tensor with shape input_shape
- X_input = Input(input_shape)
-
-
- # Zero-Padding
- X = ZeroPadding2D((3, 3))(X_input)
-
- # Stage 1
- X = Conv2D(64, (7, 7), strides = (2, 2), name = 'conv1', kernel_initializer = init)(X)
- X = BatchNormalization(axis = 3, name = 'bn_conv1')(X)
- X = Activation('relu')(X)
- X = MaxPooling2D((3, 3), strides=(2, 2))(X)
-
- # Stage 2
- X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, block='a', s = 1, init = init)
- X = identity_block(X, 3, [64, 64, 256], stage=2, block='b', init = init)
- X = identity_block(X, 3, [64, 64, 256], stage=2, block='c', init = init)
-
- ### START CODE HERE ###
-
- # Stage 3 (โ4 lines)
- X = convolutional_block(X, f = 3, filters = [128,128,512], stage = 3, block='a', s = 2, init = init)
- X = identity_block(X, 3, [128,128,512], stage=3, block='b', init = init)
- X = identity_block(X, 3, [128,128,512], stage=3, block='c', init = init)
- X = identity_block(X, 3, [128,128,512], stage=3, block='d', init = init)
-
- # Stage 4 (โ6 lines)
- X = convolutional_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block='a', s = 2, init = init)
- X = identity_block(X, 3, [256, 256, 1024], stage=4, block='b', init = init)
- X = identity_block(X, 3, [256, 256, 1024], stage=4, block='c', init = init)
- X = identity_block(X, 3, [256, 256, 1024], stage=4, block='d', init = init)
- X = identity_block(X, 3, [256, 256, 1024], stage=4, block='e', init = init)
- X = identity_block(X, 3, [256, 256, 1024], stage=4, block='f', init = init)
-
- # Stage 5 (โ3 lines)
- X = convolutional_block(X, f = 3, filters = [512, 512, 2048], stage = 5, block='a', s = 2, init = init)
- X = identity_block(X, 3, [512, 512, 2048], stage=5, block='b', init = init)
- X = identity_block(X, 3, [512, 512, 2048], stage=5, block='c', init = init)
-
- # AVGPOOL (โ1 line). Use "X = AveragePooling2D(...)(X)"
- X = AveragePooling2D(pool_size=(2, 2), name = 'avg_pool')(X)
-
- ### END CODE HERE ###
-
- # output layer
- X = Flatten()(X)
- X = Dense(classes, activation='softmax', name='fc' + str(classes), kernel_initializer = init)(X)
-
- # Create model
- model = Model(inputs = X_input, outputs = X, name='ResNet50')
-
- return model
-```
-
-
-
-
-
-### Cross Validation Training (Leave-One-Group-Out)
-
-
-
-
-
-Let's do some basic transformation on the training / label arrays, and
-print the shapes. After, we'll define some key functions for use in our
-first CNN model.
-
-
-
-
-
-``` python
-# Normalize image vectors
-X_train = X/255
-
-# Convert training and test labels to one hot matrices
-#Y = convert_to_one_hot(Y.astype(int), 10).T
-Y_train = np.expand_dims(Y.astype(int), -1)
-
-print ("number of training examples = " + str(X_train.shape[0]))
-print ("X_train shape: " + str(X_train.shape))
-print ("Y_train shape: " + str(Y_train.shape))
-```
-
-
-
- number of training examples = 22424
- X_train shape: (22424, 64, 64, 3)
- Y_train shape: (22424, 1)
-
-
-
-
-
-
-
-Next, let's call our LOGO function that incorporates the Leave One Group
-Out cross-validator. This function will allow us to split the data using
-the drivers ('subject') as the group, which should help us prevent
-overfitting as the model will probably learn too much information off
-the type of driver/subject and become biased.
-
-Below we pass the arguments to the self-defined LOGO function and
-execute. The return is a dataframe consistering of the accuracy/loss
-scores of the training/dev sets (for each group/driver).
-
-
-
-
-
-``` python
-scores = LOGO(X_train, Y_train, group = driver_imgs_df['subject'],
- model_name = ResNet50, input_shape = (64, 64, 3), classes = 10,
- init = glorot_uniform(seed=0), optimizer = 'adam', metrics = 'accuracy',
- epochs = 2, batch_size = 32)
-```
-
-
-
-
-
-Plotting the dev set accuracy, we can see that 'p081' had the lowest
-accuracy at 8.07%, and 'p002' had the highest accuracy at 71.52%.
-
-
-
-
-
-``` python
-plt.figure(figsize=(15,4))
-plt.bar(scores.index, scores.loc[:,'Test_acc'].sort_values(ascending=False))
-plt.yticks(np.arange(0, 110, 10.0))
-plt.show()
-```
-
-
-
-
-
-
-
-
-
-
-
-Calling 'describe' method, we can note some useful statistics.
-
-
-
-
-
-``` python
-scores.describe()
-```
-
-
-
-```
- Train_loss Train_acc Test_loss Test_acc
-count 26.000000 26.000000 26.000000 26.000000
-mean 4.118791 27.908272 5.293537 21.190364
-std 3.597604 19.144588 4.731039 16.150668
-min 0.722578 8.477557 0.820852 8.070501
-25% 1.849149 11.193114 2.133728 10.137083
-50% 2.545475 25.507787 2.562653 14.259937
-75% 5.299684 39.668163 8.664656 26.789961
-max 14.751674 74.439192 14.553808 71.521739
-```
-
-
-
-
-
-
-
-And finally, let's print out the train/dev scores.
-
-
-
-
-
-``` python
-print("Train acc: {:.2f}. Dev. acc: {:.2f}".format(scores['Train_acc'].mean(), scores['Test_acc'].mean()))
-print("Train loss: {:.2f}. Dev. loss: {:.2f}".format(scores['Train_loss'].mean(), scores['Test_loss'].mean()))
-```
-
-
-
- Train acc: 27.91. Dev. acc: 21.19
- Train loss: 4.12. Dev. loss: 5.29
-
-
-
-
-
-
-
-We can note that the train accuracy is higher than the dev accuracy,
-which is expected. The accuracy is quite low in comparison to our
-assumed Bayes accuracy of 100% (using human accuracy as a proxy to
-Bayes), and we have some variance (differnce between train and dev) of
-about 6.72%. Let's try increasing the number of epochs to 10 and observe
-if the train/dev accuracies increase (loss decreases).
-
-
-
-
-
-``` python
-scores = LOGO(X_train, Y_train, group = driver_imgs_df['subject'],
- model_name = ResNet50, input_shape = (64, 64, 3), classes = 10,
- init = glorot_uniform(seed=0), optimizer = 'adam', metrics = 'accuracy',
- epochs = 5, batch_size = 32)
```
-
+**Block types:**
-
+| Block | Shape Change | Used When |
+|-------|-------------|-----------|
+| **Identity Block** | Input = Output shape | Deepening without dimension change |
+| **Convolutional Block** | Input โ Output shape | When stride changes or filter count increases |
-``` python
-print("Train acc: {:.2f}. Dev. acc: {:.2f}".format(scores['Train_acc'].mean(), scores['Test_acc'].mean()))
-print("Train loss: {:.2f}. Dev. loss: {:.2f}".format(scores['Train_loss'].mean(), scores['Test_loss'].mean()))
-```
+**Stage filter configurations:**
-
+| Stage | Filters | Blocks |
+|-------|---------|--------|
+| Stage 2 | [64, 64, 256] | ConvBlock + IdBlock ร 2 |
+| Stage 3 | [128, 128, 512] | ConvBlock + IdBlock ร 3 |
+| Stage 4 | [256, 256, 1024] | ConvBlock + IdBlock ร 5 |
+| Stage 5 | [512, 512, 2048] | ConvBlock + IdBlock ร 2 |
- Train acc: 37.83. Dev. acc: 25.79
- Train loss: 2.61. Dev. loss: 3.30
+**Training config:**
-
+| Parameter | Value |
+|-----------|-------|
+| Initializer | `glorot_uniform(seed=0)` |
+| Optimizer | Adam |
+| Loss | Categorical Cross-Entropy |
+| Input Shape | `(64, 64, 3)` |
+| Output | Dense(10, softmax) |
-
+---
-
-
-
The train and dev accuracy increased
-to 37.83% and 25.79%, respectively. We can note that we still have an
-underfitting problem (high bias, about 62.17% from 100%), *however, our
-variance has increased dramatically between 2 epochs and 5 by about 80%
-(12.04% variance)\!* Not only do **we have high bias, but our model also
-exhibits high variance**. In order to tackle this, we'll need to address
-the high bias first (get as close to Bayes error as possible) and then
-deal with the resulting high variance. Note that ALL of the steps below
-should be performed with LOGO cross-validation. This way, we can be sure
-our estimates of the dev set are in line with the holdout set.
-
-In order to tackle **high bias**, we can do any of the following:
-
run more epochs increase the batch size (up to number of
-examples) make a deeper network increases the image
-size from 64x64 to 128x128, 256x256, etc. GridSearching over
-params (batch size, epoch, optimizer and it's parameters,
-initializer)
+## ๐ Training Analysis & Challenges
-
+The notebook provides honest, detailed bias-variance analysis across training runs โ a key learning documented in the project:
-
+### Epoch 2 Results
+| Set | Accuracy |
+|-----|:--------:|
+| Train | ~26% |
+| Dev | ~13% |
-Let's up the epoch count to 10. The assumption is that the train
-accuracy will be higher than the previous 5 epoch model, but our
-variance will increase.
+> High bias (underfitting) โ model hasn't converged. High variance โ large gap between train/dev.
-
+### Epoch 5 Results
+| Set | Accuracy |
+|-----|:--------:|
+| Train | **37.83%** |
+| Dev | **25.79%** |
-
+> Train accuracy improved but **underfitting persists** (~62% away from 100%). Variance increased dramatically (+80% gap between epochs 2โ5). The notebook diagnoses this explicitly:
-``` python
-scores = LOGO(X_train, Y_train, group = driver_imgs_df['subject'],
- model_name = ResNet50, input_shape = (64, 64, 3), classes = 10,
- init = glorot_uniform(seed=0), optimizer = 'adam', metrics = 'accuracy',
- epochs = 10, batch_size = 32)
```
-
-
-
-
-
-``` python
-print("Train acc: {:.2f}. Dev. acc: {:.2f}".format(scores['Train_acc'].mean(), scores['Test_acc'].mean()))
-print("Train loss: {:.2f}. Dev. loss: {:.2f}".format(scores['Train_loss'].mean(), scores['Test_loss'].mean()))
+"We still have an underfitting problem (high bias, about 62.17% from 100%),
+however, our variance has increased dramatically between 2 and 5 epochs by about 80%."
```
-
-
- Train acc: 86.95. Dev. acc: 40.68
- Train loss: 0.93. Dev. loss: 3.79
-
-
-
-
-
-
+### Prescribed fixes documented in the notebook:
-As expected, the training accuracy increased to 86.95%, but the variance
-increase from 5 epochs to 10 was about 284% (46.27% variance)\! Thus, we
-can conclude that this model suffers from severe high variance. We can
-continue on and use the steps above to fix the remaining bias, then we
-can use the steps below to reduce the variance.
+**To address High Bias (underfitting):**
+- Increase epoch count
+- Use a bigger/deeper network
+- Try different optimizers or learning rate schedules
-
-
-
-
-In order to tackle **high variance**, we can do any of the following:
-
Augment images to increase sample size Regularization
-GridSearching over params (batch size, epoch, optimizer and it's
-parameters, initializer) Decrease dev set size (allows more
-examples to be trained, making model less prone to overfitting)
-Investigate classes with low accuracy, and fix them
-
-
-
-
-
-
-
- |
- **Model**
- |
-
- **Epoch**
- |
-
- **Train Accuracy**
- |
-
- **Dev Accuracy**
- |
-
- **Bias**
- |
-
- **Variance**
- |
-
-
- |
- **Model A**
- |
-
- 2
- |
-
- 27.91
- |
-
- 21.19
- |
-
- 72.09
- |
-
- 6.72
- |
-
-
- |
- **Model B**
- |
-
- 5
- |
-
- 37.83
- |
-
- 25.79
- |
-
- 62.17
- |
-
- 12.04
- |
-
-
- |
- **Model C**
- |
-
- 10
- |
-
- 86.95
- |
-
- 40.68
- |
-
- 13.06
- |
-
- 46.27
- |
-
-
+**To address High Variance (overfitting):**
+- Apply L2 regularization
+- Add dropout layers
+- Use data augmentation
+- Increase training data volume
-
-
-
-
-### Predictions on the Holdout Set
+### LOGO Cross-Validation Note
-
-
-
+> Standard random train/val splits cause **data leakage** โ the same driver's images appear in both sets, inflating dev accuracy. The notebook flags this and recommends **Leave-One-Group-Out (LOGO)** cross-validation, splitting by `subject` (driver ID) from `driver_imgs_list.csv`.
-We'll go ahead and fit the 10 epoch model.
+---
-
+## ๐ Project Structure
-
-
-``` python
-model = ResNet50(input_shape = (64, 64, 3), classes = 10)
-model.compile(optimizer = 'adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
-model.fit(X_train, Y_train, epochs = 10, batch_size = 32)
-```
-
-
-
- Epoch 1/10
- 22424/22424 [==============================] - 83s 4ms/step - loss: 2.4026 - acc: 0.3128
- Epoch 2/10
- 22424/22424 [==============================] - 76s 3ms/step - loss: 1.8118 - acc: 0.4996
- Epoch 3/10
- 22424/22424 [==============================] - 76s 3ms/step - loss: 1.5023 - acc: 0.6153
- Epoch 4/10
- 22424/22424 [==============================] - 76s 3ms/step - loss: 0.8445 - acc: 0.8483
- Epoch 5/10
- 22424/22424 [==============================] - 76s 3ms/step - loss: 1.2427 - acc: 0.7447
- Epoch 6/10
- 22424/22424 [==============================] - 76s 3ms/step - loss: 0.8930 - acc: 0.8216
- Epoch 7/10
- 22424/22424 [==============================] - 76s 3ms/step - loss: 0.9400 - acc: 0.8144
- Epoch 8/10
- 22424/22424 [==============================] - 76s 3ms/step - loss: 0.7440 - acc: 0.8748
- Epoch 9/10
- 22424/22424 [==============================] - 76s 3ms/step - loss: 1.4076 - acc: 0.6559
- Epoch 10/10
- 22424/22424 [==============================] - 76s 3ms/step - loss: 0.6796 - acc: 0.8135
-
-
-
-
-
-
-
-
-
-
-
-
-
-``` python
-save_model(model, 'e10.h5');
```
-
-
-
-
-
-``` python
-model = load_model('e10.h5')
+Distracted Driver Detection/
+โ
+โโโ ๐ dataset/
+โ โโโ train/ # Training images, organized by class
+โ โ โโโ c0/ c1/ c2/ ... c9/
+โ โโโ test/ # Unlabeled test images
+โ
+โโโ driver_imgs_list.csv # subject, classname, img columns
+โโโ resnets_utils.py # identity_block + convolutional_block helpers
+โโโ distracted_driver_detection.ipynb # Main notebook
+โโโ requirements.txt # Python dependencies
+โโโ README.md # You are here
```
-
-
-
-
-Let's load the holdout data set from out 'test\_file\_names' csv file
-and then create the necessary array.
+---
-
+## ๐ Getting Started
-
+### 1. Clone the repository
-``` python
-holdout_imgs_df = pd.read_csv('test_file_names.csv')
-holdout_imgs_df.rename(columns={"imagename": "img"}, inplace = True)
+```bash
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Distracted Driver Detection"
```
-
-
-
+### 2. Download the dataset from Kaggle
-``` python
-CreateImgArray(64, 64, 3, holdout_imgs_df, 'test')
+```bash
+pip install kaggle
+kaggle competitions download -c state-farm-distracted-driver-detection
+unzip state-farm-distracted-driver-detection.zip -d dataset/
```
-
+Or download manually from: [kaggle.com/c/state-farm-distracted-driver-detection/data](https://www.kaggle.com/c/state-farm-distracted-driver-detection/data)
-
+### 3. Set up environment
-Again, we'll load the data here instead of having to run CreateImgArray
-repeatedly.
+```bash
+python -m venv venv
+source venv/bin/activate # Linux / macOS
+venv\Scripts\activate # Windows
-
-
-
-
-``` python
-X_holdout = np.load('X_test_64_64.npy')
-X_holdout.shape
+pip install -r requirements.txt
```
-
-
- (79726, 64, 64, 3)
-
-
-
-
-
-
-
-And now calling predictions on the holdout set, as shown below. MAKE
-SURE to clear the memory before this step\!
-
-
+### 4. Run the notebook
-
-
-``` python
-probabilities = model.predict(X_holdout, batch_size = 32)
+```bash
+jupyter notebook distracted_driver_detection.ipynb
```
-
+---
-
+## ๐ ๏ธ Tech Stack
-If desired (as a sanity check) we can visually check our predictions by
-scaling the X\_holdout array and then printing the image.
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Deep Learning | TensorFlow / Keras |
+| Model | ResNet50 (from scratch via Keras functional API) |
+| Utilities | `resnets_utils.py` (custom block helpers) |
+| Data | Pandas, NumPy |
+| Visualization | Matplotlib |
+| Notebook | Jupyter / Google Colab |
-
+---
-
+## ๐ References
-``` python
-X_holdout_scaled = Rescale(X_holdout)
-```
+- [State Farm Distracted Driver Detection โ Kaggle](https://www.kaggle.com/c/state-farm-distracted-driver-detection)
+- He, K., Zhang, X., Ren, S., & Sun, J. (2015). *Deep Residual Learning for Image Recognition.* [arXiv:1512.03385](https://arxiv.org/abs/1512.03385)
+- [deeplearning.ai โ ResNet50 from scratch (Coursera)](https://www.coursera.org/learn/convolutional-neural-networks)
+- [Keras Functional API Documentation](https://keras.io/guides/functional_api/)
-
+---
-
+
-``` python
-index = 50000
-PrintImage(X_holdout_scaled, index = index, Y = probabilities)
-print('y_pred = ' + str(probabilities[index].argmax()))
-```
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
-
-
- y = 9
- y_pred = 9
+โญ Star the main repo if this helped you!
-
-
-
-
From 3ebaf3b97621a9b1092e82ca0e5a7e821cdd8ea8 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Wed, 18 Mar 2026 23:51:36 +0530
Subject: [PATCH 4/8] Update README.md
---
Drowsiness detection [OPEN CV]/README.md | 347 ++++++++++++++++++-----
1 file changed, 278 insertions(+), 69 deletions(-)
diff --git a/Drowsiness detection [OPEN CV]/README.md b/Drowsiness detection [OPEN CV]/README.md
index 1d46eb7..d52d79a 100644
--- a/Drowsiness detection [OPEN CV]/README.md
+++ b/Drowsiness detection [OPEN CV]/README.md
@@ -1,105 +1,314 @@
-# Driver Drowsiness Detection System
+
-## Introduction
+# ๐ด Driver Drowsiness Detection โ OpenCV + Keras CNN
-This project focuses on building a Driver Drowsiness Detection System that monitors a driver's eye status using a webcam and alerts them if they appear drowsy. We utilize **OpenCV** for image capture and preprocessing, while a **Convolutional Neural Network (CNN)** model classifies whether the driver's eyes are 'Open' or 'Closed.' If drowsiness is detected, an alarm is triggered to alert the driver.
+[](https://www.python.org/)
+[](https://opencv.org/)
+[](https://keras.io/)
+[](https://www.pygame.org/)
+[]()
+[](../LICENSE.md)
-## Project Overview
+> A **real-time driver drowsiness detection system** that uses **Haar Cascade classifiers** to locate the driver's eyes in every webcam frame and a **custom-trained CNN** (`cnnCat2.h5`) to classify each eye as **Open** or **Closed** โ sounding a `pygame` alarm when drowsiness is detected.
-### Steps in the Detection Process:
-1. **Image Capture**: Capture the image using a webcam.
-2. **Face Detection**: Detect the face in the captured image and create a Region of Interest (ROI).
-3. **Eye Detection**: Detect the eyes from the ROI and feed them into the classifier.
-4. **Eye Classification**: The classifier categorizes whether the eyes are open or closed.
-5. **Drowsiness Score Calculation**: Calculate a score to determine if the driver is drowsy based on how long their eyes remain closed.
+[๐ Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
-## CNN Model
+
-The **Convolutional Neural Network (CNN)** architecture consists of the following layers:
-- **Convolutional Layers**:
- - 32 nodes, kernel size 3
- - 32 nodes, kernel size 3
- - 64 nodes, kernel size 3
-- **Fully Connected Layers**:
- - 128 nodes
- - Output layer: 2 nodes (with Softmax activation for classification)
+---
-### Activation Function:
-- **ReLU**: Used in all layers except the output layer.
-- **Softmax**: Used in the output layer to classify the eyes as either 'Open' or 'Closed.'
+## โ ๏ธ Safety Context
-## Project Prerequisites
+> Drowsy driving causes thousands of road fatalities annually. This system provides a real-time, automated alert to combat driver fatigue using a lightweight CNN that runs entirely on a standard webcam feed.
-### Required Hardware:
-- A webcam for image capture.
+---
-### Required Libraries:
-Ensure Python (version 3.6 recommended) is installed on your system. Then, install the following libraries using `pip`:
+## ๐ Table of Contents
-```bash
-pip install opencv-python
-pip install tensorflow
-pip install keras
-pip install pygame
+- [About the Project](#-about-the-project)
+- [How It Works](#-how-it-works)
+- [CNN Model Architecture](#-cnn-model-architecture)
+- [Dataset](#-dataset)
+- [Haar Cascade Files](#-haar-cascade-files)
+- [Scoring & Alert Logic](#-scoring--alert-logic)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+- [Known Limitations](#-known-limitations)
+- [References](#-references)
+
+---
+
+## ๐ฌ About the Project
+
+This project detects driver drowsiness through a two-stage pipeline:
+
+1. **Detection** โ OpenCV Haar Cascade classifiers locate the face and each eye (left, right) in every frame
+2. **Classification** โ A custom-trained Keras CNN (`cnnCat2.h5`) classifies each eye ROI as **Open** or **Closed**
+
+A running score is incremented each frame when eyes are detected as closed. When the score crosses a threshold, `pygame` plays `alarm.wav` and a "**DROWSY**" warning is overlaid on the video feed.
+
+**What this project covers:**
+- Training a binary CNN classifier on a custom ~7,000-image eye dataset
+- Real-time face and eye detection with OpenCV Haar cascades
+- Score-based drowsiness logic (accumulate โ threshold โ alarm)
+- Alarm playback with `pygame.mixer`
+
+---
+
+## โ๏ธ How It Works
+
+```
+Webcam Frame (live stream)
+ โ
+ โผ
+ Convert BGR โ Grayscale
+ โ
+ โผ
+ Haar Cascade: Detect Face
+ (haarcascade_frontalface_alt.xml)
+ โ
+ โผ
+ Haar Cascade: Detect Eyes from frame
+ โโโ Left Eye (haarcascade_lefteye_2splits.xml)
+ โโโ Right Eye (haarcascade_righteye_2splits.xml)
+ โ
+ โผ
+ Crop Eye ROI โ Resize โ Normalize
+ โ
+ โผ
+ CNN Forward Pass (cnnCat2.h5)
+ โ Predict: ['Close', 'Open']
+ โ rpred / lpred updated per frame
+ โ
+ โโโ Both eyes Open โ score decremented (min 0)
+ โ
+ โโโ Eye(s) Closed โ score incremented
+ โ
+ โโโ score > threshold
+ โ
+ โผ
+ ๐ pygame alarm.wav
+ ๐บ "DROWSY" on screen
+ ๐ฅ Red border on frame
+```
+
+---
+
+## ๐ง CNN Model Architecture
+
+`model.py` defines and trains the CNN classifier. The trained weights are saved as `models/cnnCat2.h5`.
+
+```
+Input: Eye ROI image (24 ร 24 ร 1, grayscale)
+ โ
+ โผ
+Conv2D(32, 3ร3) โ ReLU โ MaxPool(1,1)
+Conv2D(32, 3ร3) โ ReLU โ MaxPool(1,1)
+Conv2D(64, 3ร3) โ ReLU โ MaxPool(1,1)
+ โ
+ โผ
+Flatten
+Dense(128) โ ReLU
+Dropout(0.5)
+Dense(2) โ Softmax
+ โ
+ โผ
+Output: ['Close', 'Open']
+```
+
+**Training configuration:**
+
+| Parameter | Value |
+|-----------|-------|
+| Classes | 2 โ `Close` / `Open` |
+| Input Size | 24 ร 24 ร 1 (grayscale) |
+| Optimizer | Adam |
+| Loss | Categorical Cross-Entropy |
+| Activation (hidden) | ReLU |
+| Activation (output) | Softmax |
+| Regularization | Dropout (0.5) |
+
+---
+
+## ๐ Dataset
+
+| Property | Details |
+|----------|---------|
+| **Type** | Custom โ captured via webcam script |
+| **Total Images** | ~7,000 eye images |
+| **Classes** | `Open` / `Close` |
+| **Conditions** | Various lighting conditions |
+| **Cleaning** | Manually cleaned to remove unusable frames |
+
+The dataset was created by writing a capture script that crops eye regions frame by frame and saves them to disk, labeled by folder (`Open/` or `Closed/`). It was then manually reviewed to remove noisy or ambiguous images.
+
+> **Want to train on your own data?** Run `model.py` against your own captured eye dataset following the same `Open/Close` folder structure.
+
+---
+
+## ๐ Haar Cascade Files
+
+Three XML classifiers are used from the `haar cascade files/` folder:
+
+| File | Purpose |
+|------|---------|
+| `haarcascade_frontalface_alt.xml` | Detects the driver's face bounding box |
+| `haarcascade_lefteye_2splits.xml` | Detects the left eye region within the frame |
+| `haarcascade_righteye_2splits.xml` | Detects the right eye region within the frame |
+
+These are pre-trained OpenCV Haar cascades โ no training required. They are loaded in `drowsinessdetection.py` as:
+
+```python
+face = cv2.CascadeClassifier('haar cascade files/haarcascade_frontalface_alt.xml')
+leye = cv2.CascadeClassifier('haar cascade files/haarcascade_lefteye_2splits.xml')
+reye = cv2.CascadeClassifier('haar cascade files/haarcascade_righteye_2splits.xml')
```
-### Other Project Files:
-- **Haar Cascade Files**: Located in the "haar cascade files" folder, these XML files are necessary for detecting faces and eyes.
-- **Model File**: The "models" folder contains the pre-trained CNN model `cnnCat2.h5`.
-- **Alarm Sound**: The audio clip `alarm.wav` will play when drowsiness is detected.
-- **Python Files**:
- - `Model.py`: The file used to build and train the CNN model.
- - `Drowsiness detection.py`: The main file that executes the driver drowsiness detection system.
+---
-## How the Algorithm Works
+## ๐ฏ Scoring & Alert Logic
-### Step 1 โ Image Capture
-The webcam captures images in real-time using `cv2.VideoCapture(0)` and processes each frame. The frames are stored in a variable `frame`.
+The system uses a **running score counter** rather than a fixed-frame threshold:
-### Step 2 โ Face Detection
-The image is converted to grayscale for face detection using a **Haar Cascade Classifier**. The faces are detected using `detectMultiScale()`, and boundary boxes are drawn around the detected faces.
+```python
+lbl = ['Close', 'Open'] # CNN output labels
-### Step 3 โ Eye Detection
-Similar to face detection, eyes are detected within the ROI using another cascade classifier. The eye images are extracted and passed to the CNN model for classification.
+# Per frame:
+if rpred[0] == 0 and lpred[0] == 0: # Both eyes closed
+ score += 1
+ cv2.putText(frame, "Closed", ...)
+else: # Eyes open
+ score -= 1
+ cv2.putText(frame, "Open", ...)
-### Step 4 โ Eye Classification
-The extracted eye images are preprocessed by resizing to 24x24 pixels, normalizing the values, and then passed into the CNN model (`cnnCat2.h5`). The model predicts whether the eyes are open or closed.
+score = max(score, 0) # Score never goes negative
-### Step 5 โ Drowsiness Detection
-A score is calculated based on the status of both eyes. If both eyes are closed for an extended period, the score increases, indicating drowsiness. If the score exceeds a threshold, an alarm is triggered using the **Pygame** library.
+if score > 15: # Drowsiness threshold
+ # Sound alarm
+ mixer.Sound('alarm.wav').play()
+ # Draw red border on frame
+ thicc = min(thicc + 2, 16)
+ cv2.rectangle(frame, (0,0), (width,height), (0,0,255), thicc)
+```
+
+| Variable | Value | Meaning |
+|----------|:-----:|---------|
+| `score` threshold | **15** | Frames of closed eyes before alarm |
+| `rpred` / `lpred` | `0` = Closed, `1` = Open | CNN prediction per eye |
+| Border thickness `thicc` | Grows up to 16px | Visual urgency indicator |
+
+---
+
+## ๐ Project Structure
+
+```
+Drowsiness detection [OPEN CV]/
+โ
+โโโ ๐ haar cascade files/
+โ โโโ haarcascade_frontalface_alt.xml # Face detector
+โ โโโ haarcascade_lefteye_2splits.xml # Left eye detector
+โ โโโ haarcascade_righteye_2splits.xml # Right eye detector
+โ
+โโโ ๐ models/
+โ โโโ cnnCat2.h5 # Trained CNN weights (download separately)
+โ
+โโโ drowsinessdetection.py # Main script โ webcam loop + detection + alarm
+โโโ model.py # CNN model definition + training script
+โโโ alarm.wav # Alert sound file
+โโโ README.md # You are here
+```
+
+> **Note:** `models/cnnCat2.h5` is not included in the repo due to GitHub file size limits. Download it from the Google Drive link in the project or train your own by running `model.py`.
-## Execution Instructions
+---
-### Running the Detection System
+## ๐ Getting Started
-1. Open the command prompt and navigate to the directory where the main file `drowsiness detection.py` is located.
-2. Run the script using the following command:
+### 1. Clone the repository
```bash
-python drowsiness detection.py
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Drowsiness detection [OPEN CV]"
```
-The system will access the webcam and start detecting drowsiness. The real-time status will be displayed on the screen.
+### 2. Set up environment
-## Summary
+```bash
+python -m venv venv
+source venv/bin/activate # Linux / macOS
+venv\Scripts\activate # Windows
+
+pip install -r requirements.txt
+```
-This Python project implements a **Driver Drowsiness Detection System** using **OpenCV** and a **CNN model** to detect whether the driverโs eyes are open or closed. When the eyes are detected as closed for a prolonged time, an alert sound is played to prevent potential accidents. This system can be implemented in vehicles or other applications to enhance driver safety.
+### 3. Download the trained model
-## Future Enhancements
+The `cnnCat2.h5` model file must be placed in the `models/` folder. Download it from the link provided in the repository issues/releases, then:
-- Improve the detection accuracy by training on a larger dataset.
-- Implement real-time monitoring for multiple people.
-- Add functionalities to detect other signs of drowsiness like head tilting or yawning.
-
-## Contributing
+```bash
+mkdir models
+# Place cnnCat2.h5 inside models/
+```
-Feel free to contribute by submitting issues or pull requests. For major changes, please open an issue to discuss the proposed changes before submitting a PR.
+Or train your own model from scratch:
+```bash
+python model.py
+# Saves models/cnnCat2.h5 automatically
+```
-## Acknowledgments
+### 4. Run the detector
-- [OpenCV Documentation](https://opencv.org/)
+```bash
+python drowsinessdetection.py
+```
+
+- The webcam opens automatically
+- Eyes detected as closed โ score increments
+- Score exceeds threshold โ **alarm sounds + red border appears**
+- Press **`q`** to quit
+
+---
+
+## ๐ ๏ธ Tech Stack
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Computer Vision | OpenCV (`cv2`) |
+| Eye Detection | Haar Cascade Classifiers |
+| Deep Learning | Keras + TensorFlow backend |
+| Model | Custom CNN (`cnnCat2.h5`) |
+| Audio Alarm | Pygame (`pygame.mixer`) |
+| Numerical Processing | NumPy |
+
+---
+
+## โ ๏ธ Known Limitations
+
+| Limitation | Detail |
+|-----------|--------|
+| **Lighting sensitivity** | Haar cascades and CNN accuracy drop under poor or uneven lighting |
+| **Glasses / sunglasses** | Frames and tinted lenses obstruct eye detection |
+| **Head pose** | Extreme angles may cause Haar cascade face/eye detection to fail |
+| **Single eye closure** | If only one eye closes (winking), score increments only partially |
+| **No yawn detection** | Fatigue from yawning is not measured โ only eye closure |
+
+---
+
+## ๐ References
+
+- [OpenCV Haar Cascade Documentation](https://docs.opencv.org/4.x/db/d28/tutorial_cascade_classifier.html)
- [Keras Documentation](https://keras.io/)
-- [TensorFlow Documentation](https://www.tensorflow.org/)
+- [Pygame mixer Documentation](https://www.pygame.org/docs/ref/mixer.html)
+
+---
+
+
+
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
+
+โญ Star the main repo if this helped you!
----
\ No newline at end of file
+
From 4c45f0d0f20c0d62bdaa1538cce68c31c9011eff Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Thu, 19 Mar 2026 00:02:10 +0530
Subject: [PATCH 5/8] Update README.md
---
.../README.md | 255 ++++++++++++++++++
1 file changed, 255 insertions(+)
diff --git a/Gender and age detection using deep learning/README.md b/Gender and age detection using deep learning/README.md
index e69de29..7cf6f24 100644
--- a/Gender and age detection using deep learning/README.md
+++ b/Gender and age detection using deep learning/README.md
@@ -0,0 +1,255 @@
+
+
+# ๐งโ๐คโ๐ง Gender & Age Detection โ OpenCV Deep Learning
+
+[](https://www.python.org/)
+[](https://opencv.org/)
+[](http://caffe.berkeleyvision.org/)
+[](https://talhassner.github.io/home/projects/Adience/Adience-data.html)
+[](../LICENSE.md)
+
+> Detects **faces** in images or a live webcam feed and predicts each person's **gender** (Male/Female) and **age range** across 8 age buckets โ using three pre-trained deep learning models loaded via **OpenCV DNN**.
+
+[๐ Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
+
+
+
+---
+
+## ๐ Table of Contents
+
+- [About the Project](#-about-the-project)
+- [How It Works](#-how-it-works)
+- [The Three Models](#-the-three-models)
+- [Age & Gender Classes](#-age--gender-classes)
+- [CNN Architecture](#-cnn-architecture)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+- [References & Citation](#-references--citation)
+
+---
+
+## ๐ฌ About the Project
+
+This project builds a **real-time gender and age detection system** using three pre-trained models served through OpenCV's DNN module โ no model training required. Based on the DataFlair deep learning project, it uses:
+
+- A **TensorFlow SSD** model for face detection
+- A **Caffe CNN** (Levi & Hassner, 2015) for gender classification
+- A **Caffe CNN** (Levi & Hassner, 2015) for age prediction
+
+The script (`gad.py`) accepts a **static image** via `--image` argument or runs on a **live webcam feed**, draws bounding boxes around detected faces, and overlays the predicted gender and age range on each face.
+
+---
+
+## โ๏ธ How It Works
+
+```
+Input: Image / Webcam Frame
+ โ
+ โผ
+ blobFromImage(frame, 1.0, (300ร300), [104,117,123])
+ โ
+ โผ
+ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+ โ Face Detection (TensorFlow SSD) โ
+ โ opencv_face_detector_uint8.pb โ
+ โ opencv_face_detector.pbtxt โ
+ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+ โ
+ โผ
+ For each face (confidence > 0.7):
+ Crop face ROI + 20px padding
+ blobFromImage(face, 1.0, (227ร227), MODEL_MEAN_VALUES)
+ โ
+ โโโโโโโดโโโโโโโ
+ โผ โผ
+ โโโโโโโโโโโโ โโโโโโโโโโโโ
+ โ Gender โ โ Age โ
+ โ Network โ โ Network โ
+ โ (Caffe) โ โ (Caffe) โ
+ โโโโโโโโโโโโ โโโโโโโโโโโโ
+ โ โ
+ โผ โผ
+ Male/Female Age Bucket
+ โโโโโโโฌโโโโโโโ
+ โผ
+ "Gender: Male Age: (25-32)"
+ overlaid on bounding box
+```
+
+**Key preprocessing constant:**
+```python
+MODEL_MEAN_VALUES = (78.4263377603, 87.7689143744, 114.895847746)
+```
+> BGR mean values subtracted from every face blob to normalize for illumination variation across the Adience training data.
+
+---
+
+## ๐ง The Three Models
+
+| Model | Framework | Files | Purpose |
+|-------|-----------|-------|---------|
+| **Face Detector** | TensorFlow SSD | `opencv_face_detector_uint8.pb` + `opencv_face_detector.pbtxt` | Detect face bounding boxes |
+| **Gender Net** | Caffe (Levi & Hassner) | `gender_net.caffemodel` + `gender_deploy.prototxt` | Classify Male / Female |
+| **Age Net** | Caffe (Levi & Hassner) | `age_net.caffemodel` + `age_deploy.prototxt` | Predict one of 8 age ranges |
+
+```python
+faceNet = cv2.dnn.readNet("opencv_face_detector_uint8.pb", "opencv_face_detector.pbtxt")
+ageNet = cv2.dnn.readNet("age_net.caffemodel", "age_deploy.prototxt")
+genderNet = cv2.dnn.readNet("gender_net.caffemodel", "gender_deploy.prototxt")
+```
+
+---
+
+## ๐ท๏ธ Age & Gender Classes
+
+**Gender** (2 classes):
+```python
+genderList = ['Male', 'Female']
+```
+
+**Age** (8 buckets):
+```python
+ageList = ['(0-2)', '(4-6)', '(8-12)', '(15-20)',
+ '(25-32)', '(38-43)', '(48-53)', '(60-100)']
+```
+
+> Age is treated as a **classification problem** over 8 discrete ranges rather than regression โ Levi & Hassner (2015) found classification over predefined buckets more robust than direct regression on the Adience benchmark.
+
+---
+
+## ๐๏ธ CNN Architecture
+
+Both age and gender models share the same architecture โ a lightweight CNN similar to CaffeNet/AlexNet, trained on the **Adience dataset**:
+
+```
+Input: 227 ร 227 ร 3 face crop (mean-subtracted)
+ โ
+Conv1: 96 filters, 7ร7 kernel โ ReLU โ MaxPool โ LRN
+Conv2: 256 filters, 5ร5 kernel โ ReLU โ MaxPool โ LRN
+Conv3: 384 filters, 3ร3 kernel โ ReLU โ MaxPool
+ โ
+FC1: 512 nodes โ ReLU โ Dropout
+FC2: 512 nodes โ ReLU โ Dropout
+ โ
+Softmax
+โโโ Gender Net output: 2 (Male / Female)
+โโโ Age Net output: 8 (age range buckets)
+```
+
+---
+
+## ๐ Project Structure
+
+```
+Gender and age detection using deep learning/
+โ
+โโโ gad.py # Main script โ detection pipeline
+โ
+โโโ age_net.caffemodel # Age model weights (Caffe, ~44 MB)
+โโโ age_deploy.prototxt # Age model architecture
+โโโ gender_net.caffemodel # Gender model weights (Caffe, ~44 MB)
+โโโ gender_deploy.prototxt # Gender model architecture
+โโโ opencv_face_detector_uint8.pb # Face detector weights (TensorFlow)
+โโโ opencv_face_detector.pbtxt # Face detector architecture
+โ
+โโโ girl1.jpg # Sample test images
+โโโ girl2.jpg # โ
+โโโ kid1.jpg # โ
+โโโ man1.jpg # โ
+โโโ minion.jpg # โ
+โโโ woman1.jpg # โ
+โโโ woman3.jpg # โ
+โ
+โโโ README.md
+```
+
+> **Note:** The `.caffemodel` files (~44 MB each) may not be included in the repository due to GitHub's file size limits. If missing, download them from [Tal Hassner's Adience page](https://talhassner.github.io/home/projects/Adience/Adience-data.html) and place them in the project root.
+
+---
+
+## ๐ Getting Started
+
+### 1. Clone the repository
+
+```bash
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Gender and age detection using deep learning"
+```
+
+### 2. Set up environment
+
+```bash
+python -m venv venv
+source venv/bin/activate # Linux / macOS
+venv\Scripts\activate # Windows
+
+pip install -r requirements.txt
+```
+
+### 3. Run on a sample image
+
+```bash
+python gad.py --image girl1.jpg
+# Output โ Gender: Female Age: (25-32) years
+```
+
+Try the included sample images:
+
+```bash
+python gad.py --image man1.jpg
+python gad.py --image kid1.jpg
+python gad.py --image woman1.jpg
+python gad.py --image minion.jpg # ๐ค
+```
+
+### 4. Run on live webcam
+
+```bash
+python gad.py
+# No --image flag โ defaults to webcam (index 0)
+# Press Q to quit
+```
+
+---
+
+## ๐ ๏ธ Tech Stack
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Computer Vision | OpenCV (`cv2.dnn`) |
+| Face Detection | TensorFlow SSD (ResNet-10 backbone) |
+| Age / Gender Models | Caffe (Levi & Hassner, 2015) |
+| Argument Parsing | `argparse` |
+| Numerical Processing | NumPy |
+
+---
+
+## ๐ References & Citation
+
+```bibtex
+@inproceedings{Levi2015,
+ author = {Gil Levi and Tal Hassner},
+ title = {Age and Gender Classification Using Convolutional Neural Networks},
+ booktitle = {IEEE Workshop on Analysis and Modeling of Faces and Gestures (AMFG),
+ at the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
+ year = {2015}
+}
+```
+
+- [Levi & Hassner (2015) โ Original Paper & Models](https://talhassner.github.io/home/projects/Adience/Adience-data.html)
+- [Adience Benchmark Dataset](https://talhassner.github.io/home/projects/Adience/Adience-data.html)
+- [OpenCV DNN Face Detector](https://github.com/opencv/opencv/tree/master/samples/dnn)
+- [LearnOpenCV โ Age & Gender Classification](https://learnopencv.com/age-gender-classification-using-opencv-deep-learning-c-python/)
+
+---
+
+
+
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
+
+โญ Star the main repo if this helped you!
+
+
From 788922517d9338f3a8aa4e63331a06cd7de911b7 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Thu, 19 Mar 2026 00:05:28 +0530
Subject: [PATCH 6/8] Create README.md
---
.../README.md | 218 ++++++++++++++++++
1 file changed, 218 insertions(+)
create mode 100644 Getting Admission in College Prediction/README.md
diff --git a/Getting Admission in College Prediction/README.md b/Getting Admission in College Prediction/README.md
new file mode 100644
index 0000000..d4278cc
--- /dev/null
+++ b/Getting Admission in College Prediction/README.md
@@ -0,0 +1,218 @@
+
+
+# ๐ Getting Admission in College Prediction
+
+[](https://www.python.org/)
+[](https://scikit-learn.org/)
+[](https://jupyter.org/)
+[](https://www.kaggle.com/mohansacharya/graduate-admissions)
+[]()
+[](../LICENSE.md)
+
+> Predicts a student's **probability of graduate college admission** (as a continuous value between 0 and 1) from 7 academic and profile features โ using a `GridSearchCV`-powered model comparison across 6 regression algorithms.
+
+[๐ Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
+
+
+
+---
+
+## ๐ Table of Contents
+
+- [About the Project](#-about-the-project)
+- [Dataset](#-dataset)
+- [Features](#-features)
+- [Methodology](#-methodology)
+- [Model Comparison Results](#-model-comparison-results)
+- [Final Model Performance](#-final-model-performance)
+- [Sample Predictions](#-sample-predictions)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+
+---
+
+## ๐ฌ About the Project
+
+Getting into a good graduate program is one of the most competitive processes for students worldwide. This project builds a **regression model** that predicts the probability of admission based on a student's GRE score, TOEFL score, CGPA, university rating, SOP, LOR, and research experience.
+
+Six regression algorithms are trained and compared using **GridSearchCV with 5-fold cross-validation** via a custom `find_best_model()` function. The best-performing model is then evaluated on a held-out test set.
+
+**What this project covers:**
+- Exploratory data analysis on 500 graduate applicant profiles
+- Custom `find_best_model()` with GridSearchCV across 6 regressors
+- Feature importance and correlation analysis
+- Linear Regression selected as the final model with **Rยฒ = 0.821** on test set
+
+---
+
+## ๐ Dataset
+
+| Property | Details |
+|----------|---------|
+| **File** | `admission_predict.csv` |
+| **Source** | [Kaggle โ Graduate Admissions](https://www.kaggle.com/mohansacharya/graduate-admissions) |
+| **Rows** | 500 student records |
+| **Columns** | 9 (including Serial No. and target) |
+| **Task** | Regression โ predict `Chance of Admit` โ [0, 1] |
+| **Missing Values** | None |
+
+---
+
+## ๐ฌ Features
+
+| Column | Type | Range | Description |
+|--------|------|:-----:|-------------|
+| `GRE Score` | Integer | 290โ340 | Graduate Record Examination score |
+| `TOEFL Score` | Integer | 92โ120 | Test of English as a Foreign Language score |
+| `University Rating` | Integer | 1โ5 | Prestige rating of undergraduate university |
+| `SOP` | Float | 1.0โ5.0 | Strength of Statement of Purpose |
+| `LOR` | Float | 1.0โ5.0 | Strength of Letter of Recommendation |
+| `CGPA` | Float | 6.8โ9.92 | Undergraduate GPA (out of 10) |
+| `Research` | Binary | 0 / 1 | Research experience (0 = No, 1 = Yes) |
+| `Chance of Admit` โญ | Float | 0.34โ0.97 | **Target variable** โ probability of admission |
+
+> `Serial No.` is dropped before training as it carries no predictive information.
+
+---
+
+## โ๏ธ Methodology
+
+```
+Load admission_predict.csv (500 ร 9)
+ โ
+ โผ
+EDA + Correlation Analysis
+(heatmap, pairplots, distributions)
+ โ
+ โผ
+Drop 'Serial No.' column
+Define X (7 features) and y ('Chance of Admit')
+ โ
+ โผ
+find_best_model(X, y)
+โโโ GridSearchCV (cv=5) over 6 models
+ โ
+ โผ
+Select best model โ Linear Regression (normalize=True)
+ โ
+ โผ
+Train/Test Split (80/20, random_state=5)
+โ 400 train samples, 100 test samples
+ โ
+ โผ
+Fit LinearRegression(normalize=True)
+Evaluate on test set โ Rยฒ = 0.821
+ โ
+ โผ
+Sample Predictions
+```
+
+---
+
+## ๐ Model Comparison Results
+
+All 6 models evaluated using `GridSearchCV(cv=5)` via the custom `find_best_model()` function:
+
+| Model | Best Parameters | CV Rยฒ Score |
+|-------|----------------|:-----------:|
+| **Linear Regression** โ
| `{'normalize': True}` | **0.8108** |
+| Random Forest | `{'n_estimators': 15}` | 0.7689 |
+| KNN | `{'n_neighbors': 20}` | 0.7230 |
+| SVR | `{'gamma': 'scale'}` | 0.6541 |
+| Decision Tree | `{'criterion': 'mse', 'splitter': 'random'}` | 0.5868 |
+| Lasso | `{'alpha': 1, 'selection': 'random'}` | 0.2151 |
+
+> โ
**Linear Regression** selected as the final model โ highest cross-validation Rยฒ score of **0.8108**.
+
+> Lasso performed poorly (Rยฒ = 0.2151) because L1 regularization shrinks coefficients aggressively, which is harmful here where all 7 features are genuinely correlated with admission probability.
+
+---
+
+## ๐ Final Model Performance
+
+| Metric | Value |
+|--------|:-----:|
+| Model | `LinearRegression(normalize=True)` |
+| 5-Fold Cross-Validation Score | **81.0%** |
+| Train samples | 400 |
+| Test samples | 100 |
+| **Test Rยฒ Score** | **0.8215** |
+
+---
+
+## ๐ฎ Sample Predictions
+
+```python
+# Input: [GRE, TOEFL, Univ Rating, SOP, LOR, CGPA, Research]
+
+model.predict([[337, 118, 4, 4.5, 4.5, 9.65, 0]])
+# โ Chance of getting into UCLA is 92.855%
+
+model.predict([[320, 113, 2, 2.0, 2.5, 8.64, 1]])
+# โ Chance of getting into UCLA is 73.627%
+```
+
+---
+
+## ๐ Project Structure
+
+```
+Getting Admission in College Prediction/
+โ
+โโโ Admission_prediction.ipynb # Main notebook โ EDA, model comparison, training
+โโโ admission_predict.csv # Dataset (500 student records)
+โโโ requirements.txt # Python dependencies
+โโโ README.md # You are here
+```
+
+---
+
+## ๐ Getting Started
+
+### 1. Clone the repository
+
+```bash
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Getting Admission in College Prediction"
+```
+
+### 2. Set up environment
+
+```bash
+python -m venv venv
+source venv/bin/activate # Linux / macOS
+venv\Scripts\activate # Windows
+
+pip install -r requirements.txt
+```
+
+### 3. Launch the notebook
+
+```bash
+jupyter notebook Admission_prediction.ipynb
+```
+
+---
+
+## ๐ ๏ธ Tech Stack
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7.4 |
+| ML Library | scikit-learn |
+| Model Selection | `GridSearchCV`, `cross_val_score` |
+| Models | `LinearRegression`, `Lasso`, `SVR`, `DecisionTreeRegressor`, `RandomForestRegressor`, `KNeighborsRegressor` |
+| Data Processing | Pandas, NumPy |
+| Visualization | Matplotlib |
+| Notebook | Jupyter |
+
+---
+
+
+
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
+
+โญ Star the main repo if this helped you!
+
+
From 93b1a30e8951ef2833e860afc7a8c40a0344e1d3 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Thu, 19 Mar 2026 00:10:25 +0530
Subject: [PATCH 7/8] Update README.md
---
.../README.md | 303 +++++++++++++++++-
1 file changed, 302 insertions(+), 1 deletion(-)
diff --git a/Heart Disease Prediction [END 2 END]/README.md b/Heart Disease Prediction [END 2 END]/README.md
index bf59832..d82f9b5 100644
--- a/Heart Disease Prediction [END 2 END]/README.md
+++ b/Heart Disease Prediction [END 2 END]/README.md
@@ -1 +1,302 @@
-Look for Deployed Project At 
\ No newline at end of file
+- Look for final Project At ****
+
+
+
+# ๐ซ Cardio Monitor โ Heart Disease Prediction Web App
+
+[](https://www.python.org/)
+[](https://flask.palletsprojects.com/)
+[](https://www.mongodb.com/)
+[](https://scikit-learn.org/)
+[]()
+[](LICENSE)
+
+> **Cardio Monitor** is a full-stack web application that predicts whether a patient is at risk of developing **heart disease** using a machine learning model with **92% accuracy** โ built with Flask, MongoDB, and scikit-learn. Course project for **Big Data Analytics (BCSE0158)**.
+
+[](https://github.com/shsarv/Cardio-Monitor/stargazers)
+[](https://github.com/shsarv/Cardio-Monitor/forks)
+
+[๐ Core ML Project](https://github.com/shsarv/Heart-Disease-Prediction) ยท [๐ Report Bug](https://github.com/shsarv/Cardio-Monitor/issues) ยท [โจ Request Feature](https://github.com/shsarv/Cardio-Monitor/issues)
+
+
+
+---
+
+## โ ๏ธ Medical Disclaimer
+
+> **This application is for educational and research purposes only.** It does not constitute medical advice. Always consult a qualified cardiologist or medical professional for clinical decisions.
+
+---
+
+## ๐ Table of Contents
+
+- [About the Project](#-about-the-project)
+- [How It Works](#-how-it-works)
+- [Dataset & Features](#-dataset--features)
+- [Model & Performance](#-model--performance)
+- [Architecture](#-architecture)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Future Roadmap](#-future-roadmap)
+- [Tech Stack](#-tech-stack)
+- [References](#-references)
+
+---
+
+## ๐ฌ About the Project
+
+Heart disease is the leading cause of death globally. Early detection through continuous monitoring can significantly reduce mortality rates. **Cardio Monitor** combines:
+
+- A **machine learning classifier** (92% accuracy) trained on the Cleveland Heart Disease dataset
+- A **Flask web app** for real-time patient input and prediction
+- A **MongoDB** backend for storing patient records and prediction history
+- A **visualization module** for EDA and model insights
+- A roadmap toward **Apache Spark Streaming** for large-scale real-time data processing
+
+The core ML research and model building is documented in the companion repository: [shsarv/Heart-Disease-Prediction](https://github.com/shsarv/Heart-Disease-Prediction).
+
+---
+
+## โ๏ธ How It Works
+
+```
+Patient Inputs Clinical Data via Web Form
+ โ
+ โผ
+ Flask (app.py)
+ routes request to
+ โ
+ โผ
+ prediction.py
+ Loads Heart_model1.pkl
+ Runs model.predict()
+ โ
+ โโโโโโโโดโโโโโโโ
+ โผ โผ
+ At Risk โค๏ธโ๐ฉน Not at Risk โ
+ โ
+ โผ
+ Result displayed on web page
+ Record saved to MongoDB (database.py)
+```
+
+---
+
+## ๐ Dataset & Features
+
+| Property | Details |
+|----------|---------|
+| **File** | `heart.csv` |
+| **Source** | Cleveland Heart Disease Dataset (UCI ML Repository) |
+| **Samples** | 303 patient records |
+| **Task** | Binary classification โ Heart Disease (1) / No Heart Disease (0) |
+
+### Input Features
+
+| Feature | Description | Range |
+|---------|-------------|-------|
+| `age` | Age of patient | Years |
+| `sex` | Sex | 0 = Female, 1 = Male |
+| `cp` | Chest pain type | 0โ3 |
+| `trestbps` | Resting blood pressure | mm Hg |
+| `chol` | Serum cholesterol | mg/dl |
+| `fbs` | Fasting blood sugar > 120 mg/dl | 0 / 1 |
+| `restecg` | Resting ECG results | 0โ2 |
+| `thalach` | Maximum heart rate achieved | bpm |
+| `exang` | Exercise induced angina | 0 / 1 |
+| `oldpeak` | ST depression induced by exercise | Float |
+| `slope` | Slope of peak exercise ST segment | 0โ2 |
+| `ca` | Number of major vessels coloured by fluoroscopy | 0โ3 |
+| `thal` | Thalassemia | 0โ3 |
+| `target` โญ | **Heart disease present** | 0 / 1 |
+
+---
+
+## ๐ค Model & Performance
+
+| Metric | Value |
+|--------|:-----:|
+| **Accuracy** | **92%** |
+| **Saved Model** | `Heart_model1.pkl` / `heartmodel.pkl` |
+| **Algorithm** | scikit-learn classifier (see core project) |
+| **Library** | scikit-learn + mlxtend |
+
+> Two model files are present in the repo: `Heart_model1.pkl` (primary, used by `prediction.py`) and `heartmodel.pkl` (earlier iteration). Both are serialized with `pickle`.
+
+> For full model building details โ EDA, feature selection, algorithm comparison, and evaluation โ see the core project: [shsarv/Heart-Disease-Prediction](https://github.com/shsarv/Heart-Disease-Prediction).
+
+---
+
+## ๐๏ธ Architecture
+
+```
+โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+โ Flask Application โ
+โ (app.py) โ
+โ โ
+โ โโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโ โ
+โ โtemplates/โ โprediction โ โdatabase โ โ
+โ โ HTML โ โ .py โ โ .py โ โ
+โ โ pages โ โ ML model โ โ MongoDB โ โ
+โ โโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโ โ
+โ โ
+โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
+โ โ static/ โ โ
+โ โ CSS ยท JS ยท images โ โ
+โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
+โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+ โ โ
+ โผ โผ
+ Heart_model1.pkl MongoDB Atlas
+ (scikit-learn) (patient records
+ + predictions)
+```
+
+---
+
+## ๐ Project Structure
+
+```
+Cardio-Monitor/
+โ
+โโโ ๐ heart disease prediction/ # Jupyter notebooks โ EDA & model training
+โโโ ๐ static/ # CSS, JS, images
+โโโ ๐ templates/ # Jinja2 HTML templates (input form, result pages)
+โโโ ๐ __pycache__/
+โ
+โโโ app.py # Flask entry point โ routes and app config
+โโโ prediction.py # Loads Heart_model1.pkl, runs inference
+โโโ modelbuild.py # Model training and serialization script
+โโโ database.py # MongoDB connection and CRUD operations
+โโโ visualization.py # EDA and data visualization utilities
+โ
+โโโ Heart_model1.pkl # Primary trained model (pickle)
+โโโ heartmodel.pkl # Alternate model iteration (pickle)
+โโโ heart.csv # Cleveland Heart Disease dataset
+โโโ Input Data.png # Screenshot of the web app input form
+โ
+โโโ Procfile # Heroku deployment config
+โโโ requirements.txt # Python dependencies
+โโโ .gitignore
+โโโ README.md
+```
+
+---
+
+## ๐ Getting Started
+
+### Prerequisites
+
+- Python 3.7+
+- MongoDB (local or [MongoDB Atlas](https://www.mongodb.com/cloud/atlas))
+
+### 1. Clone the repository
+
+```bash
+git clone https://github.com/shsarv/Cardio-Monitor.git
+cd Cardio-Monitor
+```
+
+### 2. Set up environment
+
+```bash
+python -m venv venv
+source venv/bin/activate # Linux / macOS
+venv\Scripts\activate # Windows
+
+pip install -r requirements.txt
+```
+
+### 3. Configure MongoDB
+
+In `database.py`, update your MongoDB connection string:
+
+```python
+# Local MongoDB
+client = pymongo.MongoClient("mongodb://localhost:27017/")
+
+# MongoDB Atlas (cloud)
+client = pymongo.MongoClient("mongodb+srv://
:@cluster.mongodb.net/")
+```
+
+### 4. Run the app
+
+```bash
+python app.py
+```
+
+Navigate to โ **http://127.0.0.1:5000**
+
+### 5. Deploy to Heroku
+
+```bash
+heroku login
+heroku create cardio-monitor-app
+git push heroku main
+heroku open
+```
+
+> The `Procfile` already contains: `web: gunicorn app:app`
+
+---
+
+## ๐บ๏ธ Future Roadmap
+
+| Feature | Status |
+|---------|:------:|
+| Flask web app with MongoDB | โ
Done |
+| 92% accuracy ML model | โ
Done |
+| Heroku deployment | โ
Done |
+| **Apache Spark Streaming** โ real-time patient data ingestion | ๐ Planned |
+| **PySpark MLlib** โ large-scale distributed model training | ๐ Planned |
+| **Deep Learning model** (Keras/TensorFlow) | ๐ Planned |
+| Live demo deployment | ๐ Planned |
+
+---
+
+## ๐ ๏ธ Tech Stack
+
+**Current:**
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Web Framework | Flask |
+| ML Library | scikit-learn, mlxtend |
+| Database | MongoDB (PyMongo) |
+| Model Serialization | Pickle |
+| Frontend | HTML5, CSS3, Bootstrap |
+| Deployment | Heroku (Procfile + gunicorn) |
+| Notebook | Jupyter |
+
+**Planned (Future):**
+
+| Layer | Technology |
+|-------|-----------|
+| Streaming | Apache Spark Streaming |
+| Distributed ML | PySpark MLlib |
+| Deep Learning | Keras / TensorFlow (DeepL) |
+| Database (scale) | MongoDB Atlas |
+
+---
+
+## ๐ References
+
+- [Cleveland Heart Disease Dataset โ UCI ML Repository](https://archive.ics.uci.edu/ml/datasets/Heart+Disease)
+- [Core ML Project โ shsarv/Heart-Disease-Prediction](https://github.com/shsarv/Heart-Disease-Prediction)
+- [Flask Documentation](https://flask.palletsprojects.com/)
+- [PyMongo Documentation](https://pymongo.readthedocs.io/)
+- [mlxtend Documentation](https://rasbt.github.io/mlxtend/)
+- [Apache Spark Streaming](https://spark.apache.org/streaming/)
+
+---
+
+
+
+**Created by [Sarvesh Kumar Sharma](https://github.com/shsarv)**
+
+Course Project โ Big Data Analytics (BCSE0158)
+
+โญ Star this repo if you found it helpful!
+
+
From 23afd01662bce6d157958c873d9cff84a788ae93 Mon Sep 17 00:00:00 2001
From: shsarv4 <166940544+shsarv4@users.noreply.github.com>
Date: Thu, 19 Mar 2026 00:17:03 +0530
Subject: [PATCH 8/8] Create README.md
---
Human Activity Detection/README.md | 300 +++++++++++++++++++++++++++++
1 file changed, 300 insertions(+)
create mode 100644 Human Activity Detection/README.md
diff --git a/Human Activity Detection/README.md b/Human Activity Detection/README.md
new file mode 100644
index 0000000..985071e
--- /dev/null
+++ b/Human Activity Detection/README.md
@@ -0,0 +1,300 @@
+
+
+# ๐ Human Activity Recognition โ 2D Pose + LSTM RNN
+
+[](https://www.python.org/)
+[](https://www.tensorflow.org/)
+[]()
+[]()
+[]()
+[](../LICENSE.md)
+
+> Classifies **6 human activities** from **2D pose time series** (OpenPose keypoints) using a **2-layer stacked LSTM RNN** built in TensorFlow 1.x โ achieving **>90% accuracy** in ~7 minutes of training. Deployed via ngrok with a Flask web app and `sample_video.mp4` demo.
+
+[๐ Back to Main Repository](https://github.com/shsarv/Machine-Learning-Projects)
+
+
+
+---
+
+## ๐ Table of Contents
+
+- [About the Project](#-about-the-project)
+- [Key Idea โ Why 2D Pose?](#-key-idea--why-2d-pose)
+- [Dataset](#-dataset)
+- [LSTM Architecture](#-lstm-architecture)
+- [Training Configuration](#-training-configuration)
+- [Results & Findings](#-results--findings)
+- [Project Structure](#-project-structure)
+- [Getting Started](#-getting-started)
+- [Tech Stack](#-tech-stack)
+- [References](#-references)
+
+---
+
+## ๐ฌ About the Project
+
+This experiment classifies human activities using **2D pose time series data** and a **stacked LSTM RNN**. Rather than feeding raw RGB images or expensive 3D pose data into the network, it uses **2D (x, y) keypoints** extracted from video frames via OpenPose โ a much lighter and more accessible input representation.
+
+The core research questions:
+
+- Can **2D pose** match **3D pose** accuracy for activity recognition? (removes need for RGBD cameras)
+- Can **2D pose** match **raw RGB image** accuracy? (smaller input = smaller model = better with limited data)
+- Does this approach generalize to **animal** behaviour classification for robotics applications?
+
+The network architecture is based on Guillaume Chevalier's *LSTMs for Human Activity Recognition (2016)*, with key modifications for large class-ordered datasets using **random batch sampling without replacement**.
+
+---
+
+## ๐ง Key Idea โ Why 2D Pose?
+
+```
+Raw Video Frame (640ร480 RGB)
+ โ
+ โผ
+ OpenPose Inference
+ 18 body keypoints ร (x, y) coords
+ โ
+ โผ
+ 36-dimensional feature vector per frame
+ โ
+ โผ (32 frames = 1 time window)
+ LSTM RNN โ Activity Class
+```
+
+| Input Type | Pros | Cons |
+|------------|------|------|
+| Raw RGB images | High information | Large models, lots of data needed |
+| 3D pose (RGBD) | Rich spatial info | Needs depth sensors |
+| **2D pose (x,y)** โ
| Lightweight, RGB-only camera, small model | Some spatial ambiguity |
+
+> Limiting the feature vector to 2D pose keypoints allows for a **smaller LSTM model** that generalises better on limited datasets โ particularly relevant for future animal behaviour recognition tasks.
+
+---
+
+## ๐ Dataset
+
+| Property | Details |
+|----------|---------|
+| **Source** | Berkeley Multimodal Human Action Database (MHAD) โ 2D poses extracted via OpenPose |
+| **Download** | `RNN-HAR-2D-Pose-database.zip` (~19.2 MB, Google Drive) |
+| **Subjects** | 12 |
+| **Angles** | 4 camera angles |
+| **Repetitions** | 5 per subject per action |
+| **Total videos** | 1,438 (2 missing from original 1,440) |
+| **Total frames** | 211,200 |
+| **Training windows** | 22,625 (32 timesteps each, 50% overlap) |
+| **Test windows** | 5,751 |
+| **Input shape** | `(22625, 32, 36)` โ windows ร timesteps ร features |
+| **Preprocessing** | โ None โ raw, unnormalized pose coordinates |
+
+### Activity Classes (6)
+
+| Label | Activity |
+|-------|----------|
+| `JUMPING` | Vertical jumps |
+| `JUMPING_JACKS` | Jumping jacks |
+| `BOXING` | Boxing motions |
+| `WAVING_2HANDS` | Waving with both hands |
+| `WAVING_1HAND` | Waving with one hand |
+| `CLAPPING_HANDS` | Clapping hands |
+
+### Data Files
+
+```
+RNN-HAR-2D-Pose-database/
+โโโ X_train.txt # 22,625 training windows (36 comma-separated floats per row)
+โโโ X_test.txt # 5,751 test windows
+โโโ Y_train.txt # Training labels (0โ5)
+โโโ Y_test.txt # Test labels (0โ5)
+```
+
+---
+
+## ๐๏ธ LSTM Architecture
+
+```
+Input: (batch_size, 32 timesteps, 36 features)
+ โ
+ โผ
+ Linear projection: 36 โ 34 (ReLU)
+ โ
+ โผ
+ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+ โ BasicLSTMCell(34, forget_bias=1)โ โ Layer 1
+ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
+ โ BasicLSTMCell(34, forget_bias=1)โ โ Layer 2
+ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+ tf.contrib.rnn.MultiRNNCell (stacked)
+ tf.contrib.rnn.static_rnn (many-to-one)
+ โ
+ Last output only
+ โ
+ โผ
+ Linear: 34 โ 6
+ Softmax โ Activity class
+```
+
+> **Why n_hidden = 34?** Testing across a range of hidden unit counts showed best generalisation when hidden units โ n_input (36). 34 was found to be optimal.
+
+> **Many-to-one classifier** โ only the last LSTM output (timestep 32) is used for classification, not the full sequence output.
+
+---
+
+## โ๏ธ Training Configuration
+
+| Parameter | Value |
+|-----------|-------|
+| Framework | TensorFlow 1.x (`%tensorflow_version 1.x`) |
+| Timesteps (`n_steps`) | 32 |
+| Input features (`n_input`) | 36 (18 keypoints ร x, y) |
+| Hidden units (`n_hidden`) | 34 |
+| Classes (`n_classes`) | 6 |
+| Epochs | 300 |
+| Batch size | 512 |
+| Optimizer | Adam |
+| Initial learning rate | 0.005 |
+| LR decay | Exponential โ `0.96` per 100,000 steps |
+| Loss | Softmax cross-entropy + L2 regularization |
+| L2 lambda | 0.0015 |
+| Batch strategy | Random sampling **without replacement** (prevents class-order bias) |
+| Training time | ~7 minutes (Google Colab) |
+
+**L2 regularization formula:**
+```python
+l2 = lambda_loss_amount * sum(
+ tf.nn.l2_loss(tf_var) for tf_var in tf.trainable_variables()
+)
+cost = tf.reduce_mean(softmax_cross_entropy) + l2
+```
+
+**Decayed learning rate:**
+```python
+learning_rate = init_lr * decay_rate ^ (global_step / decay_steps)
+# = 0.005 * 0.96 ^ (global_step / 100000)
+```
+
+---
+
+## ๐ Results & Findings
+
+| Metric | Value |
+|--------|:-----:|
+| **Final Accuracy** | **> 90%** |
+| Training time | ~7 minutes |
+
+**Confusion pairs observed:**
+- `CLAPPING_HANDS` โ `BOXING` โ similar upper-body motion pattern
+- `JUMPING_JACKS` โ `WAVING_2HANDS` โ symmetric arm movements
+
+**Key conclusions:**
+- 2D pose achieves >90% accuracy, validating its use over more expensive 3D pose or raw RGB inputs
+- Hidden units โ n_input (34 โ 36) gives optimal generalisation
+- Random batch sampling without replacement is **critical** โ ordered class batches degrade training significantly
+- Approach is promising for future animal behaviour estimation with autonomous mobile robots
+
+---
+
+## ๐ Project Structure
+
+```
+Human Activity Detection/
+โ
+โโโ ๐ images/ # Result plots and visualizations
+โโโ ๐ models/ # Saved LSTM model weights
+โโโ ๐ src/ # Helper source scripts
+โโโ ๐ templates/ # HTML templates (Flask app)
+โ
+โโโ Human_Activity_Recogination.ipynb # Main notebook โ dataset, LSTM, training
+โโโ Human_Action_Classification_deployment_with_ngrok.ipynb # Flask + ngrok deployment notebook
+โโโ lstm_train.ipynb # Standalone LSTM training notebook
+โโโ app.py # Flask web application
+โโโ sample_video.mp4 # Sample video for live demo
+โโโ requirements.txt # Python dependencies
+```
+
+---
+
+## ๐ Getting Started
+
+### 1. Clone the repository
+
+```bash
+git clone https://github.com/shsarv/Machine-Learning-Projects.git
+cd "Machine-Learning-Projects/Human Activity Detection"
+```
+
+### 2. Set up environment
+
+```bash
+python -m venv venv
+source venv/bin/activate # Linux / macOS
+venv\Scripts\activate # Windows
+
+pip install -r requirements.txt
+```
+
+> โ ๏ธ **TensorFlow 1.x required.** The LSTM uses `tf.contrib.rnn` and `tf.placeholder` APIs from TF1.
+> ```bash
+> pip install tensorflow==1.15.0
+> ```
+
+### 3. Download the dataset
+
+The dataset is downloaded automatically in the notebook:
+```python
+!wget -O RNN-HAR-2D-Pose-database.zip \
+ https://drive.google.com/u/1/uc?id=1IuZlyNjg6DMQE3iaO1Px6h1yLKgatynt
+!unzip RNN-HAR-2D-Pose-database.zip
+```
+
+### 4. Run on Google Colab (recommended)
+
+```
+1. Open Human_Activity_Recogination.ipynb in Google Colab
+2. Runtime โ Change runtime type โ GPU (optional, speeds training)
+3. Run all cells โ training completes in ~7 minutes
+```
+
+### 5. Deploy with ngrok
+
+```
+Open Human_Action_Classification_deployment_with_ngrok.ipynb
+Follow the ngrok setup cells to expose the Flask app publicly
+```
+
+---
+
+## ๐ ๏ธ Tech Stack
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.7+ |
+| Deep Learning | TensorFlow 1.x (`tf.contrib.rnn`) |
+| Model | 2-layer stacked LSTM (`BasicLSTMCell`) |
+| Pose Extraction | OpenPose (CMU Perceptual Computing Lab) |
+| Data Processing | NumPy |
+| Visualization | Matplotlib |
+| Web Framework | Flask |
+| Deployment | ngrok (tunnel) |
+| Notebook | Jupyter / Google Colab |
+
+---
+
+## ๐ References
+
+- Guillaume Chevalier (2016). *LSTMs for Human Activity Recognition.* [github.com/guillaume-chevalier](https://github.com/guillaume-chevalier/LSTM-Human-Activity-Recognition) โ MIT License
+- [Berkeley MHAD Dataset](http://tele-immersion.citris-uc.org/berkeley_mhad)
+- [OpenPose โ CMU Perceptual Computing Lab](https://github.com/CMU-Perceptual-Computing-Lab/openpose)
+- Goodfellow et al. *"It has been observed in practice that when using a larger batch there is a significant degradation in the quality of the model..."* โ basis for small batch strategy
+- [Andrej Karpathy โ The Unreasonable Effectiveness of RNNs](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) โ referenced for many-to-one classifier design
+
+---
+
+
+
+Part of the [Machine Learning Projects](https://github.com/shsarv/Machine-Learning-Projects) collection by [Sarvesh Kumar Sharma](https://github.com/shsarv)
+
+โญ Star the main repo if this helped you!
+
+