A real-time video calling application with built-in sign language recognition.
The main motivation behind making this project is to explore WebRTC and in-browser machine learning for real-time inference. This is achieved by running an ONNX model using ONNX Runtime Web via a Web Worker, ensuring high-performance inference without blocking the main UI thread.
- React 19 (built with Vite)
- TailwindCSS v4 (for styling)
- WebRTC API (for peer-to-peer video streaming and data channels)
- ONNX Runtime Web (for running the sign language detection model in a Web Worker)
- MediaPipe (for fast browser-based hand tracking and preprocessing)
- Socket.IO Client (for WebRTC signaling)
- TypeScript
- Node.js & Express
- Socket.IO (signaling server for establishing WebRTC peer connections)
- TypeScript
git clone <your-repository-url>
cd video-calling-webrtcOpen a terminal and navigate to the backend directory:
cd backendInstall dependencies:
npm installCreate a .env file in the backend directory and add your required API keys:
METERED_API_KEY=your_metered_api_key_here
GEMINI_API_KEY=your_gemini_api_key_hereStart the backend development server:
npm run devOpen a new terminal window or tab and navigate to the frontend directory:
cd frontendInstall dependencies:
npm installStart the frontend development server:
npm run devThe application should now be accessible in your web browser at the URL provided by the Vite server (typically http://localhost:5173).
The project includes a custom machine learning model for sign language detection. The training code and notebook are located in the model directory.
If you want to modify, train, or use your own custom model in the frontend:
- Train or update the model using the resources in the
modeldirectory. - Export the trained model to the ONNX format (e.g.,
landmark_model.onnx). - Copy the exported
.onnxmodel file alongside any matching class label files (likelandmark_classes.json) into thefrontend/public/directory.
The web application will then automatically load your ONNX model from the public directory directly into the Web Worker for real-time inference.