GitRAG is a smart Git-based project assistant powered by Retrieval-Augmented Generation (RAG). It helps you ask questions like:
“What changes were made for Render deployment?”
“When was image editing added?”
“Why wasvercel.jsonremoved?”
And it answers based only on your Git commit history using LLMs, LangChain, and FAISS. 🤖
| Layer | Tool / Library | Purpose |
|---|---|---|
| LLM | Gemma-2B (Google) |
Local language model for generation |
| Vector DB | FAISS |
Semantic retrieval of commit chunks |
| Embedding | MiniLM (all-MiniLM-L6-v2) |
Convert commit messages into vectors |
| Git Interface | GitPython |
Access commit history programmatically |
| Prompt Engine | LangChain |
RAG pipeline (Retriever + LLM) |
| Interface | Python CLI | Ask questions via command-line |
| DB Storage | SQLite |
Store and fetch commit history |
- 💬 Ask questions about your codebase history
- 🗃️ Extract commits and store them as documents
- 🔍 Retrieve relevant commits using FAISS
- 🤖 Generate human-like answers using
Gemma-2B - ✅ Lightweight & runs on CPU
git clone https://github.com/itzsudipta/GitRAG.git
cd GitRAG
# Create virtual environment (optional)
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
# Install dependencies
pip install -r requirements.txt📁 GitRAG/
├── main.py # Main execution file
├── commits.db # SQLite DB of Git commits
├── faiss_index/ # FAISS index directory (auto-generated)
├── requirements.txt # List of Python packages
└── README.md # You're here!
Clone your target Git repository inside the folder or adjust the path in main.py.
git clone https://github.com/your-username/PixelCraft.gitRun the assistant:
python main.pyAsk anything about your commits:
Ask about your codebase commits (or type 'exit'):
> What was added for Render hosting?
🧠 Answer:
A Procfile was added and the README was updated with Render deployment steps.
- Extracts commit history using GitPython and stores it in SQLite.
- Splits text and generates vector embeddings with MiniLM.
- FAISS retrieves the most relevant chunks based on your query.
- Gemma-2B generates a natural answer from the retrieved content.
-
Clone your target repository inside the GitRAG folder:
git clone https://github.com/your-username/your-repo-name.git
-
Update the
repo_pathin your script to match the cloned folder name:repo_path = "your-repo-name" extract_commits_to_db(repo_path)
-
🔁 Example (if your repo is PixelCraft):
git clone https://github.com/itzsudipta/PixelCraft.git
repo_path = "PixelCraft" extract_commits_to_db(repo_path)
- When was vercel.json removed and why?
- What was the last image processing feature added?
- How was the app configured for deployment?
- When did the project enter development phase?
- 🌐 Web UI with Streamlit/Flask
- 📦 GitHub API integration
- 📊 Visual diff summaries
- 📄 Auto Changelog Generator
MIT License © 2025 Sudipta Sarkar
- Google Gemma
- LangChain
- FAISS by Meta
- MiniLM
🔧 Built with Python, curiosity, and ❤️ to decode your codebase.