Skip to content

GitRAG is an AI-powered assistant that turns your Git commit history into an intelligent, searchable knowledge base. It uses a Retrieval-Augmented Generation (RAG) pipeline powered by Gemma-2B, FAISS, and LangChain to answer natural language questions about your repository’s past—just like magic.

Notifications You must be signed in to change notification settings

itzsudipta/GitRAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

🔍 GitRAG – AI-Powered Git Commit Assistant

GitRAG is a smart Git-based project assistant powered by Retrieval-Augmented Generation (RAG). It helps you ask questions like:

“What changes were made for Render deployment?”
“When was image editing added?”
“Why was vercel.json removed?”

And it answers based only on your Git commit history using LLMs, LangChain, and FAISS. 🤖


🧠 Tech Stack

Layer Tool / Library Purpose
LLM Gemma-2B (Google) Local language model for generation
Vector DB FAISS Semantic retrieval of commit chunks
Embedding MiniLM (all-MiniLM-L6-v2) Convert commit messages into vectors
Git Interface GitPython Access commit history programmatically
Prompt Engine LangChain RAG pipeline (Retriever + LLM)
Interface Python CLI Ask questions via command-line
DB Storage SQLite Store and fetch commit history

✨ Features

  • 💬 Ask questions about your codebase history
  • 🗃️ Extract commits and store them as documents
  • 🔍 Retrieve relevant commits using FAISS
  • 🤖 Generate human-like answers using Gemma-2B
  • ✅ Lightweight & runs on CPU

🛠️ Installation

git clone https://github.com/itzsudipta/GitRAG.git
cd GitRAG

# Create virtual environment (optional)
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows

# Install dependencies
pip install -r requirements.txt

📂 Project Structure

📁 GitRAG/
├── main.py                  # Main execution file
├── commits.db               # SQLite DB of Git commits
├── faiss_index/             # FAISS index directory (auto-generated)
├── requirements.txt         # List of Python packages
└── README.md                # You're here!

🚀 Running the Assistant

Clone your target Git repository inside the folder or adjust the path in main.py.

git clone https://github.com/your-username/PixelCraft.git

Run the assistant:

python main.py

Ask anything about your commits:

Ask about your codebase commits (or type 'exit'):
> What was added for Render hosting?

🧠 Answer:
A Procfile was added and the README was updated with Render deployment steps.

⚙️ How It Works

  • Extracts commit history using GitPython and stores it in SQLite.
  • Splits text and generates vector embeddings with MiniLM.
  • FAISS retrieves the most relevant chunks based on your query.
  • Gemma-2B generates a natural answer from the retrieved content.

📝 How to Use with Your Own Repo

  1. Clone your target repository inside the GitRAG folder:

    git clone https://github.com/your-username/your-repo-name.git
  2. Update the repo_path in your script to match the cloned folder name:

    repo_path = "your-repo-name"
    extract_commits_to_db(repo_path)
  3. 🔁 Example (if your repo is PixelCraft):

    git clone https://github.com/itzsudipta/PixelCraft.git
    repo_path = "PixelCraft"
    extract_commits_to_db(repo_path)

📌 Example Questions

  • When was vercel.json removed and why?
  • What was the last image processing feature added?
  • How was the app configured for deployment?
  • When did the project enter development phase?

📈 Future Plans

  • 🌐 Web UI with Streamlit/Flask
  • 📦 GitHub API integration
  • 📊 Visual diff summaries
  • 📄 Auto Changelog Generator

📜 License

MIT License © 2025 Sudipta Sarkar


🙌 Credits

  • Google Gemma
  • LangChain
  • FAISS by Meta
  • MiniLM

🔧 Built with Python, curiosity, and ❤️ to decode your codebase.

About

GitRAG is an AI-powered assistant that turns your Git commit history into an intelligent, searchable knowledge base. It uses a Retrieval-Augmented Generation (RAG) pipeline powered by Gemma-2B, FAISS, and LangChain to answer natural language questions about your repository’s past—just like magic.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages