GitHub - StudiYash/InstaVision: InstaVision is a project which is capable of generating images by just giving it the description of the image. It uses various technologies like Google Imagen3, Open AI Dall E3, Stable Diffusion, Online Database called as Redis and much more. It includes a feature-rich Telegram Bot and a User Friendly Windows Application.

Project Introduction 🛡️

Abstract

InstaVision is a powerful AI-driven Image Generation and Image Editing project designed to transform your text descriptions into stunning, high-quality images using various Image Generation APIs. Perfect for creators, students, and anyone with a vivid imagination, InstaVision makes it easy to bring your ideas to life with just a few words.

Project Timeline

Start Date: 22nd August 2024
End Date: 14th February 2025
Total Time Required: 5 months and 24 days

My Introduction

Name	GitHub Profile	LinkedIn Profile
Yash Suhas Shukla	GitHub	LinkedIn

Methodology ✨

The Methodology of InstaVision is designed to efficiently process text-based inputs to generate visually stunning, AI-powered images. Below is an in-depth breakdown of each component and its functionality:

1. Text Input

Source: Inputs are received through multiple platforms, including:
- Telegram Bot: Users can interact with the bot by sending text prompts directly via Telegram.
- Windows Application: Desktop users can input their prompts through the standalone InstaVision application.
Purpose: This serves as the starting point of the entire workflow, where users provide their creative ideas or descriptions for image generation.

2. Prompt Preprocessing

This stage ensures that the text input is validated and optimized for the image generation process. It includes two key components:

a) Banned Words Check

Objective: Ensures that inputs adhere to ethical and usage guidelines by filtering out inappropriate or offensive language.
Process:
- Scans the text prompt for words or phrases that are flagged as inappropriate.
- Prompts the user to revise their input if any banned words are detected.
Impact: Maintains the integrity and professionalism of the generated content.

b) Rate Limiting via Redis (Only for Telegram Bot)

Objective: Prevents system overload and abuse by enforcing user-specific rate limits.
Features:
- Default users can generate up to 5 images per day.
- Privileged users can generate up to 50 images per day.
- Admin users have no restrictions.
Technology Used:
- Redis Cloud Database for maintaining real-time user quota information.
- Reset logic to refresh limits every 24 hours.
Impact: Ensures fair usage and system scalability for multiple users simultaneously.

3. Image Processing

This stage leverages advanced AI models to generate and enhance images based on the processed input. It comprises two critical submodules:

a) Image Generation

Objective: Transforms text-based prompts into visually stunning images using state-of-the-art AI models.
Supported Models:
- Replicate APIs for SDXL Lightning and Flux Schnell.
- Google Imagen API for generating highly detailed and realistic images.
- OpenAI DALL-E 3 API for a creative and versatile image generation approach.
Process:
- Receives processed text input and sends it to the selected image generation API.
- Produces a high-quality image as output.
Impact: Offers users diverse styles and capabilities for image generation, catering to various creative needs.

b) Watermarking

Objective: Adds a customizable watermark to the generated image for branding and intellectual property protection.
Features:
- Supports both default and custom watermark text.
- Uses fonts specified by the user, with fallbacks to default fonts if unavailable.
- Ensures that the watermark is visually appealing and non-intrusive.
Impact: Protects generated images from unauthorized use and establishes a unique identity for InstaVision.

4. Image Output

Objective: Delivers the final, watermarked image to the user.
Delivery Channels:
- Telegram Bot: Sends the image to the user and/or a designated group.
- Windows Application: Displays the image directly within the application and allows users to save it locally.
Impact: Ensures a seamless user experience by making the final output readily accessible.

Key Advantages of the Methodology:

Scalability: Efficiently handles simultaneous requests from multiple users through robust backend integration.
Ethical Compliance: Ensures content appropriateness through banned words detection.
Flexibility: Supports a variety of input methods and image generation models.
User Protection: Implements rate limiting and watermarking to ensure fair use and intellectual property security.

This methodology highlights the seamless integration of user-friendly interfaces, ethical controls, and advanced AI technologies to deliver a reliable and creative experience with InstaVision.

Backend Preparation 🔧

Mark Models Index

The backend development was an intricate journey, involving months of rigorous research, experimentation, and iterative coding. Each phase contributed to refining the system’s ability to generate and edit images over various prompts of input and in various languages.

Our Mark Model Index Document provides a comprehensive overview of this journey, showcasing each model’s evolution, from early concepts to the final optimized versions. Dive into the document to see how each model was crafted, tested, and fine-tuned to tackle the challenges of multilingual, multimodal hate speech detection.

Project Backend 🖥️

The Project Backend contains resources for both generating and editing images, supporting state-of-the-art models to deliver exceptional results.

Image Generation Models:

A collection of 20 advanced models for creating high-quality images:

ai-forever_kandinsky-2.2
black-forest-labs_flux-1.1-pro-ultra
bytedance_sdxl-lightning-4step
lucataco_dreamshaper-xl-turbo ... (see the full list in the Backend README)

Image Editing Models:

A set of 8 powerful tools for enhancing images, such as object removal, de-oldifying, and more:

adirik_t2i-adapter-sdxl-openpose
arielreplicate_deoldify-image
black-forest-labs_flux-canny-pro ... (complete list in the Backend README)

For detailed instructions, visit the

Project Frontend 🎨

The Project Frontend focuses on delivering a user-friendly and aesthetically pleasing interface for InstaVision.

Main Page Features:

🌟 Maximized window view.
🎨 Dark-themed interface with dynamic logo resizing.
✨ Hover-responsive buttons.

Image Generation UI:

🖥 Fully responsive design.
🎨 Dark theme with real-time image preview.
📂 Save generated images with a single click.

Image Editing UI:

🖼 Real-time preview of uploaded and edited images.
🛠 AI-powered tools for object removal, enhancements, and more.

For detailed instructions, visit the

Project Windows Application ✨

Installer Steps:

Language Selection: Choose preferred language.
License Agreement: Accept terms.
Installation Progress: Relax while the app installs.

Main Features:

Main Page: Central hub for navigation.
Image Generation: Advanced UI for AI-powered image creation.
Image Editing: Tools for refining and enhancing images.

📥 Download InstaVision

Click the button below to download the latest version of InstaVision.

For more information about Project Windows Application, visit the

Project Telegram Bot 🤖

Key Features:

Simultaneous Request Handling: Handles up to 50 requests simultaneously.
Rate Limiting: Enforces user limits (default: 5/day; privileged: 50/day).
Translation: Supports input in 80+ languages (Language List).
Watermarking: Customizable watermark with fallback fonts.

For more details, refer to

Project Representation 🎉

Innovation Fest 2024 at Vishwakarma University Pune

The InstaVision project was proudly showcased at the Innovation Fest 2024 on 24th October 2024. Held at the Vishwakarma University, Pune, this prestigious event was sponsored by the Binghamton University and Thomas J. Watson College of Engineering and Applied Science.

The project secured a Consolation prize of ₹1000. Below are the Consolation certificate awarded to me for presenting InstaVision at Innovation Fest 2024.
Techmanthan 2025 at JSPM College Pune

The InstaVision project was proudly showcased at the Techmanthan 2025 which was a National Level Technical Fest and which was organized on 28th - 29th January 2025. Held at the JSPM College, Pune. This Competition offered me a valuable platform for knowledge exchange, constructive feedback, and networking with other innovators, researchers, and industry experts.

Below is the participation certificate awarded to me for presenting InstaVision at Techmanthan 2025.

Real-Life Usage 🌍

InstaVision has been successfully utilized in various real-world events, showcasing its versatility and impact. Here are some notable instances:

01) Alampata 2024 - VPKBIET's Ganeshotsav Celebration

Event: Alampata 2024, an annual Ganeshotsav festival at VPKBIET
Date: August 7, 2024 - August 17, 2024
Theme: Technology and AI Integration
InstaVision's Role: Used for Telegram Bot Image Generative Competition
Images Generated: 702 Images.
Alampata 2024 Report:

02) VoltzFest 2025 - VPKBIET's AI Art Gallery

Event: VoltzFest 2025, a platform for artists for uplifting their skills using AI
Date: February 10, 2025 - February 11, 2025
Theme: AI Art Generation
InstaVision's Role: Used for Telegram Bot Image Generative Competition
Images Generated: 264 Images.
VoltzFest 2025 Report:

Project Testing Prompts 📝

The Project Test Inputs folder includes curated prompts for evaluating InstaVision across all models. Prompts are designed for versatility and optimized for showcasing API strengths.

Explore test prompts and examples in the Project Test Inputs Folder.

Securing copyright for this project marked an important milestone in safeguarding my innovation and intellectual property. Copyrighting my project not only protects the unique aspects of my Image Generation system but also reinforces my commitment to creating responsible AI products. By copyrighting this idea, I have ensured that the methods, models, and technological advances developed through this project remain attributed to me.

Certificate of Copyright

Establishing copyright protection is a proactive step towards fostering innovation, ensuring recognition, and laying a foundation for future advancements in image generation.

License 📄

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. For more details, please refer to the LICENSE file in the repository.

By using this project, you agree to give appropriate credit, not use the material for commercial purposes without permission, and share any adaptations under the same license.

Attribution should be given as: "InstaVision Bot by Yash Shukla (https://github.com/StudiYash/InstaVision)"

Quick Overview regarding the permissions of usage of this project can be found on LICENSE DEED : CC BY-NC-SA 4.0

Contributions 🎉

Contributions are welcome! Feel free to open an issue or submit a pull request.

Contributor License Agreement (CLA): By submitting a pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement (CLA).
Code of Conduct: This project and everyone participating in it are governed by the InstaVision Code of Conduct.
Contributors: See the list of contributors here.

Made with ❤️ by Yash Shukla

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
Certificates		Certificates
Project Backend		Project Backend
Project Frontend		Project Frontend
Project Real-life Usage		Project Real-life Usage
Project Telegram Bot		Project Telegram Bot
Project Test Inputs		Project Test Inputs
Project Windows Application		Project Windows Application
Support Files		Support Files
CLA.md		CLA.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
InstaVision Logo.png		InstaVision Logo.png
LICENSE		LICENSE
PULL_REQUEST_TEMPLATE.md		PULL_REQUEST_TEMPLATE.md
README.md		README.md

License

StudiYash/InstaVision

Folders and files

Latest commit

History

Repository files navigation