InstaVision is a powerful AI-driven Image Generation and Image Editing project designed to transform your text descriptions into stunning, high-quality images using various Image Generation APIs. Perfect for creators, students, and anyone with a vivid imagination, InstaVision makes it easy to bring your ideas to life with just a few words.
- Start Date: 22nd August 2024
- End Date: 14th February 2025
- Total Time Required: 5 months and 24 days
| Name | GitHub Profile | LinkedIn Profile |
|---|---|---|
| Yash Suhas Shukla | GitHub |
The Methodology of InstaVision is designed to efficiently process text-based inputs to generate visually stunning, AI-powered images. Below is an in-depth breakdown of each component and its functionality:
- Source: Inputs are received through multiple platforms, including:
- Telegram Bot: Users can interact with the bot by sending text prompts directly via Telegram.
- Windows Application: Desktop users can input their prompts through the standalone InstaVision application.
- Purpose: This serves as the starting point of the entire workflow, where users provide their creative ideas or descriptions for image generation.
This stage ensures that the text input is validated and optimized for the image generation process. It includes two key components:
- Objective: Ensures that inputs adhere to ethical and usage guidelines by filtering out inappropriate or offensive language.
- Process:
- Scans the text prompt for words or phrases that are flagged as inappropriate.
- Prompts the user to revise their input if any banned words are detected.
- Impact: Maintains the integrity and professionalism of the generated content.
- Objective: Prevents system overload and abuse by enforcing user-specific rate limits.
- Features:
- Default users can generate up to 5 images per day.
- Privileged users can generate up to 50 images per day.
- Admin users have no restrictions.
- Technology Used:
- Redis Cloud Database for maintaining real-time user quota information.
- Reset logic to refresh limits every 24 hours.
- Impact: Ensures fair usage and system scalability for multiple users simultaneously.
This stage leverages advanced AI models to generate and enhance images based on the processed input. It comprises two critical submodules:
- Objective: Transforms text-based prompts into visually stunning images using state-of-the-art AI models.
- Supported Models:
- Replicate APIs for SDXL Lightning and Flux Schnell.
- Google Imagen API for generating highly detailed and realistic images.
- OpenAI DALL-E 3 API for a creative and versatile image generation approach.
- Process:
- Receives processed text input and sends it to the selected image generation API.
- Produces a high-quality image as output.
- Impact: Offers users diverse styles and capabilities for image generation, catering to various creative needs.
- Objective: Adds a customizable watermark to the generated image for branding and intellectual property protection.
- Features:
- Supports both default and custom watermark text.
- Uses fonts specified by the user, with fallbacks to default fonts if unavailable.
- Ensures that the watermark is visually appealing and non-intrusive.
- Impact: Protects generated images from unauthorized use and establishes a unique identity for InstaVision.
- Objective: Delivers the final, watermarked image to the user.
- Delivery Channels:
- Telegram Bot: Sends the image to the user and/or a designated group.
- Windows Application: Displays the image directly within the application and allows users to save it locally.
- Impact: Ensures a seamless user experience by making the final output readily accessible.
- Scalability: Efficiently handles simultaneous requests from multiple users through robust backend integration.
- Ethical Compliance: Ensures content appropriateness through banned words detection.
- Flexibility: Supports a variety of input methods and image generation models.
- User Protection: Implements rate limiting and watermarking to ensure fair use and intellectual property security.
This methodology highlights the seamless integration of user-friendly interfaces, ethical controls, and advanced AI technologies to deliver a reliable and creative experience with InstaVision.
The backend development was an intricate journey, involving months of rigorous research, experimentation, and iterative coding. Each phase contributed to refining the systemβs ability to generate and edit images over various prompts of input and in various languages.
Our Mark Model Index Document provides a comprehensive overview of this journey, showcasing each modelβs evolution, from early concepts to the final optimized versions. Dive into the document to see how each model was crafted, tested, and fine-tuned to tackle the challenges of multilingual, multimodal hate speech detection.
The Project Backend contains resources for both generating and editing images, supporting state-of-the-art models to deliver exceptional results.
A collection of 20 advanced models for creating high-quality images:
ai-forever_kandinsky-2.2black-forest-labs_flux-1.1-pro-ultrabytedance_sdxl-lightning-4steplucataco_dreamshaper-xl-turbo... (see the full list in the Backend README)
A set of 8 powerful tools for enhancing images, such as object removal, de-oldifying, and more:
adirik_t2i-adapter-sdxl-openposearielreplicate_deoldify-imageblack-forest-labs_flux-canny-pro... (complete list in the Backend README)
For detailed instructions, visit the
The Project Frontend focuses on delivering a user-friendly and aesthetically pleasing interface for InstaVision.
- π Maximized window view.
- π¨ Dark-themed interface with dynamic logo resizing.
- β¨ Hover-responsive buttons.
- π₯ Fully responsive design.
- π¨ Dark theme with real-time image preview.
- π Save generated images with a single click.
- πΌ Real-time preview of uploaded and edited images.
- π AI-powered tools for object removal, enhancements, and more.
For detailed instructions, visit the
- Language Selection: Choose preferred language.
- License Agreement: Accept terms.
- Installation Progress: Relax while the app installs.
- Main Page: Central hub for navigation.
- Image Generation: Advanced UI for AI-powered image creation.
- Image Editing: Tools for refining and enhancing images.
π₯ Download InstaVision
Click the button below to download the latest version of InstaVision.
For more information about Project Windows Application, visit the
- Simultaneous Request Handling: Handles up to 50 requests simultaneously.
- Rate Limiting: Enforces user limits (default: 5/day; privileged: 50/day).
- Translation: Supports input in 80+ languages (Language List).
- Watermarking: Customizable watermark with fallback fonts.
For more details, refer to
-
Innovation Fest 2024 at Vishwakarma University Pune
The InstaVision project was proudly showcased at the Innovation Fest 2024 on 24th October 2024. Held at the Vishwakarma University, Pune, this prestigious event was sponsored by the Binghamton University and Thomas J. Watson College of Engineering and Applied Science.
The project secured a Consolation prize of βΉ1000. Below are the Consolation certificate awarded to me for presenting InstaVision at Innovation Fest 2024.
-
Techmanthan 2025 at JSPM College Pune
The InstaVision project was proudly showcased at the Techmanthan 2025 which was a National Level Technical Fest and which was organized on 28th - 29th January 2025. Held at the JSPM College, Pune. This Competition offered me a valuable platform for knowledge exchange, constructive feedback, and networking with other innovators, researchers, and industry experts.
Below is the participation certificate awarded to me for presenting InstaVision at Techmanthan 2025.
InstaVision has been successfully utilized in various real-world events, showcasing its versatility and impact. Here are some notable instances:
-
Event: Alampata 2024, an annual Ganeshotsav festival at VPKBIET
-
Date: August 7, 2024 - August 17, 2024
-
Theme: Technology and AI Integration
-
InstaVision's Role: Used for Telegram Bot Image Generative Competition
-
Images Generated: 702 Images.
-
Alampata 2024 Report:
-
Event: VoltzFest 2025, a platform for artists for uplifting their skills using AI
-
Date: February 10, 2025 - February 11, 2025
-
Theme: AI Art Generation
-
InstaVision's Role: Used for Telegram Bot Image Generative Competition
-
Images Generated: 264 Images.
-
VoltzFest 2025 Report:
The Project Test Inputs folder includes curated prompts for evaluating InstaVision across all models. Prompts are designed for versatility and optimized for showcasing API strengths.
Explore test prompts and examples in the Project Test Inputs Folder.
Securing copyright for this project marked an important milestone in safeguarding my innovation and intellectual property. Copyrighting my project not only protects the unique aspects of my Image Generation system but also reinforces my commitment to creating responsible AI products. By copyrighting this idea, I have ensured that the methods, models, and technological advances developed through this project remain attributed to me.
Establishing copyright protection is a proactive step towards fostering innovation, ensuring recognition, and laying a foundation for future advancements in image generation.
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. For more details, please refer to the LICENSE file in the repository.
By using this project, you agree to give appropriate credit, not use the material for commercial purposes without permission, and share any adaptations under the same license.
Attribution should be given as: "InstaVision Bot by Yash Shukla (https://github.com/StudiYash/InstaVision)"
Quick Overview regarding the permissions of usage of this project can be found on LICENSE DEED : CC BY-NC-SA 4.0
Contributions are welcome! Feel free to open an issue or submit a pull request.
-
Contributor License Agreement (CLA): By submitting a pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement (CLA).
-
Code of Conduct: This project and everyone participating in it are governed by the InstaVision Code of Conduct.
-
Contributors: See the list of contributors here.
Made with β€οΈ by Yash Shukla





