Skip to content

Feature/ai starter kit#201

Open
ChrsBaur wants to merge 14 commits intomasterfrom
feature/ai-starter-kit
Open

Feature/ai starter kit#201
ChrsBaur wants to merge 14 commits intomasterfrom
feature/ai-starter-kit

Conversation

@ChrsBaur
Copy link
Contributor

@ChrsBaur ChrsBaur commented Jan 27, 2026

Added

  • AI Starter Kit - Complete AI/LLM development framework (optional)
    • New cookiecutter.json option: include_ai_starter_kit: ["no", "yes"]
    • Philosophy: "Launchpads, not Prisons" - Transparent, flexible, modular code
    • RAG (Retrieval-Augmented Generation) with transparent LangChain implementation
      • Function-based approach (no complex wrapper classes)
      • Direct LCEL code visible and customizable
      • TODO comments at every customization point
      • build_rag_chain() - Core RAG logic, easy to modify
      • load_documents(), chunk_documents(), create_vector_store() - Modular functions
    • LangChain agents with pre-built tools (calculator, search, date)
    • Type-safe configuration management with Pydantic v2 and pydantic-settings
      • Runtime overrides supported (config.temperature = 1.5)
      • frozen=False for maximum flexibility
      • Singleton pattern optional, not enforced
    • Centralized prompt management system for version-controlled prompt engineering
    • Comprehensive documentation with runnable examples
    • Full support for all package managers (Poetry, Conda, Pip, UV)
    • Dependencies added (when AI kit enabled):
      • langchain>=0.3 - LLM application framework
      • langchain-openai>=0.2 - OpenAI integration
      • langchain-community>=0.3 - Community tools and vectorstores
      • pydantic>=2.10 - Data validation and structured outputs
      • pydantic-settings>=2.7 - Environment-based configuration
      • python-dotenv>=1.0 - Environment variable management
      • loguru>=0.7 - Enhanced logging
      • chromadb>=0.6 - Local vector database
      • tiktoken>=0.8 - OpenAI tokenizer
    • .env.example template with comprehensive configuration documentation
    • AI-specific .gitignore entries (vector databases, model caches, logs)
    • Best practices for 2026: Structured outputs, function calling, observability
  • UI Framework Support - Web interfaces for AI applications (optional)
    • New cookiecutter.json option: ui_framework: ["none", "chainlit", "streamlit"]
    • Chainlit integration for async chatbot interfaces
      • Real-time streaming responses
      • Session management
      • RAG integration out-of-the-box
      • Minimal wrapper code (view layer only)
      • Dependency: chainlit>=1.3
    • Streamlit integration for interactive data apps
      • Chat history management
      • Streamlit session state
      • RAG integration
      • Minimal wrapper code (view layer only)
      • Dependency: streamlit>=1.40
    • Dynamic Dockerfile configuration based on selected UI framework
      • Chainlit: EXPOSE 8000, runs with chainlit run
      • Streamlit: EXPOSE 8501, runs with streamlit run
      • None: Default Python entrypoint
  • Unified Cloud Deployment with Terraform 🎯 SINGLE SOURCE OF TRUTH
    • New cookiecutter.json option: cloud_provider: ["none", "aws", "azure"]
    • AWS Deployment (Terraform)
      • Infrastructure: terraform/aws/ with App Runner + ECR
      • GitHub Actions workflow: .github/workflows/deploy_aws.yml.disabled
      • Resources: ECR repository, App Runner service, IAM roles
      • Automatic scaling and deployment
    • Azure Deployment (Terraform) - Now unified with AWS approach!
      • Infrastructure: terraform/azure/ with Container Apps + ACR
      • GitHub Actions workflow: .github/workflows/deploy_azure.yml.disabled (Terraform-based)
      • Resources: Resource Group, ACR, Log Analytics, Container App Environment, Container App
      • Ingress configuration, auto-scaling, secrets management
    • Unified Terraform Approach
      • Same HCL syntax for both clouds
      • Consistent file structure (terraform/{aws,azure}/)
      • Same deployment workflow (init → plan → apply)
      • State management for both clouds
      • Comprehensive terraform/README.md with examples
    • Dynamic Configuration
      • Port configuration based on UI framework
      • Environment variables
      • Secrets management patterns (AWS Secrets Manager / Azure Key Vault)
      • Scaling configuration
    • CI/CD Integration
      • Automated Docker builds
      • Push to cloud registries (ECR/ACR)
      • Terraform apply in GitHub Actions
      • Deployment summaries
      • Optional rollback support

Changed

  • Updated .gitignore with AI/LLM-specific entries (when AI kit enabled)
    • chroma_db/ - Vector databases
    • .cache/ - Model caches
    • *.pkl, *.pickle - Serialized models
    • llm_logs/ - LLM API logs
  • Dockerfiles made dynamic for UI frameworks
    • Dockerfile__poetry, Dockerfile__pip, Dockerfile__conda, Dockerfile__uv
    • Conditional EXPOSE and CMD/ENTRYPOINT based on ui_framework
    • Support for both web apps (Chainlit/Streamlit) and CLI apps
  • AI Code Architecture - "Launchpads, not Prisons"
    • Removed complex wrapper classes in favor of transparent functions
    • All configuration mutable for easy experimentation
    • TODO comments guide developers to customization points
    • Runnable examples in every module (if __name__ == "__main__")
    • Direct LangChain code visible (no abstraction hiding)
  • Azure Deployment - Migrated to Terraform
    • Changed from Azure CLI commands to Terraform
    • Same deployment experience as AWS
    • Better state management and rollback capabilities

Fixed

  • poetry.toml no longer generated for non-Poetry package managers
    • Previously, poetry.toml was incorrectly generated for UV, Pip, and Conda projects
    • Now only Poetry projects receive poetry.toml (virtualenv configuration)
    • UV projects correctly receive only pyproject.toml (without poetry.toml)
    • Improved post-generation hook logic with explicit files_poetry_only set
  • Removed erroneous CLI entry from Poetry dependencies section (should only be in scripts)

Documentation

  • Added terraform/README.md - Comprehensive Terraform usage guide
  • Enhanced inline code documentation with TODO comments
  • Runnable examples in AI modules

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an optional AI Starter Kit (RAG + agents), optional UI frameworks (Chainlit/Streamlit), and unified Terraform-based cloud deployment for AWS and Azure, plus a Ruff-based formatting option. It also wires these options through Cookiecutter, Dockerfiles, CI (GitHub Actions and GitLab), and dependency management across all supported package managers.

Changes:

  • Introduces an AI module (ai/) with configuration, RAG utilities, LangChain agents, and prompt management, plus UI frontends (ui/app.py) for Chainlit and Streamlit.
  • Adds Terraform stacks for AWS App Runner and Azure Container Apps, with matching GitHub Actions workflows and a shared Terraform README.
  • Extends cookiecutter options and post-generation hooks to support AI kit inclusion, UI framework selection, cloud provider selection, and Ruff as an alternative formatter, updating requirements, environments, CI, editor, and pre-commit configs accordingly.

Reviewed changes

Copilot reviewed 33 out of 34 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
{{cookiecutter.project_slug}}/terraform/azure/variables.tf Defines Azure deployment variables (region, ACR, resources, container sizing, env vars) with defaults tuned to UI framework.
{{cookiecutter.project_slug}}/terraform/azure/main.tf Provisions Azure Resource Group, ACR, Log Analytics, Container App Environment, and Container App with ingress, scaling, and outputs.
{{cookiecutter.project_slug}}/terraform/aws/variables.tf Declares AWS Terraform variables (region, app name, image tag, container resources, env vars) with UI-aware port defaults.
{{cookiecutter.project_slug}}/terraform/aws/main.tf Creates ECR repo, lifecycle policy, IAM roles, and an AWS App Runner service wired to the container image and env vars.
{{cookiecutter.project_slug}}/terraform/README.md Documents the unified Terraform layout, configuration, remote state, CI/CD integration, and usage/cleanup commands.
{{cookiecutter.project_slug}}/src/{{cookiecutter.module_name}}/ui/app.py Provides a conditional UI entrypoint implementing a Chainlit chat app or Streamlit chat app (or a placeholder) that integrates with the AI starter kit when enabled.
{{cookiecutter.project_slug}}/src/{{cookiecutter.module_name}}/ui/__init__.py Adds a brief module-level description for the UI layer.
{{cookiecutter.project_slug}}/src/{{cookiecutter.module_name}}/ai/rag.py Implements a function-first RAG toolkit (document loading/chunking, vector store creation, retriever creation, RAG chain building, and examples).
{{cookiecutter.project_slug}}/src/{{cookiecutter.module_name}}/ai/prompts.py Centralizes system prompts and reusable prompt templates, with helpers for building RAG and summarization prompts.
{{cookiecutter.project_slug}}/src/{{cookiecutter.module_name}}/ai/config.py Defines a Pydantic AIConfig with environment-based settings, validation, singleton access (get_ai_config), and a demo CLI entry.
{{cookiecutter.project_slug}}/src/{{cookiecutter.module_name}}/ai/agent.py Adds an AIAgent wrapping a LangChain functions agent with calculator, mock search, and date tools plus sync/async run methods and examples.
{{cookiecutter.project_slug}}/src/{{cookiecutter.module_name}}/ai/__init__.py Exposes AIConfig, get_ai_config, and SystemPrompts as the public AI package API.
{{cookiecutter.project_slug}}/src/{{cookiecutter.module_name}}/ai/README.md Documents the AI Starter Kit architecture, usage examples, configuration, best practices, observability, testing, and troubleshooting.
{{cookiecutter.project_slug}}/requirements.txt Adds optional AI and UI dependencies (LangChain stack, logging, ChromaDB, Chainlit/Streamlit) under cookiecutter conditions.
{{cookiecutter.project_slug}}/requirements-dev.txt Updates dev tooling to include Ruff/isort/pyupgrade according to the chosen formatter, and refreshes testing dependencies.
{{cookiecutter.project_slug}}/pyproject.toml Wires AI/UI dependencies into Poetry or PEP 621 metadata, configures test/linter/dev groups, introduces Ruff configuration when selected, and adjusts isort/pytest/Jupyter versions.
{{cookiecutter.project_slug}}/environment.yml Extends the Conda runtime environment with AI starter kit and UI framework dependencies via pip sub-sections.
{{cookiecutter.project_slug}}/environment-dev.yml Updates Conda dev environment to conditionally include isort and Ruff and synchronize JupyterLab and test tool versions.
{{cookiecutter.project_slug}}/env.example Provides an AI kit–specific environment template for OpenAI keys, RAG settings, agent settings, and logging level.
{{cookiecutter.project_slug}}/README.md Minor formatting cleanup in the uv usage section.
{{cookiecutter.project_slug}}/Dockerfile__uv Makes the uv-based Dockerfile expose and run Chainlit or Streamlit when selected, otherwise keeps the CLI entrypoint.
{{cookiecutter.project_slug}}/Dockerfile__poetry Adds conditional EXPOSE/CMD logic for Chainlit/Streamlit in the Poetry-based Dockerfile.
{{cookiecutter.project_slug}}/Dockerfile__pip Adds conditional EXPOSE/CMD logic for Chainlit/Streamlit in the pip-based Dockerfile.
{{cookiecutter.project_slug}}/Dockerfile__conda Adds conditional EXPOSE/CMD logic for Chainlit/Streamlit in the Conda-based Dockerfile, using mamba run.
{{cookiecutter.project_slug}}/.vscode__editor/settings.json Integrates Ruff as a formatter/linter option in VS Code settings, alongside the existing Black configuration.
{{cookiecutter.project_slug}}/.pre-commit-config.yaml Pins Black/ Ruff versions, adds isort and pyupgrade where appropriate, and wires Ruff hooks (lint + format) when selected.
{{cookiecutter.project_slug}}/.gitlab-ci.yml Introduces a dedicated lint stage and a Ruff-powered linter-happiness job for each package manager, plus corresponding setup logic.
{{cookiecutter.project_slug}}/.gitignore Adds AI-specific ignores (vector DBs, caches, pickles, LLM logs) gated by the AI starter kit option.
{{cookiecutter.project_slug}}/.github/workflows/deploy_azure.yml.disabled Adds a Terraform-based Azure Container Apps deployment workflow with Docker build/push, ACR integration, and summary output.
{{cookiecutter.project_slug}}/.github/workflows/deploy_aws.yml.disabled Adds a Terraform-based AWS App Runner deployment workflow with ECR build/push and deployment summaries.
tests/test_options.py Adds tests that validate the Ruff formatter setup across package managers and GitLab CI integration.
hooks/post_gen_project.py Extends post-generation hooks to handle Ruff, and adds handle_ai_starter_kit to manage AI, UI, and cloud deployment files based on cookiecutter choices.
cookiecutter.json Introduces new options for Ruff, AI starter kit, UI framework, and cloud provider, with updated interactive prompts.
CHANGELOG.md Documents releases 1.4.0 and 1.5.0, describing Ruff support and the AI Starter Kit + Terraform cloud deployment features.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +52 to +56
from {{ cookiecutter.module_name }}.ai import RAGPipeline
from pathlib import Path

# Initialize RAG
rag = RAGPipeline(collection_name="my_docs")
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Quick Start example imports RAGPipeline from {{ cookiecutter.module_name }}.ai, but __init__.py does not export this symbol and there is no RAGPipeline implementation in ai/rag.py, so this snippet will fail for users who copy-paste it.
Either expose the intended RAG entry point from the package (and implement RAGPipeline), or update the example to use the actual public API that exists today.

Suggested change
from {{ cookiecutter.module_name }}.ai import RAGPipeline
from pathlib import Path
# Initialize RAG
rag = RAGPipeline(collection_name="my_docs")
from pathlib import Path
# See {{ cookiecutter.module_name }}/ai/rag.py for the concrete RAG API
# (classes/functions) provided by this template.
#
# For example, you might expose a pipeline object with methods like:
# - load_documents(path)
# - add_documents(documents)
# - query(question) -> {"answer": str, "sources": list}
#
# Initialize your RAG implementation (adapt this to match rag.py)
rag = ... # e.g. an instance or function from ai/rag.py

Copilot uses AI. Check for mistakes.
Comment on lines +49 to +52
try:
# Safe evaluation (only allow math operations)
result = eval(expression, {"__builtins__": {}}, {})
logger.info(f"Calculator: {expression} = {result}")
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calculator tool evaluates arbitrary expressions with eval, which is unsafe even with __builtins__ stripped because crafted expressions can still access powerful objects via Python's object model.
Given that this tool is exposed to the LLM agent, it should use a safe math parser (e.g., ast.parse with a whitelist of nodes or a dedicated expression-evaluation library) rather than raw eval.

Copilot uses AI. Check for mistakes.
- $CONDA_PKGS_DIRS/urls.txt
before_script:
- export PATH="/opt/conda/bin:$PATH"
- conda env create -n .venv -f environment-dev.yml environment.yml
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conda env create invocation passes two YAML files (environment-dev.yml environment.yml) after a single -f flag, which is not valid CLI syntax and will cause the lint job to fail when code_formatter='ruff' and package_manager='conda'.
If you intend to merge the two environments, use separate -f flags or pre-merge the files; if you only need one, drop the extra filename.

Suggested change
- conda env create -n .venv -f environment-dev.yml environment.yml
- conda env create -n .venv -f environment-dev.yml

Copilot uses AI. Check for mistakes.
Comment on lines +189 to +235
def handle_ai_starter_kit():
include_ai = '{{ cookiecutter.include_ai_starter_kit }}'
ui_framework = '{{ cookiecutter.ui_framework }}'
cloud_provider = '{{ cookiecutter.cloud_provider }}'
ai_folder = f'{module_dir}/ai'
ui_folder = f'{module_dir}/ui'

if include_ai == 'no':
# Remove AI module if not selected
if os.path.exists(ai_folder):
shutil.rmtree(ai_folder)
# Remove env.example
if os.path.exists('env.example'):
os.remove('env.example')
else:
# Rename env.example to .env.example
if os.path.exists('env.example'):
os.rename('env.example', '.env.example')

# Handle UI framework
if ui_framework == 'none':
# Remove UI module if no framework selected
if os.path.exists(ui_folder):
shutil.rmtree(ui_folder)

# Handle cloud provider deployments
if cloud_provider == 'aws':
# Keep AWS Terraform, remove Azure
if os.path.exists('.github/workflows/deploy_azure.yml.disabled'):
os.remove('.github/workflows/deploy_azure.yml.disabled')
if os.path.exists('terraform/azure'):
shutil.rmtree('terraform/azure')
elif cloud_provider == 'azure':
# Keep Azure Terraform, remove AWS
if os.path.exists('.github/workflows/deploy_aws.yml.disabled'):
os.remove('.github/workflows/deploy_aws.yml.disabled')
if os.path.exists('terraform/aws'):
shutil.rmtree('terraform/aws')
elif cloud_provider == 'none':
# Remove all cloud deployment files
if os.path.exists('.github/workflows/deploy_aws.yml.disabled'):
os.remove('.github/workflows/deploy_aws.yml.disabled')
if os.path.exists('.github/workflows/deploy_azure.yml.disabled'):
os.remove('.github/workflows/deploy_azure.yml.disabled')
if os.path.exists('terraform'):
shutil.rmtree('terraform')

Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new handle_ai_starter_kit function introduces several branches (AI kit on/off, three UI framework options, and three cloud providers) that add or remove files and directories, but there are no tests in tests/test_options.py covering these new cookiecutter options.
Given that other cookiecutter switches are exercised via check_project, adding tests for include_ai_starter_kit, ui_framework, and cloud_provider would help prevent regressions in this post-generation hook logic.

Copilot uses AI. Check for mistakes.

{%- if cookiecutter.include_ai_starter_kit == 'yes' %}
from {{ cookiecutter.module_name }}.ai.config import get_ai_config
from {{ cookiecutter.module_name }}.ai.rag import RAGPipeline
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RAGPipeline is imported from the ai.rag module but no such class or function is defined anywhere in the codebase, so enabling the AI starter kit with a UI framework will raise an ImportError at runtime when this file is imported.
To fix this, either implement and export a RAGPipeline in ai/rag.py (matching the usage here) or update this import and the downstream code to use the existing functional RAG helpers instead.

Suggested change
from {{ cookiecutter.module_name }}.ai.rag import RAGPipeline
try:
from {{ cookiecutter.module_name }}.ai.rag import RAGPipeline
except ImportError:
from typing import Any, Dict
class RAGPipeline:
"""
Fallback RAGPipeline used when `{{ cookiecutter.module_name }}.ai.rag`
is not available.
This prevents import-time failures of the UI while making it clear at
runtime that the real RAG implementation is missing.
"""
def __init__(self, collection_name: str | None = None, *args: Any, **kwargs: Any) -> None:
self.collection_name = collection_name
async def ainvoke(self, query: str, config: Any | None = None) -> Dict[str, Any]:
"""
Minimal async interface to mimic a real RAG pipeline.
Returns a diagnostic message so developers know they must implement
`{{ cookiecutter.module_name }}.ai.rag.RAGPipeline`.
"""
return {
"answer": (
"RAGPipeline is not implemented. Please add a real implementation "
"in `{{ cookiecutter.module_name }}.ai.rag.RAGPipeline`."
),
"query": query,
}

Copilot uses AI. Check for mistakes.
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
sku = var.acr_sku # Basic, Standard, or Premium
admin_enabled = true # TODO: Use Managed Identity in production
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Azure Container Registry is created with admin_enabled = true and the Container App is configured to pull images using the ACR admin username/password, which grants broad push/pull access directly into the running workload. If an attacker compromises the container app (RCE, dependency exploit, etc.), they can exfiltrate these credentials and push tampered images to the registry, turning this into a supply-chain style compromise. Prefer disabling the ACR admin user and configuring the Container App to authenticate using a managed identity or scoped pull-only credentials instead.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments