This project contains my notes for a comprehensive set of materials, guides, and exercises from Nebius for AI Performance Engineering course.
I'm updating it as I go through the course.
Explore how to transition from raw machine learning models to functional AI-driven products.
-
Intro to AI & LLMs: An essential introduction to the landscape of Large Language Models. This module covers:
- What changed with LLMs and their core limitations.
- The GPT assistant training pipeline (Pretraining, SFT, RLHF).
- Tokenization strategies and token economics.
- Prompt and Context engineering techniques, including Zero-shot, Few-shot, and Chain-of-Thought prompting.
- Practical insights into tool use and function calling.
-
Evaluation & Benchmarks: An essential introduction to the landscape of Large Language Models. This module covers:
- Why LLM evaluation is hard
- Evaluation metrics
- Evaluation-Driven Development (EDD) - the mindset
- Common metrics and where they break
- Common benchmarks and their expiration dates
- LLM-as-a-Judge and automated behavioral evals (Anthropic's Bloom)
- Human evaluation
- EDD in practice: turning metrics into decisions
-
AI Systems & Test-Time Compute
- Fine-Tuning vs Retrieval Augmented Generation (RAG)
- The RAG Pipeline Components
- Chunking and Embedding strategies
- Evaluating RAG Systems and the "RAG Triad"
- Evaluation Datasets and Benchmarks
Deep dive into the underlying architecture of modern LLMs.
-
LLM Architecture - AI and LLM Intro
- Intro and Generative Al Landscape
- Types of ML
- Supervised Tasks Evaluation
- Language Models
- N-Gram LM
- Language Models Evaluation
-
- Core Terminology & Hierarchy
- Language Model Architectures
- Optimization & Regularization
- Evaluation & Benchmarks
-
Neural Networks and Learned Representations
- Neural Networks and Multi-Layer Perceptrons (MLPs)
- Activation functions and Backpropagation
- Learned representations and Word Embeddings (Word2Vec)
- Sentence Embeddings (Concatenation, Autoencoders, Pooling)
Best practices for deploying, monitoring, and maintaining machine learning models in production.
Techniques for optimizing model inference, reducing latency, and managing compute resources efficiently.
Advanced topics in model refinement using Reinforcement Learning.