A production-grade machine learning system for weather prediction built entirely in Rust. This project demonstrates the complete ML lifecycle from data collection to model monitoring, using Evcxr Jupyter kernel for interactive exploration.
Auto-updated daily at 06:00 UTC | Last run: Pending first deployment
| City | Country | Current | +24h | +48h | +72h | Rain % | Confidence |
|---|---|---|---|---|---|---|---|
| SΓ£o Paulo | π§π· | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Rio de Janeiro | π§π· | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| SΓ£o JosΓ© dos Campos | π§π· | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Campinas | π§π· | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| New York | πΊπΈ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Los Angeles | πΊπΈ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| London | π¬π§ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Berlin | π©πͺ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Oslo | π³π΄ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Tokyo | π―π΅ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Shanghai | π¨π³ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Chongqing | π¨π³ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Nanjing | π¨π³ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Dubai | π¦πͺ | --Β°C | --Β°C | --Β°C | --Β°C | --% | -- |
| Metric | Rain Prediction | Condition | Temp 24h | Temp 48h | Temp 72h |
|---|---|---|---|---|---|
| Accuracy/RMSE | --% | --% | --Β°C | --Β°C | --Β°C |
| vs Baseline | -- | -- | -- | -- | -- |
- Complete ML Pipeline: Data collection β Preprocessing β Feature Engineering β Model Training β Evaluation β Monitoring
- Multiple ML Libraries: Side-by-side comparison of linfa, smartcore, rustyml, and Burn
- 10 Years of Data: Historical weather data from 2016-2025 for 14 cities worldwide
- 4 Prediction Tasks:
- Rain prediction (binary classification)
- Weather condition classification (6 classes)
- Temperature forecasting (24h, 48h, 72h)
- Multi-target forecasting (temp + humidity + wind)
- Drift Detection: Automated monitoring for data and concept drift
- Live Dashboard: Daily auto-updated predictions via GitHub Actions
- API: Open-Meteo (free, unlimited, no API key required)
- Historical Data: 2016-2025 (10 years)
- Granularity: Hourly observations
- Total Records: ~1.2 million
| Region | Cities |
|---|---|
| Brazil π§π· | SΓ£o Paulo, Rio de Janeiro, SΓ£o JosΓ© dos Campos, Campinas |
| USA πΊπΈ | New York, Los Angeles |
| Europe π¬π§π©πͺπ³π΄ | London, Berlin, Oslo |
| Asia π―π΅π¨π³π¦πͺ | Tokyo, Shanghai, Chongqing, Nanjing, Dubai |
RustForMachineLearning/
βββ notebooks/ # Jupyter notebooks (Evcxr)
β βββ 01_data_collection_and_exploration.ipynb
β βββ 02_preprocessing_and_feature_engineering.ipynb
β βββ 03_feature_selection_and_model_training.ipynb
β βββ 04_hyperparameter_tuning.ipynb
β βββ 05_evaluation_and_validation.ipynb
β βββ 06_drift_detection_and_monitoring.ipynb
βββ src/ # Production Rust code
β βββ lib.rs
β βββ data/ # Data loading & API client
β βββ preprocessing/ # Data cleaning & transformation
β βββ features/ # Feature engineering & selection
β βββ models/ # ML model implementations
β βββ training/ # Training utilities
β βββ evaluation/ # Metrics & visualization
β βββ monitoring/ # Drift detection
β βββ bin/ # CLI tools
βββ data/ # Data storage
β βββ raw/ # Raw API data
β βββ processed/ # Cleaned data
β βββ features/ # Engineered features
βββ models/ # Trained model artifacts
βββ docs/ # Documentation
βββ .github/workflows/ # GitHub Actions
- Rust (1.70+)
- Jupyter with Evcxr kernel
- Git
# Clone the repository
git clone https://github.com/yourusername/RustForMachineLearning.git
cd RustForMachineLearning
# Build the project
cargo build --release
# Install Evcxr Jupyter kernel (if not already installed)
cargo install evcxr_jupyter
evcxr_jupyter --install# Start Jupyter
jupyter lab
# Navigate to notebooks/ and open the notebooks in ordercargo run --release --bin daily_predictionsβββββββββββββββ βββββββββββββββ βββββββββββββββ
β DATA βββββΆβ PREPROC βββββΆβ FEATURES β
β COLLECTION β β & WRANGLING β β ENGINEERING β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
βΌ βΌ βΌ
Open-Meteo API Missing values Lag features
14 cities Outliers Rolling stats
10 years Normalization Cyclical encoding
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β FEATURE βββββΆβ MODEL βββββΆβ TRAINING β
β SELECTION β β COMPARISON β β β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
βΌ βΌ βΌ
Correlation linfa Train/Val/Test
Importance smartcore Cross-validation
Recursive elim Burn Early stopping
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β TUNING βββββΆβ EVALUATION βββββΆβ MONITORING β
β β β β β & DRIFT β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
βΌ βΌ βΌ
Grid search Accuracy/F1 Data drift
Random search RMSE/MAE Concept drift
Cross-val ROC/AUC Auto-retrain
| Model | Rain (Acc) | Rain (F1) | Condition (Acc) | Condition (F1) |
|---|---|---|---|---|
| Logistic Regression | -- | -- | -- | -- |
| Decision Tree | -- | -- | -- | -- |
| Random Forest | -- | -- | -- | -- |
| Gradient Boosting | -- | -- | -- | -- |
| Neural Network | -- | -- | -- | -- |
| Ensemble | -- | -- | -- | -- |
| Model | Temp 24h (RMSE) | Temp 48h (RMSE) | Temp 72h (RMSE) |
|---|---|---|---|
| Linear Regression | -- | -- | -- |
| Decision Tree | -- | -- | -- |
| Random Forest | -- | -- | -- |
| Gradient Boosting | -- | -- | -- |
| Neural Network | -- | -- | -- |
| Ensemble | -- | -- | -- |
| Library | Purpose | Status |
|---|---|---|
| linfa | Classical ML | β |
| smartcore | Classical ML | β |
| rustyml | Classical ML | π |
| Burn | Deep Learning | π |
- polars: DataFrame operations
- ndarray: N-dimensional arrays
- reqwest: HTTP client
- chrono: Date/time handling
- plotters: Visualization
- statrs: Statistical functions
- Design Document
- API Reference (coming soon)
- Contributing Guide (coming soon)
This project is licensed under the MIT License - see the LICENSE file for details.
- Open-Meteo for providing free weather data
- Rust ML Community for the excellent ML libraries
- Evcxr for the Rust Jupyter kernel