Skip to content

Complete Weather Prediction System in Rust A production-grade machine learning system built entirely in Rust, demonstrating the full ML lifecycle from data collection to model monitoring. Uses Evcxr Jupyter kernel for interactive exploration and experimentation.

Notifications You must be signed in to change notification settings

SamoraDC/RustWeatherML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

51 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RustWeatherML

A production-grade machine learning system for weather prediction built entirely in Rust. This project demonstrates the complete ML lifecycle from data collection to model monitoring, using Evcxr Jupyter kernel for interactive exploration.


🌍 Live Weather Predictions

Auto-updated daily at 06:00 UTC | Last run: Pending first deployment

24-Hour, 48-Hour & 72-Hour Forecast

City Country Current +24h +48h +72h Rain % Confidence
SΓ£o Paulo πŸ‡§πŸ‡· --Β°C --Β°C --Β°C --Β°C --% --
Rio de Janeiro πŸ‡§πŸ‡· --Β°C --Β°C --Β°C --Β°C --% --
SΓ£o JosΓ© dos Campos πŸ‡§πŸ‡· --Β°C --Β°C --Β°C --Β°C --% --
Campinas πŸ‡§πŸ‡· --Β°C --Β°C --Β°C --Β°C --% --
New York πŸ‡ΊπŸ‡Έ --Β°C --Β°C --Β°C --Β°C --% --
Los Angeles πŸ‡ΊπŸ‡Έ --Β°C --Β°C --Β°C --Β°C --% --
London πŸ‡¬πŸ‡§ --Β°C --Β°C --Β°C --Β°C --% --
Berlin πŸ‡©πŸ‡ͺ --Β°C --Β°C --Β°C --Β°C --% --
Oslo πŸ‡³πŸ‡΄ --Β°C --Β°C --Β°C --Β°C --% --
Tokyo πŸ‡―πŸ‡΅ --Β°C --Β°C --Β°C --Β°C --% --
Shanghai πŸ‡¨πŸ‡³ --Β°C --Β°C --Β°C --Β°C --% --
Chongqing πŸ‡¨πŸ‡³ --Β°C --Β°C --Β°C --Β°C --% --
Nanjing πŸ‡¨πŸ‡³ --Β°C --Β°C --Β°C --Β°C --% --
Dubai πŸ‡¦πŸ‡ͺ --Β°C --Β°C --Β°C --Β°C --% --

Model Performance (Last 7 Days)

Metric Rain Prediction Condition Temp 24h Temp 48h Temp 72h
Accuracy/RMSE --% --% --Β°C --Β°C --Β°C
vs Baseline -- -- -- -- --

πŸ“‹ Project Overview

Features

  • Complete ML Pipeline: Data collection β†’ Preprocessing β†’ Feature Engineering β†’ Model Training β†’ Evaluation β†’ Monitoring
  • Multiple ML Libraries: Side-by-side comparison of linfa, smartcore, rustyml, and Burn
  • 10 Years of Data: Historical weather data from 2016-2025 for 14 cities worldwide
  • 4 Prediction Tasks:
    • Rain prediction (binary classification)
    • Weather condition classification (6 classes)
    • Temperature forecasting (24h, 48h, 72h)
    • Multi-target forecasting (temp + humidity + wind)
  • Drift Detection: Automated monitoring for data and concept drift
  • Live Dashboard: Daily auto-updated predictions via GitHub Actions

Data Source

  • API: Open-Meteo (free, unlimited, no API key required)
  • Historical Data: 2016-2025 (10 years)
  • Granularity: Hourly observations
  • Total Records: ~1.2 million

Cities Covered

Region Cities
Brazil πŸ‡§πŸ‡· SΓ£o Paulo, Rio de Janeiro, SΓ£o JosΓ© dos Campos, Campinas
USA πŸ‡ΊπŸ‡Έ New York, Los Angeles
Europe πŸ‡¬πŸ‡§πŸ‡©πŸ‡ͺπŸ‡³πŸ‡΄ London, Berlin, Oslo
Asia πŸ‡―πŸ‡΅πŸ‡¨πŸ‡³πŸ‡¦πŸ‡ͺ Tokyo, Shanghai, Chongqing, Nanjing, Dubai

πŸ—οΈ Project Structure

RustForMachineLearning/
β”œβ”€β”€ notebooks/                    # Jupyter notebooks (Evcxr)
β”‚   β”œβ”€β”€ 01_data_collection_and_exploration.ipynb
β”‚   β”œβ”€β”€ 02_preprocessing_and_feature_engineering.ipynb
β”‚   β”œβ”€β”€ 03_feature_selection_and_model_training.ipynb
β”‚   β”œβ”€β”€ 04_hyperparameter_tuning.ipynb
β”‚   β”œβ”€β”€ 05_evaluation_and_validation.ipynb
β”‚   └── 06_drift_detection_and_monitoring.ipynb
β”œβ”€β”€ src/                          # Production Rust code
β”‚   β”œβ”€β”€ lib.rs
β”‚   β”œβ”€β”€ data/                     # Data loading & API client
β”‚   β”œβ”€β”€ preprocessing/            # Data cleaning & transformation
β”‚   β”œβ”€β”€ features/                 # Feature engineering & selection
β”‚   β”œβ”€β”€ models/                   # ML model implementations
β”‚   β”œβ”€β”€ training/                 # Training utilities
β”‚   β”œβ”€β”€ evaluation/               # Metrics & visualization
β”‚   β”œβ”€β”€ monitoring/               # Drift detection
β”‚   └── bin/                      # CLI tools
β”œβ”€β”€ data/                         # Data storage
β”‚   β”œβ”€β”€ raw/                      # Raw API data
β”‚   β”œβ”€β”€ processed/                # Cleaned data
β”‚   └── features/                 # Engineered features
β”œβ”€β”€ models/                       # Trained model artifacts
β”œβ”€β”€ docs/                         # Documentation
└── .github/workflows/            # GitHub Actions

πŸš€ Getting Started

Prerequisites

  • Rust (1.70+)
  • Jupyter with Evcxr kernel
  • Git

Installation

# Clone the repository
git clone https://github.com/yourusername/RustForMachineLearning.git
cd RustForMachineLearning

# Build the project
cargo build --release

# Install Evcxr Jupyter kernel (if not already installed)
cargo install evcxr_jupyter
evcxr_jupyter --install

Running the Notebooks

# Start Jupyter
jupyter lab

# Navigate to notebooks/ and open the notebooks in order

Running Daily Predictions

cargo run --release --bin daily_predictions

πŸ“Š ML Pipeline

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    DATA     │───▢│  PREPROC    │───▢│  FEATURES   β”‚
β”‚ COLLECTION  β”‚    β”‚ & WRANGLING β”‚    β”‚ ENGINEERING β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚                 β”‚                   β”‚
       β–Ό                 β–Ό                   β–Ό
  Open-Meteo API   Missing values      Lag features
  14 cities        Outliers            Rolling stats
  10 years         Normalization       Cyclical encoding

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  FEATURE    │───▢│   MODEL     │───▢│  TRAINING   β”‚
β”‚ SELECTION   β”‚    β”‚ COMPARISON  β”‚    β”‚             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚                 β”‚                   β”‚
       β–Ό                 β–Ό                   β–Ό
  Correlation       linfa            Train/Val/Test
  Importance        smartcore        Cross-validation
  Recursive elim    Burn             Early stopping

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   TUNING    │───▢│ EVALUATION  │───▢│ MONITORING  β”‚
β”‚             β”‚    β”‚             β”‚    β”‚ & DRIFT     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚                 β”‚                   β”‚
       β–Ό                 β–Ό                   β–Ό
  Grid search       Accuracy/F1       Data drift
  Random search     RMSE/MAE          Concept drift
  Cross-val         ROC/AUC           Auto-retrain

πŸ“ˆ Model Performance

Classification Tasks

Model Rain (Acc) Rain (F1) Condition (Acc) Condition (F1)
Logistic Regression -- -- -- --
Decision Tree -- -- -- --
Random Forest -- -- -- --
Gradient Boosting -- -- -- --
Neural Network -- -- -- --
Ensemble -- -- -- --

Regression Tasks

Model Temp 24h (RMSE) Temp 48h (RMSE) Temp 72h (RMSE)
Linear Regression -- -- --
Decision Tree -- -- --
Random Forest -- -- --
Gradient Boosting -- -- --
Neural Network -- -- --
Ensemble -- -- --

πŸ”§ Technologies

ML Libraries

Library Purpose Status
linfa Classical ML βœ…
smartcore Classical ML βœ…
rustyml Classical ML πŸ”„
Burn Deep Learning πŸ”„

Data & Utilities

  • polars: DataFrame operations
  • ndarray: N-dimensional arrays
  • reqwest: HTTP client
  • chrono: Date/time handling
  • plotters: Visualization
  • statrs: Statistical functions

πŸ“š Documentation


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

About

Complete Weather Prediction System in Rust A production-grade machine learning system built entirely in Rust, demonstrating the full ML lifecycle from data collection to model monitoring. Uses Evcxr Jupyter kernel for interactive exploration and experimentation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published