Mixture-of-Experts for Continual Learning on MTL5 benchmark.
pip install -r requirements.txtDownload the Llama-2-7b-hf model into the ../model/Llama-2-7b-hf/ directory.
Run the full continual learning training (including random initialization baseline + continual learning sequence DBPedia → Amazon → Yahoo → AGNews):
bash scripts/mtl5/run_moe-cl.shAfter training, calculate the continual learning metrics (ACC, BWT, FWT):
# Calculate metrics for order1
python calculate_bwt_fwt.py \
--log_file results/moe-cl/mtl5/order1/log.txt \
--order order1
# With random initialization baseline for FWT calculation
python calculate_bwt_fwt.py \
--log_file results/moe-cl/mtl5/order1/log.txt \
--order order1 \
--random_init_log results/moe-cl/mtl5/rand_init/log.txt| Metric | Description |
|---|---|
| ACC | Average accuracy across all tasks after learning the final task |
| BWT | Backward Transfer — measures forgetting (negative = forgetting occurred) |
| FWT | Forward Transfer — measures knowledge transfer to new tasks (positive = helpful) |
order1: DBPedia → Amazon → Yahoo → AGNewsorder2: DBPedia → Amazon → AGNews → Yahooorder3: Yahoo → Amazon → AGNews → DBPedia
- Training results and model checkpoints:
results/directory - Training logs:
logs/directory