Life Project: LLM-Based Life-Goal Classification

Quick Start

uv sync
uv run python main.py --prompt V4 --model "configs/llms/openai/G41-mini.yaml" --input "input/pt-BR/ptBR_Final_Data_classification.xlsx" --language pt-BR

Results will be saved under:

output/classification/

Project Introduction

This project provides an automated system for classifying life goals using Large Language Models (LLMs). It supports batch processing of Excel datasets, multiple prompt versions (V1–V6), multilingual configurations, and transparent logging of system prompts, user prompts, and classification outputs.

The system is designed for controlled experimentation on how prompt structure affects model behavior and classification outcomes.

Project Structure

├── main.py                        # Main classification script
├── run_all_combinations.sh        # Run all configuration combinations 
├── evaluation_main.py             # Compare human and LLM classifications
├── kappa.py                       # Cohen’s Kappa for agreement testing
├── Unified_Data.py                # Data merging utility
├── Age_Check.py                   # Age validation
├── translate_EN.py                # English translation of goal texts

├── lifeproject/
│   ├── classifier_batched.py      # Core batched LLM classification logic and build user prompt
│   ├── prompt_builder.py          # Build system prompts
│   ├── llm.py                     # LLM configuration and async client
│   ├── config.py                  # YAML-based model loader

├── configs/
│   ├── llms/
│   │   └── openai/
│   │       ├── G4O-mini.yaml
│   │       ├── G41-mini.yaml
│   │       └── G5.yaml
│   └── prompt/
│       ├── taskset/
│           ├── role.txt
│           ├── ...
│           ├── other prompt component.txt       
│       ├── language_hint/
│       └── codebook/

├── input/                         # Input files for classification and evaluation
│   ├── pt-BR/
│   └── ZH/

├── output/
│   ├── classification/            # Classification results
│   ├── evaluation/                # Evaluation reports
│   ├── kappa/                     # Kappa statistics
│   └── prompt/                    # Saved prompts used in runs

├── requirements.txt
├── pyproject.toml
└── uv.lock

Environment Setup

The project uses uv for dependency management.

Install uv

Windows:

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Linux/MacOS:

curl -LsSf https://astral.sh/uv/install.sh | sh

Install all dependencies

uv sync

OR for quick testing:

uv pip install -r requirements.in

Set your API key

Create a .env file:

OPENAI_API_KEY=your_api_key_here

Running the Classification

Single classification run

uv run python main.py \
  --prompt V5 --model "configs/llms/openai/G41-mini.yaml" \
  --input "input/pt-BR/ptBR_Final_Data_classification.xlsx" \
  --language "configs/prompt/language_hint/pt-BR.txt"

Batch execution for all configurations

chmod +x run_all_combinations.sh
./run_all_combinations.sh

Evaluation Tools

Per-goal accuracy

uv run python evaluate_accuracy.py

Per-category agreement (Cohen’s Kappa)

uv run python kappa.py

Model & Prompt Configuration

Model Configuration

Model YAML files are stored under:

configs/llms/

Each file defines:

API endpoint and key
Model name
Temperature
Token limits
Pricing and concurrency

Switch models easily by using a different YAML file with the --model argument. More details in configs/llms/README.md.

Prompt Configuration

Prompt files are under:

configs/prompt/

Switch prompt version easily by using a different YAML file with the --prompt argument. More details in configs/prompt/README.md

Core Modules and Workflow

`main.py` — Core entry point

Controls data loading, prompt construction, model calls, and output saving.

Main workflow:

Parse command-line arguments (paths for input file, taskset, language, and model)
Load environment variables (e.g., API key)
Load model configuration using LLMConfigManager.from_yaml()
Build the system prompt with prompt_builder.load_prompt_components()
Read input Excel data and extract goal texts (goal1–goal15)
Call get_batched_model_response() for batch classification
Save model outputs (classification + reasoning + token usage) as Excel/CSV files

Evaluation Modules

Accuracy Comparison: `evaluation_main.py`

Compares human and LLM classifications goal by goal and generates a mismatch report. Workflow:

Load data (human, LLM, and optional English translation)
Reshape all datasets to long format with load_and_reshape()
Merge by id and loc (same individual and goal)
Normalize multi-label categories (e.g., “IR,WEC”)
Compute accuracy (exact matches)
Generate a mismatch report for all differing cases
Save evaluation results under output/evaluation/

Inter-Coder Agreement: `kappa.py`

Calculates Cohen’s Kappa per category, mean and weighted Kappa, and significance. Workflow:

Read both Excel files (human vs. LLM)
Extract all LPSgoalX_category columns
Identify all unique category labels
Build binary matrices for each label (1 = present, 0 = absent)
Calculate Kappa, standard error, and p-value
Compute mean and weighted mean Kappa
Save summarized results to output/kappa/

Summary

main.py— pipeline controller
config.py + llm.py — model configuration and API management
prompt_builder.py — system prompt construction
classifier_batched.py — batch classification execution and user prompt construction
evaluation_main.py + kappa.py — evaluation and agreement analysis

Output Files

Type	Directory
Classification	`output/classification/`
Evaluation	`output/evaluation/`
Kappa	`output/kappa/`
Prompt	`output/prompt`

Notes

Default codebook: configs/prompt/codebook/codebook_en.txt
Supports multilingual inputs (currently pt-BR and zh-TW)

Contact

FTOLP is a project by the ODISSEI Social Data Science (SoDa) team. Do you have questions, suggestions, or remarks on the technical implementation? Create an issue in the issue tracker or feel free to contact Qixiang Fang or Shiyu Dong.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Life Project: LLM-Based Life-Goal Classification

Quick Start

Project Introduction

Project Structure

Environment Setup

Install uv

Install all dependencies

Set your API key

Running the Classification

Single classification run

Batch execution for all configurations

Evaluation Tools

Per-goal accuracy

Per-category agreement (Cohen’s Kappa)

Model & Prompt Configuration

Model Configuration

Prompt Configuration

Core Modules and Workflow

`main.py` — Core entry point

Evaluation Modules

Accuracy Comparison: `evaluation_main.py`

Inter-Coder Agreement: `kappa.py`

Summary

Output Files

Notes

Contact

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
configs		configs
data		data
evaluate		evaluate
input		input
lifeproject		lifeproject
output		output
.gitignore		.gitignore
.python-version		.python-version
Age_Check.py		Age_Check.py
LICENSE		LICENSE
README.md		README.md
Unified_Data.py		Unified_Data.py
check_changes.py		check_changes.py
evaluate_accuracy.py		evaluate_accuracy.py
evaluation_main.py		evaluation_main.py
kappa.py		kappa.py
main.py		main.py
main_batched.py		main_batched.py
pyproject.toml		pyproject.toml
requirements.in		requirements.in
requirements.txt		requirements.txt
run_all_combinations.sh		run_all_combinations.sh
system_prompt.txt		system_prompt.txt
translate_EN.py		translate_EN.py
uv.lock		uv.lock

License

sodascience/LifeProject

Folders and files

Latest commit

History

Repository files navigation

Life Project: LLM-Based Life-Goal Classification

Quick Start

Project Introduction

Project Structure

Environment Setup

Install uv

Install all dependencies

Set your API key

Running the Classification

Single classification run

Batch execution for all configurations

Evaluation Tools

Per-goal accuracy

Per-category agreement (Cohen’s Kappa)

Model & Prompt Configuration

Model Configuration

Prompt Configuration

Core Modules and Workflow

main.py — Core entry point

Evaluation Modules

Accuracy Comparison: evaluation_main.py

Inter-Coder Agreement: kappa.py

Summary

Output Files

Notes

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`main.py` — Core entry point

Accuracy Comparison: `evaluation_main.py`

Inter-Coder Agreement: `kappa.py`

Packages