TestWeaver

Overview

TestWeaver is an advanced regression test generation tool that integrates Large Language Models (LLMs) with lightweight program analysis. Its goal is to generate high-quality test cases that enhance code coverage while addressing common challenges such as redundant test generation and the coverage plateau.

Unlike traditional test generators, TestWeaver incrementally builds a test suite by reasoning about program execution. It begins with seed tests and iteratively refines them through feedback-driven guidance informed by execution analysis, slicing, and "closest" test case retrieval.

Key Features

Execution-aware feedback: Uses real execution traces to guide the LLM toward covering uncovered lines.
Backward slicing: Focuses the LLM on only the relevant code for each target line, reducing hallucinations.
Closest test retrieval: Identifies test cases that nearly reach the uncovered line to serve as contextual guidance.
Support for multiple LLM providers: Works with OpenAI, Anthropic, or AWS Bedrock.

🔧 Setup

. Install dependencies

pip install -r requirements.txt

🔐 Configure API Key

You need to set up access to an LLM provider before running TestWeaver.

echo "OPENAI_API_KEY=sk-your-actual-api-key-here" > .env
echo "OPENAI_BASE_URL=https://api.openai.com/v1" >> .env

📂 Prepare Dataset for Evaluation

We conduct our evaluation using the CodaMosa (CM) suite, a dataset derived from 35 open-source Python projects.

To download the dataset, run:

git clone https://github.com/plasma-umass/codamosa.git

🧪 Running TestWeaver on a Specific Subproject

You can run TestWeaver on a specific subproject within a larger repository.
This will launch an experimental run.

The results will be saved under the output/cm/... directory.

cd scripts/
export PYTHONPATH=$(pwd)
export sample_id=21  # The id of your chosen Codamosa module, e.g. 21 is corresponding to 'tqdm' module   
python testweaver.py --test-index $sample_id

🧪 Running TestWeaver Ablation Study

To evaluate the impact of different components, you can run TestWeaver in an ablation study mode. This command will execute five experimental configurations:

With slicing
Without slicing
Without execution-in-line
Without closest-test retrieval
Full TestWeaver pipeline

The results will be saved under the output/cm/... directory.

cd scripts/
export PYTHONPATH=$(pwd)
export sample_id=21  # The id of your chosen Codamosa module, e.g. 21 is corresponding to 'tqdm' module   
python ablate.py --test-index $sample_id

📌 Notes

TestWeaver builds tests incrementally by reasoning about what code remains uncovered.
It uses slicing and closest-test retrieval to make LLM prompts more focused and effective.
Generated tests are saved as .py files and can be executed with pytest.

Baselines:

CoverUp Baseline

Run CoverUp baseline with DeepSeek model.

Prerequisites: Docker, Python 3.10+

Steps:

Ensure .env file is configured (same as TestWeaver):

echo "OPENAI_API_KEY=sk-your-actual-api-key-here" > .env
echo "OPENAI_BASE_URL=https://llm-prof-tien.thaiminhpv.id.vn/" >> .env

Load docker image:

docker load -i scripts/baselines/coverup/docker/coverup-runner.tar

Run CoverUp baseline:

cd scripts/baselines/coverup
python3 scripts/eval_coverup.py --config deepseek-v3 --suite cm

Optional: Run on specific package or file:

python3 scripts/eval_coverup.py --config deepseek-v3 --suite cm --package tqdm
python3 scripts/eval_coverup.py --config deepseek-v3 --suite cm --only tqdm/_tqdm.py

Output: scripts/baselines/coverup/output/cm.deepseek-v3/<package>/final.json

CodaMosa Baseline

Run CodaMosa baseline with DeepSeek model.

Prerequisites: Docker, Python 3.10+

Steps:

Ensure .env file is configured (same as TestWeaver):

echo "OPENAI_API_KEY=sk-your-actual-api-key-here" > .env
echo "OPENAI_BASE_URL=https://llm-prof-tien.thaiminhpv.id.vn/" >> .env

Load docker images:

cd scripts/baselines/codamosa/replication
docker load < docker-images/benchmarks-docker.tar.gz
docker load < docker-images/codamosa-docker.tar.gz

Start benchmark container (if not already started):

./scripts/start_benchmark_container.sh

Run CodaMosa baseline:

python3 run_codamosa_deepseek.py

Output: scripts/baselines/codamosa/replication/deepseek-coda/<module>-<run>/statistics.csv

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
cache		cache
codamosa @ 17a4c2a		codamosa @ 17a4c2a
config		config
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
repositories		repositories
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TestWeaver

Overview

Key Features

🔧 Setup

. Install dependencies

🔐 Configure API Key

📂 Prepare Dataset for Evaluation

🧪 Running TestWeaver on a Specific Subproject

🧪 Running TestWeaver Ablation Study

📌 Notes

Baselines:

CoverUp Baseline

CodaMosa Baseline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TestWeaver

Overview

Key Features

🔧 Setup

. Install dependencies

🔐 Configure API Key

📂 Prepare Dataset for Evaluation

🧪 Running TestWeaver on a Specific Subproject

🧪 Running TestWeaver Ablation Study

📌 Notes

Baselines:

CoverUp Baseline

CodaMosa Baseline

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages