Skip to content

FSoft-AI4Code/TestWeaver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TestWeaver

Overview

TestWeaver is an advanced regression test generation tool that integrates Large Language Models (LLMs) with lightweight program analysis. Its goal is to generate high-quality test cases that enhance code coverage while addressing common challenges such as redundant test generation and the coverage plateau.

Unlike traditional test generators, TestWeaver incrementally builds a test suite by reasoning about program execution. It begins with seed tests and iteratively refines them through feedback-driven guidance informed by execution analysis, slicing, and "closest" test case retrieval.


Key Features

  • Execution-aware feedback: Uses real execution traces to guide the LLM toward covering uncovered lines.
  • Backward slicing: Focuses the LLM on only the relevant code for each target line, reducing hallucinations.
  • Closest test retrieval: Identifies test cases that nearly reach the uncovered line to serve as contextual guidance.
  • Support for multiple LLM providers: Works with OpenAI, Anthropic, or AWS Bedrock.


🔧 Setup

. Install dependencies

pip install -r requirements.txt

🔐 Configure API Key

You need to set up access to an LLM provider before running TestWeaver.

echo "OPENAI_API_KEY=sk-your-actual-api-key-here" > .env
echo "OPENAI_BASE_URL=https://api.openai.com/v1" >> .env

📂 Prepare Dataset for Evaluation

We conduct our evaluation using the CodaMosa (CM) suite, a dataset derived from 35 open-source Python projects.

To download the dataset, run:

git clone https://github.com/plasma-umass/codamosa.git

🧪 Running TestWeaver on a Specific Subproject

You can run TestWeaver on a specific subproject within a larger repository.
This will launch an experimental run.

The results will be saved under the output/cm/... directory.

cd scripts/
export PYTHONPATH=$(pwd)
export sample_id=21  # The id of your chosen Codamosa module, e.g. 21 is corresponding to 'tqdm' module   
python testweaver.py --test-index $sample_id

🧪 Running TestWeaver Ablation Study

To evaluate the impact of different components, you can run TestWeaver in an ablation study mode. This command will execute five experimental configurations:

  1. With slicing
  2. Without slicing
  3. Without execution-in-line
  4. Without closest-test retrieval
  5. Full TestWeaver pipeline

The results will be saved under the output/cm/... directory.

cd scripts/
export PYTHONPATH=$(pwd)
export sample_id=21  # The id of your chosen Codamosa module, e.g. 21 is corresponding to 'tqdm' module   
python ablate.py --test-index $sample_id

📌 Notes

  • TestWeaver builds tests incrementally by reasoning about what code remains uncovered.
  • It uses slicing and closest-test retrieval to make LLM prompts more focused and effective.
  • Generated tests are saved as .py files and can be executed with pytest.

Baselines:

CoverUp Baseline

Run CoverUp baseline with DeepSeek model.

Prerequisites: Docker, Python 3.10+

Steps:

  1. Ensure .env file is configured (same as TestWeaver):
echo "OPENAI_API_KEY=sk-your-actual-api-key-here" > .env
echo "OPENAI_BASE_URL=https://llm-prof-tien.thaiminhpv.id.vn/" >> .env
  1. Load docker image:
docker load -i scripts/baselines/coverup/docker/coverup-runner.tar
  1. Run CoverUp baseline:
cd scripts/baselines/coverup
python3 scripts/eval_coverup.py --config deepseek-v3 --suite cm

Optional: Run on specific package or file:

python3 scripts/eval_coverup.py --config deepseek-v3 --suite cm --package tqdm
python3 scripts/eval_coverup.py --config deepseek-v3 --suite cm --only tqdm/_tqdm.py

Output: scripts/baselines/coverup/output/cm.deepseek-v3/<package>/final.json

CodaMosa Baseline

Run CodaMosa baseline with DeepSeek model.

Prerequisites: Docker, Python 3.10+

Steps:

  1. Ensure .env file is configured (same as TestWeaver):
echo "OPENAI_API_KEY=sk-your-actual-api-key-here" > .env
echo "OPENAI_BASE_URL=https://llm-prof-tien.thaiminhpv.id.vn/" >> .env
  1. Load docker images:
cd scripts/baselines/codamosa/replication
docker load < docker-images/benchmarks-docker.tar.gz
docker load < docker-images/codamosa-docker.tar.gz
  1. Start benchmark container (if not already started):
./scripts/start_benchmark_container.sh
  1. Run CodaMosa baseline:
python3 run_codamosa_deepseek.py

Output: scripts/baselines/codamosa/replication/deepseek-coda/<module>-<run>/statistics.csv

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •