A simple and efficient command-line tool that merges all your project's source code into a single file, making it easy to copy and paste large codebases into Large Language Model (LLM) prompts.
When interacting with LLMs like ChatGPT, Gemini, or Claude for code-related tasks, we often need to provide the full context of a project. Manually copying and pasting content from multiple files is not only inefficient but also prone to errors.
Code4LM automates this process. With a single command, it scans your project and "packs" all relevant source files into a neatly organized text file, allowing you to copy the entire project's code at once.
📂 Smart Merging: Recursively traverses your project directory and merges all files with specified extensions.
🎯 Customizable File Types: Freely select which file types to include using the -exts flag (e.g., .py, .js, .md).
🙈 Flexible Exclusions: Easily exclude unnecessary directories (like node_modules or pycache) with --exclude and specific files with --exclude-files.
💧 Dry Run Mode: Use the --dry-run option to preview which files will be merged without actually creating a file.
⌨️ Pure Command-Line: Lightweight and easy to integrate into any workflow.
Ensure you have Python 3.7+ installed. You can then install Code4LM directly from a Git repository (recommended) or from a local clone.
pip install code4lmpip install git+https://github.com/KylinMountain/code4lm.gitClone or download the project to your local machine.
In your terminal, navigate to the project's root directory (where pyproject.toml is located).
Run the following command:
pip install .
Once installed, the code4lm command will be available globally in your system.
Simply run the code4lm command in the root directory of the project you want to process.
Running the command in a project directory will merge files using the default configuration:
code4lm
usage: code4lm [-h] [--path PATH] [--output OUTPUT] [--exts EXTS] [--exclude EXCLUDE] [--exclude-files EXCLUDE_FILES] [--dry-run]
A tool to merge project source code files for Large Language Models.
options:
-h, --help show this help message and exit
--path PATH The root path of the project to process. (default: .)
--output OUTPUT The name of the output file. (default: all_code.txt)
--exts EXTS Comma-separated list of file extensions to include.
Example: .py,.js,.html
--exclude EXCLUDE Comma-separated list of directory names to exclude.
Example: venv,node_modules
--exclude-files EXCLUDE_FILES
Comma-separated list of exact file names to exclude.
Example: secret.key,temp.log
--dry-run Perform a dry run to see which files would be merged without creating the output file.
See what will happen without writing any files:
code4lm --dry-run
Include only .js and .css files, while excluding the dist directory:
code4lm --exts ".js,.css" --exclude "dist" --output "frontend_bundle.txt"
Skip merging config.dev.js and secret.json:
code4lm --exclude-files "config.dev.js,secret.json"
Contributions of any kind are welcome! If you have a great idea or find a bug, feel free to submit a Pull Request or create an Issue.
This project is licensed under the MIT License.