GPT Copy is a command-line tool that recursively scans a directory, collects readable files, and concatenates them into a single structured markdown stream. The output can be printed to stdout or written to a file, making it easy to feed codebases, documentation, or notes into language models like GPT.
- Recursive Directory Scanning: Respects
.gitignore
rules to selectively process files. - Structured Output: Concatenates file contents into a structured markdown document with file-specific code fences.
- File Filtering: Supports glob-style include (
-i/--include
) and exclude (-e/--exclude
) patterns for precise file selection. - Force Mode: The
-f/--force
option bypasses ignore rules and Git-tracked file restrictions. - Line Numbering: Add zero-padded line numbers to each file's content using the
-n/--number
option (similar tocat -n
). - Token Counting: Includes a separate
tokens
CLI command to count the number of tokens in text using OpenAI’stiktoken
library with GPT-4o model encoding.
Ensure you have Python 3 installed. You can install the dependencies using:
pip install -r requirements.txt
Alternatively, install directly from Git:
pip install git+https://github.com/simone-viozzi/gpt-copy.git
Run the tool by specifying the target directory:
gpt-copy /path/to/directory
Redirect the output to a file:
gpt-copy /path/to/directory -o output.md
Fine-tune which files are processed using include and exclude options.
-
Include Files (
-i
or--include
): Specify one or more glob patterns (with optional brace expansion) to include only matching files.Examples:
- Include all Python files in the
src
folder:gpt-copy /path/to/directory -i "src/*.py"
- Include specific modules:
gpt-copy /path/to/directory -i "src/{module1,module2}.py"
- Include all Python files in the
-
Exclude Files (
-e
or--exclude
): Specify one or more glob patterns to exclude files. Exclusion takes precedence over inclusion.Examples:
- Exclude all files in the
tests
folder:gpt-copy /path/to/directory -e "tests/*"
- Exclude a specific file:
gpt-copy /path/to/directory -i "src/*.py" -e "src/__init__.py"
- Exclude all files in the
Ignore .gitignore
and Git-tracked file restrictions to process all files:
gpt-copy /path/to/directory -f
Enable line numbering for the content of each file. This option prefixes each line with a zero-padded line number, similar to the Unix cat -n
command.
Usage Example:
gpt-copy /path/to/directory -n
Count the number of tokens in a given text using GPT-4o encoding. The command reads from a file or standard input.
Examples:
- Count tokens in a file:
tokens file.txt
- Pipe output from
gpt-copy
intotokens
:gpt-copy /path/to/directory | tokens
-
Collects
.gitignore
Rules: Scans the directory for.gitignore
files and applies the rules to skip ignored files unless the force mode is enabled. -
Generates a Structured File Tree: Creates a visual representation of the directory structure.
-
Reads and Formats Files:
- Detects file type based on extension.
- Wraps file contents in appropriate markdown code fences.
- Adds line numbers if the
-n/--number
option is enabled. - Skips binary or unrecognized file types.
-
Applies File Filtering: Uses include and exclude glob patterns to determine which files to process, based on their paths relative to the root directory.
# Folder Structure
```
project_root
├── main.py
├── README.md
└── subdir
├── config.yaml
└── script.js
```
## File Contents
### File: `main.py`
*(Relative Path: `main.py`)*
```python
print("Hello, World!")
```
### File: `config.yaml`
*(Relative Path: `subdir/config.yaml`)*
```yaml
version: 1.0
enabled: true
```
Contributions are welcome! If you'd like to contribute, please open a pull request with your proposed changes.
This project is licensed under the MIT License.