Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Upload Python Package to PyPI

on:
release:
types: [created]
types: [ created ]

jobs:
pypi-publish:
Expand Down
30 changes: 15 additions & 15 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,20 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]
python-version: [ "3.8", "3.9", "3.10", "3.11" ]

steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest-cov
pip install -e .
- name: Run tests
run: |
pytest tests/
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest-cov
pip install -e .
- name: Run tests
run: |
pytest tests/
49 changes: 34 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@
/ /___ | (_) || (_| || __/ / /\/\ \ / /_//
\____/ \___/ \__,_| \___| \/ \//___,'

Ver. 0.0.2
````
Ver. 0.0.2b
```

# CodeMD

🚀 Transform code repositories into markdown-formatted strings ready for LLM prompting
🚀 Transform code files and repositories into markdown-formatted strings ready for LLM prompting

[![Tests](https://github.com/dotpyu/codemd/actions/workflows/tests.yml/badge.svg)](https://github.com/dotpyu/codemd/actions/workflows/tests.yml)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
Expand All @@ -21,28 +21,32 @@ Ver. 0.0.2

## 📝 Overview

CodeMD helps you convert your entire codebase into a format that's optimal for code-related prompts with Large Language Models (LLMs) like GPT-4, Claude, and others. It automatically processes your code files and outputs them in a clean, markdown-formatted structure that's perfect for LLM interactions.
CodeMD helps you convert your code files or entire codebase into a format that's optimal for code-related prompts with Large Language Models (LLMs) like GPT-4, Claude, and others. It automatically processes your code files and outputs them in a clean, markdown-formatted structure that's perfect for LLM interactions.

## ✨ Features

- 🔍 **Smart Directory Scanning**: Recursively scans directories for code files
- 🎯 **Flexible Configuration**:
- 🔍 **Flexible Processing**:
- Single file processing
- Recursive directory scanning
- 🎯 **Configurable Options**:
- Configurable file extensions
- File and pattern exclusion support
- Custom .gitignore support
- 📊 **Intelligent Output**:
- 📊 **Smart Output**:
- Markdown-formatted code blocks
- Preserved directory structure
- Repository structure visualization
- Optional directory structure visualization
- Token count estimation (with tiktoken)
- Configurable output display
- 📋 **Convenience**:
- Simple command-line interface
- Direct copy-to-clipboard support
- Multiple output options

### 🎉 Recent Updates
### 🎉 Recent Updates (0.0.2b)

- ⭐ **NEW**: Repository structure visualization (disable with `--no-structure`)
- ⭐ **NEW**: Single file processing support
- ⭐ **NEW**: Configurable output display (use `--print` to show output)
- ⭐ **NEW**: Repository structure visualization (auto-disabled for single files, or use `--no-structure`)
- ⭐ **NEW**: Automatic .gitignore support
- Uses project's .gitignore by default
- Custom .gitignore files via `--gitignore`
Expand All @@ -65,13 +69,27 @@ pip install -e .

### Command Line Interface

**Basic Usage:**
**Single File Processing:**
```bash
codemd /path/to/your/code
# Process a single file (no output by default)
codemd /path/to/script.py

# Process and display output
codemd /path/to/script.py --print

# Save to file
codemd /path/to/script.py -o output.md
```

**Custom Extensions and Output:**
**Directory Processing:**
```bash
# Basic directory scanning (no output by default)
codemd /path/to/your/code

# Show output in terminal
codemd /path/to/your/code --print

# Custom extensions and output file
codemd /path/to/your/code -e py,java,sql -o output.md
```

Expand Down Expand Up @@ -106,6 +124,7 @@ Contributions are welcome! Feel free to open issues or submit pull requests.
Distributed under the Apache 2.0 License. See `LICENSE` for more information.

---

<div align="center">
Made with ❤️ by Peilin
</div>
</div>
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "hatchling.build"

[project]
name = "codemd"
version = "0.0.2"
version = "0.0.2b"
authors = [
{ name = "Peilin Yu", email = "peilin_yu@brown.edu" },
]
Expand Down
6 changes: 3 additions & 3 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os

from setuptools import setup, find_packages

with open(os.path.join(os.path.dirname(__file__), "README.md"), encoding="utf-8") as f:
Expand All @@ -7,10 +8,9 @@
with open("requirements.txt", encoding="utf-8") as f:
requirements = [line.strip() for line in f if line.strip() and not line.startswith("#")]


setup(
name="codemd",
version="0.0.2",
version="0.0.2b",
author="Peilin Yu",
author_email="peilin_yu@brown.edu",
description="Transform code repositories into markdown-formatted strings ready for LLM prompting",
Expand Down Expand Up @@ -43,4 +43,4 @@
"Bug Reports": "https://github.com/dotpyu/codemd/issues",
"Source": "https://github.com/dotpyu/codemd",
},
)
)
4 changes: 2 additions & 2 deletions src/codemd/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from .cli import main
from .scanner import CodeScanner

__version__ = "0.0.2"
__all__ = ["CodeScanner", "main"]
__version__ = "0.0.2b"
__all__ = ["CodeScanner", "main"]
64 changes: 38 additions & 26 deletions src/codemd/cli.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
import argparse
import platform
import subprocess
import sys
from pathlib import Path
from typing import Set, Tuple
import subprocess
import platform

from .scanner import CodeScanner

try:
import tiktoken

TIKTOKEN_AVAILABLE = True
except ImportError:
TIKTOKEN_AVAILABLE = False


BANNER = r"""
___ _ ___
/ __\ ___ __| | ___ /\/\ / \
Expand All @@ -24,13 +24,18 @@

EPILOG = """
Examples:
# Basic usage (prints to stdout)
# Basic usage (file or directory, no output by default)
codemd /path/to/code
codemd /path/to/file.py

# Print output to stdout
codemd /path/to/code --print
codemd /path/to/file.py --print

# Custom extensions (prints to stdout)
# Custom extensions
codemd /path/to/code -e py,java,sql

# Save to file instead of printing
# Save to file
codemd /path/to/code -o output.md

# Exclude patterns and specific files
Expand All @@ -39,28 +44,31 @@
# Non-recursive scan with custom output
codemd /path/to/code --no-recursive -o custom.md

# Disable structure output
# Disable structure output (auto-disabled for single files)
codemd /path/to/code --no-structure

# Use specific gitignore files
codemd /path/to/code --gitignore .gitignore .custom-ignore

# Disable gitignore processing
codemd /path/to/code --ignore-gitignore

# Process single file and print output
codemd /path/to/script.py --print -o script.md
"""


def parse_arguments() -> argparse.Namespace:
"""Parse command line arguments."""
parser = argparse.ArgumentParser(
prog='codemd',
description='Transform code repositories into markdown-formatted strings ready for LLM prompting',
description='Transform code repositories or files into markdown-formatted strings ready for LLM prompting',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=EPILOG
)

parser.add_argument('directory', type=str, help='Directory to scan')
parser.add_argument('-e', '--extensions', type=str, default='py,java,js,cpp,c,h,hpp',
parser.add_argument('path', type=str, help='File or directory to scan')
parser.add_argument('-e', '--extensions', type=str, default=None,
help='Comma-separated list of file extensions to include (without dots)')
parser.add_argument('--exclude-patterns', type=str, default='',
help='Comma-separated list of patterns to exclude (e.g., test_,debug_)')
Expand All @@ -74,6 +82,8 @@ def parse_arguments() -> argparse.Namespace:
help='Enable verbose output')
parser.add_argument('--no-structure', action='store_true',
help='Disable repository structure output')
parser.add_argument('--print', action='store_true',
help='Print the markdown output (disabled by default)')

parser.add_argument(
'--gitignore',
Expand All @@ -90,8 +100,10 @@ def parse_arguments() -> argparse.Namespace:

return parser.parse_args()


def str_to_set(s: str) -> Set[str]:
"""Convert comma-separated string to set of strings."""
if s is None: return None
return {item.strip() for item in s.split(',') if item.strip()}


Expand Down Expand Up @@ -203,19 +215,16 @@ def format_token_info(token_count: int, model_name: str) -> str:

def main() -> int:
print(BANNER)
print("Version 0.0.2")
print("Transform your code into LLM-ready prompts\n")
print("Version 0.0.2b")
print("Transform your code into LLM-ready prompts and automatically copy them to your clipboard!\n")

try:
args = parse_arguments()
directory = Path(args.directory)
path = Path(args.path)
output_file = Path(args.output) if args.output else None

if not directory.exists():
print(f"Error: Directory '{directory}' does not exist", file=sys.stderr)
return 1
if not directory.is_dir():
print(f"Error: '{directory}' is not a directory", file=sys.stderr)
if not path.exists():
print(f"Error: Path '{path}' does not exist", file=sys.stderr)
return 1

extensions = str_to_set(args.extensions)
Expand All @@ -230,16 +239,18 @@ def main() -> int:
ignore_gitignore=args.ignore_gitignore
)

scanner.no_structure = args.no_structure

scanner.no_structure = args.no_structure or path.is_file()

try:
content = scanner.scan_directory(
directory,
recursive=not args.no_recursive
)
if path.is_file():
content = scanner.scan_file(path)
else:
content = scanner.scan_directory(
path,
recursive=not args.no_recursive
)
except Exception as e:
print(f"Error scanning directory: {str(e)}", file=sys.stderr)
print(f"Error scanning path: {str(e)}", file=sys.stderr)
return 1

files = content.count('```') // 2
Expand All @@ -262,7 +273,8 @@ def main() -> int:
if args.verbose:
print(f"\nProcessed {files} files ({chars:,} characters)")
print(token_info + "\n")
print(content)
if args.print: # Only print content if --print flag is set
print(content)
print(token_info)
prompt_for_copy(content)

Expand Down
Loading
Loading