PR2MD is a powerful command-line tool that extracts GitHub Pull Request and Issue data and converts it into comprehensive, well-formatted Markdown documents. Perfect for documentation, archiving, code reviews, or offline analysis of pull requests and issues.
- π₯ Complete PR & Issue Data Extraction: Retrieves all PR and Issue details including metadata, description, labels, and timestamps
- π¬ Full Conversation Thread: Captures all comments and discussions in chronological order
- β Review Information: Includes all code reviews with approval status and reviewer comments (PRs only)
- π» Code Comments: Extracts inline review comments with their associated code context (PRs only)
- π Change Statistics: Displays files changed, additions, deletions, and commit information (PRs only)
- π Complete Diffs: Includes the full unified diff of all changes (PRs only)
- π¨ Beautiful Formatting: Generates clean, readable Markdown with proper structure and syntax highlighting
- β‘ Fast & Efficient: Uses the official GitHub REST API with proper error handling
- π Type-Safe: Written in Python with comprehensive type annotations
The easiest way to install PR2MD is directly from PyPI:
pip install pr2mdThat's it! The pr2md command will be available in your terminal.
Alternatively, you can install from source for development or to get the latest unreleased features:
# Clone the repository
git clone https://github.com/tboy1337/PR2MD.git
cd PR2MD
# Install dependencies
pip install -r requirements.txt
# Install the package
pip install -e .- Python 3.12 or higher
requestslibrary (automatically installed with pip)
After installing via pip, you can immediately start using PR2MD:
# Extract a PR by URL (saves to PR-123.md)
pr2md https://github.com/owner/repo/pull/123
# Extract an Issue by URL (saves to Issue-456.md)
pr2md https://github.com/owner/repo/issues/456
# Save to a custom filename
pr2md https://github.com/owner/repo/pull/123 -o output.md
# Output to console/stdout
pr2md https://github.com/owner/repo/pull/123 -oExtract a PR using its URL (automatically saves to PR-123.md):
pr2md https://github.com/owner/repo/pull/123
python -m pr2md https://github.com/owner/repo/pull/123Extract an Issue using its URL (automatically saves to Issue-456.md):
pr2md https://github.com/owner/repo/issues/456Or specify the owner, repository, type, and number separately:
pr2md owner repo pr 123
pr2md owner repo issue 456Output the Markdown to a custom filename:
pr2md https://github.com/owner/repo/pull/123 -o pr-details.md
pr2md owner repo pr 123 --output pr-analysis.md
pr2md owner repo issue 456 --output issue-report.mdOutput to stdout instead of saving to a file:
pr2md https://github.com/owner/repo/pull/123 -o
pr2md owner repo pr 123 --output
pr2md owner repo issue 456 --outputEnable detailed logging for debugging:
pr2md https://github.com/owner/repo/pull/123 -v
pr2md https://github.com/owner/repo/pull/123 --verboseBy default, PR2MD automatically scans for and downloads referenced PRs and issues mentioned in the main PR/Issue. You can configure this behavior:
# Set maximum recursion depth for downloading references (default: 2)
pr2md https://github.com/owner/repo/pull/123 --depth 3
# Download direct references only (no recursion into their references)
pr2md https://github.com/owner/repo/pull/123 --depth 0
# Disable automatic downloading of referenced PRs and issues
pr2md https://github.com/owner/repo/pull/123 --no-references
# Exit with code 2 if any referenced download fails (default: partial success is OK)
pr2md https://github.com/owner/repo/pull/123 --strictThe --depth option controls how many levels deep the tool will follow references. For example, with --depth 2, if PR #123 references PR #456, and PR #456 references PR #789, the tool will download all three PRs. With --depth 1, it would only download PR #123 and PR #456. With --depth 0, only direct references from the primary PR or issue are downloaded (no further recursion).
Note: Reference downloading only works when using the default auto-naming (omitting -o). If you specify any output filename with -o, reference downloading is automatically disabled.
Pre-built Windows executables are published on GitHub Releases alongside each tagged version. pip installs remain the recommended cross-platform option.
View all available options:
pr2md --help
pr2md --versionThe generated Markdown document includes:
- PR number, title, and status (Open/Closed/Merged)
- Author information with GitHub profile link
- Creation, update, closed, and merged timestamps
- Base and head branch information with commit SHAs
- Labels (if any)
- The full PR description/body
- Number of files changed
- Line additions and deletions
- Complete unified diff of all changes
- Syntax-highlighted code blocks
- All comments from the PR discussion
- Chronologically sorted
- Author attribution and timestamps
- Links back to GitHub
- All submitted reviews
- Review state (Approved β , Changes Requested π΄, Commented π¬, etc.)
- Review comments and timestamps
- Inline code review comments
- Grouped by file
- Includes code context (diff hunk)
- Reply chains preserved
- Issue number, title, and status (Open/Closed)
- Author information with GitHub profile link
- Creation, update, and closed timestamps
- Labels (if any)
- The full issue description/body
- All comments from the issue discussion
- Chronologically sorted
- Author attribution and timestamps
- Links back to GitHub
# Extract PR #42 from the PR2MD repository (saves to PR-42.md)
pr2md tboy1337 PR2MD pr 42
# Extract Issue #10 from the PR2MD repository (saves to Issue-10.md)
pr2md tboy1337 PR2MD issue 10This creates files containing all the PR/Issue information in beautifully formatted Markdown documents.
If you want a custom filename:
pr2md tboy1337 PR2MD pr 42 -o pr-42-analysis.md
pr2md tboy1337 PR2MD issue 10 -o issue-10-report.mdThe tool uses the GitHub REST API without authentication. GitHub imposes rate limits:
- Unauthenticated requests: 60 requests per hour per IP address
- Authenticated requests: 5,000 requests per hour (not supported by PR2MD)
When the API returns a rate-limit response, PR2MD waits and retries automatically, up to 5 waits or 3600 seconds of total wait time per run. Progress messages are logged at INFO level (for example, "Rate limited, waiting 45sβ¦"). If that budget is exhausted, the run fails with an error.
For typical single PR or issue exports, unauthenticated access is usually sufficient. Reference downloading with --depth greater than zero consumes additional API calls. Use --no-references or lower --depth to reduce API usage.
Authentication is not implemented by design. Private repositories are not supported.
PR2MD avoids silent truncation where possible, with explicit bounds:
- Paginated data (comments, reviews, review comments) is fetched page-by-page until GitHub returns no further pages, up to a maximum of 100 pages (~10,000 items) per endpoint; exceeding that limit fails with an error
- Full diffs are always included for pull requests, regardless of size; there is no maximum export size for PRs, issues, or diffs. Tiered log messages appear at 5 MB (warning), 25 MB (info), and 100 MB (warning) so you know memory and disk use may be high. Large diffs use streaming HTTP reads with an extended read timeout (300 seconds)
- Reference downloads are unlimited in count; only
--depthbounds recursion - Primary exports fail without writing a file when an unrecoverable API error occurs (exit code 1)
- Reference downloads that fail are listed in stderr and appended as a
## Reference Download Summarysection in the primary markdown file; use--strictto exit with code 2 when any reference fails. Summary appends use streaming I/O so they work on very large output files
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Primary extraction, write, or summary append failed (including when the PR/issue number does not exist) |
| 2 | --strict was set and one or more reference downloads failed |
Install the package with development dependencies:
pip install -e ".[dev]"Alternatively, install runtime and dev dependencies separately:
pip install -e .
pip install -r requirements-dev.txtRun the local verification script (formatting, type checks, lint, security scan, tests):
py scripts/verify.pyUnit tests run by default; integration tests (live GitHub API) are excluded:
pytest # unit tests only
pytest -m integration # live API smoke testsTests enforce at least 90% combined coverage (see pytest.ini and .coveragerc). py scripts/verify.py runs the full local quality gate before release.
- Public repositories only β no GitHub token or private-repo support
- Rate limited β 60 API requests per hour without authentication; the tool waits and retries when limited, up to 5 waits or 3600 seconds total per run
- Pagination cap β at most 100 pages (~10,000 items) per paginated endpoint
- Reference downloads β unlimited in count; only
--depthbounds recursion. A PR or issue with many#NNNreferences can consume the full hourly API budget and produce many files. Failures are reported in the output file and stderr; use--strictfor exit code 2 - Reference shorthand parsing β
#123andowner/repo#123are parsed as pull requests until download-time type correction - Requires an internet connection to fetch data
- Large PRs with extensive diffs may generate very large Markdown files; responses are streamed with no artificial size cap. Tiered size notices are logged at 5 MB, 25 MB, and 100 MB; diff reads use a 300 second timeout
- Custom output paths (
-o path) must stay within the current working directory; nested subdirectories are created automatically when needed - Issues accessed via the
/issues/URL path are treated as issues; use/pull/or explicitprfor pull requests - Non-existent resources β if the repository or PR/issue number does not exist, the run exits immediately with code 1 and no output file is written
This project is licensed under the CRL License - see LICENSE.md for details.