cli-commands.md

CLI Commands Reference

Complete command-line interface reference for GitFlow Analytics.

🚀 Basic Usage

Default Command (Analyze)

# Simplified syntax (analyze is the default command)
gitflow-analytics [OPTIONS]

# Explicit analyze command (backward compatible)  
gitflow-analytics analyze [OPTIONS]

Examples:

# Basic analysis with configuration file
gitflow-analytics -c config.yaml

# Analyze last 8 weeks
gitflow-analytics -c config.yaml --weeks 8

# Clear cache and re-analyze
gitflow-analytics -c config.yaml --clear-cache

📋 Global Options

Required Options

-c, --config PATH - Path to YAML configuration file

Analysis Options

--weeks INTEGER - Number of weeks to analyze (default: from config)
--clear-cache - Clear analysis cache before running
--skip-identity-analysis - Skip automatic identity resolution
--validate-only - Validate configuration without running analysis

Output Options

--format [csv,json,markdown,all] - Output format(s) to generate
--output-dir PATH - Override output directory from config
--quiet - Suppress progress output
--verbose - Enable verbose logging

Utility Options

--version - Show version information
--help - Show help message and exit

🔧 Subcommands

analyze (default)

Run comprehensive repository analysis and generate reports.

gitflow-analytics analyze -c config.yaml [OPTIONS]

Options:

All global options apply
--repositories TEXT - Comma-separated list of repositories to analyze (overrides config)
--backfill-since YYYY-MM-DD - Hydrate pull_request_cache from this date forward. Bypasses the incremental fetch gate so historical PRs older than the last-processed checkpoint are fetched. Auto-triggers weekly_pr_metrics rollup for the same date range. Idempotent — safe to re-run. Does not change default behavior (#52).

Examples:

# Analyze specific repositories only
gitflow-analytics analyze -c config.yaml --repositories "repo1,repo2"

# Quick 2-week analysis with JSON output
gitflow-analytics analyze -c config.yaml --weeks 2 --format json

# Backfill all merged PRs back to a specific date
gfa analyze -c config.yaml --backfill-since 2025-01-01

fetch

Fetch data from external platforms (GitHub PRs, JIRA, ClickUp) and cache it locally.

gfa fetch -c config.yaml [OPTIONS]

Options:

-c, --config PATH - Path to YAML configuration file (required)
--weeks, -w INTEGER - Number of weeks to fetch (default: 4)
--output, -o PATH - Output directory for cache (overrides config)
--clear-cache - Clear cache before fetching data
--backfill-since YYYY-MM-DD - Hydrate pull_request_cache from this date forward. Bypasses the incremental fetch gate so historical PRs older than the last-processed checkpoint are fetched. Idempotent — safe to re-run. Does not change default behavior (#52).
--backfill-prs-since YYYY-MM-DD - Override the PR-fetch lower bound independently of --backfill-since. When set, PRs are fetched back to this date; commits still use --backfill-since. Takes priority over --backfill-since for the PR window. Idempotent. (#55)
--log [none|INFO|DEBUG] - Enable logging at the specified level (default: none)

Examples:

# Standard incremental fetch for the last 4 weeks
gfa fetch -c config.yaml

# Fetch the last 8 weeks
gfa fetch -c config.yaml --weeks 8

# Backfill all merged PRs back to a specific date
gfa fetch -c config.yaml --backfill-since 2025-01-01

# Re-run the same backfill safely (idempotent)
gfa fetch -c config.yaml --backfill-since 2025-01-01

# Backfill PRs further back than commits
gfa fetch -c config.yaml --backfill-since 2025-06-01 --backfill-prs-since 2025-01-01

classify

Run batch LLM classification on commits already in the cache (Stage 2 of the collect → classify → report pipeline).

gfa classify -c config.yaml [OPTIONS]

Options:

-c, --config PATH - Path to YAML configuration file (required)
-w, --weeks INTEGER - Number of weeks to classify (default: 4; should match the collect --weeks value)
--week YYYY-Www - Target a specific ISO week (e.g. 2026-W07). Repeatable for multiple discrete weeks. Mutually exclusive with --weeks N, --from, and --to.
--from YYYY-Www - Start of an inclusive ISO week range (e.g. --from 2026-W01). Must be paired with --to. Mutually exclusive with --weeks N and --week.
--to YYYY-Www - End of an inclusive ISO week range (e.g. --to 2026-W18). Must be paired with --from. Mutually exclusive with --weeks N and --week.
--reclassify - Re-classify commits that were already classified
--show-jira-signals - Log every commit short-circuited by the JIRA project-key mapping (issue #62). Useful for auditing which commits hit the tier-3 classification path defined in jira_project_mappings.
--validate-coverage - Warn when the fraction of classified commits falls below the coverage threshold. Exits with a non-zero code if coverage is insufficient (#65).
--coverage-threshold FLOAT - Minimum acceptable classification coverage ratio (default: 0.8). Values between 0.0 and 1.0. Used in conjunction with --validate-coverage.
--log [none|INFO|DEBUG] - Enable logging at the specified level (default: none)

Examples:

# Standard classification for the last 4 weeks
gfa classify -c config.yaml

# Force re-classification of all commits in range
gfa classify -c config.yaml --reclassify

# Classify a single ISO week
gfa classify -c config.yaml --week 2026-W07 --reclassify

# Classify an inclusive ISO week range
gfa classify -c config.yaml --from 2026-W01 --to 2026-W18 --reclassify

# Classify multiple discrete weeks (--week is repeatable)
gfa classify -c config.yaml --week 2026-W07 --week 2026-W08 --reclassify

# Audit which commits were classified via the JIRA project-key mapping
gfa classify -c config.yaml --show-jira-signals --log INFO

# Warn if fewer than 80% of commits are classified (default threshold)
gfa classify -c config.yaml --validate-coverage

# Warn if fewer than 90% of commits are classified
gfa classify -c config.yaml --validate-coverage --coverage-threshold 0.9

Prerequisite: gfa collect -c config.yaml

Next step: gfa report -c config.yaml

collect

Collect commits and pull request data from GitHub into the local cache database (Stage 1 of the collect → classify → report pipeline).

gfa collect -c config.yaml [OPTIONS]

Options:

-c, --config PATH - Path to YAML configuration file (required)
-w, --weeks INTEGER - Number of weeks to collect (default: 4)
--week YYYY-Www - Target a specific ISO week (e.g. 2026-W07). Repeatable for multiple discrete weeks. Mutually exclusive with --weeks N, --from, and --to.
--from YYYY-Www - Start of an inclusive ISO week range (e.g. --from 2026-W01). Must be paired with --to. Mutually exclusive with --weeks N and --week.
--to YYYY-Www - End of an inclusive ISO week range (e.g. --to 2026-W04). Must be paired with --from. Mutually exclusive with --weeks N and --week.
--log [none|INFO|DEBUG] - Enable logging at the specified level (default: none)

Examples:

# Standard incremental collect for the last 4 weeks
gfa collect -c config.yaml

# Collect a single ISO week
gfa collect -c config.yaml --week 2026-W07

# Collect multiple discrete weeks
gfa collect -c config.yaml --week 2026-W07 --week 2026-W08

Next step: gfa classify -c config.yaml

report

Generate reports from classified commit data already in the local cache (Stage 3 of the collect → classify → report pipeline).

gfa report -c config.yaml [OPTIONS]

Options:

-c, --config PATH - Path to YAML configuration file (required)
-w, --weeks INTEGER - Number of weeks to include in the report (default: 4)
--week YYYY-Www - Target a specific ISO week (e.g. 2026-W07). Repeatable for multiple discrete weeks. Mutually exclusive with --weeks N, --from, and --to.
--from YYYY-Www - Start of an inclusive ISO week range (e.g. --from 2026-W01). Must be paired with --to. Mutually exclusive with --weeks N and --week.
--to YYYY-Www - End of an inclusive ISO week range (e.g. --to 2026-W04). Must be paired with --from. Mutually exclusive with --weeks N and --week.
--format [csv|json|markdown|all] - Output format(s) to generate
--output-dir PATH - Override output directory from config
--log [none|INFO|DEBUG] - Enable logging at the specified level (default: none)

Examples:

# Generate a report for the last 4 weeks
gfa report -c config.yaml

# Report for a single ISO week
gfa report -c config.yaml --week 2026-W07

# Report for an inclusive ISO week range
gfa report -c config.yaml --from 2026-W01 --to 2026-W04

Prerequisite: gfa classify -c config.yaml

identities

Manage developer identity resolution and consolidation.

gitflow-analytics identities -c config.yaml [OPTIONS]

Options:

--interactive - Interactive identity resolution mode
--auto-approve - Automatically approve suggested identity mappings
--export PATH - Export identity mappings to YAML file
--import PATH - Import identity mappings from YAML file

Examples:

# Run interactive identity analysis
gitflow-analytics identities -c config.yaml --interactive

# Export current identity mappings
gitflow-analytics identities -c config.yaml --export identity-mappings.yaml

validate

Validate configuration files and system setup.

gitflow-analytics validate -c config.yaml [OPTIONS]

Options:

--check-tokens - Validate GitHub API tokens and permissions
--check-repos - Verify repository access and cloning
--check-ml - Validate ML model availability and setup

Examples:

# Comprehensive validation
gitflow-analytics validate -c config.yaml --check-tokens --check-repos --check-ml

# Quick config validation only
gitflow-analytics validate -c config.yaml

cache

Manage analysis cache and performance optimization.

gitflow-analytics cache [SUBCOMMAND] [OPTIONS]

Subcommands:

clear - Clear all cache databases
status - Show cache statistics and disk usage
optimize - Optimize cache databases (VACUUM)

Examples:

# Clear all caches
gitflow-analytics cache clear

# Show cache status
gitflow-analytics cache status

# Optimize cache performance
gitflow-analytics cache optimize

alias-rename

Rename a developer's canonical display name in manual mappings.

gitflow-analytics alias-rename -c config.yaml \
  --old-name "Current Name" \
  --new-name "New Name" \
  [OPTIONS]

Required Options:

--old-name TEXT - Current canonical name to rename (must exist in manual_mappings)
--new-name TEXT - New canonical display name to use in reports

Optional Flags:

--update-cache - Update cached database records with the new name
--dry-run - Show what would be changed without applying changes

Examples:

# Preview changes with dry-run
gitflow-analytics alias-rename -c config.yaml \
  --old-name "bianco-zaelot" \
  --new-name "Emiliozzo Bianco" \
  --dry-run

# Apply rename to config file only
gitflow-analytics alias-rename -c config.yaml \
  --old-name "bianco-zaelot" \
  --new-name "Emiliozzo Bianco"

# Update both config and database cache
gitflow-analytics alias-rename -c config.yaml \
  --old-name "bianco-zaelot" \
  --new-name "Emiliozzo Bianco" \
  --update-cache

What It Does:

Searches analysis.identity.manual_mappings for the old name
Updates the name field to the new name
Preserves all other fields (primary_email, aliases)
Optionally updates developer_identities and developer_aliases tables

Use Cases:

Fix typos in developer names
Use preferred names or nicknames
Update names after marriage or legal name changes
Standardize name formatting across team

Notes:

Without --update-cache, old name persists in cached data until next analysis
Always test with --dry-run first to preview changes
See Managing Aliases Guide for detailed usage

add-alias

Add alias mappings to a configuration file non-interactively. Suitable for scripting and CI pipelines where interactive prompts are not available.

gfa add-alias -c config.yaml \
  --canonical "[email protected]" \
  --alias "[email protected]" \
  --alias "Dev Name" \
  [OPTIONS]

Required Options:

-c, --config PATH - Path to YAML configuration file

Mapping Options (mutually exclusive — use one or the other):

--canonical EMAIL - Primary/canonical email for this developer identity; combine with one or more --alias flags
--from-file PATH - YAML or JSON file containing batch alias mappings (cannot be combined with --canonical)
--alias EMAIL_OR_NAME - Email address or display name to map to --canonical; repeatable

Behaviour Flags:

--dry-run - Show what would be changed without writing to the config file
--apply - Trigger identity re-resolution after updating the config

Examples:

# Map a personal email and display name to a canonical work email
gfa add-alias -c config.yaml \
  --canonical "[email protected]" \
  --alias "[email protected]" \
  --alias "Alice Smith"

# Preview changes before writing
gfa add-alias -c config.yaml \
  --canonical "[email protected]" \
  --alias "[email protected]" \
  --dry-run

# Load batch mappings from a YAML file and re-resolve identities
gfa add-alias -c config.yaml \
  --from-file aliases.yaml \
  --apply

# Load batch mappings from a JSON file
gfa add-alias -c config.yaml \
  --from-file aliases.json

Supported --from-file Formats:

GFA native YAML — a config file with a developer_aliases: key:

developer_aliases:
  - canonical: "[email protected]"
    aliases: ["[email protected]", "Alice Smith"]

Flat YAML list — a list of {canonical, aliases} objects:

- canonical: "[email protected]"
  aliases:
    - [email protected]
    - Alice Smith
- canonical: "[email protected]"
  aliases:
    - [email protected]

JSON array — equivalent structure in JSON:

[
  {"canonical": "[email protected]", "aliases": ["[email protected]", "Alice Smith"]},
  {"canonical": "[email protected]",   "aliases": ["[email protected]"]}
]

What It Does:

Reads the existing analysis.identity.manual_mappings (or developer_aliases) section in the config
Merges new aliases into the matching canonical entry, or creates a new entry if the canonical is not yet present
Skips duplicates — existing aliases are never written twice (idempotent)
Writes the updated config back to disk (unless --dry-run is specified)
Optionally triggers identity re-resolution via --apply

Use Cases:

Onboarding automation: script alias setup as part of repo initialisation
CI pipelines: keep alias mappings in a separate file and apply them on deploy
Bulk imports: migrate alias lists from another tool's export format
Safe updates: use --dry-run to audit changes before committing them

Notes:

--from-file and --canonical are mutually exclusive; combining them is an error
The operation is idempotent: running the same command twice produces the same config
Always verify with --dry-run before running in unattended automation
See Managing Aliases Guide for detailed identity management guidance

backfill-ticket-ids

Populate NULL ticket_ids and commit_count values on existing cached pull requests using commit messages already stored in cached_commits. No GitHub API calls are made — all data is sourced from the local cache database. The operation is idempotent and safe to re-run.

gfa backfill-ticket-ids -c config.yaml

Options:

-c, --config PATH - Path to YAML configuration file (required)

Examples:

# Backfill ticket IDs and commit counts for all cached PRs
gfa backfill-ticket-ids -c config.yaml

What It Does:

Queries pull_request_cache for rows where ticket_ids IS NULL or commit_count IS NULL
For each such PR, joins against cached_commits using commit hashes already in the local database
Extracts JIRA-style ticket IDs matching [A-Z]+-\d+ from each commit message and deduplicates them
Counts commit hashes to produce commit_count
Writes ticket_ids (JSON array, e.g. ["DUE-1234", "CORE-567"]) and commit_count back to the row
Makes no outbound GitHub API calls — entirely offline

Use Cases:

Upgrading an existing installation to v3.14.22 and enriching previously cached PRs
Recovering ticket_ids after a schema migration
Re-running after updating the ticket-ID extraction regex without re-fetching PRs

Notes:

PRs whose ticket_ids and commit_count are already populated are skipped
Commit messages must already be present in cached_commits; PRs with no associated cached commits will have ticket_ids set to [] and commit_count set to 0
See Cache System Reference for the full pull_request_cache schema

override

Manage manual classification overrides for individual commits. Overrides take precedence over all automated classifiers (LLM, rule-based, JIRA mapping).

gfa override <subcommand> [OPTIONS]

Subcommands:

override set

Manually assign a change_type classification to a specific commit.

gfa override set <commit_sha> <change_type> -c config.yaml

Arguments:

commit_sha - Full or abbreviated commit SHA to override
change_type - Classification to apply (e.g. feature, maintenance, bugfix, analytics)

Examples:

# Override a commit to be classified as a feature
gfa override set abc1234 feature -c config.yaml

# Override a commit to be classified as maintenance
gfa override set def5678 maintenance -c config.yaml

override list

List all manual classification overrides currently stored in the cache database.

gfa override list -c config.yaml

Options:

-c, --config PATH - Path to YAML configuration file (required)

Examples:

# Show all active overrides
gfa override list -c config.yaml

override remove

Remove a manual classification override, returning the commit to automated classification on the next gfa classify run.

gfa override remove <commit_sha> -c config.yaml

Arguments:

commit_sha - Full or abbreviated commit SHA whose override should be removed

Examples:

# Remove an override and allow automated re-classification
gfa override remove abc1234 -c config.yaml

What It Does (all override subcommands):

Reads/writes the classification_overrides table in the local cache database
On the next gfa classify run, overrides take the highest precedence — the LLM call for overridden commits is skipped entirely
override remove deletes the row; the commit reverts to standard classification on the next classify pass

backfill-ai-detection

Backfill AI-detection results for commits already stored in cached_commits. Scans commit footers and trailers for known AI-tool signatures (Cursor, Claude, GitHub Copilot) and writes results to the cache database. No GitHub API calls are made — all data is sourced from cached commit messages.

gfa backfill-ai-detection -c config.yaml

Options:

-c, --config PATH - Path to YAML configuration file (required)

Examples:

# Detect AI footers for all cached commits
gfa backfill-ai-detection -c config.yaml

What It Does:

Queries cached_commits for all rows
Scans each commit message/body for AI-tool trailer patterns:
- Co-authored-by: Claude (Anthropic Claude)
- Co-authored-by: GitHub Copilot (GitHub Copilot)
- Cursor-specific commit footers
Writes is_ai_assisted flag and detected tool name back to cached_commits
Operation is idempotent — safe to re-run after adding new AI-tool signatures

Use Cases:

Upgrading an existing installation and enriching previously cached commits
Adding AI-tool detection to a repository after the fact
Re-running after adding new AI-tool signature patterns

📊 Output Formats

CSV Format (`--format csv`)

Generates structured data files:

weekly_metrics_YYYYMMDD.csv - Weekly development metrics
developers_YYYYMMDD.csv - Developer profiles and statistics
summary_YYYYMMDD.csv - Project-wide summary statistics
untracked_commits_YYYYMMDD.csv - Commits without ticket references

JSON Format (`--format json`)

Generates comprehensive data export:

comprehensive_export_YYYYMMDD.json - Complete analysis data

Markdown Format (`--format markdown`)

Generates human-readable reports:

narrative_report_YYYYMMDD.md - Executive summary with insights

All Formats (`--format all`)

Generates all available output formats.

🚨 Exit Codes

GitFlow Analytics uses standard exit codes:

0: Success - Analysis completed successfully
1: General error - Configuration or processing error
2: Configuration error - Invalid YAML or missing required fields
3: Authentication error - Invalid or missing GitHub token
4: Repository error - Repository access or cloning failed
5: Analysis error - Analysis processing failed
6: Output error - Report generation failed

🔍 Environment Variables

GitFlow Analytics recognizes these environment variables:

Authentication

GITHUB_TOKEN - GitHub personal access token
JIRA_ACCESS_USER - JIRA username for API access
JIRA_ACCESS_TOKEN - JIRA API token or password

Configuration

GITFLOW_CONFIG - Default configuration file path
GITFLOW_CACHE_DIR - Override default cache directory
GITFLOW_LOG_LEVEL - Set logging level (DEBUG, INFO, WARNING, ERROR)

Performance

GITFLOW_MAX_WORKERS - Maximum parallel processing workers
GITFLOW_BATCH_SIZE - Commit processing batch size
GITFLOW_TIMEOUT - Network request timeout in seconds

💡 Usage Patterns

Daily Team Health Check

# Quick 1-week analysis for daily standup insights
gitflow-analytics -c config.yaml --weeks 1 --format markdown --quiet

Weekly Sprint Review

# 2-week analysis with comprehensive data
gitflow-analytics -c config.yaml --weeks 2 --format all

Monthly Planning Analysis

# 4-week analysis with cache clearing for fresh data
gitflow-analytics -c config.yaml --weeks 4 --clear-cache --format all

Quarterly Strategic Review

# 12-week comprehensive analysis
gitflow-analytics -c config.yaml --weeks 12 --format all --verbose

CI/CD Integration

# Automated analysis with JSON export for dashboard integration
gitflow-analytics -c config.yaml --weeks 4 --format json --quiet

🔧 Advanced Usage

Configuration Override

# Override output directory
gitflow-analytics -c config.yaml --output-dir /custom/reports/

# Analyze subset of repositories
gitflow-analytics -c config.yaml --repositories "critical-repo,main-app"

Performance Optimization

# Use cached analysis for faster reporting
gitflow-analytics -c config.yaml --weeks 8

# Clear cache for fresh analysis (slower but current)
gitflow-analytics -c config.yaml --weeks 8 --clear-cache

Debugging and Troubleshooting

# Verbose output for debugging
gitflow-analytics -c config.yaml --verbose

# Validate configuration before running
gitflow-analytics validate -c config.yaml --check-tokens --check-repos

# Test configuration without full analysis
gitflow-analytics -c config.yaml --validate-only

🆘 Common Issues

"Command not found"

# Ensure GitFlow Analytics is installed and in PATH
pip show gitflow-analytics
which gitflow-analytics

# Install if missing
pip install gitflow-analytics

"Configuration file not found"

# Provide absolute path to configuration
gitflow-analytics -c /full/path/to/config.yaml

# Check current directory for config file
ls -la *.yaml

"GitHub API rate limit exceeded"

# Check token is set correctly
echo $GITHUB_TOKEN

# Validate token has necessary permissions
gitflow-analytics validate -c config.yaml --check-tokens

"Repository not found or access denied"

# Verify repository names and permissions
gitflow-analytics validate -c config.yaml --check-repos

# Check GitHub token has access to repositories

📚 Related Documentation

Configuration Guide - Complete YAML configuration reference
Getting Started - Installation and first steps
Troubleshooting - Common issues and solutions
Examples - Real-world usage scenarios

🔄 Command History and Aliases

Useful Shell Aliases

# Add to your .bashrc or .zshrc
alias gfa='gitflow-analytics'
alias gfa-weekly='gitflow-analytics -c config.yaml --weeks 1'
alias gfa-monthly='gitflow-analytics -c config.yaml --weeks 4 --clear-cache'
alias gfa-validate='gitflow-analytics validate -c config.yaml --check-all'

Bash Completion

GitFlow Analytics supports bash completion for commands and options:

# Enable bash completion (if supported)
eval "$(_GITFLOW_ANALYTICS_COMPLETE=bash_source gitflow-analytics)"

FilesExpand file tree

cli-commands.md

Latest commit

History

cli-commands.md

File metadata and controls

CLI Commands Reference

🚀 Basic Usage

Default Command (Analyze)

📋 Global Options

Required Options

Analysis Options

Output Options

Utility Options

🔧 Subcommands

analyze (default)

fetch

classify

collect

report

identities

validate

cache

alias-rename

add-alias

backfill-ticket-ids

override

override set

override list

override remove

backfill-ai-detection

📊 Output Formats

CSV Format (--format csv)

JSON Format (--format json)

Markdown Format (--format markdown)

All Formats (--format all)

🚨 Exit Codes

🔍 Environment Variables

Authentication

Configuration

Performance

💡 Usage Patterns

Daily Team Health Check

Weekly Sprint Review

Monthly Planning Analysis

Quarterly Strategic Review

CI/CD Integration

🔧 Advanced Usage

Configuration Override

Performance Optimization

Debugging and Troubleshooting

🆘 Common Issues

"Command not found"

"Configuration file not found"

"GitHub API rate limit exceeded"

"Repository not found or access denied"

📚 Related Documentation

🔄 Command History and Aliases

Useful Shell Aliases

Bash Completion

CSV Format (`--format csv`)

JSON Format (`--format json`)

Markdown Format (`--format markdown`)

All Formats (`--format all`)