Thanks to visit codestin.com
Credit goes to github.com

Skip to content

pdax-leo/pdax-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

GitHub Commit Analyzer

A Go-based tool that analyzes and exports commit data from GitHub organization repositories for specific team members. This tool helps track team contributions across multiple repositories by generating detailed CSV reports of commits.

Features

  • Organization-wide Analysis: Scans all private repositories in a GitHub organization
  • Team Member Filtering: Extracts commits only from specified team members
  • Date Range Filtering: Supports filtering commits from a specific date onwards
  • Concurrent Processing: Uses configurable concurrency for efficient API usage
  • CSV Export: Generates detailed CSV reports with commit information
  • Rate Limit Handling: Automatically handles GitHub API rate limits
  • Error Resilience: Gracefully handles missing repositories and access issues

Prerequisites

  • Go 1.24.2 or later
  • GitHub Personal Access Token with appropriate permissions
  • Access to the target GitHub organization

Installation

  1. Clone this repository:
git clone <repository-url>
cd tools
  1. Install dependencies:
go mod download

Usage

Basic Usage

go run main.go -token=<your_github_token> -org=<organization_name> -members=<comma_separated_usernames>

Full Command Line Options

go run main.go [OPTIONS]

Required Parameters

  • -token: GitHub fine-grained personal access token
  • -org: GitHub organization name
  • -members: Comma-separated list of GitHub usernames to analyze

Optional Parameters

  • -since: Get commits since this date (format: YYYY-MM-DD, default: all commits)
  • -output: Output CSV file name (default: "commits.csv")
  • -concurrency: Number of concurrent repository processing threads (default: 10)

Example

go run main.go \
  -token=github_pat_xxxxxxxxxxxxx \
  -org=MyCompany \
  -members=john.doe,jane.smith,bob.wilson \
  -since=2024-01-01 \
  -output=team_commits_2024.csv \
  -concurrency=5

Output Format

The tool generates a CSV file with the following columns:

Column Description
Repository Name of the repository
Commit Full SHA hash of the commit
Author GitHub username of the commit author
Message Commit message (cleaned of newlines)
Date Commit date in YYYY-MM-DD HH:MM:SS format
URL Direct link to the commit on GitHub

GitHub Token Setup

  1. Go to GitHub Settings → Developer settings → Personal access tokens → Fine-grained tokens
  2. Create a new token with the following permissions for your organization:
    • Repository access: All repositories (or specific repositories)
    • Repository permissions:
      • Contents: Read
      • Metadata: Read
      • Pull requests: Read (if analyzing PR commits)

How It Works

  1. Repository Discovery: Fetches all private repositories from the specified organization
  2. Concurrent Processing: Processes multiple repositories simultaneously using goroutines
  3. Commit Extraction: Retrieves all commits from each repository (optionally filtered by date)
  4. Team Member Filtering: Filters commits to include only those authored by specified team members
  5. Data Export: Compiles all matching commits into a structured CSV report

Error Handling

The tool handles various scenarios gracefully:

  • Rate Limiting: Automatically waits when GitHub API rate limits are hit
  • Missing Repositories: Skips repositories that are not found or inaccessible
  • Empty Repositories: Handles repositories with no commits
  • Network Issues: Provides clear error messages for connection problems

Performance Considerations

  • Concurrency: Adjust the -concurrency parameter based on your API rate limits
  • Date Filtering: Use the -since parameter to limit the scope and improve performance
  • Large Organizations: For organizations with many repositories, consider running during off-peak hours

Dependencies

  • github.com/google/go-github/v71: GitHub API client library
  • golang.org/x/oauth2: OAuth2 authentication for GitHub API

License

This project is provided as-is for internal tooling purposes.

Contributing

When making changes to this tool:

  1. Ensure backward compatibility with existing command-line parameters
  2. Test with various organization sizes and team configurations
  3. Verify CSV output format remains consistent
  4. Update this README if adding new features

Troubleshooting

Common Issues

"Error getting repositories: 401"

  • Check that your GitHub token is valid and has the correct permissions

"Hit rate limit"

  • Reduce the -concurrency parameter or wait for rate limits to reset

"Repository not found or no access"

  • Ensure your token has access to the organization's private repositories

Empty CSV output

  • Verify team member usernames are spelled correctly
  • Check that the specified date range contains commits
  • Ensure team members have commits in the organization's repositories# pdax-tools

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors