multi-repo-analyzer is a deterministic, static security analysis engine designed to assess repository risk without executing code.
- No Code Execution: Analysis is strictly static to ensure the safety of the host machine.
- Explainability: Every finding includes a "Why it Matters" and "Recommendation."
- Noise Reduction: Built-in suppression logic handles common frontend and test suite noise.
- Policy Driven: Final decisions (Allow/Block/Warn) are decoupled from the scoring engine.
- Ingestion: Shallow clone of the repository into a temporary workspace (Timeout: 300s).
- Walking: Recursive traversal using
ScanGuardto enforce file limits (10,000 files max). - Classification: Path and extension-based language detection.
- Analyzer Registry: Execution of specialized analyzers return immutable
Findingobjects. - Post-Processing:
- Correlation: Grouping related signals.
- Suppression: Identifying and downgrading benign noise (e.g., SVGs, test data).
- Scoring: Calculation of the risk score based on severity, category, and squared confidence.
- Policy Evaluation: The
PolicyEngineapplies rules (Standard, Zero-Trust, etc.) to decide the final outcome. - Report Generation: Production of the final JSON and AI-powered practical explanations.
- Timeout: Git clones are capped at 300s to prevent hanging on slow connections or massive repos.
- Workspaces: All scans happen in OS-specific temporary directories that are cleaned up on completion.
- Resource Limits:
ScanGuardprevents the tool from walking deep directory structures (e.g., accidental recursive symlinks).
The PolicyEngine allows organizations to define different thresholds for different environments:
- Standard: Blocks on dangerous execution; warns on high risk scores.
- Zero-Trust: More aggressive blocking on suspicious signals.
- Beginner: Optimized for individuals with more informational alerts.