CCPA Compliance Reasoning System

A hackathon project that parses the California Consumer Privacy Act (CCPA) from a PDF, indexes its sections for semantic retrieval, and actively reasons over business practices using a local LLM (Meta Llama 3 8B).

What's included

parse_statute.py: Extracts the 45 legal sections from the raw ccpa_statute.pdf.
ccpa_sections.json: The extracted sections (you don't strictly need to re-run the parser unless this file is deleted).
retrieval.py: Uses sentence-transformers and faiss-cpu to index sections and perform natural language semantic search.
reasoning.py: Uses llama-cpp-python and a local Llama 3 8B model to evaluate business scenarios against the retrieved CCPA sections, outputting strict JSON compliance judgements.
compliance_checker.py: The top-level entry point that uses keyword heuristics to short-circuit obvious violations before falling back to the LLM reasoning engine.

Setup Instructions

1. Python Environment

Requires Python 3.9+ (3.11 recommended).

python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate
pip install -r requirements.txt

2. Download the LLM Model

The reasoning engine uses a quantized Llama 3 8B model. You need to download the .gguf file to a models/ directory in the project root.

# First, ensure you have the models directory
mkdir -p models

# Download using huggingface-cli (included in requirements.txt)
huggingface-cli download lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF \
  Meta-Llama-3-8B-Instruct-Q4_K_M.gguf \
  --local-dir models

(Note: This is a ~4.9GB download and may take a few minutes).

3. Verify the Installation

Run the two test scripts to confirm everything is working:

Test the Semantic Retriever:

python retrieval.py

(You should see it retrieve sections relating to selling user data.)

Test the Reasoning Engine:

python reasoning.py

(This will run 6 diverse test scenarios through the local Llama model. It might take 10-20 seconds to load the model into RAM for the first time.)

Test the Compliance Checker (Top-Level):

python compliance_checker.py

(This tests the heuristic short-circuits and LLM fallbacks.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CCPA Compliance Reasoning System

What's included

Setup Instructions

1. Python Environment

2. Download the LLM Model

3. Verify the Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
README.md		README.md
SUMMARY.md		SUMMARY.md
ccpa_sections.json		ccpa_sections.json
ccpa_statute.pdf		ccpa_statute.pdf
compliance_checker.py		compliance_checker.py
parse_statute.py		parse_statute.py
reasoning.py		reasoning.py
requirements.txt		requirements.txt
retrieval.py		retrieval.py
test_sweep.py		test_sweep.py

Folders and files

Latest commit

History

Repository files navigation

CCPA Compliance Reasoning System

What's included

Setup Instructions

1. Python Environment

2. Download the LLM Model

3. Verify the Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages