Vectorless RAG

A Python project for building RAG (Retrieval-Augmented Generation) applications without vector embeddings, focusing on legal document analysis using the CUAD (Contract Understanding Atticus Dataset).

Project Structure

vectorless/
├── src/                    # Core source code
├── scripts/                # Processing scripts
│   ├── process_contract.py # Main contract processing pipeline
│   └── run_all_41_questions.py # Sample evaluation script
├── docs/                   # Documentation
│   ├── README.md          # Detailed documentation
│   └── GENERALIZED_WORKFLOW.md # Workflow documentation
├── data/                   # Input datasets
├── sample_dataset/         # Sample data for development
├── output/                 # Generated outputs
│   ├── results/           # Processing results
│   └── segmentation_results/ # Cached segmentation data
├── main.py                # Entry point
├── pyproject.toml         # Project configuration
└── CLAUDE.md             # AI assistant instructions

Quick Start

# Install dependencies
uv sync

# Run main application
uv run python main.py

# Process a specific contract
uv run python scripts/process_contract.py --contract-index 0

# Run evaluation on sample data
uv run python scripts/run_all_41_questions.py

Features

Document segmentation without vector embeddings
Parallel question processing
Intelligent caching for performance
Comprehensive evaluation metrics
Generalizable workflow for different document types

See docs/ for detailed documentation and workflow guides.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Vectorless RAG

Project Structure

Quick Start

Features

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
docs		docs
output		output
sample_dataset		sample_dataset
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

satish860/vectorless

Folders and files

Latest commit

History

Repository files navigation

Vectorless RAG

Project Structure

Quick Start

Features

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages