Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Abdul-Omira/cad-fusion-lab

Repository files navigation

Text-to-CAD AI Model

A state-of-the-art AI model that converts natural language descriptions into parametric CAD files using a multi-modal transformer architecture with geometric validation and visual feedback.

Text-to-CAD Example

πŸ—οΈ Architecture Overview

This project implements a multi-modal transformer architecture that:

  • Processes text descriptions using BERT-based encoders
  • Generates parametric CAD sequences with geometric validation
  • Incorporates visual feedback through CLIP-based scoring
  • Validates geometric constraints and manufacturing requirements

πŸ”‘ Key Components

1. Data Pipeline (src/data/)

  • DeepCAD dataset processing and cleaning
  • Multi-modal annotation generation
  • Geometric hash-based deduplication
  • 8-bit quantized parameter representation

2. Model Architecture (src/models/)

  • Adaptive text encoder with BERT backbone
  • 24-layer transformer decoder for CAD sequences
  • Visual feedback module with CLIP integration
  • Geometric validation system

3. Training Pipeline (src/training/)

  • Sequential pre-training on CAD sequences
  • Visual feedback fine-tuning with PPO
  • Multi-stage training orchestration
  • Gradient clipping and learning rate scheduling

4. Validation System (src/validation/)

  • KCL code generation and execution
  • Feature-by-feature geometric validation
  • Manufacturing constraint checking
  • Chamfer distance computation

5. Deployment (src/deployment/)

  • FastAPI server with async processing
  • Multiple export formats (STEP, GLTF, KCL)
  • Batch processing capabilities
  • Quantization for production deployment

πŸ“¦ Installation

# Clone the repository
git clone https://github.com/omira/text-to-cad.git
cd text-to-cad

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install additional CAD tools 
# macOS:
brew install opencascade
brew install freecad

# Ubuntu:
sudo apt-get install libopencascade-dev
sudo apt-get install freecad

πŸš€ Quick Start

from src.models.text_to_cad import TextToCADModel
from src.inference.pipeline import load_model_from_checkpoint, InferencePipeline

# Load pre-trained model or create a new one
try:
    model = load_model_from_checkpoint("checkpoints/final_model.pt")
except FileNotFoundError:
    model = TextToCADModel(vocab_size=10000)

pipeline = InferencePipeline(model)

# Generate CAD from text
cad_sequence = pipeline.generate("Create a rectangular bracket with mounting holes")
kcl_code = pipeline.export_kcl(cad_sequence)
print(kcl_code)

# Export to STEP format
step_file = pipeline.export_step(cad_sequence, "output.step")
print(f"STEP file saved to: {step_file}")

πŸ‹οΈ Training

Dataset Preparation

# Create dataset directories
mkdir -p data/{raw,processed,annotations}

# Download and process DeepCAD dataset
python scripts/prepare_dataset.py --download --process --annotate

# Process dataset only
python scripts/prepare_dataset.py --process --input-dir data/raw --output-dir data/processed

# Generate annotations
python scripts/prepare_dataset.py --annotate --output-dir data/processed --annotation-dir data/annotations

Model Training

# Train small model for prototyping
python scripts/train.py --config configs/small_config.yaml --output-dir checkpoints/small

# Train base model
python scripts/train.py --config configs/base_config.yaml --output-dir checkpoints/base

# Train large model 
python scripts/train.py --config configs/large_config.yaml --output-dir checkpoints/large

# Fine-tune with visual feedback
python scripts/train.py --config configs/base_config.yaml --checkpoint checkpoints/base/final_model.pt --finetune

πŸ“Š Evaluation

# Run evaluation suite
python scripts/evaluate.py --model checkpoints/final_model.pt --output-dir outputs/evaluation --visualize

# Generate validation report
python scripts/evaluate.py --model checkpoints/final_model.pt --test-set data/processed/test.json --output-dir outputs/validation

πŸ–₯️ Deployment

# Start the API server
python src/deployment/server.py --model checkpoints/final_model.pt --config configs/base_config.yaml --port 8000

# Generate example CAD from text with curl
curl -X POST "http://localhost:8000/generate" \
     -H "Content-Type: application/json" \
     -d '{"text": "Create a rectangular bracket with two mounting holes", "format": "step"}'

πŸ”§ VS Code Tasks

This project includes pre-configured VS Code tasks for common operations:

  1. Install Dependencies: Install required Python packages
  2. Train Model (Small): Train a small model for quick iteration
  3. Train Model (Base): Train the standard model
  4. Fine-tune with Visual Feedback: Fine-tune model with CLIP-based visual feedback
  5. Run Evaluation: Evaluate model performance
  6. Run API Server: Start the model serving API
  7. Process DeepCAD Dataset: Process the raw dataset
  8. Generate Example CAD: Run a quick CAD generation example
  9. Run Tests: Run test suite

πŸ“‹ Technical Specifications

  • Model Architecture: Multi-modal transformer (BERT encoder + 24-layer decoder)
  • Model Size: 350M parameters
  • Sequence Length: Up to 512 CAD operations (768 for large model)
  • Precision: 8-bit quantization with 12-bit for critical dimensions
  • Training Data: DeepCAD dataset with geometric hash deduplication
  • Training Stages: Pre-training + PPO fine-tuning with visual feedback
  • Performance Metrics:
    • Chamfer Distance: <0.87mm
    • CLIP Score: >0.82
    • Inference Speed: <2s for 50-operation sequences on NVIDIA A100
    • Manufacturing Validity Rate: >95%

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

  • Accuracy: 0.87mm Chamfer Distance, 0.82 CLIP Score

Project Structure

β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ data/           # Data processing and loading
β”‚   β”œβ”€β”€ models/         # Model architectures
β”‚   β”œβ”€β”€ training/       # Training loops and optimization
β”‚   β”œβ”€β”€ validation/     # Geometric validation
β”‚   β”œβ”€β”€ inference/      # Inference pipeline
β”‚   └── deployment/     # Deployment utilities
β”œβ”€β”€ configs/            # Training and model configurations
β”œβ”€β”€ scripts/            # Training and evaluation scripts
β”œβ”€β”€ data/              # Dataset storage
β”œβ”€β”€ checkpoints/       # Model checkpoints
β”œβ”€β”€ outputs/           # Generated outputs and logs
└── tests/             # Unit and integration tests

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: pytest tests/
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Citation

@article{omira2025textcad,
  title={Text-to-CAD: A State-of-the-Art Model for Converting Natural Language to Parametric CAD},
  author={OMIRA Technologies},
  journal={arXiv preprint arXiv:2025.xxxxx},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages