Text2Pydough is a comprehensive AI evaluation system that enables Large Language Models (LLMs) to generate PyDough code directly from natural language queries. It is designed to assess and demonstrate the models' capabilities in translating text to PyDough code effectively.
Text2Pydough is an AI-powered system that evaluates and demonstrates how effectively Large Language Models (LLMs) can translate natural language queries into PyDough code. PyDough is a domain-specific language (DSL) for database operations that offers an alternative, more intuitive syntax to SQL for querying relational databases. Text2Pydough provides a complete ecosystem that includes:
- AI Model Evaluation: Parallel evaluation frameworks to test the performance of various AI providers in generating accurate PyDough code from natural language.
- Interactive Demonstrations: Web-based applications that allow real-time PyDough code generation from user inputs.
- Multi-Provider AI Integration: A flexible abstraction layer supporting multiple LLM providers, including Claude, Gemini, Azure OpenAI, DeepSeek, and others.
- Develop a reliable tool capable of converting natural language into PyDough code.
- Ensure high accuracy in the code generated according to user requirements.
- Guarantee that the generated PyDough statements are coherent and aligned with the user's intent.
- Leverage PyDough to produce simple, efficient, and optimized queries.
- Simplify the processing of metadata within the system.
The prompt evaluation script consists of multiple parallel implementations that evaluate AI models' PyDough code generation capabilities through automate pipelines. The prompt evaluation includes ensemble logic and parallel model execution. This script benchmarks AI models' ability to generate correct database queries from natural language questions
This is the prompt evaluation workflow:
- Argument parsing: The script accepts command-line arguments specifying database paths, model configurations, prompt files, and execution parameters.
- Mlflow setup: Initializes MLflow tracking with remote URI and authentication token to log experiments and model artifacts.
- Database metadata: Prepares database schema information by generating JSON metadata files for each database if they don't exist, creating a mapping structure for SQL query generation.
- Questions processing: Loads test questions from CSV and processes them either sequentially or in parallel across multiple AI models (Claude, Gemini, etc.) using threading for concurrent execution.
- Ensemble selection: When running multiple models in parallel, implements an ensemble approach that compares DataFrame outputs between models to find consensus or falls back to the most reliable model (preferring Gemini).
- Result evaluation: Executes generated Python code against test databases, compares outputs with expected results, and categorizes results as "Match" or other comparison outcomes.
- Metrics calculation: Computes performance statistics including match percentages by difficulty, complexity, and database combinations, generating detailed breakdowns for analysis.
- MlFlow loggin: Records all experiment parameters, metrics, artifacts (CSV files, distribution reports), and logs the final model with associated prompt and script files for reproducibility.
- 
WSL (Windows Subsystem for Linux) must be installed with a Linux distribution (Ubuntu is recommended). Installation instructions: https://learn.microsoft.com/en-us/windows/wsl/install 
- 
(Optional but recommended) Miniconda or Anaconda for environment management: https://www.anaconda.com/docs/getting-started/miniconda/install#linux 
- 
Clone the Repository git clone https://github.com/bodo-ai/text2pydough.git cd text2pydough/
- 
Install Miniconda (if not already installed) mkdir -p ~/miniconda3 wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3 rm ~/miniconda3/miniconda.sh source ~/miniconda3/bin/activate conda init --all After running rm ~/miniconda3/miniconda.sh, close and reopen the terminal.
- 
Create and Activate the Virtual Environment conda env create -f environment.yml conda activate aisuite_deepseek 
- 
Install Additional Dependencies pip install google.genai pip install mistralai 
This section provides an overview of the core directories that make up the text2pydough project, including their purpose, key components, and how they fit into the overall pipeline for natural language to PyDough code generation.
The LCARS directory contains an interactive demo system for generating PyDough code from natural language queries. It is built entirely in Python and Jupyter notebooks, and serves as a hands-on demonstration of the system’s capabilities, using real-time AI model responses and the TPCH database schema. To provide a user-friendly interface for exploring how LLMs generate PyDough code, translate it into SQL, and return results.
The lcar_lab directory is a research and experimentation suite for training, evaluating, and improving AI models for PyDough generation. It contains all infrastructure necessary for ML experimentation, including tracking, data processing, and automatic evaluation.