Thanks to visit codestin.com
Credit goes to github.com

Skip to content

antonsoo/AncientLanguages

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ancient Languages Logo

Ancient Languages

The Conversation Across Millennia

Learn 46 ancient languages with modern AI and scholarly rigor. Authentic texts · Zero hallucinations · Research-grade accuracy

Discord License Python Flutter Tests Coverage

Quick Start · Interactive Preview · Investment Opportunity · Join Discord


HISTORIA·VERO·TESTIS·TEMPORVM·LVX·VERITATIS·VITA·MEMORIAE·MAGISTRA·VITAE·NVNTIA·VETVSTATIS·QVA·VOCE·ALIA·NISI·ORATORIS·IMMORTALITATI·COMMENDATVR

"History is the witness of time, the light of truth, the life of memory, the teacher of life, the herald of antiquity; by what other voice than that of the orator is it committed to immortality?" — Marcus Tullius Cicero, De Oratore II.9.36

🏛️ The Problem

Reading translations is like watching a movie described over the phone. You get the plot, but you miss the soul.

Ancient Greek has four words for "love" (ἔρως, φιλία, ἀγάπη, στοργή). English collapses them to one. Egyptian hieroglyphs carry meaning in their very shape. Homer's dactylic hexameter becomes prose. When you read a translation, you're reading the interpreter's choices, not the author's voice.

For centuries, accessing this wisdom required elite university degrees, expensive textbooks, and decades of rote memorization.

⚡ The Solution

Ancient Languages is the first platform to combine research-grade philology with modern AI. We don't just teach you about these languages—we let you interact with them.

From Sumerian cuneiform (3100 BCE) to medieval manuscripts (1200 CE), we're building infrastructure for preserving and transmitting humanity's full linguistic heritage.

🎓 Neuro-Symbolic Accuracy

We use a neuro-symbolic approach: AI handles synthesis and dialogue, but hard linguistic truth comes from peer-reviewed academic sources.

Result: Zero AI hallucinations on grammar.

  • Perseus Digital Library morphology
  • LSJ Lexicon (116,502 Greek entries)
  • TLA Berlin, ORACC UPenn, CDLI UCLA
  • Transparent citations for every definition

🤖 AI-Powered Immersion

Don't translate "The boy has the red apple." Read real passages from the Book of the Dead. Chat with a Spartan general before battle. Debate philosophy with an Athenian citizen.

  • GPT-5, Claude 4.5, Gemini 2.5
  • Personalized lessons from authentic texts
  • Conversational practice with historical personas
  • Instant morphological analysis

🌍 46 Languages & Counting

Top Priority (User-Requested): 🏛️ Classical Latin · 📖 Koine Greek · 🏺 Classical Greek · 🕎 Biblical Hebrew

Core Languages (20 total): 🪷 Classical Sanskrit · 🐉 Classical Chinese · ☸️ Pali · ☦️ Old Church Slavonic · 🗣️ Ancient Aramaic · 🌙 Classical Arabic · 🪓 Old Norse · 👁️ Middle Egyptian · 🪢 Old English · 🍎 Yehudit/Paleo-Hebrew · ⚖️ Coptic · 🔆 Ancient Sumerian · 🪔 Classical Tamil · ✝️ Classical Syriac · 🏹 Akkadian · 🕉️ Vedic Sanskrit

Extended Coverage (16 languages) · Partial Courses (10 languages)

View complete language list → | Development roadmap →

🔐 Privacy-First Architecture

Unlike subscription apps that monetize user data:

  • BYOK (Bring Your Own Key): Use your own API keys for AI services (or use free Google Gemini tier)
  • Offline capable: Works without API keys using Echo provider
  • Zero tracking: No telemetry, no user behavior monitoring
  • Self-hostable: Docker-ready deployment, full control
  • Open source: Audit the code yourself (Elastic License 2.0)

✨ Interactive Preview

Currently available in self-hosted mode. Public web demo coming soon.

1. The Intelligent Reader

Instant, scholarly analysis of authentic texts. No more flipping through 2,000-page lexicons.

Text: Μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος (Iliad 1.1)
👇 Tap Word: Μῆνιν (Mēnin)
📖 Lemma: μῆνις (mēnis)
🧠 Analysis: Noun, Feminine, Accusative, Singular
🏛️ Definition (LSJ): "Wrath, lasting anger, especially of the gods"
⚙️ Grammar: Accusative of direct object for verb ἄειδε
Source Data: Perseus Digital Library (Tufts University)

2. AI-Generated Lessons from Authentic Texts

We don't write drills. We extract them from 5,000 years of literature.

User: "Generate a beginner vocabulary quiz from Gilgamesh Tablet XI."

AI Tutor: "Sure. Based on the Flood Narrative, let's practice these high-frequency Akkadian terms found in the text:

  1. 𒉌𒂵 (nēmequ) - Wisdom
  2. 𒌓 (ūmu) - Day
  3. 𒀀𒁍 (abūbu) - Flood

Which word completes this line: 'He brought back a story from before the [___]?'"

3. Chat with Historical Personas

Conversational practice in the ancient language with AI-powered teachers.

  • 🏛️ Athenian philosopher — Socratic dialogue in Classical Greek
  • ⚔️ Spartan warrior — Laconic military speech
  • 🏺 Roman senator — Ciceronian rhetoric in Latin
  • 𓂋 Egyptian scribe — Hieratic script and scribal formulas
  • 𒀭 Sumerian lugal — Cuneiform royal inscriptions

🚀 Core Features

🎓 AI Lesson Generation

Generate exercises from authentic ancient texts—not "The apple is red," but real passages from Homer, the Ṛgveda, and Pyramid Texts.

Exercise types: Alphabet drills (Greek, Hebrew, cuneiform, hieroglyphics) · Vocabulary matching · Cloze exercises · Translation practice (Ancient ↔ English)

Target specific passages: "Generate lesson from Iliad 1.20-1.50"

📖 Interactive Reader

Tap any word for instant scholarly analysis with full Perseus morphological data.

Works offline—all linguistic data embedded. No API calls required for reader functionality.

Example: Tap "Μῆνιν" (first word of the Iliad) → Lemma μῆνις · Morphology: Feminine accusative singular · LSJ: "wrath, anger, especially of the gods" · Grammar: Smyth §175 · Etymology: PIE mē-

💬 Conversational Practice (Coach)

Chat with AI-powered historical personas in their native languages. Practice everyday Attic Greek with an Athenian merchant. Learn laconic military speech from a Spartan warrior. Master Ciceronian rhetoric with a Roman senator.

RAG-based (retrieves relevant grammar/lexicon before responding) — no hallucinations

🏆 Gamification & Progress

Streaks · XP & levels · Achievements & badges · ELO ratings per grammar topic · Text statistics (vocabulary coverage %, reading speed, comprehension scores)

🔊 Text-to-Speech

Hear reconstructed pronunciation for ancient languages (where scholarly consensus exists). Prosody support for ancient meter and rhythm (e.g., dactylic hexameter for Homer).


🚀 Quick Start

Get running in 5 minutes.

Prerequisites

  • Docker & Docker Compose
  • Python 3.12+
# 1. Clone repository
git clone https://github.com/antonsoo/AncientLanguages.git
cd AncientLanguages

# 2. Start services (PostgreSQL, Redis, Qdrant)
docker compose up -d

# 3. Install dependencies & setup DB
conda activate ancient-languages-py312  # or use Python 3.12 venv
pip install -e ".[dev]"
alembic upgrade head

# 4. Launch
uvicorn app.main:app --reload
# Visit http://localhost:8000

Optional: Add a free Google Gemini API key for AI lessons (2M tokens/day free tier):

echo "GOOGLE_API_KEY=your-key" >> backend/.env
echo "LESSONS_ENABLED=1" >> backend/.env

Full setup guide & troubleshooting →


🌟 Why This Matters

For language learners: Learn ancient languages with modern app UX—gamified, engaging, effective. No dry textbooks. Real Homer, real Plato, real Vedas.

For scholars & educators: Research-grade linguistic data meets modern tools. Instant morphological analysis that would take hours with physical dictionaries. Perfect for teaching or research.

For developers: Showcase of AI-powered education done right. October 2025 LLM APIs, vector search with pgvector, multi-provider architecture, 173+ tests passing, professional development practices.

For humanity: Ancient languages are living connections to our ancestors. When they fade, we lose entire conceptual frameworks, rhetorical traditions, and direct access to primary sources. This platform reverses that loss.

Read the full vision →


📈 The Opportunity

We are seeking seed funding to accelerate from functional prototype to market dominance.

Market Gap

  • Language learning market: $60B+ globally, but 99% focused on modern travel/business languages
  • The "long tail" of history: Millions of theology students, history buffs, homeschoolers, and academics have zero modern tools—stuck with $200 textbooks from 1980
  • No modern competitors: Leading language learning platforms focus exclusively on modern languages. Academic tools are user-hostile databases. We are the only player combining modern UX with academic rigor

Traction & Velocity

This project utilizes AI-assisted development to achieve velocity impossible a year ago.

  • 46 languages implemented with AI lesson generation
  • 173+ tests ensuring linguistic accuracy
  • Top 4 priority languages (Latin, Koine Greek, Classical Greek, Biblical Hebrew) ready for advanced beta
  • 30,000+ lines of Python · 90,000+ lines of Flutter — production-ready codebase
  • Professional architecture ready for institutional adoption

Funding Goals

Capital will be deployed immediately to:

  1. Hire specialist linguists: Contract PhDs to validate data pipelines for niche languages (Sumerian, Hittite, etc.)
  2. Accelerate frontend: Move from Flutter web-beta to native iOS/Android launch
  3. University pilots: Finalize partnerships currently in discussion with 3 major divinity schools

Accelerated roadmap with funding:

  • Month 3: Enhanced support for 12 core languages, first university pilot
  • Month 6: Full morphological analysis for 20 languages, 3-5 institutional partnerships
  • Month 12: Complete feature parity for all 46 languages, 10+ university adoptions, sustainable revenue

Read full vision & business model → · Inquiries: [email protected]


🏗️ Technical Architecture

Built for scale, accuracy, and rapid iteration.

┌─────────────────────┐
│  Flutter Frontend   │ ← Material Design 3 · 90,000+ lines Dart
│   (Web/iOS/Android) │
└──────────┬──────────┘
           │ API
┌──────────▼──────────┐
│  FastAPI Backend    │ ← Python 3.12 · 12 routers · 173+ tests
│    (Python 3.12)    │
└──────────┬──────────┘
           │
   ┌───────┴────────┐
   │                │
┌──▼──────────┐  ┌─▼──────────────┐
│ PostgreSQL  │  │ LLM Interface  │
│ + pgvector  │  │ (Neuro-Symbolic│
│             │  │    Router)     │
└──┬──────────┘  └─┬──────────────┘
   │               │
   │            ┌──▼────────────────────────┐
   │            │ GPT-5 / Claude 4.5 /      │
   │            │ Gemini 2.5                │
   ▼            └───────────────────────────┘
┌──────────────────────────────────────────┐
│ Static Linguistic Data Sources:          │
│ Perseus · LSJ · ORACC · TLA · CDLI       │
└──────────────────────────────────────────┘

Accuracy first: We never ask an LLM "What does this Greek word mean?" We ask our database. We use LLMs only to explain that data to the user.

Offline capable: All core morphological data is embedded. The app works on an airplane (conversational AI features require internet).

Technical docs → | API specs →


📚 Research Foundation

All linguistic data from authoritative academic institutions—zero AI hallucinations.

Language Source Scale
Greek Perseus (Tufts) 116,502 lexicon entries (LSJ) · 3,000+ grammar sections (Smyth)
Latin Perseus Lewis & Short lexicon · Allen & Greenough grammar
Sanskrit Sanskrit Digital Library 180,000+ entries (Monier-Williams) · Whitney grammar
Hebrew Dead Sea Scrolls Library Brown-Driver-Briggs · Gesenius grammar
Egyptian TLA (Berlin) Wörterbuch · Gardiner grammar
Mesopotamian ORACC (UPenn) · CDLI (UCLA) Chicago Assyrian Dictionary · von Soden, Thomsen grammars

🤝 Community & Contributing

This is a civilization-scale project. We need your help.

Ways to contribute:

  • Linguists: Help curate texts and validate morphology for your specialty
  • Developers: Python backend, Flutter frontend, data engineering roles
  • Learners: Test the alpha and break things
  • Documentation: Tutorials, translations, guides

Contribution guide → | Good first issues → | Discord community →


💡 Support & Contact

Support the preservation of human knowledge:

GitHub Sponsors · Patreon

Join the conversation:

💬 Discord · GitHub Discussions · Report Issues

For investment/partnership inquiries:

📧 [email protected]


📜 License

Code: Elastic License 2.0 (ELv2) — Free to use, modify, and distribute. Commercial use permitted. Cannot provide as managed service.

Data: Original licenses preserved (Perseus: CC BY-SA 3.0, others public domain or academic licenses)

Full license →


Every Ancient Text Is a Conversation Across Millennia

We're making these conversations accessible to everyone.

Start Learning · Start Developing · Join Discord


Built with dedication by developers, linguists, and scholars worldwide

Star this repository if you believe ancient languages should be accessible to everyone