Thanks to visit codestin.com
Credit goes to github.com

Skip to content

pmahmud/avrolib

Repository files navigation

Avro Phonetic TypeScript Library

A modern TypeScript implementation of Avro Phonetic, a phonetic parser that converts Roman characters to Bengali script.

Architecture Overview

High-Level Flow

Input Text (English/Roman) 
         ↓
+------------------+
|   AvroPhonetic   |  ← Main entry point
+------------------+
         ↓
+------------------+
| SuggestionBuilder| ← Manages suggestions and caching
+------------------+
    ↙           ↘
+---------+  +------------+
|Phonetic |  | Database   | ← Pattern matching & dictionary lookup
|Processor|  | Search     |
+---------+  +------------+
    ↓           ↓
    Bengali suggestions
         ↓
Selected output saved
to candidate cache

Component Details

1. AvroPhonetic (main.ts)

The main entry point class that users interact with.

const avro = new AvroPhonetic();
const suggestions = avro.suggest("ami");  // Get suggestions
avro.commit("ami", "আমি");               // Commit selection

Flow:

User Input → suggest() → SuggestionBuilder
                     ↪ Returns suggestions array
Selected word → commit() → Updates cache

2. SuggestionBuilder (suggestion-builder.ts)

Heart of the suggestion system, coordinates between different components.

Features:

  • Manages suggestion cache
  • Handles dictionary lookups
  • Applies phonetic rules
  • Maintains user's previous selections

Flow:

Input Text
    ↓
1. Split into parts (begin/middle/end)
    ↓
2. Get phonetic conversion
    ↓
3. Search dictionary
    ↓
4. Apply suffix rules
    ↓
5. Sort suggestions
    ↓
6. Return result with prev selection

3. PhoneticProcessor (phonetic-processor.ts)

Handles core phonetic pattern matching and conversion.

Rules example:

Input: "bhl"      Input: "aa"
   ↓                 ↓
Check patterns    Check patterns
   ↓                 ↓
Output: "ভ্ল"     Output: "আ"

Pattern matching flow:

Input character
    ↓
+----------------+
| Match patterns |←→ Pattern database
+----------------+
    ↓
Apply rules
    ↓
Return Bengali chars

4. DatabaseSearch (database-search.ts)

Manages dictionary lookups and word suggestions.

Flow:

Input word
    ↓
1. Get candidates by first char
    ↓
2. Convert to regex pattern
    ↓
3. Search in word tables
    ↓
4. Return matching words

5. RegexProcessor (regex-processor.ts)

Handles complex pattern matching using regular expressions.

Features:

  • Complex character combinations
  • Special character handling
  • Contextual matches
Input text
    ↓
+------------------+
| Process patterns |←→ Regex patterns
+------------------+
    ↓
Apply transformations
    ↓
Return regex string

6. Utils & Dictionaries

Utils (utils.ts)

  • String manipulation
  • Levenshtein distance calculation
  • Unicode conversion
  • Special character detection

Dictionaries (dictionaries.ts)

+-------------------+
|     Dictionary    |
+-------------------+
| - Word tables     |
| - Suffix rules    |
| - Autocorrect     |
+-------------------+

Complete Data Flow

User types "amar"
     ↓
AvroPhonetic.suggest()
     ↓
SuggestionBuilder
  ↙     ↓      ↘
Phonetic   Database   RegexProcessor
Processor   Search
  ↓          ↓          ↓
Parse     Find      Generate
patterns  matches   patterns
  ↓          ↓          ↓
  →  Combine results  ←
         ↓
Apply suffix rules
         ↓
Sort suggestions
         ↓
Return ["আমার", ...]
         ↓
User selects "আমার"
         ↓
AvroPhonetic.commit()
         ↓
Update cache & save

Usage Example

import { AvroPhonetic } from './avrolib';

// Initialize
const avro = new AvroPhonetic({
  loadCandidateSelectionsFromFile: () => loadFromStorage(),
  saveCandidateSelectionsToFile: (data) => saveToStorage(data)
});

// Get suggestions
const result = avro.suggest("amar");
// result = { words: ["আমার", ...], prevSelection: 0 }

// Commit selection
avro.commit("amar", "আমার");

Components Interaction

User Input ("amar")
       ↓
   AvroPhonetic
       ↓
 SuggestionBuilder
 ↙      ↓       ↘
PP      DS       RP    [Components]
↓       ↓        ↓
phonetic  dict   regex  [Processing]
↓       ↓        ↓
suggestions merged      [Combination]
       ↓
sorted by relevance    [Ranking]
       ↓
returned to user       [Output]

Where:

  • PP: PhoneticProcessor
  • DS: DatabaseSearch
  • RP: RegexProcessor

This modular architecture allows for:

  • Easy maintenance
  • Clear separation of concerns
  • Extensible components
  • Testable units
  • Efficient caching
  • Type safety

Each component can be used independently or as part of the full system, making the library flexible for different use cases.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published