English | Tiếng Việt [IJCAI 2025 Accepted Paper Preprint]
This project aims to analyze the Vietnamese language to develop a faster typing method by implementing word prediction based on partial input. For instance, inputting only x0ch2 should yield xin chào as the predicted output.
Completeness: v7 is basically better VNI, everything VNI can do, v7 also can do. So you can input any possible Vietnamese words with v7.
Use the below script to try v7 method!
- The Vietnamese language consists of many diacritics, making typing in Vietnamese time-consuming due to the need for these diacritical marks.
v7aims to simplify Vietnamese typing by using only the initial consonant and tone to predict the intended words. For example, instead of typingtưởng tượngastuong73 tuong75(VNI) ortuongwr tuongwj(Telex), you can typet3t5withv7!- Naturally, this reduction in key usage leads to some information loss. For instance, the input
t3t5could also correspond totiểu tiện, as3represents the hook tonehỏiand5represents the underdot tonenặng. - This project analyzes and addresses these problems to ultimately introduce
v7, enhancing the Vietnamese typing experience.
v7 inherits both from former VNI and Telex.
-
Special consonants:
gfor bothgandgh.ngfor bothngandngh.zforgi. (z6→giúp,giết,giáp, ...)ddforđ. (dd4→đã,đãi,đỗ, ...) (Telex style)
-
Tones (
VNI style):0for no tones:tuân,câm,tân...1for normal acute:cấm,tiếng,tấn,thính... (compare with6to see the differences)2for grave:tuần,cầm,tần...3for hook:tẩn,cẩm,hỉ...4for tilde:mãi,rã,phũ...5for normal underdot:nhậm,phụng,độn,mạnh... (compare with7to see the differences)6forentering/checkedacute:cấp,tiếc,tất,thích... (everything with acute and ends withp,t,c,chmust be tone6)7forentering/checkedunderdot:nhập,phục,đột,mạch... (everything with underdot and ends withp,t,c,chmust be tone7)
-
Special vowels:
- Lots of
ă,â,ê,ô,ơ,ưwhen typing Vietnamese? Not a problem anymore because just typinga,e,o,uandv7will predict the most suitable ones for you! This feature also helps reducing number of keys you have to type!
- Lots of
This 8-tone system follows the Vietnamese Eight-Tone Analysis.
Note: If you aren't familiar with 8-tone system, you can still config to use traditional VNI 6-tone. But using 8-tone system is highly recommended for much much better AI result!
Operating Systems:
- ✅ macOS - Please switch to English keyboard
- ✅ Windows - Please switch to English keyboard
- ⛔ Linux – Not supported yet
Current Limitations:
- 🚫 CapsLock: Not currently supported. Please make sure CapsLock is off when typing.
⚠️ Accessibility: Some platforms (e.g., macOS) may require enabling accessibility permissions such as input monitoring for the tool to function correctly.- Stable version in progress...
v7 predicts the words/phrases users want to type by checking and ranking possible words/phrases.
This mode utilize v7gpt: a GPT-like model with a custom tokenizer only for v7, trained on a Vietnamese corpus, based on Andrej Karpathy's nanoGPT.
- Advantages:
- Works in any circumstances.
- Understands the context in which the user is writing to predict the most suitable next word.
- Can effectively predict entire sentences at a time.
This project uses Python 3.12.
To run the app in AI Mode, follow these steps:
- Install the required packages for AI Mode (Torch is required):
pip install -r requirements_ai.txt
- Download the pretrained model checkpoint:
gdown 1dDP0jIJ79syE6vt6QnVl05_4fYpuwrqd -O checkpoints/v7gpt-1.3.pth # Or download the file at https://drive.google.com/file/d/12ZBG5IBOKmgmv7mh32uFdDUqr-K0SzPS/view?usp=drive_link to checkpoints/v7gpt-1.3.pth - Start the application:
python main.py