Thanks to visit codestin.com
Credit goes to github.com

Skip to content

leeyejin1231/KOTOX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

32 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

KOTOX: A Korean Toxic Text for Obfuscation and Detoxification

GitHub Repo stars GitHub last commit GitHub contributors


โ—๏ธWarning: this document contains content that may be offensive or upsetting.

KOTOX, the first Korean dataset for deobfuscation and detoxification, was built using linguistically grounded transformation rules to handle obfuscated toxic expressions.

๐Ÿงช About KOTOX

KOTOX motivation

KOTOX is the first Korean dataset designed for deobfuscation and detoxification of toxic language. Built from linguistically grounded transformation rules, it introduces obfuscated instances to model how users disguise offensive expressions in Korean.

Unlike previous datasets that focus mainly on English or clean toxic text, KOTOX captures diverse obfuscation patterns specific to Hangul and Korean phonology, such as phonological, iconological, transliteration-based, syntactic, and pragmatic variations.

It supports three complementary tasksโ€”obfuscated toxic text classification, neutral text deobfuscation, and toxic text sanitizationโ€”providing a unified benchmark to evaluate the robustness of language models against obfuscated toxic content and to advance safer language technologies for low-resource languages.

๐Ÿ—‚๏ธ Tasks

The dataset enables three complementary tasks:

  1. ๐Ÿง  Obfuscated Toxic Text Classification
  • Classify whether an obfuscated sentence is toxic or neutral.
  1. ๐Ÿ”ค Neutral Text Deobfuscation
  • Restore an obfuscated neutral sentence to its original, clean form.
  1. ๐Ÿงผ Obfuscated Toxic Text Sanitization
  • Rewrite obfuscated toxic text into a deobfuscated, neutral sentence while preserving meaning.

๐Ÿงฉ Obfuscation Rules

KOTOX defines 17 transformation rules across 5 linguistic approaches, based on Korean linguistic properties.

Approach Transformation rule Example
Phonological Initial consonant repacement ํ•œ๊ตญ์ธ โ†’ ํ•œ๊พน์ธ
Medial vowel replacement ํ•ด์ˆ˜์š•์žฅ โ†’ ํ—ค์ˆ˜์š•์žฅ
Final consonant replacement ํ•œ๊ตญ์ธ โ†’ ํ•๊ตฎ์น
Ortographic reyllabification ํ•œ๊ตญ์ธ โ†’ ํ•œ๊ตฌ๊ธด
Initial consonant insertion ํ•œ๊ตญ์ธ โ†’ ํ•œ๊ตญ๊ธด
Medial vowel insertion ํ•œ๊ตญ์ธ โ†’ ํ™˜๊ถ‰์œˆ
Final consonant insertion ๋ฐ”๊นฅ โ†’ ๋ฐ•๊นฅ
Liaison ํ•  ์ง“์ด๊ฐ€ โ†’ ํ• ์ฐŒ์‹œ๊ฐ€
Iconological Hangeul look-alike ๊ท€์—ฝ๋‹ค โ†’ ์ปค์—ฝ๋‹ค
Cross-script substitution ์ญˆ๊พธ๋ฏธ โ†’ ๅ’๊พธๅฃI
Rotation-based variation ๋…ผ๋ฌธ โ†’ ๊ณฐ๊ตญ
Transliteration Phonetic substitution (Latin) ๋งํ–ˆ์–ด โ†’ mangํ–ˆ์–ด
Phonetic substitution (CJK) ์ˆ˜์ƒํ•ด โ†’ ๆฐด์ƒํ•ด
Semantic substitution ๊ฐ€์ง€๋งˆ์„ธ์š” โ†’ ๋ˆํŠธ๊ณ ์ฟ ๋‹ค์‚ฌ์ด
Syntactic Spacing perturbation ํ™”์žฅ์‹ค ๋”๋Ÿฝ๊ณ  ๋ณ„๋กœ โ†’ ํ™”์žฅ ์‹ค๋”๋Ÿฝ ๊ณ ๋ณ„๋กœ
Syllable anagram ์˜ค๋žœ๋งŒ์— ์™ธ๊ตญ์—ฌํ–‰์„ โ†’ ์˜ค๋งŒ๋žœ์— ์™ธ์—ฌ๊ตญํ–‰์„
Pragmatic Symbol/emoji insertion ๋ˆ์„ ์“ฐ๋Š” ํ˜ธ๊ฐฑ โ†’ ๋ˆ์„ยฐโ™ก ์“ฐ๋Š”ใ€Šํ˜ธ..๊ฐฑใ€‹โ‰ฅใ……โ‰ค

๐Ÿ—๏ธ Dataset Construction

KOTOX overview

Base Corpus

  • Started from K/DA - 7.5k Korean neutral-toxic sentence pairs
  • After manual filtering by annotators โ†’ 2,294 high-quality pairs selected as source data

Rule Application Process

  • Applied transformation rules to both neutral and toxic sides of each pair
  • Used an alogorithm to sample and apply 2-4 rules per text, depending on difficulty
    Easy: 2 rules
    Normal: 3 rules
    Hard: 4 rules

Dataset Cmoposition

  • Final dataset: 6.9k neutral-toxic-pairs + corresponding obfuscated counterparts
  • Split into train/validation/test = 8:1:1 ratio for each difficulty level
Dataset train valid test sum
easy 1,835 229 230 2,294
normal 1,835 229 230 2,294
hard 1,835 229 230 2,294
total 5,505 687 690 6,882

โš’๏ธ Setup

Datasets

โ””โ”€โ”€ data
    โ”œโ”€โ”€ KOTOX
    โ”‚   โ”œโ”€โ”€ easy
    โ”‚   โ”œโ”€โ”€ normal
    โ”‚   โ”œโ”€โ”€ hard
    โ”‚   โ””โ”€โ”€ total
    โ””โ”€โ”€ KOTOX_classification
        โ”œโ”€โ”€ easy
        โ”œโ”€โ”€ normal
        โ”œโ”€โ”€ hard
        โ””โ”€โ”€ total

KOTOX: for obfuscation and detoxification, ๐Ÿค— huggingface-KOTOX

KOTOX_classification: for toxic hate speech detection, ๐Ÿค— huggingface-KOTOX-classification

Environment Setup

Install the necessary dependencises using the provided requirements

$ pip install -r requirements.txt

Add .env file for using OpenAI API

OPENAI_API_KEY= <Your OpenAPI Key>

Git colne G2P

$ git clone https://github.com/seongmin-mun/KoG2Padvanced.git

๐Ÿš€ Usage

Augmentation

$ python augmentation.py

Classification

Train

Modify the classification/train_config.py file.

$ cd classification
$ python train.py

Evaluation

Modify the classification/eval_config.py file.

$ cd classification
$ python eval.py

Fine-tuning

Modify the finetuning/train_examples.sh file.

$ chmod+x finetuning/train_examples.sh
$ ./train_examples.sh

Citation

@misc{lee2025kotoxkoreantoxicdataset,
      title={KOTOX: A Korean Toxic Dataset for Deobfuscation and Detoxification}, 
      author={Yejin Lee and Su-Hyeon Kim and Hyundong Jin and Dayoung Kim and Yeonsoo Kim and Yo-Sub Han},
      year={2025},
      eprint={2510.10961},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.10961}, 
}

About

A Korean Toxic Text for Obfuscation and Detoxification

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •