Thanks to visit codestin.com
Credit goes to lib.rs

#unicode #charset

textcode

Text encoding/decoding library. Supports: UTF-8, ISO6937, ISO8859, GB2312

8 releases

Uses new Rust 2024

0.3.1 Dec 15, 2025
0.3.0 Dec 15, 2025
0.2.2 May 2, 2022
0.2.1 Nov 3, 2020
0.1.0 Nov 18, 2019

#159 in Text processing

Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App

2,105 downloads per month
Used in 10 crates (5 directly)

MIT license

460KB
8K SLoC

textcode

docs

Intro

Textcode is a library for text encoding/decoding.

The library uses non-strict conversion: invalid or unmappable characters are replaced with ?.

⚠️ Breaking change in v0.3.0

The library API has been completely redesigned:

Old API (v0.2.x): module-based functions

use textcode::iso8859_5;

let mut text = String::new();
iso8859_5::decode(b"\xbf\xe0\xd8\xd2\xd5\xe2!", &mut text);

let mut bytes = Vec::new();
iso8859_5::encode("Привет!", &mut bytes);

New API (v0.3.x): generic functions with codec types

use textcode::{Iso8859_5, decode, encode};

let text = decode::<Iso8859_5>(b"\xbf\xe0\xd8\xd2\xd5\xe2!");

let bytes = encode::<Iso8859_5>("Привет!");

Charsets

  • UTF-8
  • UTF-16 - Decoding BE and LE with BOM, encoding BE without BOM
  • iso-6937 - Latin superset of ISO/IEC 6937 with Euro and letters with diacritics
  • iso-8859-1 - Western European
  • iso-8859-2 - Central European
  • iso-8859-3 - South European
  • iso-8859-4 - North European
  • iso-8859-5 - Cyrillic
  • iso-8859-6 - Arabic
  • iso-8859-7 - Greek
  • iso-8859-8 - Hebrew
  • iso-8859-9 - Turkish
  • iso-8859-10 - Nordic
  • iso-8859-11 - Thai
  • iso-8859-13 - Baltic Rim
  • iso-8859-14 - Celtic
  • iso-8859-15 - Western European
  • iso-8859-16 - South-Eastern European
  • gb2312 - Simplified Chinese
  • Geo - DVB single-byte Georgian character encoding (Magti TV)

Example

use textcode::{Iso8859_5, decode, encode};

const UTF8: &str = "Привет!";
const ISO8859_5: &[u8] = &[0xbf, 0xe0, 0xd8, 0xd2, 0xd5, 0xe2, 0x21];

let text = decode::<Iso8859_5>(ISO8859_5);
assert_eq!(text, UTF8);

let bytes = encode::<Iso8859_5>(UTF8);
assert_eq!(bytes, ISO8859_5);

No runtime deps