Thanks to visit codestin.com
Credit goes to github.com

Skip to content

benoitc/erlang-idna

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

erlang-idna

Hex.pm Hex.pm CI

A pure Erlang IDNA implementation following RFC 5891.

Current Unicode version: 17.0.0

Features

  • IDNA 2008 compliance with RFC 5891
  • IDNA 2003 backward compatibility
  • UTS #46 compatibility processing (Unicode Technical Standard #46)
  • Full label validation:
    • NFC normalization check
    • Hyphen placement rules
    • Leading combining marks check
    • Contextual rules (CONTEXTJ/CONTEXTO)
    • Bidirectional text rules (RFC 5893)

Installation

Rebar3

Add to your rebar.config:

{deps, [
    {idna, "7.1.0"}
]}.

Mix (Elixir)

Add to your mix.exs:

defp deps do
  [
    {:idna, "~> 7.1"}
  ]
end

Quick Start

Encoding (Unicode → ASCII/Punycode)

%% Basic encoding
1> idna:encode("münchen.de").
"xn--mnchen-3ya.de"

2> idna:encode("βόλος.com").
"xn--nxasmq5b.com"

%% Japanese domain with UTS #46 processing
3> idna:encode("日本語.JP", [uts46]).
"xn--wgv71a119e.jp"

Decoding (ASCII/Punycode → Unicode)

1> idna:decode("xn--mnchen-3ya.de").
"münchen.de"

2> idna:decode("xn--nxasmq5b.com").
"βόλος.com"

Options

The encode/2 and decode/2 functions accept an options list:

Option Default Description
uts46 false Enable UTS #46 compatibility processing
std3_rules false Enforce STD3 ASCII rules
transitional false Use transitional processing (IDNA 2003 compatibility)
strict false Only use ASCII period (.) as label separator

Examples with Options

%% UTS #46 processing normalizes and maps characters
1> idna:encode("Ⅷ.com", [uts46]).
"viii.com"

%% Transitional processing (ß → ss)
2> idna:encode("faß.de", [uts46, transitional]).
"fass.de"

%% Non-transitional (default) preserves ß
3> idna:encode("faß.de", [uts46]).
"xn--fa-hia.de"

%% STD3 rules reject certain characters
4> idna:encode("_example.com", [uts46, std3_rules]).
** exception exit: {invalid_codepoint,95}

API Reference

Main Functions

Function Description
encode/1,2 Encode a Unicode domain name to ASCII (Punycode)
decode/1,2 Decode an ASCII domain name to Unicode
alabel/1 Convert a single label to ASCII form (A-label)
ulabel/1 Convert a single label to Unicode form (U-label)

Validation Functions

Function Description
check_label/1,4 Validate a domain label
check_nfc/1 Check NFC normalization
check_hyphen/1 Check hyphen placement rules
check_context/1 Check contextual rules
check_initial_combiner/1 Check for leading combining marks
check_label_length/1 Check label length (max 63 octets)

Compatibility Functions (Deprecated)

Function Replacement
to_ascii/1 Use encode/1
to_unicode/1 Use decode/1
from_ascii/1 Use decode/1
utf8_to_ascii/1 Use encode/1

Documentation

Full API documentation is available on HexDocs.

Generate documentation locally:

rebar3 ex_doc

Updating Unicode Data

This library currently supports Unicode 17.0.0. To update to a new Unicode version:

1. Download Unicode Data Files

Replace VERSION with the target version (e.g., 17.0.0):

# Core Unicode data files
wget -O uc_spec/UnicodeData.txt https://www.unicode.org/Public/VERSION/ucd/UnicodeData.txt
wget -O uc_spec/ArabicShaping.txt https://www.unicode.org/Public/VERSION/ucd/ArabicShaping.txt
wget -O uc_spec/Scripts.txt https://www.unicode.org/Public/VERSION/ucd/Scripts.txt

# IDNA-specific files (path structure as of Unicode 17.0.0)
wget -O uc_spec/IdnaMappingTable.txt https://www.unicode.org/Public/VERSION/idna/IdnaMappingTable.txt
wget -O test/IdnaTestV2.txt https://www.unicode.org/Public/VERSION/idna/IdnaTestV2.txt

2. Generate IDNA Table

Use the kjd/idna Python tool:

git clone --depth 1 https://github.com/kjd/idna.git /tmp/kjd-idna
python3 /tmp/kjd-idna/tools/idna-data make-table --version VERSION > uc_spec/idna-table.txt
rm -rf /tmp/kjd-idna

If the tool needs additional files, use the --source option:

python3 /tmp/kjd-idna/tools/idna-data make-table --version VERSION --source uc_spec > uc_spec/idna-table.txt

3. Regenerate Erlang Modules

cd uc_spec
./gen_idnadata_mod.escript
./gen_idna_table_mod.escript
./gen_idna_mapping_mod.escript
cd ..

4. Run Tests

rebar3 eunit

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

Erlang IDNA lib

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Erlang 100.0%