Thanks to visit codestin.com
Credit goes to github.com

Skip to content

janlelis/unicode-script.js

Repository files navigation

unicode-scripts.js [ci]

Retrieve all Unicode script(s) used in a string.

Can also return the Script_Extension property which is defined as characters which are "commonly used with more than one script, but with a limited number of scripts".

Unicode version: 16.0.0 (September 2024)

Install

Use npm or your favorite package manager to install this module:

npm install unicode-scripts

Or use ESM module directly from the browser.

Usage - Scripts

unicodeScripts(string) / unicodeScriptCodes(string)

import { unicodeScripts, unicodeScriptCodes } from "unicode-scripts";

// Set of all scripts of a string
unicodeScripts("СC") // Set(2) { 'Cyrillic', 'Latin' }
unicodeScripts("𐱐") // Set(1) { 'Unknown' }

// Get all scripts of string in ISO 15924 four-letter codes
unicodeScriptCodes("СC") // Set(2) { 'Cyrl', 'Latn' }

unicodeScript(char) / unicodeScriptCode(char)

// Single character

import { unicodeScript } from "unicode-scripts";
unicodeScript("ᴦ") // "Greek"

Usage - Script Extensions

unicodeScriptExtensions(string) / unicodeScriptExtensionCodes(string)

import { unicodeScriptExtensions } from "unicode-scripts";
unicodeScriptExtensions("॥")
// Set(23) {
//   'Bengali',
//   'Devanagari',
//   'Dogra',
//   'Grantha',
//   'Gujarati',
//   'Gunjala_Gondi',
//   'Gurmukhi',
//   'Gurung_Khema',
//   'Kannada',
//   'Khudawadi',
//   'Limbu',
//   'Mahajani',
//   'Malayalam',
//   'Masaram_Gondi',
//   'Nandinagari',
//   'Ol_Onal',
//   'Oriya',
//   'Sinhala',
//   'Syloti_Nagri',
//   'Takri',
//   'Tamil',
//   'Telugu',
//   'Tirhuta'
// }

More Examples / JSDoc

See SPECS and DOCS.

List of All Scripts

Script names and short names can be retrieved like this:

import { listUnicodeScripts } from "unicode-scripts"
listUnicodeScripts() // Set(172) { 'Adlam', 'Ahom', 'Anatolian_Hieroglyphs', …

import { listUnicodeScriptCodes } from "unicode-scripts"
listUnicodeScriptCodes() // Set(172) { 'Adlm', 'Aghb', 'Ahom', …

You can find a list of all scripts in Unicode, with links to Wikipedia on character.construction/scripts

Also See

MIT License