Description
In my previous PR Support ISO 3166-1 Alpha-3 country codes I added support not only for alpha-3 country codes, but also extended the Languages class to support ISO 639-2 three-letter language codes. My focus was on the country codes, and with the country codes it was easy to get it right because all countries have one alpha2 code and one alpha3 code.
For the languages things is a bit more complicated. The extension to the Languages class was made to just mirror the Countries class, and not enough thought went into it. Here are some of the problems I can now see (after thinking about it) that we have with the Languages:
Wrong terminology
For languages we have ISO 639-1 that cover two-letter codes, and ISO 639-2 that cover three letter codes. Nowhere in the ISO specifications do they talk about "alpha2" or "alpha3". Those terms are borrowed from the ISO 3166-1 spec that covers country/region codes. So anyone familiar with ISO 639-1 and ISO 639-2 will wonder why are we in the code have method names and variable names with "alpha2" and "alpha3" in them.
Missing languages
This is a more serious issue. Not all languages have an ISO 639-1 two letter code. Here are some examples of the exeptions:
- 409 languages (out of 615) has only a three letter code, but no corresponding two letter code. F.ex.
ace => Achinese
ach => Acoli
arz => Egyptian Arabic
- 22 languages must be described with more than three letters
en_AU => Australian English
de_AT => Austrian German
zh_Hans => Simplified Chinese
- 4 languages has only a two letter code, and no three letter code
no => Norwegian
sh => Serbo-Croatian
tl => Tagalog
tw => Twi
- Only 180 languages has a two-letter to three letter mapping.
Implications for Languages methods
Languages::getAlpha3Code
: This is the only one that is not new. It throws MissingResourceException for all but the 180 languages in the last category. Does not seem right to me. May be it should accept as input all longer than 2 codes and return them unchanged if it is a valid language code?Languages::getAlpha2Code
: Same considerations as above. Currently throws MissingResourceException for all but the 180 languages that has a 2 letter code.Languages::getAlpha3Codes()
: Currently only returns 180 codes.Languages::alpha3CodeExists
: Now it only returns true for the above mentioned 180 codes.Languages::getAlpha3Name
: Throws MissingResourceException if the language is not among the 180.Languages::getAlpha3Names
: Only returns a list of 180 languages. By contrastLanguages::getNames
returns a list of 615 languages.
What to do
I am seeking input in this issue from others on what to do about this. Please comment below.