Description
Description
Working with Greek texts we got the report that when turning the text into uppercase, we should strip the accents from the input.
Note that this is not simply turning the string into ascii.
Example:
In Greek:
άδικος, κείμενο, ίριδα → ΑΔΙΚΟΣ, ΚΕΙΜΕΝΟ, ΙΡΙΔΑ
In Turkish:
ISTANBUL
-> İSTANBUL
It doesn't seem like we can do that with the current String component, is it maybe an idea to add a new LocaleString class to this component?
Source:
https://icu.unicode.org/design/case/greek-upper
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/toLocaleUpperCase
Very quick PoC
$string = "άδικος, κείμενο, ίριδα";
$localeString = new class ($string, "el") extends UnicodeString {
public function __construct(string $string = '', private readonly ?string $languageCode = null)
{
parent::__construct($string);
}
public function upper(): static
{
$str = clone $this;
$str->string = \Transliterator::create($this->languageCode . "-Upper")->transliterate($str->string);
return $str;
}
};
echo $localeString->upper(); // ΑΔΙΚΟΣ, ΚΕΙΜΕΝΟ, ΙΡΙΔΑ
to be updated to implement more methods, and to reuse Transliterator instances.
And I needed to make the languageCode constructor var nullable because of the parent class, default this to the php default?