-
Notifications
You must be signed in to change notification settings - Fork 11.6k
[12.x] Add acronym to strings #56336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I would suggest adding an alias called " Furthermore, as with other language-related things (see hotmeteor/titles#1 (comment), hotmeteor/titles#3, hotmeteor/titles#4 for example), some acronyms don't exactly follow these easy rules. A good addition would be to include "stop words", meaning words like "of" ("United States of America" = "US framework/src/Illuminate/Support/Str.php Lines 1499 to 1503 in cea3d1b
Some abbreviations don't have separators between letters but only one punctuation mark at the end ("e.g.", "etc.", …)—what about these? |
Some stop words could be usefully used in acronyms, just like the one in the example: ASAP which includes "as". I think maybe we need another method to remove all stop words on demand. |
From the European Commission's Style Guide:
Should we offer this, too? And what about diacritics? Are they kept or omitted?
Some initialisms lowercase stop words instead of removing them. May get more complex than the initial thought 😅 Also, what about hyphenated words? |
return ''; | ||
} | ||
|
||
$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : ''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we shouldn't throw here or make this at least configurable
|
||
$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : ''; | ||
|
||
preg_match_all('/\b[a-zA-Z]/', $string, $matches); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like, instead of this,
framework/src/Illuminate/Support/Str.php
Line 1474 in cea3d1b
$parts = explode(' ', $value); |
framework/src/Illuminate/Support/Str.php
Line 1722 in cea3d1b
$words = explode(' ', static::replace(['-', '_'], ' ', $value)); |
framework/src/Illuminate/Support/Str.php
Line 1507 in cea3d1b
$words = preg_split('/\s+/', $value, -1, PREG_SPLIT_NO_EMPTY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #56338
The idea of the PR is just to get an acronym of the string provided regardless of external rules. If more control is required you can chain other string methods before and after the acronym method. |
That'll defeat the purpose. The methods should handle stuff that can be defined as rules internally and not cause extra work. I'm not saying, this should include these rules but make this functionality configurable by user-passed rules. So, I'm not suggesting a whole i18n/Intl implementation, whose capabilities Laravel will never have anyways because that's of course not the aim of the framework. |
* @param string $separator | ||
* @return string | ||
*/ | ||
public static function acronym($string, $separator = '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we usually call this $value
?
public static function acronym($string, $separator = '') | |
public static function acronym($value, $separator = '') |
public static function acronym($string, $separator = '') | ||
{ | ||
if (empty($string)) { | ||
return ''; | ||
} | ||
|
||
$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : ''; | ||
|
||
preg_match_all('/\b[a-zA-Z]/', $string, $matches); | ||
|
||
$acronym = implode($separator, $matches[0]); | ||
|
||
return static::upper($acronym); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how the method could look with my suggested changes btw
public static function acronym($string, $separator = '') | |
{ | |
if (empty($string)) { | |
return ''; | |
} | |
$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : ''; | |
preg_match_all('/\b[a-zA-Z]/', $string, $matches); | |
$acronym = implode($separator, $matches[0]); | |
return static::upper($acronym); | |
} | |
public static function initialism($string, $separator = '', $stopWords = [], bool $excludeStopWords = true, int $uppercaseLimit = -1, bool $normalize = true) | |
{ | |
if (empty($string)) { | |
return ''; | |
} | |
$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : ''; | |
$words = mb_split('\s+', $string); | |
if (count($stopWords) && $excludeStopWords) { | |
$words = array_values(array_filter($words, fn (string $word) => !in_array($word, $stopWords))); | |
} | |
if ($normalize) { | |
$words = array_map(static::ascii(...), $words); | |
} | |
if (count($stopWords) && !$excludeStopWords) { | |
$initials = array_map( | |
fn (string $word) => in_array($word, $stopWords) | |
? static::lower(static::substr($word, 0, 1)) | |
: static::upper(static::substr($word, 0, 1)), | |
$words, | |
); | |
} else if ($uppercaseLimit !== -1 && count($words) > $uppercaseLimit) { | |
$initials = array_map( | |
fn (string $word, int $idx) => $idx === 0 | |
? static::upper(static::substr($word, 0, 1)) | |
: static::lower(static::substr($word, 0, 1)), | |
$words, | |
array_keys($words), | |
); | |
} else { | |
$initials = array_map( | |
fn (string $word) => static::upper(static::substr($word, 0, 1)), | |
$words, | |
); | |
} | |
return implode($separator, $initials); | |
} |
With the following cases:
Str::initialism('United States of America'), // USOA
Str::initialism('United States of America', '.'), // U.S.O.A
Str::initialism('United States of America', stopWords: ['of']), // USA
Str::initialism('United States of America', stopWords: ['of'], excludeStopWords: false), // USoA
Str::initialism('United States of America', uppercaseLimit: 4), // USOA
Str::initialism('United States of America', uppercaseLimit: 2), // Usoa
Str::initialism('Änderung der Datenschutzerklärung', normalize: false), // ÄDD (notice the A umlaut here)
Str::initialism('Änderung der Datenschutzerklärung', normalize: true), // ADD
Thanks for your pull request to Laravel! Unfortunately, I'm going to delay merging this code for now. To preserve our ability to adequately maintain the framework, we need to be very careful regarding the amount of code we include. If applicable, please consider releasing your code as a package so that the community can still take advantage of your contributions! |
This PR introduces a method that converts a given string into its acronym, with support for customizable separators.
Accepted separators:
['.', '-', '_', '/', ' ']