Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

nabilhassen
Copy link

This PR introduces a method that converts a given string into its acronym, with support for customizable separators.

Accepted separators: ['.', '-', '_', '/', ' ']

Str::acronym('As soon as possible');         // 'ASAP'
Str::acronym('As soon as possible', '.');    // 'A.S.A.P'

@shaedrich
Copy link
Contributor

shaedrich commented Jul 19, 2025

I would suggest adding an alias called "initialism" (maybe, this even is the better term for what the method does 🤔)

Furthermore, as with other language-related things (see hotmeteor/titles#1 (comment), hotmeteor/titles#3, hotmeteor/titles#4 for example), some acronyms don't exactly follow these easy rules. A good addition would be to include "stop words", meaning words like "of" ("United States of America" = "USoA") which won't become part of the acronym:

$minorWords = [
'and', 'as', 'but', 'for', 'if', 'nor', 'or', 'so', 'yet', 'a', 'an',
'the', 'at', 'by', 'for', 'in', 'of', 'off', 'on', 'per', 'to', 'up', 'via',
'et', 'ou', 'un', 'une', 'la', 'le', 'les', 'de', 'du', 'des', 'par', 'à',
];

Some abbreviations don't have separators between letters but only one punctuation mark at the end ("e.g.", "etc.", …)—what about these?

@nabilhassen
Copy link
Author

I would suggest adding an alias called "initialism"

Furthermore, as with other language-related things (see hotmeteor/titles#1 (comment), hotmeteor/titles#3, hotmeteor/titles#4 for example), some acronyms don't exactly follow these easy rules. A good addition would be to include "stop words", meaning words like "of" ("United States of America" = "USoA") which won't become part of the acronym:

$minorWords = [
'and', 'as', 'but', 'for', 'if', 'nor', 'or', 'so', 'yet', 'a', 'an',
'the', 'at', 'by', 'for', 'in', 'of', 'off', 'on', 'per', 'to', 'up', 'via',
'et', 'ou', 'un', 'une', 'la', 'le', 'les', 'de', 'du', 'des', 'par', 'à',
];

Some abbreviations don't have separators between letters but only one punctuation mark at the end ("e.g.", "etc.", …)—what about these?

Some stop words could be usefully used in acronyms, just like the one in the example: ASAP which includes "as". I think maybe we need another method to remove all stop words on demand.

@shaedrich
Copy link
Contributor

shaedrich commented Jul 19, 2025

From the European Commission's Style Guide:

7.3. Writing acronyms.

Acronyms with up to five letters are written in upper case:
 AIDS, COST, COVID-19, ECHO, EFTA, NASA, NATO, SHAPE, TRIPS
Exceptions: Tacis and Phare, which are no longer considered acronyms
Acronyms with six letters or more should normally be written with an initial
 capital followed by lower case. Thus:
 Benelux, Esprit, Helios, Interreg, Reside
Exceptions: organisations that themselves use upper case (such as UNESCO,
CENELEC and UNCTAD) and other acronyms conventionally written in upper
case.

Should we offer this, too?

And what about diacritics? Are they kept or omitted?

I think maybe we need another method to remove all stop words on demand.

Some initialisms lowercase stop words instead of removing them. May get more complex than the initial thought 😅

Also, what about hyphenated words?

return '';
}

$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : '';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we shouldn't throw here or make this at least configurable


$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : '';

preg_match_all('/\b[a-zA-Z]/', $string, $matches);
Copy link
Contributor

@shaedrich shaedrich Jul 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like, instead of this,

$parts = explode(' ', $value);
,
$words = explode(' ', static::replace(['-', '_'], ' ', $value));
, and
$words = preg_split('/\s+/', $value, -1, PREG_SPLIT_NO_EMPTY);
, we should use one method to split a string into words and one only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #56338

@nabilhassen
Copy link
Author

I would suggest adding an alias called "initialism"
Furthermore, as with other language-related things (see hotmeteor/titles#1 (comment), hotmeteor/titles#3, hotmeteor/titles#4 for example), some acronyms don't exactly follow these easy rules. A good addition would be to include "stop words", meaning words like "of" ("United States of America" = "USoA") which won't become part of the acronym:

$minorWords = [
'and', 'as', 'but', 'for', 'if', 'nor', 'or', 'so', 'yet', 'a', 'an',
'the', 'at', 'by', 'for', 'in', 'of', 'off', 'on', 'per', 'to', 'up', 'via',
'et', 'ou', 'un', 'une', 'la', 'le', 'les', 'de', 'du', 'des', 'par', 'à',
];

Some abbreviations don't have separators between letters but only one punctuation mark at the end ("e.g.", "etc.", …)—what about these?

Some stop words could be usefully used in acronyms, just like the one in the example: ASAP which includes "as". I think maybe we need another method to remove all stop words on demand.

The idea of the PR is just to get an acronym of the string provided regardless of external rules. If more control is required you can chain other string methods before and after the acronym method.

@shaedrich
Copy link
Contributor

If more control is required you can chain other string methods before and after the acronym method.

That'll defeat the purpose. The methods should handle stuff that can be defined as rules internally and not cause extra work. I'm not saying, this should include these rules but make this functionality configurable by user-passed rules. So, I'm not suggesting a whole i18n/Intl implementation, whose capabilities Laravel will never have anyways because that's of course not the aim of the framework.

* @param string $separator
* @return string
*/
public static function acronym($string, $separator = '')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we usually call this $value?

Suggested change
public static function acronym($string, $separator = '')
public static function acronym($value, $separator = '')

Comment on lines +775 to +788
public static function acronym($string, $separator = '')
{
if (empty($string)) {
return '';
}

$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : '';

preg_match_all('/\b[a-zA-Z]/', $string, $matches);

$acronym = implode($separator, $matches[0]);

return static::upper($acronym);
}
Copy link
Contributor

@shaedrich shaedrich Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how the method could look with my suggested changes btw

Suggested change
public static function acronym($string, $separator = '')
{
if (empty($string)) {
return '';
}
$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : '';
preg_match_all('/\b[a-zA-Z]/', $string, $matches);
$acronym = implode($separator, $matches[0]);
return static::upper($acronym);
}
public static function initialism($string, $separator = '', $stopWords = [], bool $excludeStopWords = true, int $uppercaseLimit = -1, bool $normalize = true)
{
if (empty($string)) {
return '';
}
$separator = in_array($separator, ['.', '-', '_', '/', ' ']) ? $separator : '';
$words = mb_split('\s+', $string);
if (count($stopWords) && $excludeStopWords) {
$words = array_values(array_filter($words, fn (string $word) => !in_array($word, $stopWords)));
}
if ($normalize) {
$words = array_map(static::ascii(...), $words);
}
if (count($stopWords) && !$excludeStopWords) {
$initials = array_map(
fn (string $word) => in_array($word, $stopWords)
? static::lower(static::substr($word, 0, 1))
: static::upper(static::substr($word, 0, 1)),
$words,
);
} else if ($uppercaseLimit !== -1 && count($words) > $uppercaseLimit) {
$initials = array_map(
fn (string $word, int $idx) => $idx === 0
? static::upper(static::substr($word, 0, 1))
: static::lower(static::substr($word, 0, 1)),
$words,
array_keys($words),
);
} else {
$initials = array_map(
fn (string $word) => static::upper(static::substr($word, 0, 1)),
$words,
);
}
return implode($separator, $initials);
}

With the following cases:

    Str::initialism('United States of America'), // USOA
    Str::initialism('United States of America', '.'), // U.S.O.A
    Str::initialism('United States of America', stopWords: ['of']), // USA
    Str::initialism('United States of America', stopWords: ['of'], excludeStopWords: false), // USoA
    Str::initialism('United States of America', uppercaseLimit: 4), // USOA
    Str::initialism('United States of America', uppercaseLimit: 2), // Usoa
    Str::initialism('Änderung der Datenschutzerklärung', normalize: false), // ÄDD (notice the A umlaut here)
    Str::initialism('Änderung der Datenschutzerklärung', normalize: true), // ADD

@taylorotwell
Copy link
Member

Thanks for your pull request to Laravel!

Unfortunately, I'm going to delay merging this code for now. To preserve our ability to adequately maintain the framework, we need to be very careful regarding the amount of code we include.

If applicable, please consider releasing your code as a package so that the community can still take advantage of your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants