Last month a call for input was sent concerning the introduction of Unicode email addresses for WordPress accounts (#31992). Initial support was merged in [62482]. Here is what you need to know in order to test this change on your sites and in your plugins and themes.
is_email() and sanitize_email() accept non-ASCII email addresses like grรฅ@grรฅ.org if the site databaseโs charset is utf8mb4.
- Support is added as an enhancement Enhancements are simple improvements to WordPress, such as the addition of a hook, a new feature, or an improvement to an existing feature. which can be disabled by removing the
is_email and sanitize_email filters for wp_is_unicode_email and wp_sanitize_unicode_email, respectively.
- A new class โ
WP_Email_Address โ provides a structural view into email addresses so your code doesnโt have to guess. It provides the local part, the domain part, and decodes Punycode translations in the domain part.
It should be possible, therefore, to create WordPress accounts with email addresses not previously allowed. In addition, email validation is updated to match the WHATWG email specification so that WordPress and an <input type=email> element will agree on what is and what isnโt allowable.
The term โUnicode email addressโ may be a bit ambiguous because there are two ways emails can be considered Unicode:
- Unicode domain support has been supported for many years through Punycode encoding of the domain. This is an ASCII-encoded version of Unicode domains where the domain parts start with
xn--, like xn--uist2j67d64zv30b.xn--ses554g as a stand-in for ๆ
็ฐๅณช้ฟๅ.็ฝๅ. Because the encoding is all ASCII, WordPress has implicitly supported Unicode domains without recognizing them. The change in [62482] decodes the domain parts so that WordPress and its plugins and themes can access either the ASCII representation (for circumstances like HTML HyperText Markup Language. The semantic scripting language primarily used for outputting content in web browsers. attributes where software will read their value) or the Unicode representation (for circumstances like text nodes where human will read their value).
- Unicode local part (mailbox) support has largely been absent from specifications and software until recently when most major email hosts started routing mail with UTF-8 mailboxes. WordPress previously rejected all addresses containing non-ASCII characters. It now accepts valid UTF-8 local parts. There has never been an ASCII-encoding of this part of the email address.
If your extension code expects email addresses to only contain ASCII bytes, they will need updating for WordPressโ new Unicode email support. The easiest way to account for this is to use the new WP_Email_Address::from_string() and then access its getter methods.
// Generate an author link.
$email = WP_Email_Address::from_string( $provided_email );
if ( null === $email ) {
return '';
}
$processor = new WP_HTML_Tag_Processor( '<a> </a>' );
$processor->next_tag();
$processor->set_attribute( 'href', "mailto:{$email->get_ascii_address()}" );
$processor->next_token();
$processor->set_modifiable_text( $email->get_unicode_address() );
return $processor->get_updated_html();
If your plugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party. connects with a third party service using email addresses from WordPress, now is a good time to ensure that third party also properly supports Unicode email addresses. If not, you can disable Unicode email support with the following snippet.
// Disable Unicode email support until third-party integration supports them.
remove_filter( 'is_email', 'wp_is_unicode_email', 10 );
remove_filter( 'sanitize_email', 'wp_sanitize_unicode_email', 10 );
add_filter( 'is_email', 'wp_is_ascii_email', 10, 3 );
add_filter( 'sanitize_email', 'wp_sanitize_ascii_email', 10, 3 );
Thank you!
This change updates existing email validation and sanitization code and introduces new behaviors for an unbounded set of potential email addresses. Itโs likely that unanticipated cases will arise, and with your feedback in these cases, this feature can be a successful part of WordPress 7.1.
Props
Thanks to @amykamala and @jorbin for reviewing this post!
#call-for-testing, #email, #unicode