Conversation
…de Extensions, and fix ures_openDirect crash with NULL locale ID. Add test case for ures_openDirect with NULL locale ID.
huichen123
reviewed
Aug 10, 2021
daniel-ju
approved these changes
Aug 10, 2021
icu-patches/patches/019-ICU-Patch-ICU-21705_Fix_ures_openDirect_crash_with_NULL_localeID.patch
Show resolved
Hide resolved
axelandrejs
approved these changes
Aug 11, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes two issues:
ures_openDirectcrash with a NULL input locale ID.It also adds a test case for
ures_openDirectwith a NULL input locale ID.This change adds a new ICU-PATCH file for the changes, though hopefully this won't be needed and we can fix this issue upstream before ingesting a new version of ICU.
PR Checklist
Detailed Description
ICU-21705 Calling ures_openDirect with NULL locale ID results in crash (undefined behavior)
Calling the public API
ures_openDirectwill crash if you call it with aNULLLocale ID -- even though this is permitted by the API docs.The issue is that
strcmpis called with the input locale ID (ie: null pointer), which is undefined behavior, and causes the crash.This doesn't occur with
ures_open, as that callsuloc_getBaseName()first, which gets the default locale via_canonicalize(), so that the call tostrcmphas a locale ID (rather than a null pointer).We need to check the input locale ID and adjust it accordingly if it is null (default locale) or a pointer to empty string (root locale).
ICU-21706 ICU4C test suite crashes if the default locale has any BCP47 Unicode extension tags on it (ex: "en-US-u-hc-12")
If ICU's default locale has any BCP47 Unicode extension tags on it (ex: "en-US-u-hc-12") then the ICU test suite will crash.
The problem is that the
resbMutexis attempted to be locked twice, which leads to the crash/termination.The issue occurs on the first call to
ures_open(). This causes the ICU data file to be loaded. However, the default locale isn't yet cached ingDefaultLocale, soLocale::getDefaultneeds to query the host OS for the default locale.If the default locale has any BCP47 Unicode extension tags, this causes
_canonicalizeto attempt to convert them to the legacy ICU style extensions by callinguloc_forLanguageTag. As part of this conversion,uloc_toLegacyKeywill also attempt to load the ICU data file, in order to load thekeyTypeDatato map the extensions.However, this is problematic, as it means that we're now trying to load the data file while trying to load the data file. In other words, it means that the
resbMutexwill attempted to be locked twice – leading to the crash.However, we can avoid this by moving the query for the default locale to be outside of the mutex protected part of the code – which allows us to avoid the circular nature of the issue. We can query and store the ICU default locale, and then pass it to the
findFirstExistingfunction inside the mutex protected part of the code, so it doesn’t need to query for the default locale itself.