Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 8c49454

Browse files
committed
Be more careful about extracting encoding from locale strings on Windows.
GetLocaleInfoEx() can fail on strings that setlocale() was perfectly happy with. A common way for that to happen is if the locale string is actually a Unix-style string, say "et_EE.UTF-8". In that case, what's after the dot is an encoding name, not a Windows codepage number; blindly treating it as a codepage number led to failure, with a fairly silly error message. Hence, check to see if what's after the dot is all digits, and if not, treat it as a literal encoding name rather than a codepage number. This will do the right thing with many Unix-style locale strings, and produce a more sensible error message otherwise. Somewhat independently of that, treat a zero (CP_ACP) result from GetLocaleInfoEx() as meaning that we must use UTF-8 encoding. Back-patch to all supported branches. Juan José Santamaría Flecha Discussion: https://postgr.es/m/[email protected]
1 parent 24566b3 commit 8c49454

File tree

1 file changed

+24
-5
lines changed

1 file changed

+24
-5
lines changed

src/port/chklocale.c

Lines changed: 24 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -239,25 +239,44 @@ win32_langinfo(const char *ctype)
239239
{
240240
r = malloc(16); /* excess */
241241
if (r != NULL)
242-
sprintf(r, "CP%u", cp);
242+
{
243+
/*
244+
* If the return value is CP_ACP that means no ANSI code page is
245+
* available, so only Unicode can be used for the locale.
246+
*/
247+
if (cp == CP_ACP)
248+
strcpy(r, "utf8");
249+
else
250+
sprintf(r, "CP%u", cp);
251+
}
243252
}
244253
else
245254
#endif
246255
{
247256
/*
248-
* Locale format on Win32 is <Language>_<Country>.<CodePage> . For
249-
* example, English_United States.1252.
257+
* Locale format on Win32 is <Language>_<Country>.<CodePage>. For
258+
* example, English_United States.1252. If we see digits after the
259+
* last dot, assume it's a codepage number. Otherwise, we might be
260+
* dealing with a Unix-style locale string; Windows' setlocale() will
261+
* take those even though GetLocaleInfoEx() won't, so we end up here.
262+
* In that case, just return what's after the last dot and hope we can
263+
* find it in our table.
250264
*/
251265
codepage = strrchr(ctype, '.');
252266
if (codepage != NULL)
253267
{
254-
int ln;
268+
size_t ln;
255269

256270
codepage++;
257271
ln = strlen(codepage);
258272
r = malloc(ln + 3);
259273
if (r != NULL)
260-
sprintf(r, "CP%s", codepage);
274+
{
275+
if (strspn(codepage, "0123456789") == ln)
276+
sprintf(r, "CP%s", codepage);
277+
else
278+
strcpy(r, codepage);
279+
}
261280
}
262281

263282
}

0 commit comments

Comments
 (0)