Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit bded4d3

Browse files
committed
Make gettext Unicode interface consistent and clean up the docs.
1 parent 6a9475f commit bded4d3

3 files changed

Lines changed: 89 additions & 112 deletions

File tree

Doc/library/gettext.rst

Lines changed: 79 additions & 92 deletions
Original file line numberDiff line numberDiff line change
@@ -66,8 +66,8 @@ class-based API instead.
6666

6767
.. function:: lgettext(message)
6868

69-
Equivalent to :func:`gettext`, but the translation is returned in the preferred
70-
system encoding, if no other encoding was explicitly set with
69+
Equivalent to :func:`gettext`, but the translation is returned in the
70+
preferred system encoding, if no other encoding was explicitly set with
7171
:func:`bind_textdomain_codeset`.
7272

7373

@@ -78,8 +78,8 @@ class-based API instead.
7878

7979
.. function:: ldgettext(domain, message)
8080

81-
Equivalent to :func:`dgettext`, but the translation is returned in the preferred
82-
system encoding, if no other encoding was explicitly set with
81+
Equivalent to :func:`dgettext`, but the translation is returned in the
82+
preferred system encoding, if no other encoding was explicitly set with
8383
:func:`bind_textdomain_codeset`.
8484

8585

@@ -99,8 +99,8 @@ class-based API instead.
9999

100100
.. function:: lngettext(singular, plural, n)
101101

102-
Equivalent to :func:`ngettext`, but the translation is returned in the preferred
103-
system encoding, if no other encoding was explicitly set with
102+
Equivalent to :func:`ngettext`, but the translation is returned in the
103+
preferred system encoding, if no other encoding was explicitly set with
104104
:func:`bind_textdomain_codeset`.
105105

106106

@@ -169,13 +169,14 @@ class can also install themselves in the built-in namespace as the function
169169

170170
.. function:: translation(domain[, localedir[, languages[, class_[, fallback[, codeset]]]]])
171171

172-
Return a :class:`Translations` instance based on the *domain*, *localedir*, and
173-
*languages*, which are first passed to :func:`find` to get a list of the
172+
Return a :class:`Translations` instance based on the *domain*, *localedir*,
173+
and *languages*, which are first passed to :func:`find` to get a list of the
174174
associated :file:`.mo` file paths. Instances with identical :file:`.mo` file
175-
names are cached. The actual class instantiated is either *class_* if provided,
176-
otherwise :class:`GNUTranslations`. The class's constructor must take a single
177-
file object argument. If provided, *codeset* will change the charset used to
178-
encode translated strings.
175+
names are cached. The actual class instantiated is either *class_* if
176+
provided, otherwise :class:`GNUTranslations`. The class's constructor must
177+
take a single file object argument. If provided, *codeset* will change the
178+
charset used to encode translated strings in the :meth:`lgettext` and
179+
:meth:`lngettext` methods.
179180

180181
If multiple files are found, later files are used as fallbacks for earlier ones.
181182
To allow setting the fallback, :func:`copy.copy` is used to clone each
@@ -187,7 +188,7 @@ class can also install themselves in the built-in namespace as the function
187188
:class:`NullTranslations` instance if *fallback* is true.
188189

189190

190-
.. function:: install(domain[, localedir [, codeset[, names]]]])
191+
.. function:: install(domain[, localedir[, codeset[, names]]]])
191192

192193
This installs the function :func:`_` in Python's builtin namespace, based on
193194
*domain*, *localedir*, and *codeset* which are passed to the function
@@ -225,92 +226,92 @@ are the methods of :class:`NullTranslations`:
225226
:meth:`add_fallback`. It then calls ``self._parse(fp)`` if *fp* is not
226227
``None``.
227228

229+
.. method:: _parse(fp)
228230

229-
.. method:: NullTranslations._parse(fp)
230-
231-
No-op'd in the base class, this method takes file object *fp*, and reads the
232-
data from the file, initializing its message catalog. If you have an
233-
unsupported message catalog file format, you should override this method to
234-
parse your format.
231+
No-op'd in the base class, this method takes file object *fp*, and reads
232+
the data from the file, initializing its message catalog. If you have an
233+
unsupported message catalog file format, you should override this method
234+
to parse your format.
235235

236236

237-
.. method:: NullTranslations.add_fallback(fallback)
237+
.. method:: add_fallback(fallback)
238238

239-
Add *fallback* as the fallback object for the current translation object. A
240-
translation object should consult the fallback if it cannot provide a
241-
translation for a given message.
239+
Add *fallback* as the fallback object for the current translation object.
240+
A translation object should consult the fallback if it cannot provide a
241+
translation for a given message.
242242

243243

244-
.. method:: NullTranslations.gettext(message)
244+
.. method:: gettext(message)
245245

246-
If a fallback has been set, forward :meth:`gettext` to the fallback. Otherwise,
247-
return the translated message. Overridden in derived classes.
246+
If a fallback has been set, forward :meth:`gettext` to the fallback.
247+
Otherwise, return the translated message. Overridden in derived classes.
248248

249249

250-
.. method:: NullTranslations.lgettext(message)
250+
.. method:: lgettext(message)
251251

252-
If a fallback has been set, forward :meth:`lgettext` to the fallback. Otherwise,
253-
return the translated message. Overridden in derived classes.
252+
If a fallback has been set, forward :meth:`lgettext` to the fallback.
253+
Otherwise, return the translated message. Overridden in derived classes.
254254

255255

256-
.. method:: NullTranslations.ngettext(singular, plural, n)
256+
.. method:: ngettext(singular, plural, n)
257257

258-
If a fallback has been set, forward :meth:`ngettext` to the fallback. Otherwise,
259-
return the translated message. Overridden in derived classes.
258+
If a fallback has been set, forward :meth:`ngettext` to the fallback.
259+
Otherwise, return the translated message. Overridden in derived classes.
260260

261261

262-
.. method:: NullTranslations.lngettext(singular, plural, n)
262+
.. method:: lngettext(singular, plural, n)
263263

264-
If a fallback has been set, forward :meth:`ngettext` to the fallback. Otherwise,
265-
return the translated message. Overridden in derived classes.
264+
If a fallback has been set, forward :meth:`ngettext` to the fallback.
265+
Otherwise, return the translated message. Overridden in derived classes.
266266

267267

268-
.. method:: NullTranslations.info()
268+
.. method:: info()
269269

270-
Return the "protected" :attr:`_info` variable.
270+
Return the "protected" :attr:`_info` variable.
271271

272272

273-
.. method:: NullTranslations.charset()
273+
.. method:: charset()
274274

275-
Return the "protected" :attr:`_charset` variable.
275+
Return the "protected" :attr:`_charset` variable, which is the encoding of
276+
the message catalog file.
276277

277278

278-
.. method:: NullTranslations.output_charset()
279+
.. method:: output_charset()
279280

280-
Return the "protected" :attr:`_output_charset` variable, which defines the
281-
encoding used to return translated messages.
281+
Return the "protected" :attr:`_output_charset` variable, which defines the
282+
encoding used to return translated messages in :meth:`lgettext` and
283+
:meth:`lngettext`.
282284

283285

284-
.. method:: NullTranslations.set_output_charset(charset)
286+
.. method:: set_output_charset(charset)
285287

286-
Change the "protected" :attr:`_output_charset` variable, which defines the
287-
encoding used to return translated messages.
288+
Change the "protected" :attr:`_output_charset` variable, which defines the
289+
encoding used to return translated messages.
288290

289291

290-
.. method:: NullTranslations.install([names])
292+
.. method:: install([names])
291293

292-
this method installs :meth:`self.gettext` into the built-in namespace,
293-
binding it to ``_``.
294+
This method installs :meth:`self.gettext` into the built-in namespace,
295+
binding it to ``_``.
294296

295-
If the *names* parameter is given, it must be a sequence containing
296-
the names of functions you want to install in the builtin namespace
297-
in addition to :func:`_`. Supported names are ``'gettext'`` (bound
298-
to :meth:`self.gettext`), ``'ngettext'`` (bound to
299-
:meth:`self.ngettext`), ``'lgettext'`` and ``'lngettext'``.
297+
If the *names* parameter is given, it must be a sequence containing the
298+
names of functions you want to install in the builtin namespace in
299+
addition to :func:`_`. Supported names are ``'gettext'`` (bound to
300+
:meth:`self.gettext`), ``'ngettext'`` (bound to :meth:`self.ngettext`),
301+
``'lgettext'`` and ``'lngettext'``.
300302

301-
Note that this is only one way, albeit the most convenient way, to
302-
make the :func:`_` function available to your application. Because
303-
it affects the entire application globally, and specifically the
304-
built-in namespace, localized modules should never install
305-
:func:`_`. Instead, they should use this code to make :func:`_`
306-
available to their module::
303+
Note that this is only one way, albeit the most convenient way, to make
304+
the :func:`_` function available to your application. Because it affects
305+
the entire application globally, and specifically the built-in namespace,
306+
localized modules should never install :func:`_`. Instead, they should use
307+
this code to make :func:`_` available to their module::
307308

308-
import gettext
309-
t = gettext.translation('mymodule', ...)
310-
_ = t.gettext
309+
import gettext
310+
t = gettext.translation('mymodule', ...)
311+
_ = t.gettext
311312

312-
This puts :func:`_` only in the module's global namespace and so only
313-
affects calls within this module.
313+
This puts :func:`_` only in the module's global namespace and so only
314+
affects calls within this module.
314315

315316

316317
The :class:`GNUTranslations` class
@@ -329,7 +330,10 @@ key ``Content-Type`` is found, then the ``charset`` property is used to
329330
initialize the "protected" :attr:`_charset` instance variable, defaulting to
330331
``None`` if not found. If the charset encoding is specified, then all message
331332
ids and message strings read from the catalog are converted to Unicode using
332-
this encoding.
333+
this encoding, else ASCII encoding is assumed.
334+
335+
Since message ids are read as Unicode strings too, all :meth:`*gettext` methods
336+
will assume message ids as Unicode strings, not byte strings.
333337

334338
The entire set of key/value pairs are placed into a dictionary and set as the
335339
"protected" :attr:`_info` instance variable.
@@ -344,25 +348,23 @@ The following methods are overridden from the base class implementation:
344348
.. method:: GNUTranslations.gettext(message)
345349

346350
Look up the *message* id in the catalog and return the corresponding message
347-
string, as a bytestring encoded with the catalog's charset encoding, if
348-
known. If there is no entry in the catalog for the *message* id, and a fallback
349-
has been set, the look up is forwarded to the fallback's :meth:`gettext` method.
350-
Otherwise, the *message* id is returned.
351+
string, as a Unicode string. If there is no entry in the catalog for the
352+
*message* id, and a fallback has been set, the look up is forwarded to the
353+
fallback's :meth:`gettext` method. Otherwise, the *message* id is returned.
351354

352355

353356
.. method:: GNUTranslations.lgettext(message)
354357

355-
Equivalent to :meth:`gettext`, but the translation is returned in the preferred
356-
system encoding, if no other encoding was explicitly set with
357-
:meth:`set_output_charset`.
358+
Equivalent to :meth:`gettext`, but the translation is returned as a
359+
bytestring encoded in the selected output charset, or in the preferred system
360+
encoding if no encoding was explicitly set with :meth:`set_output_charset`.
358361

359362

360363
.. method:: GNUTranslations.ngettext(singular, plural, n)
361364

362365
Do a plural-forms lookup of a message id. *singular* is used as the message id
363366
for purposes of lookup in the catalog, while *n* is used to determine which
364-
plural form to use. The returned message string is a bytestring encoded with
365-
the catalog's charset encoding, if known.
367+
plural form to use. The returned message string is a Unicode string.
366368

367369
If the message id is not found in the catalog, and a fallback is specified, the
368370
request is forwarded to the fallback's :meth:`ngettext` method. Otherwise, when
@@ -380,9 +382,9 @@ The following methods are overridden from the base class implementation:
380382

381383
.. method:: GNUTranslations.lngettext(singular, plural, n)
382384

383-
Equivalent to :meth:`gettext`, but the translation is returned in the preferred
384-
system encoding, if no other encoding was explicitly set with
385-
:meth:`set_output_charset`.
385+
Equivalent to :meth:`gettext`, but the translation is returned as a
386+
bytestring encoded in the selected output charset, or in the preferred system
387+
encoding if no encoding was explicitly set with :meth:`set_output_charset`.
386388

387389

388390
Solaris message catalog support
@@ -609,21 +611,6 @@ marked with :func:`N_`. :program:`pygettext` and :program:`xpot` both support
609611
this through the use of command line switches.
610612

611613

612-
:func:`gettext` vs. :func:`lgettext`
613-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
614-
615-
In Python 2.4 the :func:`lgettext` family of functions were introduced. The
616-
intention of these functions is to provide an alternative which is more
617-
compliant with the current implementation of GNU gettext. Unlike
618-
:func:`gettext`, which returns strings encoded with the same codeset used in the
619-
translation file, :func:`lgettext` will return strings encoded with the
620-
preferred system encoding, as returned by :func:`locale.getpreferredencoding`.
621-
Also notice that Python 2.4 introduces new functions to explicitly choose the
622-
codeset used in translated strings. If a codeset is explicitly set, even
623-
:func:`lgettext` will return translated strings in the requested codeset, as
624-
would be expected in the GNU gettext implementation.
625-
626-
627614
Acknowledgements
628615
----------------
629616

Lib/gettext.py

Lines changed: 8 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -304,26 +304,16 @@ def _parse(self, fp):
304304
# cause no problems since us-ascii should always be a subset of
305305
# the charset encoding. We may want to fall back to 8-bit msgids
306306
# if the Unicode conversion fails.
307+
charset = self._charset or 'ascii'
307308
if b'\x00' in msg:
308309
# Plural forms
309310
msgid1, msgid2 = msg.split(b'\x00')
310311
tmsg = tmsg.split(b'\x00')
311-
if self._charset:
312-
msgid1 = str(msgid1, self._charset)
313-
tmsg = [str(x, self._charset) for x in tmsg]
314-
else:
315-
msgid1 = str(msgid1)
316-
tmsg = [str(x) for x in tmsg]
317-
for i in range(len(tmsg)):
318-
catalog[(msgid1, i)] = tmsg[i]
312+
msgid1 = str(msgid1, charset)
313+
for i, x in enumerate(tmsg):
314+
catalog[(msgid1, i)] = str(x, charset)
319315
else:
320-
if self._charset:
321-
msg = str(msg, self._charset)
322-
tmsg = str(tmsg, self._charset)
323-
else:
324-
msg = str(msg)
325-
tmsg = str(tmsg)
326-
catalog[msg] = tmsg
316+
catalog[str(msg, charset)] = str(tmsg, charset)
327317
# advance to next entry in the seek tables
328318
masteridx += 8
329319
transidx += 8
@@ -359,7 +349,7 @@ def gettext(self, message):
359349
if tmsg is missing:
360350
if self._fallback:
361351
return self._fallback.gettext(message)
362-
return str(message)
352+
return message
363353
return tmsg
364354

365355
def ngettext(self, msgid1, msgid2, n):
@@ -369,9 +359,9 @@ def ngettext(self, msgid1, msgid2, n):
369359
if self._fallback:
370360
return self._fallback.ngettext(msgid1, msgid2, n)
371361
if n == 1:
372-
tmsg = str(msgid1)
362+
tmsg = msgid1
373363
else:
374-
tmsg = str(msgid2)
364+
tmsg = msgid2
375365
return tmsg
376366

377367

Misc/NEWS

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,8 @@ Library
4747
code of every single module of the standard library, including invalid files
4848
used in the test suite.
4949

50-
- All the u* variant functions and methods in gettext have been renamed to their
51-
none u* siblings.
50+
- The gettext library now consistently uses Unicode strings for message ids
51+
and message strings, and ``ugettext()`` and the like don't exist anymore.
5252

5353
- The traceback module has been expanded to handle chained exceptions.
5454

0 commit comments

Comments
 (0)