[po] auto sync

github-actions[bot] · github-actions[bot] · commit 87584464ef7b · 2021-06-19T08:33:23.000Z
diff --git a/howto/unicode.po b/howto/unicode.po
@@ -681,26 +681,31 @@ msgid ""
 "notes.curiousefficiency.org/en/latest/python3/text_file_processing.html>`_, "
 "by Nick Coghlan."
 msgstr ""
+"`用 Python 3 处理文本文件 <http://python-"
+"notes.curiousefficiency.org/en/latest/python3/text_file_processing.html>`_ "
+"，作者 Nick Coghlan。"
 
 #: ../../howto/unicode.rst:520
 msgid ""
 "`Pragmatic Unicode <https://nedbatchelder.com/text/unipain.html>`_, a PyCon "
 "2012 presentation by Ned Batchelder."
 msgstr ""
+"`实用的 Unicode <https://nedbatchelder.com/text/unipain.html>`_，Ned Batchelder "
+"在 PyCon 2012 上的演示。"
 
 #: ../../howto/unicode.rst:522
 msgid ""
 "The :class:`str` type is described in the Python library reference at "
 ":ref:`textseq`."
-msgstr ""
+msgstr ":class:`str` 类型在 Python 库参考文档 :ref:`textseq` 中有介绍。 "
 
 #: ../../howto/unicode.rst:525
 msgid "The documentation for the :mod:`unicodedata` module."
-msgstr ""
+msgstr ":mod:`unicodedata` 模块的文档"
 
 #: ../../howto/unicode.rst:527
 msgid "The documentation for the :mod:`codecs` module."
-msgstr ""
+msgstr ":mod:`codecs` 模块的文档"
 
 #: ../../howto/unicode.rst:529
 msgid ""
@@ -710,17 +715,23 @@ msgid ""
 "Python 2's Unicode features (where the Unicode string type is called "
 "``unicode`` and literals start with ``u``)."
 msgstr ""
+"Marc-André Lemburg 在 EuroPython 2002 上做了一个题为“Python 和 Unicode”（PDF "
+"幻灯片）<https://downloads.egenix.com/python/Unicode-EPC2002-Talk.pdf>`_ "
+"的演示文稿。该幻灯片很好地概括了 Python 2 的 Unicode 功能设计（其中 Unicode 字符串类型称为 ``unicode``，文字以 "
+"``u`` 开头）。"
 
 #: ../../howto/unicode.rst:537
 msgid "Reading and Writing Unicode Data"
-msgstr ""
+msgstr "Unicode 数据的读写"
 
 #: ../../howto/unicode.rst:539
 msgid ""
 "Once you've written some code that works with Unicode data, the next problem"
 " is input/output.  How do you get Unicode strings into your program, and how"
 " do you convert Unicode into a form suitable for storage or transmission?"
 msgstr ""
+"既然处理 Unicode 数据的代码写好了，下一个问题就是输入/输出了。如何将 Unicode 字符串读入程序，如何将 Unicode "
+"转换为适于存储或传输的形式呢？"
 
 #: ../../howto/unicode.rst:543
 msgid ""
@@ -730,6 +741,8 @@ msgid ""
 "Unicode data, for example.  Many relational databases also support Unicode-"
 "valued columns and can return Unicode values from an SQL query."
 msgstr ""
+"根据输入源和输出目标的不同，或许什么都不用干；请检查一下应用程序用到的库是否原生支持 Unicode。例如，XML 解析器往往会返回 Unicode "
+"数据。许多关系数据库的字段也支持 Unicode 值，并且 SQL 查询也能返回 Unicode 值。"
 
 #: ../../howto/unicode.rst:549
 msgid ""
@@ -739,6 +752,8 @@ msgid ""
 "bytes with ``bytes.decode(encoding)``.  However, the manual approach is not "
 "recommended."
 msgstr ""
+"在写入磁盘或通过套接字发送之前，Unicode 数据通常要转换为特定的编码。可以自己完成所有工作：打开一个文件，从中读取一个 8 位字节对象，然后用 "
+"``bytes.decode(encoding)`` 对字节串进行转换。但是，不推荐采用这种全人工的方案。 "
 
 #: ../../howto/unicode.rst:554
 msgid ""
@@ -753,6 +768,10 @@ msgid ""
 "at least a moment you'd need to have both the encoded string and its Unicode"
 " version in memory.)"
 msgstr ""
+"编码的多字节特性就是一个难题； 一个 Unicode 字符可以用几个字节表示。 如果要以任意大小的块（例如 1024 或 4096 "
+"字节）读取文件，那么在块的末尾可能只读到某个 Unicode 字符的部分字节，这就需要编写错误处理代码。 "
+"有一种解决方案是将整个文件读入内存，然后进行解码，但这样就没法处理很大的文件了；若要读取 2 GB 的文件，就需要 2 GB 的 "
+"RAM。（其实需要的内存会更多些，因为至少有一段时间需要在内存中同时存放已编码字符串及其 Unicode 版本。）"
 
 #: ../../howto/unicode.rst:564
 msgid ""
@@ -765,16 +784,20 @@ msgid ""
 "*encoding* and *errors* parameters which are interpreted just like those in "
 ":meth:`str.encode` and :meth:`bytes.decode`."
 msgstr ""
+"解决方案是利用底层解码接口去捕获编码序列不完整的情况。这部分代码已经是现成的：内置函数 :func:`open` "
+"可以返回一个文件类的对象，该对象认为文件的内容采用指定的编码，:meth:`~io.TextIOBase.read` 和 "
+":meth:`~io.TextIOBase.write` 等方法接受 Unicode 参数。只要用 :func:`open` 的 *encoding* "
+"和 *errors* 参数即可，参数释义同 :meth:`str.encode` 和 :meth:`bytes.decode` 。 "
 
 #: ../../howto/unicode.rst:573
 msgid "Reading Unicode from a file is therefore simple::"
-msgstr ""
+msgstr "因此从文件读取 Unicode 就比较简单了："
 
 #: ../../howto/unicode.rst:579
 msgid ""
 "It's also possible to open files in update mode, allowing both reading and "
 "writing::"
-msgstr ""
+msgstr "也可以在更新模式下打开文件，以便同时读取和写入："
 
 #: ../../howto/unicode.rst:587
 msgid ""
@@ -788,6 +811,10 @@ msgid ""
 "endian encodings, that specify one particular byte ordering and don't skip "
 "the BOM."
 msgstr ""
+"Unicode 字符 ``U+FEFF`` 用作字节顺序标记（BOM），通常作为文件的第一个字符写入，以帮助自动检测文件的字节顺序。某些编码（例如 "
+"UTF-16）期望在文件开头出现 BOM；当采用这种编码时，BOM 将自动作为第一个字符写入，并在读取文件时会静默删除。这些编码有多种变体，例如用于 "
+"little-endian 和 big-endian 编码的 “utf-16-le” 和 “utf-16-be”，会指定一种特定的字节顺序并且不会忽略 "
+"BOM。"
 
 #: ../../howto/unicode.rst:596
 msgid ""