@@ -19,11 +19,11 @@ due to the many different aspects of the Unicode-Python integration.
1919
2020The latest version of this document is always available at:
2121
22- http://starship.skyport .net/~lemburg/unicode-proposal.txt
22+ http://starship.python .net/~lemburg/unicode-proposal.txt
2323
2424Older versions are available as:
2525
26- http://starship.skyport .net/~lemburg/unicode-proposal-X.X.txt
26+ http://starship.python .net/~lemburg/unicode-proposal-X.X.txt
2727
2828
2929Conventions:
@@ -101,7 +101,7 @@ of the source file (e.g. '# source file encoding: latin-1'). If you
101101only use 7-bit ASCII then everything is fine and no such notice is
102102needed, but if you include Latin-1 characters not defined in ASCII, it
103103may well be worthwhile including a hint since people in other
104- countries will want to be able to read you source strings too.
104+ countries will want to be able to read your source strings too.
105105
106106
107107Unicode Type Object:
@@ -169,7 +169,7 @@ during coercion of strings to Unicode should not be masked and passed
169169through to the user.
170170
171171In containment tests ('a' in u'abc' and u'a' in 'abc') both sides
172- should be coerced to Unicode before applying the test. Errors occuring
172+ should be coerced to Unicode before applying the test. Errors occurring
173173during coercion (e.g. None in u'abc') should not be masked.
174174
175175
@@ -184,7 +184,7 @@ always coerce to the more precise format, i.e. Unicode objects.
184184 s + u := unicode(s) + u
185185
186186All string methods should delegate the call to an equivalent Unicode
187- object method call by converting all envolved strings to Unicode and
187+ object method call by converting all involved strings to Unicode and
188188then applying the arguments to the Unicode method of the same name,
189189e.g.
190190
@@ -199,7 +199,7 @@ Formatting Markers.
199199Exceptions:
200200-----------
201201
202- UnicodeError is defined in the exceptions module as subclass of
202+ UnicodeError is defined in the exceptions module as a subclass of
203203ValueError. It is available at the C level via PyExc_UnicodeError.
204204All exceptions related to Unicode encoding/decoding should be
205205subclasses of UnicodeError.
@@ -268,7 +268,7 @@ Python should provide a few standard codecs for the most relevant
268268encodings, e.g.
269269
270270 'utf-8': 8-bit variable length encoding
271- 'utf-16': 16-bit variable length encoding (litte /big endian)
271+ 'utf-16': 16-bit variable length encoding (little /big endian)
272272 'utf-16-le': utf-16 but explicitly little endian
273273 'utf-16-be': utf-16 but explicitly big endian
274274 'ascii': 7-bit ASCII codepage
@@ -284,7 +284,7 @@ Note: 'utf-16' should be implemented by using and requiring byte order
284284marks (BOM) for file input/output.
285285
286286All other encodings such as the CJK ones to support Asian scripts
287- should be implemented in seperate packages which do not get included
287+ should be implemented in separate packages which do not get included
288288in the core Python distribution and are not a part of this proposal.
289289
290290
@@ -324,14 +324,14 @@ class Codec:
324324 """
325325 def encode(self,input,errors='strict'):
326326
327- """ Encodes the object intput and returns a tuple (output
327+ """ Encodes the object input and returns a tuple (output
328328 object, length consumed).
329329
330330 errors defines the error handling to apply. It defaults to
331331 'strict' handling.
332332
333333 The method may not store state in the Codec instance. Use
334- SteamCodec for codecs which have to keep state in order to
334+ StreamCodec for codecs which have to keep state in order to
335335 make encoding/decoding efficient.
336336
337337 """
@@ -350,7 +350,7 @@ class Codec:
350350 'strict' handling.
351351
352352 The method may not store state in the Codec instance. Use
353- SteamCodec for codecs which have to keep state in order to
353+ StreamCodec for codecs which have to keep state in order to
354354 make encoding/decoding efficient.
355355
356356 """
@@ -490,7 +490,7 @@ class StreamReader(Codec):
490490 the line breaking knowledge from the underlying stream's
491491 .readline() method -- there is currently no support for
492492 line breaking using the codec decoder due to lack of line
493- buffering. Sublcasses should however, if possible, try to
493+ buffering. Subclasses should however, if possible, try to
494494 implement this method using their own knowledge of line
495495 breaking.
496496
@@ -527,7 +527,7 @@ class StreamReader(Codec):
527527 """ Resets the codec buffers used for keeping state.
528528
529529 Note that no stream repositioning should take place.
530- This method is primarely intended to be able to recover
530+ This method is primarily intended to be able to recover
531531 from decoding errors.
532532
533533 """
@@ -553,7 +553,7 @@ interfaces, though.
553553
554554It is not required by the Unicode implementation to use these base
555555classes, only the interfaces must match; this allows writing Codecs as
556- extensions types.
556+ extension types.
557557
558558As guideline, large mapping tables should be implemented using static
559559C data in separate (shared) extension modules. That way multiple
@@ -628,8 +628,8 @@ Private Code Point Areas:
628628-------------------------
629629
630630Support for these is left to user land Codecs and not explicitly
631- intergrated into the core. Note that due to the Internal Format being
632- implemented, only the area between \uE000 and \uF8FF is useable for
631+ integrated into the core. Note that due to the Internal Format being
632+ implemented, only the area between \uE000 and \uF8FF is usable for
633633private encodings.
634634
635635
@@ -649,14 +649,14 @@ provides access to about 64k characters and covers all characters in
649649the Basic Multilingual Plane (BMP) of Unicode.
650650
651651It is the Codec's responsibility to ensure that the data they pass to
652- the Unicode object constructor repects this assumption. The
652+ the Unicode object constructor respects this assumption. The
653653constructor does not check the data for Unicode compliance or use of
654654surrogates.
655655
656656Future implementations can extend the 32 bit restriction to the full
657657set of all UTF-16 addressable characters (around 1M characters).
658658
659- The Unicode API should provide inteface routines from <PythonUnicode>
659+ The Unicode API should provide interface routines from <PythonUnicode>
660660to the compiler's wchar_t which can be 16 or 32 bit depending on the
661661compiler/libc/platform being used.
662662
0 commit comments