Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 758bca6

Browse files
committed
Improve pickle's documentation.
There is still much to be done, but I am committing my changes incrementally to avoid losing them again (for a third time now).
1 parent 87eee63 commit 758bca6

1 file changed

Lines changed: 144 additions & 96 deletions

File tree

Doc/library/pickle.rst

Lines changed: 144 additions & 96 deletions
Original file line numberDiff line numberDiff line change
@@ -92,11 +92,9 @@ advantage that there are no restrictions imposed by external standards such as
9292
XDR (which can't represent pointer sharing); however it means that non-Python
9393
programs may not be able to reconstruct pickled Python objects.
9494

95-
By default, the :mod:`pickle` data format uses a printable ASCII representation.
96-
This is slightly more voluminous than a binary representation. The big
97-
advantage of using printable ASCII (and of some other characteristics of
98-
:mod:`pickle`'s representation) is that for debugging or recovery purposes it is
99-
possible for a human to read the pickled file with a standard text editor.
95+
By default, the :mod:`pickle` data format uses a compact binary representation.
96+
The module :mod:`pickletools` contains tools for analyzing data streams
97+
generated by :mod:`pickle`.
10098

10199
There are currently 4 different protocols which can be used for pickling.
102100

@@ -110,17 +108,15 @@ There are currently 4 different protocols which can be used for pickling.
110108
efficient pickling of :term:`new-style class`\es.
111109

112110
* Protocol version 3 was added in Python 3.0. It has explicit support for
113-
bytes and cannot be unpickled by Python 2.x pickle modules.
111+
bytes and cannot be unpickled by Python 2.x pickle modules. This is
112+
the current recommended protocol, use it whenever it is possible.
114113

115114
Refer to :pep:`307` for more information.
116115

117-
If a *protocol* is not specified, protocol 3 is used. If *protocol* is
116+
If a *protocol* is not specified, protocol 3 is used. If *protocol* is
118117
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
119118
protocol version available will be used.
120119

121-
A binary format, which is slightly more efficient, can be chosen by specifying a
122-
*protocol* version >= 1.
123-
124120

125121
Usage
126122
-----
@@ -146,152 +142,210 @@ an unpickler, then you call the unpickler's :meth:`load` method. The
146142
as line terminators and therefore will look "funny" when viewed in Notepad or
147143
other editors which do not support this format.
148144

145+
.. data:: DEFAULT_PROTOCOL
146+
147+
The default protocol used for pickling. May be less than HIGHEST_PROTOCOL.
148+
Currently the default protocol is 3; a backward-incompatible protocol
149+
designed for Python 3.0.
150+
151+
149152
The :mod:`pickle` module provides the following functions to make the pickling
150153
process more convenient:
151154

152-
153155
.. function:: dump(obj, file[, protocol])
154156

155-
Write a pickled representation of *obj* to the open file object *file*. This is
156-
equivalent to ``Pickler(file, protocol).dump(obj)``.
157+
Write a pickled representation of *obj* to the open file object *file*. This
158+
is equivalent to ``Pickler(file, protocol).dump(obj)``.
157159

158-
If the *protocol* parameter is omitted, protocol 3 is used. If *protocol* is
159-
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
160-
protocol version will be used.
160+
The optional *protocol* argument tells the pickler to use the given protocol;
161+
supported protocols are 0, 1, 2, 3. The default protocol is 3; a
162+
backward-incompatible protocol designed for Python 3.0.
161163

162-
*file* must have a :meth:`write` method that accepts a single string argument.
163-
It can thus be a file object opened for writing, a :mod:`StringIO` object, or
164-
any other custom object that meets this interface.
164+
Specifying a negative protocol version selects the highest protocol version
165+
supported. The higher the protocol used, the more recent the version of
166+
Python needed to read the pickle produced.
165167

168+
The *file* argument must have a write() method that accepts a single bytes
169+
argument. It can thus be a file object opened for binary writing, a
170+
io.BytesIO instance, or any other custom object that meets this interface.
166171

167-
.. function:: load(file)
172+
.. function:: dumps(obj[, protocol])
168173

169-
Read a string from the open file object *file* and interpret it as a pickle data
170-
stream, reconstructing and returning the original object hierarchy. This is
171-
equivalent to ``Unpickler(file).load()``.
174+
Return the pickled representation of the object as a :class:`bytes`
175+
object, instead of writing it to a file.
172176

173-
*file* must have two methods, a :meth:`read` method that takes an integer
174-
argument, and a :meth:`readline` method that requires no arguments. Both
175-
methods should return a string. Thus *file* can be a file object opened for
176-
reading, a :mod:`StringIO` object, or any other custom object that meets this
177-
interface.
177+
The optional *protocol* argument tells the pickler to use the given protocol;
178+
supported protocols are 0, 1, 2, 3. The default protocol is 3; a
179+
backward-incompatible protocol designed for Python 3.0.
178180

179-
This function automatically determines whether the data stream was written in
180-
binary mode or not.
181+
Specifying a negative protocol version selects the highest protocol version
182+
supported. The higher the protocol used, the more recent the version of
183+
Python needed to read the pickle produced.
181184

185+
.. function:: load(file, [\*, encoding="ASCII", errors="strict"])
182186

183-
.. function:: dumps(obj[, protocol])
187+
Read a pickled object representation from the open file object *file* and
188+
return the reconstituted object hierarchy specified therein. This is
189+
equivalent to ``Unpickler(file).load()``.
184190

185-
Return the pickled representation of the object as a :class:`bytes`
186-
object, instead of writing it to a file.
191+
The protocol version of the pickle is detected automatically, so no protocol
192+
argument is needed. Bytes past the pickled object's representation are
193+
ignored.
187194

188-
If the *protocol* parameter is omitted, protocol 3 is used. If *protocol*
189-
is specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
190-
protocol version will be used.
195+
The argument *file* must have two methods, a read() method that takes an
196+
integer argument, and a readline() method that requires no arguments. Both
197+
methods should return bytes. Thus *file* can be a binary file object opened
198+
for reading, a BytesIO object, or any other custom object that meets this
199+
interface.
191200

201+
Optional keyword arguments are encoding and errors, which are used to decode
202+
8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
203+
'strict', respectively.
192204

193-
.. function:: loads(bytes_object)
205+
.. function:: loads(bytes_object, [\*, encoding="ASCII", errors="strict"])
194206

195-
Read a pickled object hierarchy from a :class:`bytes` object.
196-
Bytes past the pickled object's representation are ignored.
207+
Read a pickled object hierarchy from a :class:`bytes` object and return the
208+
reconstituted object hierarchy specified therein
197209

198-
The :mod:`pickle` module also defines three exceptions:
210+
The protocol version of the pickle is detected automatically, so no protocol
211+
argument is needed. Bytes past the pickled object's representation are
212+
ignored.
199213

214+
Optional keyword arguments are encoding and errors, which are used to decode
215+
8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
216+
'strict', respectively.
217+
218+
219+
The :mod:`pickle` module defines three exceptions:
200220

201221
.. exception:: PickleError
202222

203-
A common base class for the other exceptions defined below. This inherits from
223+
Common base class for the other pickling exceptions. It inherits
204224
:exc:`Exception`.
205225

206-
207226
.. exception:: PicklingError
208227

209-
This exception is raised when an unpicklable object is passed to the
210-
:meth:`dump` method.
211-
228+
Error raised when an unpicklable object is encountered by :class:`Pickler`.
229+
It inherits :exc:`PickleError`.
212230

213231
.. exception:: UnpicklingError
214232

215-
This exception is raised when there is a problem unpickling an object. Note that
216-
other exceptions may also be raised during unpickling, including (but not
217-
necessarily limited to) :exc:`AttributeError`, :exc:`EOFError`,
218-
:exc:`ImportError`, and :exc:`IndexError`.
233+
Error raised when there a problem unpickling an object, such as a data
234+
corruption or a security violation. It inherits :exc:`PickleError`.
219235

220-
The :mod:`pickle` module also exports two callables, :class:`Pickler` and
221-
:class:`Unpickler`:
236+
Note that other exceptions may also be raised during unpickling, including
237+
(but not necessarily limited to) AttributeError, EOFError, ImportError, and
238+
IndexError.
222239

223240

224-
.. class:: Pickler(file[, protocol])
241+
The :mod:`pickle` module exports two classes, :class:`Pickler` and
242+
:class:`Unpickler`:
225243

226-
This takes a file-like object to which it will write a pickle data stream.
244+
.. class:: Pickler(file[, protocol])
227245

228-
If the *protocol* parameter is omitted, protocol 3 is used. If *protocol* is
229-
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
230-
protocol version will be used.
246+
This takes a binary file for writing a pickle data stream.
231247

232-
*file* must have a :meth:`write` method that accepts a single string argument.
233-
It can thus be an open file object, a :mod:`StringIO` object, or any other
234-
custom object that meets this interface.
248+
The optional *protocol* argument tells the pickler to use the given protocol;
249+
supported protocols are 0, 1, 2, 3. The default protocol is 3; a
250+
backward-incompatible protocol designed for Python 3.0.
235251

236-
:class:`Pickler` objects define one (or two) public methods:
252+
Specifying a negative protocol version selects the highest protocol version
253+
supported. The higher the protocol used, the more recent the version of
254+
Python needed to read the pickle produced.
237255

256+
The *file* argument must have a write() method that accepts a single bytes
257+
argument. It can thus be a file object opened for binary writing, a
258+
io.BytesIO instance, or any other custom object that meets this interface.
238259

239260
.. method:: dump(obj)
240261

241-
Write a pickled representation of *obj* to the open file object given in the
242-
constructor. Either the binary or ASCII format will be used, depending on the
243-
value of the *protocol* argument passed to the constructor.
262+
Write a pickled representation of *obj* to the open file object given in
263+
the constructor.
264+
265+
.. method:: persistent_id(obj)
244266

267+
Do nothing by default. This exists so a subclass can override it.
268+
269+
If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
270+
other value causes :class:`Pickler` to emit the returned value as a
271+
persistent ID for *obj*. The meaning of this persistent ID should be
272+
defined by :meth:`Unpickler.persistent_load`. Note that the value
273+
returned by :meth:`persistent_id` cannot itself have a persistent ID.
274+
275+
See :ref:`pickle-persistent` for details and examples of uses.
245276

246277
.. method:: clear_memo()
247278

248-
Clears the pickler's "memo". The memo is the data structure that remembers
249-
which objects the pickler has already seen, so that shared or recursive objects
250-
pickled by reference and not by value. This method is useful when re-using
251-
picklers.
279+
Deprecated. Use the :meth:`clear` method on the :attr:`memo`. Clear the
280+
pickler's memo, useful when reusing picklers.
281+
282+
.. attribute:: fast
283+
284+
Enable fast mode if set to a true value. The fast mode disables the usage
285+
of memo, therefore speeding the pickling process by not generating
286+
superfluous PUT opcodes. It should not be used with self-referential
287+
objects, doing otherwise will cause :class:`Pickler` to recurse
288+
infinitely.
289+
290+
Use :func:`pickletools.optimize` if you need more compact pickles.
291+
292+
.. attribute:: memo
293+
294+
Dictionary holding previously pickled objects to allow shared or
295+
recursive objects to pickled by reference as opposed to by value.
252296

253297

254298
It is possible to make multiple calls to the :meth:`dump` method of the same
255299
:class:`Pickler` instance. These must then be matched to the same number of
256300
calls to the :meth:`load` method of the corresponding :class:`Unpickler`
257301
instance. If the same object is pickled by multiple :meth:`dump` calls, the
258-
:meth:`load` will all yield references to the same object. [#]_
302+
:meth:`load` will all yield references to the same object.
259303

260-
:class:`Unpickler` objects are defined as:
304+
Please note, this is intended for pickling multiple objects without intervening
305+
modifications to the objects or their parts. If you modify an object and then
306+
pickle it again using the same :class:`Pickler` instance, the object is not
307+
pickled again --- a reference to it is pickled and the :class:`Unpickler` will
308+
return the old value, not the modified one.
261309

262310

263-
.. class:: Unpickler(file)
311+
.. class:: Unpickler(file, [\*, encoding="ASCII", errors="strict"])
264312

265-
This takes a file-like object from which it will read a pickle data stream.
266-
This class automatically determines whether the data stream was written in
267-
binary mode or not, so it does not need a flag as in the :class:`Pickler`
268-
factory.
313+
This takes a binary file for reading a pickle data stream.
269314

270-
*file* must have two methods, a :meth:`read` method that takes an integer
271-
argument, and a :meth:`readline` method that requires no arguments. Both
272-
methods should return a string. Thus *file* can be a file object opened for
273-
reading, a :mod:`StringIO` object, or any other custom object that meets this
274-
interface.
315+
The protocol version of the pickle is detected automatically, so no
316+
protocol argument is needed.
275317

276-
:class:`Unpickler` objects have one (or two) public methods:
318+
The argument *file* must have two methods, a read() method that takes an
319+
integer argument, and a readline() method that requires no arguments. Both
320+
methods should return bytes. Thus *file* can be a binary file object opened
321+
for reading, a BytesIO object, or any other custom object that meets this
322+
interface.
277323

324+
Optional keyword arguments are encoding and errors, which are used to decode
325+
8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
326+
'strict', respectively.
278327

279328
.. method:: load()
280329

281330
Read a pickled object representation from the open file object given in
282331
the constructor, and return the reconstituted object hierarchy specified
283-
therein.
332+
therein. Bytes past the pickled object's representation are ignored.
284333

285-
This method automatically determines whether the data stream was written
286-
in binary mode or not.
334+
.. method:: persistent_load(pid)
287335

336+
Raise an :exc:`UnpickingError` by default.
288337

289-
.. method:: noload()
338+
If defined, :meth:`persistent_load` should return the object specified by
339+
the persistent ID *pid*. On errors, such as if an invalid persistent ID is
340+
encountered, an :exc:`UnpickingError` should be raised.
290341

291-
This is just like :meth:`load` except that it doesn't actually create any
292-
objects. This is useful primarily for finding what's called "persistent
293-
ids" that may be referenced in a pickle data stream. See section
294-
:ref:`pickle-protocol` below for more details.
342+
See :ref:`pickle-persistent` for details and examples of uses.
343+
344+
.. method:: find_class(module, name)
345+
346+
Import *module* if necessary and return the object called *name* from it.
347+
Subclasses may override this to gain control over what type of objects can
348+
be loaded, potentially reducing security risks.
295349

296350

297351
What can be pickled and unpickled?
@@ -506,6 +560,8 @@ The registered constructor is deemed a "safe constructor" for purposes of
506560
unpickling as described above.
507561

508562

563+
.. _pickle-persistent:
564+
509565
Pickling and unpickling external objects
510566
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
511567

@@ -747,14 +803,6 @@ the same process or a new process. ::
747803

748804
.. [#] Don't confuse this with the :mod:`marshal` module
749805
750-
.. [#] *Warning*: this is intended for pickling multiple objects without intervening
751-
modifications to the objects or their parts. If you modify an object and then
752-
pickle it again using the same :class:`Pickler` instance, the object is not
753-
pickled again --- a reference to it is pickled and the :class:`Unpickler` will
754-
return the old value, not the modified one. There are two problems here: (1)
755-
detecting changes, and (2) marshalling a minimal set of changes. Garbage
756-
Collection may also become a problem here.
757-
758806
.. [#] The exception raised will likely be an :exc:`ImportError` or an
759807
:exc:`AttributeError` but it could be something else.
760808

0 commit comments

Comments
 (0)