Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 5f3b63a

Browse files
committed
Improve pickle's documentation.
Use double-space for ending a sentence. Add dbpickle.py example. Improve description about persistent IDs.
1 parent 758bca6 commit 5f3b63a

2 files changed

Lines changed: 135 additions & 84 deletions

File tree

Doc/includes/dbpickle.py

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# Simple example presenting how persistent ID can be used to pickle
2+
# external objects by reference.
3+
4+
import pickle
5+
import sqlite3
6+
from collections import namedtuple
7+
8+
# Simple class representing a record in our database.
9+
MemoRecord = namedtuple("MemoRecord", "key, task")
10+
11+
class DBPickler(pickle.Pickler):
12+
13+
def persistent_id(self, obj):
14+
# Instead of pickling MemoRecord as a regular class instance, we emit a
15+
# persistent ID instead.
16+
if isinstance(obj, MemoRecord):
17+
# Here, our persistent ID is simply a tuple containing a tag and a
18+
# key which refers to a specific record in the database.
19+
return ("MemoRecord", obj.key)
20+
else:
21+
# If obj does not have a persistent ID, return None. This means obj
22+
# needs to be pickled as usual.
23+
return None
24+
25+
26+
class DBUnpickler(pickle.Unpickler):
27+
28+
def __init__(self, file, connection):
29+
super().__init__(file)
30+
self.connection = connection
31+
32+
def persistent_load(self, pid):
33+
# This method is invoked whenever a persistent ID is encountered.
34+
# Here, pid is the tuple returned by DBPickler.
35+
cursor = self.connection.cursor()
36+
type_tag, key_id = pid
37+
if type_tag == "MemoRecord":
38+
# Fetch the referenced record from the database and return it.
39+
cursor.execute("SELECT * FROM memos WHERE key=?", (str(key_id),))
40+
key, task = cursor.fetchone()
41+
return MemoRecord(key, task)
42+
else:
43+
# Always raises an error if you cannot return the correct object.
44+
# Otherwise, the unpickler will think None is the object referenced
45+
# by the persistent ID.
46+
raise pickle.UnpicklingError("unsupported persistent object")
47+
48+
49+
def main(verbose=True):
50+
import io, pprint
51+
52+
# Initialize and populate our database.
53+
conn = sqlite3.connect(":memory:")
54+
cursor = conn.cursor()
55+
cursor.execute("CREATE TABLE memos(key INTEGER PRIMARY KEY, task TEXT)")
56+
tasks = (
57+
'give food to fish',
58+
'prepare group meeting',
59+
'fight with a zebra',
60+
)
61+
for task in tasks:
62+
cursor.execute("INSERT INTO memos VALUES(NULL, ?)", (task,))
63+
64+
# Fetch the records to be pickled.
65+
cursor.execute("SELECT * FROM memos")
66+
memos = [MemoRecord(key, task) for key, task in cursor]
67+
# Save the records using our custom DBPickler.
68+
file = io.BytesIO()
69+
DBPickler(file).dump(memos)
70+
71+
if verbose:
72+
print("Records to be pickled:")
73+
pprint.pprint(memos)
74+
75+
# Update a record, just for good measure.
76+
cursor.execute("UPDATE memos SET task='learn italian' WHERE key=1")
77+
78+
# Load the reports from the pickle data stream.
79+
file.seek(0)
80+
memos = DBUnpickler(file, conn).load()
81+
82+
if verbose:
83+
print("Unpickled records:")
84+
pprint.pprint(memos)
85+
86+
87+
if __name__ == '__main__':
88+
main()

Doc/library/pickle.rst

Lines changed: 47 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Relationship to other Python modules
2727
------------------------------------
2828

2929
The :mod:`pickle` module has an transparent optimizer (:mod:`_pickle`) written
30-
in C. It is used whenever available. Otherwise the pure Python implementation is
30+
in C. It is used whenever available. Otherwise the pure Python implementation is
3131
used.
3232

3333
Python has a more primitive serialization module called :mod:`marshal`, but in
@@ -108,7 +108,7 @@ There are currently 4 different protocols which can be used for pickling.
108108
efficient pickling of :term:`new-style class`\es.
109109

110110
* Protocol version 3 was added in Python 3.0. It has explicit support for
111-
bytes and cannot be unpickled by Python 2.x pickle modules. This is
111+
bytes and cannot be unpickled by Python 2.x pickle modules. This is
112112
the current recommended protocol, use it whenever it is possible.
113113

114114
Refer to :pep:`307` for more information.
@@ -166,7 +166,7 @@ process more convenient:
166166
Python needed to read the pickle produced.
167167

168168
The *file* argument must have a write() method that accepts a single bytes
169-
argument. It can thus be a file object opened for binary writing, a
169+
argument. It can thus be a file object opened for binary writing, a
170170
io.BytesIO instance, or any other custom object that meets this interface.
171171

172172
.. function:: dumps(obj[, protocol])
@@ -220,18 +220,21 @@ The :mod:`pickle` module defines three exceptions:
220220

221221
.. exception:: PickleError
222222

223-
Common base class for the other pickling exceptions. It inherits
223+
Common base class for the other pickling exceptions. It inherits
224224
:exc:`Exception`.
225225

226226
.. exception:: PicklingError
227227

228228
Error raised when an unpicklable object is encountered by :class:`Pickler`.
229229
It inherits :exc:`PickleError`.
230230

231+
Refer to :ref:`pickle-picklable` to learn what kinds of objects can be
232+
pickled.
233+
231234
.. exception:: UnpicklingError
232235

233236
Error raised when there a problem unpickling an object, such as a data
234-
corruption or a security violation. It inherits :exc:`PickleError`.
237+
corruption or a security violation. It inherits :exc:`PickleError`.
235238

236239
Note that other exceptions may also be raised during unpickling, including
237240
(but not necessarily limited to) AttributeError, EOFError, ImportError, and
@@ -254,7 +257,7 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
254257
Python needed to read the pickle produced.
255258

256259
The *file* argument must have a write() method that accepts a single bytes
257-
argument. It can thus be a file object opened for binary writing, a
260+
argument. It can thus be a file object opened for binary writing, a
258261
io.BytesIO instance, or any other custom object that meets this interface.
259262

260263
.. method:: dump(obj)
@@ -276,8 +279,8 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
276279

277280
.. method:: clear_memo()
278281

279-
Deprecated. Use the :meth:`clear` method on the :attr:`memo`. Clear the
280-
pickler's memo, useful when reusing picklers.
282+
Deprecated. Use the :meth:`clear` method on :attr:`memo`, instead.
283+
Clear the pickler's memo, useful when reusing picklers.
281284

282285
.. attribute:: fast
283286

@@ -329,24 +332,28 @@ return the old value, not the modified one.
329332

330333
Read a pickled object representation from the open file object given in
331334
the constructor, and return the reconstituted object hierarchy specified
332-
therein. Bytes past the pickled object's representation are ignored.
335+
therein. Bytes past the pickled object's representation are ignored.
333336

334337
.. method:: persistent_load(pid)
335338

336339
Raise an :exc:`UnpickingError` by default.
337340

338341
If defined, :meth:`persistent_load` should return the object specified by
339-
the persistent ID *pid*. On errors, such as if an invalid persistent ID is
340-
encountered, an :exc:`UnpickingError` should be raised.
342+
the persistent ID *pid*. If an invalid persistent ID is encountered, an
343+
:exc:`UnpickingError` should be raised.
341344

342345
See :ref:`pickle-persistent` for details and examples of uses.
343346

344347
.. method:: find_class(module, name)
345348

346-
Import *module* if necessary and return the object called *name* from it.
347-
Subclasses may override this to gain control over what type of objects can
348-
be loaded, potentially reducing security risks.
349+
Import *module* if necessary and return the object called *name* from it,
350+
where the *module* and *name* arguments are :class:`str` objects.
351+
352+
Subclasses may override this to gain control over what type of objects and
353+
how they can be loaded, potentially reducing security risks.
354+
349355

356+
.. _pickle-picklable:
350357

351358
What can be pickled and unpickled?
352359
----------------------------------
@@ -372,9 +379,9 @@ The following types can be pickled:
372379

373380
Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
374381
exception; when this happens, an unspecified number of bytes may have already
375-
been written to the underlying file. Trying to pickle a highly recursive data
382+
been written to the underlying file. Trying to pickle a highly recursive data
376383
structure may exceed the maximum recursion depth, a :exc:`RuntimeError` will be
377-
raised in this case. You can carefully raise this limit with
384+
raised in this case. You can carefully raise this limit with
378385
:func:`sys.setrecursionlimit`.
379386

380387
Note that functions (built-in and user-defined) are pickled by "fully qualified"
@@ -390,7 +397,7 @@ pickled, so in the following example the class attribute ``attr`` is not
390397
restored in the unpickling environment::
391398

392399
class Foo:
393-
attr = 'a class attr'
400+
attr = 'A class attribute'
394401

395402
picklestring = pickle.dumps(Foo)
396403

@@ -571,79 +578,30 @@ Pickling and unpickling external objects
571578

572579
For the benefit of object persistence, the :mod:`pickle` module supports the
573580
notion of a reference to an object outside the pickled data stream. Such
574-
objects are referenced by a "persistent id", which is just an arbitrary string
575-
of printable ASCII characters. The resolution of such names is not defined by
576-
the :mod:`pickle` module; it will delegate this resolution to user defined
577-
functions on the pickler and unpickler.
581+
objects are referenced by a persistent ID, which should be either a string of
582+
alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for
583+
any newer protocol).
578584

579-
To define external persistent id resolution, you need to set the
580-
:attr:`persistent_id` attribute of the pickler object and the
581-
:attr:`persistent_load` attribute of the unpickler object.
585+
The resolution of such persistent IDs is not defined by the :mod:`pickle`
586+
module; it will delegate this resolution to the user defined methods on the
587+
pickler and unpickler, :meth:`persistent_id` and :meth:`persistent_load`
588+
respectively.
582589

583590
To pickle objects that have an external persistent id, the pickler must have a
584-
custom :func:`persistent_id` method that takes an object as an argument and
591+
custom :meth:`persistent_id` method that takes an object as an argument and
585592
returns either ``None`` or the persistent id for that object. When ``None`` is
586-
returned, the pickler simply pickles the object as normal. When a persistent id
587-
string is returned, the pickler will pickle that string, along with a marker so
588-
that the unpickler will recognize the string as a persistent id.
593+
returned, the pickler simply pickles the object as normal. When a persistent ID
594+
string is returned, the pickler will pickle that object, along with a marker so
595+
that the unpickler will recognize it as a persistent ID.
589596

590597
To unpickle external objects, the unpickler must have a custom
591-
:func:`persistent_load` function that takes a persistent id string and returns
592-
the referenced object.
593-
594-
Here's a silly example that *might* shed more light::
595-
596-
import pickle
597-
from io import StringIO
598-
599-
src = StringIO()
600-
p = pickle.Pickler(src)
601-
602-
def persistent_id(obj):
603-
if hasattr(obj, 'x'):
604-
return 'the value %d' % obj.x
605-
else:
606-
return None
607-
608-
p.persistent_id = persistent_id
598+
:meth:`persistent_load` method that takes a persistent ID object and returns the
599+
referenced object.
609600

610-
class Integer:
611-
def __init__(self, x):
612-
self.x = x
613-
def __str__(self):
614-
return 'My name is integer %d' % self.x
601+
Example:
615602

616-
i = Integer(7)
617-
print(i)
618-
p.dump(i)
619-
620-
datastream = src.getvalue()
621-
print(repr(datastream))
622-
dst = StringIO(datastream)
623-
624-
up = pickle.Unpickler(dst)
625-
626-
class FancyInteger(Integer):
627-
def __str__(self):
628-
return 'I am the integer %d' % self.x
629-
630-
def persistent_load(persid):
631-
if persid.startswith('the value '):
632-
value = int(persid.split()[2])
633-
return FancyInteger(value)
634-
else:
635-
raise pickle.UnpicklingError('Invalid persistent id')
636-
637-
up.persistent_load = persistent_load
638-
639-
j = up.load()
640-
print(j)
641-
642-
643-
.. BAW: pickle supports something called inst_persistent_id()
644-
which appears to give unknown types a second shot at producing a persistent
645-
id. Since Jim Fulton can't remember why it was added or what it's for, I'm
646-
leaving it undocumented.
603+
.. highlightlang:: python
604+
.. literalinclude:: ../includes/dbpickle.py
647605

648606

649607
.. _pickle-sub:
@@ -808,5 +766,10 @@ the same process or a new process. ::
808766
809767
.. [#] These methods can also be used to implement copying class instances.
810768
811-
.. [#] This protocol is also used by the shallow and deep copying operations defined in
812-
the :mod:`copy` module.
769+
.. [#] This protocol is also used by the shallow and deep copying operations
770+
defined in the :mod:`copy` module.
771+
772+
.. [#] The limitation on alphanumeric characters is due to the fact the
773+
persistent IDs, in protocol 0, are delimited by the newline character.
774+
Therefore if any kind of newline characters, such as \r and \n, occurs in
775+
persistent IDs, the resulting pickle will become unreadable.

0 commit comments

Comments
 (0)