Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit d039286

Browse files
committed
More improvements to pickle's documentation.
Add "Restricting Globals" section. Remove useless 'verbose' flag in the example dbpickle.py.
1 parent 62073e0 commit d039286

2 files changed

Lines changed: 92 additions & 45 deletions

File tree

Doc/includes/dbpickle.py

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ def persistent_load(self, pid):
4646
raise pickle.UnpicklingError("unsupported persistent object")
4747

4848

49-
def main(verbose=True):
49+
def main():
5050
import io, pprint
5151

5252
# Initialize and populate our database.
@@ -68,20 +68,18 @@ def main(verbose=True):
6868
file = io.BytesIO()
6969
DBPickler(file).dump(memos)
7070

71-
if verbose:
72-
print("Records to be pickled:")
73-
pprint.pprint(memos)
71+
print("Pickled records:")
72+
pprint.pprint(memos)
7473

7574
# Update a record, just for good measure.
7675
cursor.execute("UPDATE memos SET task='learn italian' WHERE key=1")
7776

78-
# Load the reports from the pickle data stream.
77+
# Load the records from the pickle data stream.
7978
file.seek(0)
8079
memos = DBUnpickler(file, conn).load()
8180

82-
if verbose:
83-
print("Unpickled records:")
84-
pprint.pprint(memos)
81+
print("Unpickled records:")
82+
pprint.pprint(memos)
8583

8684

8785
if __name__ == '__main__':

Doc/library/pickle.rst

Lines changed: 86 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -111,15 +111,17 @@ There are currently 4 different protocols which can be used for pickling.
111111
bytes and cannot be unpickled by Python 2.x pickle modules. This is
112112
the current recommended protocol, use it whenever it is possible.
113113

114-
Refer to :pep:`307` for more information.
114+
Refer to :pep:`307` for information about improvements brought by
115+
protocol 2. See :mod:`pickletools`'s source code for extensive
116+
comments about opcodes used by pickle protocols.
115117

116118
If a *protocol* is not specified, protocol 3 is used. If *protocol* is
117119
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
118120
protocol version available will be used.
119121

120122

121-
Usage
122-
-----
123+
Module Interface
124+
----------------
123125

124126
To serialize an object hierarchy, you first create a pickler, then you call the
125127
pickler's :meth:`dump` method. To de-serialize a data stream, you first create
@@ -347,10 +349,13 @@ return the old value, not the modified one.
347349
.. method:: find_class(module, name)
348350

349351
Import *module* if necessary and return the object called *name* from it,
350-
where the *module* and *name* arguments are :class:`str` objects.
352+
where the *module* and *name* arguments are :class:`str` objects. Note,
353+
unlike its name suggests, :meth:`find_class` is also used for finding
354+
functions.
351355

352356
Subclasses may override this to gain control over what type of objects and
353-
how they can be loaded, potentially reducing security risks.
357+
how they can be loaded, potentially reducing security risks. Refer to
358+
:ref:`pickle-restrict` for details.
354359

355360

356361
.. _pickle-picklable:
@@ -424,7 +429,7 @@ protocol provides a standard way for you to define, customize, and control how
424429
your objects are serialized and de-serialized. The description in this section
425430
doesn't cover specific customizations that you can employ to make the unpickling
426431
environment slightly safer from untrusted pickle data streams; see section
427-
:ref:`pickle-sub` for more details.
432+
:ref:`pickle-restrict` for more details.
428433

429434

430435
.. _pickle-inst:
@@ -600,41 +605,85 @@ referenced object.
600605

601606
Example:
602607

608+
.. XXX Work around for some bug in sphinx/pygments.
603609
.. highlightlang:: python
604610
.. literalinclude:: ../includes/dbpickle.py
611+
.. highlightlang:: python3
605612

613+
.. _pickle-restrict:
606614

607-
.. _pickle-sub:
608-
609-
Subclassing Unpicklers
610-
----------------------
615+
Restricting Globals
616+
^^^^^^^^^^^^^^^^^^^
611617

612618
.. index::
613-
single: load_global() (pickle protocol)
614-
single: find_global() (pickle protocol)
615-
616-
By default, unpickling will import any class that it finds in the pickle data.
617-
You can control exactly what gets unpickled and what gets called by customizing
618-
your unpickler.
619-
620-
You need to derive a subclass from :class:`Unpickler`, overriding the
621-
:meth:`load_global` method. :meth:`load_global` should read two lines from the
622-
pickle data stream where the first line will the name of the module containing
623-
the class and the second line will be the name of the instance's class. It then
624-
looks up the class, possibly importing the module and digging out the attribute,
625-
then it appends what it finds to the unpickler's stack. Later on, this class
626-
will be assigned to the :attr:`__class__` attribute of an empty class, as a way
627-
of magically creating an instance without calling its class's
628-
:meth:`__init__`. Your job (should you choose to accept it), would be to have
629-
:meth:`load_global` push onto the unpickler's stack, a known safe version of any
630-
class you deem safe to unpickle. It is up to you to produce such a class. Or
631-
you could raise an error if you want to disallow all unpickling of instances.
632-
If this sounds like a hack, you're right. Refer to the source code to make this
633-
work.
634-
635-
The moral of the story is that you should be really careful about the source of
636-
the strings your application unpickles.
619+
single: find_class() (pickle protocol)
620+
621+
By default, unpickling will import any class or function that it finds in the
622+
pickle data. For many applications, this behaviour is unacceptable as it
623+
permits the unpickler to import and invoke arbitrary code. Just consider what
624+
this hand-crafted pickle data stream does when loaded::
625+
626+
>>> import pickle
627+
>>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
628+
hello world
629+
0
630+
631+
In this example, the unpickler imports the :func:`os.system` function and then
632+
apply the string argument "echo hello world". Although this example is
633+
inoffensive, it is not difficult to imagine one that could damage your system.
634+
635+
For this reason, you may want to control what gets unpickled by customizing
636+
:meth:`Unpickler.find_class`. Unlike its name suggests, :meth:`find_class` is
637+
called whenever a global (i.e., a class or a function) is requested. Thus it is
638+
possible to either forbid completely globals or restrict them to a safe subset.
639+
640+
Here is an example of an unpickler allowing only few safe classes from the
641+
:mod:`builtins` module to be loaded::
642+
643+
import builtins
644+
import io
645+
import pickle
637646

647+
safe_builtins = {
648+
'range',
649+
'complex',
650+
'set',
651+
'frozenset',
652+
'slice',
653+
}
654+
655+
class RestrictedUnpickler(pickle.Unpickler):
656+
def find_class(self, module, name):
657+
# Only allow safe classes from builtins.
658+
if module == "builtins" and name in safe_builtins:
659+
return getattr(builtins, name)
660+
# Forbid everything else.
661+
raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
662+
(module, name))
663+
664+
def restricted_loads(s):
665+
"""Helper function analogous to pickle.loads()."""
666+
return RestrictedUnpickler(io.BytesIO(s)).load()
667+
668+
A sample usage of our unpickler working has intended::
669+
670+
>>> restricted_loads(pickle.dumps([1, 2, range(15)]))
671+
[1, 2, range(0, 15)]
672+
>>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
673+
Traceback (most recent call last):
674+
...
675+
pickle.UnpicklingError: global 'os.system' is forbidden
676+
>>> restricted_loads(b'cbuiltins\neval\n'
677+
... b'(S\'getattr(__import__("os"), "system")'
678+
... b'("echo hello world")\'\ntR.')
679+
Traceback (most recent call last):
680+
...
681+
pickle.UnpicklingError: global 'builtins.eval' is forbidden
682+
683+
As our examples shows, you have to be careful with what you allow to
684+
be unpickled. Therefore if security is a concern, you may want to consider
685+
alternatives such as the marshalling API in :mod:`xmlrpc.client` or
686+
third-party solutions.
638687

639688
.. _pickle-example:
640689

@@ -769,7 +818,7 @@ the same process or a new process. ::
769818
.. [#] This protocol is also used by the shallow and deep copying operations
770819
defined in the :mod:`copy` module.
771820
772-
.. [#] The limitation on alphanumeric characters is due to the fact the
773-
persistent IDs, in protocol 0, are delimited by the newline character.
774-
Therefore if any kind of newline characters, such as \r and \n, occurs in
821+
.. [#] The limitation on alphanumeric characters is due to the fact
822+
the persistent IDs, in protocol 0, are delimited by the newline
823+
character. Therefore if any kind of newline characters occurs in
775824
persistent IDs, the resulting pickle will become unreadable.

0 commit comments

Comments
 (0)