|
| 1 | +\chapter{Memory Management \label{memory}} |
| 2 | +\sectionauthor{Vladimir Marangozov}{ [email protected]} |
| 3 | + |
| 4 | + |
| 5 | +\section{Overview \label{memoryOverview}} |
| 6 | + |
| 7 | +Memory management in Python involves a private heap containing all |
| 8 | +Python objects and data structures. The management of this private |
| 9 | +heap is ensured internally by the \emph{Python memory manager}. The |
| 10 | +Python memory manager has different components which deal with various |
| 11 | +dynamic storage management aspects, like sharing, segmentation, |
| 12 | +preallocation or caching. |
| 13 | + |
| 14 | +At the lowest level, a raw memory allocator ensures that there is |
| 15 | +enough room in the private heap for storing all Python-related data |
| 16 | +by interacting with the memory manager of the operating system. On top |
| 17 | +of the raw memory allocator, several object-specific allocators |
| 18 | +operate on the same heap and implement distinct memory management |
| 19 | +policies adapted to the peculiarities of every object type. For |
| 20 | +example, integer objects are managed differently within the heap than |
| 21 | +strings, tuples or dictionaries because integers imply different |
| 22 | +storage requirements and speed/space tradeoffs. The Python memory |
| 23 | +manager thus delegates some of the work to the object-specific |
| 24 | +allocators, but ensures that the latter operate within the bounds of |
| 25 | +the private heap. |
| 26 | + |
| 27 | +It is important to understand that the management of the Python heap |
| 28 | +is performed by the interpreter itself and that the user has no |
| 29 | +control on it, even if she regularly manipulates object pointers to |
| 30 | +memory blocks inside that heap. The allocation of heap space for |
| 31 | +Python objects and other internal buffers is performed on demand by |
| 32 | +the Python memory manager through the Python/C API functions listed in |
| 33 | +this document. |
| 34 | + |
| 35 | +To avoid memory corruption, extension writers should never try to |
| 36 | +operate on Python objects with the functions exported by the C |
| 37 | +library: \cfunction{malloc()}\ttindex{malloc()}, |
| 38 | +\cfunction{calloc()}\ttindex{calloc()}, |
| 39 | +\cfunction{realloc()}\ttindex{realloc()} and |
| 40 | +\cfunction{free()}\ttindex{free()}. This will result in |
| 41 | +mixed calls between the C allocator and the Python memory manager |
| 42 | +with fatal consequences, because they implement different algorithms |
| 43 | +and operate on different heaps. However, one may safely allocate and |
| 44 | +release memory blocks with the C library allocator for individual |
| 45 | +purposes, as shown in the following example: |
| 46 | + |
| 47 | +\begin{verbatim} |
| 48 | + PyObject *res; |
| 49 | + char *buf = (char *) malloc(BUFSIZ); /* for I/O */ |
| 50 | +
|
| 51 | + if (buf == NULL) |
| 52 | + return PyErr_NoMemory(); |
| 53 | + ...Do some I/O operation involving buf... |
| 54 | + res = PyString_FromString(buf); |
| 55 | + free(buf); /* malloc'ed */ |
| 56 | + return res; |
| 57 | +\end{verbatim} |
| 58 | + |
| 59 | +In this example, the memory request for the I/O buffer is handled by |
| 60 | +the C library allocator. The Python memory manager is involved only |
| 61 | +in the allocation of the string object returned as a result. |
| 62 | + |
| 63 | +In most situations, however, it is recommended to allocate memory from |
| 64 | +the Python heap specifically because the latter is under control of |
| 65 | +the Python memory manager. For example, this is required when the |
| 66 | +interpreter is extended with new object types written in C. Another |
| 67 | +reason for using the Python heap is the desire to \emph{inform} the |
| 68 | +Python memory manager about the memory needs of the extension module. |
| 69 | +Even when the requested memory is used exclusively for internal, |
| 70 | +highly-specific purposes, delegating all memory requests to the Python |
| 71 | +memory manager causes the interpreter to have a more accurate image of |
| 72 | +its memory footprint as a whole. Consequently, under certain |
| 73 | +circumstances, the Python memory manager may or may not trigger |
| 74 | +appropriate actions, like garbage collection, memory compaction or |
| 75 | +other preventive procedures. Note that by using the C library |
| 76 | +allocator as shown in the previous example, the allocated memory for |
| 77 | +the I/O buffer escapes completely the Python memory manager. |
| 78 | + |
| 79 | + |
| 80 | +\section{Memory Interface \label{memoryInterface}} |
| 81 | + |
| 82 | +The following function sets, modeled after the ANSI C standard, are |
| 83 | +available for allocating and releasing memory from the Python heap: |
| 84 | + |
| 85 | + |
| 86 | +\begin{cfuncdesc}{void*}{PyMem_Malloc}{size_t n} |
| 87 | + Allocates \var{n} bytes and returns a pointer of type \ctype{void*} |
| 88 | + to the allocated memory, or \NULL{} if the request fails. |
| 89 | + Requesting zero bytes returns a non-\NULL{} pointer. |
| 90 | + The memory will not have been initialized in any way. |
| 91 | +\end{cfuncdesc} |
| 92 | + |
| 93 | +\begin{cfuncdesc}{void*}{PyMem_Realloc}{void *p, size_t n} |
| 94 | + Resizes the memory block pointed to by \var{p} to \var{n} bytes. |
| 95 | + The contents will be unchanged to the minimum of the old and the new |
| 96 | + sizes. If \var{p} is \NULL, the call is equivalent to |
| 97 | + \cfunction{PyMem_Malloc(\var{n})}; if \var{n} is equal to zero, the |
| 98 | + memory block is resized but is not freed, and the returned pointer |
| 99 | + is non-\NULL. Unless \var{p} is \NULL, it must have been |
| 100 | + returned by a previous call to \cfunction{PyMem_Malloc()} or |
| 101 | + \cfunction{PyMem_Realloc()}. |
| 102 | +\end{cfuncdesc} |
| 103 | + |
| 104 | +\begin{cfuncdesc}{void}{PyMem_Free}{void *p} |
| 105 | + Frees the memory block pointed to by \var{p}, which must have been |
| 106 | + returned by a previous call to \cfunction{PyMem_Malloc()} or |
| 107 | + \cfunction{PyMem_Realloc()}. Otherwise, or if |
| 108 | + \cfunction{PyMem_Free(p)} has been called before, undefined |
| 109 | + behaviour occurs. If \var{p} is \NULL, no operation is performed. |
| 110 | +\end{cfuncdesc} |
| 111 | + |
| 112 | +The following type-oriented macros are provided for convenience. Note |
| 113 | +that \var{TYPE} refers to any C type. |
| 114 | + |
| 115 | +\begin{cfuncdesc}{\var{TYPE}*}{PyMem_New}{TYPE, size_t n} |
| 116 | + Same as \cfunction{PyMem_Malloc()}, but allocates \code{(\var{n} * |
| 117 | + sizeof(\var{TYPE}))} bytes of memory. Returns a pointer cast to |
| 118 | + \ctype{\var{TYPE}*}. The memory will not have been initialized in |
| 119 | + any way. |
| 120 | +\end{cfuncdesc} |
| 121 | + |
| 122 | +\begin{cfuncdesc}{\var{TYPE}*}{PyMem_Resize}{void *p, TYPE, size_t n} |
| 123 | + Same as \cfunction{PyMem_Realloc()}, but the memory block is resized |
| 124 | + to \code{(\var{n} * sizeof(\var{TYPE}))} bytes. Returns a pointer |
| 125 | + cast to \ctype{\var{TYPE}*}. |
| 126 | +\end{cfuncdesc} |
| 127 | + |
| 128 | +\begin{cfuncdesc}{void}{PyMem_Del}{void *p} |
| 129 | + Same as \cfunction{PyMem_Free()}. |
| 130 | +\end{cfuncdesc} |
| 131 | + |
| 132 | +In addition, the following macro sets are provided for calling the |
| 133 | +Python memory allocator directly, without involving the C API functions |
| 134 | +listed above. However, note that their use does not preserve binary |
| 135 | +compatibility accross Python versions and is therefore deprecated in |
| 136 | +extension modules. |
| 137 | + |
| 138 | +\cfunction{PyMem_MALLOC()}, \cfunction{PyMem_REALLOC()}, \cfunction{PyMem_FREE()}. |
| 139 | + |
| 140 | +\cfunction{PyMem_NEW()}, \cfunction{PyMem_RESIZE()}, \cfunction{PyMem_DEL()}. |
| 141 | + |
| 142 | + |
| 143 | +\section{Examples \label{memoryExamples}} |
| 144 | + |
| 145 | +Here is the example from section \ref{memoryOverview}, rewritten so |
| 146 | +that the I/O buffer is allocated from the Python heap by using the |
| 147 | +first function set: |
| 148 | + |
| 149 | +\begin{verbatim} |
| 150 | + PyObject *res; |
| 151 | + char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */ |
| 152 | +
|
| 153 | + if (buf == NULL) |
| 154 | + return PyErr_NoMemory(); |
| 155 | + /* ...Do some I/O operation involving buf... */ |
| 156 | + res = PyString_FromString(buf); |
| 157 | + PyMem_Free(buf); /* allocated with PyMem_Malloc */ |
| 158 | + return res; |
| 159 | +\end{verbatim} |
| 160 | + |
| 161 | +The same code using the type-oriented function set: |
| 162 | + |
| 163 | +\begin{verbatim} |
| 164 | + PyObject *res; |
| 165 | + char *buf = PyMem_New(char, BUFSIZ); /* for I/O */ |
| 166 | +
|
| 167 | + if (buf == NULL) |
| 168 | + return PyErr_NoMemory(); |
| 169 | + /* ...Do some I/O operation involving buf... */ |
| 170 | + res = PyString_FromString(buf); |
| 171 | + PyMem_Del(buf); /* allocated with PyMem_New */ |
| 172 | + return res; |
| 173 | +\end{verbatim} |
| 174 | + |
| 175 | +Note that in the two examples above, the buffer is always |
| 176 | +manipulated via functions belonging to the same set. Indeed, it |
| 177 | +is required to use the same memory API family for a given |
| 178 | +memory block, so that the risk of mixing different allocators is |
| 179 | +reduced to a minimum. The following code sequence contains two errors, |
| 180 | +one of which is labeled as \emph{fatal} because it mixes two different |
| 181 | +allocators operating on different heaps. |
| 182 | + |
| 183 | +\begin{verbatim} |
| 184 | +char *buf1 = PyMem_New(char, BUFSIZ); |
| 185 | +char *buf2 = (char *) malloc(BUFSIZ); |
| 186 | +char *buf3 = (char *) PyMem_Malloc(BUFSIZ); |
| 187 | +... |
| 188 | +PyMem_Del(buf3); /* Wrong -- should be PyMem_Free() */ |
| 189 | +free(buf2); /* Right -- allocated via malloc() */ |
| 190 | +free(buf1); /* Fatal -- should be PyMem_Del() */ |
| 191 | +\end{verbatim} |
| 192 | + |
| 193 | +In addition to the functions aimed at handling raw memory blocks from |
| 194 | +the Python heap, objects in Python are allocated and released with |
| 195 | +\cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()} and |
| 196 | +\cfunction{PyObject_Del()}, or with their corresponding macros |
| 197 | +\cfunction{PyObject_NEW()}, \cfunction{PyObject_NEW_VAR()} and |
| 198 | +\cfunction{PyObject_DEL()}. |
| 199 | + |
| 200 | +These will be explained in the next chapter on defining and |
| 201 | +implementing new object types in C. |
0 commit comments