Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 976157f

Browse files
committed
Merged revisions 86981,86984 via svnmerge from
svn+ssh://[email protected]/python/branches/py3k ........ r86981 | antoine.pitrou | 2010-12-03 19:41:39 +0100 (ven., 03 déc. 2010) | 5 lines Issue #10478: Reentrant calls inside buffered IO objects (for example by way of a signal handler) now raise a RuntimeError instead of freezing the current process. ........ r86984 | antoine.pitrou | 2010-12-03 20:14:17 +0100 (ven., 03 déc. 2010) | 3 lines Add an "advanced topics" section to the io doc. ........
1 parent a818394 commit 976157f

4 files changed

Lines changed: 155 additions & 30 deletions

File tree

Doc/library/io.rst

Lines changed: 65 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -54,12 +54,6 @@ In-memory text streams are also available as :class:`StringIO` objects::
5454
The text stream API is described in detail in the documentation for the
5555
:class:`TextIOBase`.
5656

57-
.. note::
58-
59-
Text I/O over a binary storage (such as a file) is significantly slower than
60-
binary I/O over the same storage. This can become noticeable if you handle
61-
huge amounts of text data (for example very large log files).
62-
6357

6458
Binary I/O
6559
^^^^^^^^^^
@@ -506,8 +500,8 @@ Raw File I/O
506500
Buffered Streams
507501
^^^^^^^^^^^^^^^^
508502

509-
In many situations, buffered I/O streams will provide higher performance
510-
(bandwidth and latency) than raw I/O streams. Their API is also more usable.
503+
Buffered I/O streams provide a higher-level interface to an I/O device
504+
than raw I/O does.
511505

512506
.. class:: BytesIO([initial_bytes])
513507

@@ -766,14 +760,72 @@ Text I/O
766760
# .getvalue() will now raise an exception.
767761
output.close()
768762

769-
.. note::
770-
771-
:class:`StringIO` uses a native text storage and doesn't suffer from the
772-
performance issues of other text streams, such as those based on
773-
:class:`TextIOWrapper`.
774763

775764
.. class:: IncrementalNewlineDecoder
776765

777766
A helper codec that decodes newlines for universal newlines mode. It
778767
inherits :class:`codecs.IncrementalDecoder`.
779768

769+
770+
Advanced topics
771+
---------------
772+
773+
Here we will discuss several advanced topics pertaining to the concrete
774+
I/O implementations described above.
775+
776+
Performance
777+
^^^^^^^^^^^
778+
779+
Binary I/O
780+
""""""""""
781+
782+
By reading and writing only large chunks of data even when the user asks
783+
for a single byte, buffered I/O is designed to hide any inefficiency in
784+
calling and executing the operating system's unbuffered I/O routines. The
785+
gain will vary very much depending on the OS and the kind of I/O which is
786+
performed (for example, on some contemporary OSes such as Linux, unbuffered
787+
disk I/O can be as fast as buffered I/O). The bottom line, however, is
788+
that buffered I/O will offer you predictable performance regardless of the
789+
platform and the backing device. Therefore, it is most always preferable to
790+
use buffered I/O rather than unbuffered I/O.
791+
792+
Text I/O
793+
""""""""
794+
795+
Text I/O over a binary storage (such as a file) is significantly slower than
796+
binary I/O over the same storage, because it implies conversions from
797+
unicode to binary data using a character codec. This can become noticeable
798+
if you handle huge amounts of text data (for example very large log files).
799+
800+
:class:`StringIO`, however, is a native in-memory unicode container and will
801+
exhibit similar speed to :class:`BytesIO`.
802+
803+
Multi-threading
804+
^^^^^^^^^^^^^^^
805+
806+
:class:`FileIO` objects are thread-safe to the extent that the operating
807+
system calls (such as ``read(2)`` under Unix) they are wrapping are thread-safe
808+
too.
809+
810+
Binary buffered objects (instances of :class:`BufferedReader`,
811+
:class:`BufferedWriter`, :class:`BufferedRandom` and :class:`BufferedRWPair`)
812+
protect their internal structures using a lock; it is therefore safe to call
813+
them from multiple threads at once.
814+
815+
:class:`TextIOWrapper` objects are not thread-safe.
816+
817+
Reentrancy
818+
^^^^^^^^^^
819+
820+
Binary buffered objects (instances of :class:`BufferedReader`,
821+
:class:`BufferedWriter`, :class:`BufferedRandom` and :class:`BufferedRWPair`)
822+
are not reentrant. While reentrant calls will not happen in normal situations,
823+
they can arise if you are doing I/O in a :mod:`signal` handler. If it is
824+
attempted to enter a buffered object again while already being accessed
825+
*from the same thread*, then a :exc:`RuntimeError` is raised.
826+
827+
The above implicitly extends to text files, since the :func:`open()`
828+
function will wrap a buffered object inside a :class:`TextIOWrapper`. This
829+
includes standard streams and therefore affects the built-in function
830+
:func:`print()` as well.
831+

Lib/test/test_io.py

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2560,12 +2560,47 @@ def test_interrupted_write_buffered(self):
25602560
def test_interrupted_write_text(self):
25612561
self.check_interrupted_write("xy", b"xy", mode="w", encoding="ascii")
25622562

2563+
def check_reentrant_write(self, data, **fdopen_kwargs):
2564+
def on_alarm(*args):
2565+
# Will be called reentrantly from the same thread
2566+
wio.write(data)
2567+
1/0
2568+
signal.signal(signal.SIGALRM, on_alarm)
2569+
r, w = os.pipe()
2570+
wio = self.io.open(w, **fdopen_kwargs)
2571+
try:
2572+
signal.alarm(1)
2573+
# Either the reentrant call to wio.write() fails with RuntimeError,
2574+
# or the signal handler raises ZeroDivisionError.
2575+
with self.assertRaises((ZeroDivisionError, RuntimeError)) as cm:
2576+
while 1:
2577+
for i in range(100):
2578+
wio.write(data)
2579+
wio.flush()
2580+
# Make sure the buffer doesn't fill up and block further writes
2581+
os.read(r, len(data) * 100)
2582+
finally:
2583+
wio.close()
2584+
os.close(r)
2585+
2586+
def test_reentrant_write_buffered(self):
2587+
self.check_reentrant_write(b"xy", mode="wb")
2588+
2589+
def test_reentrant_write_text(self):
2590+
self.check_reentrant_write("xy", mode="w", encoding="ascii")
2591+
2592+
25632593
class CSignalsTest(SignalsTest):
25642594
io = io
25652595

25662596
class PySignalsTest(SignalsTest):
25672597
io = pyio
25682598

2599+
# Handling reentrancy issues would slow down _pyio even more, so the
2600+
# tests are disabled.
2601+
test_reentrant_write_buffered = None
2602+
test_reentrant_write_text = None
2603+
25692604

25702605
def test_main():
25712606
tests = (CIOTest, PyIOTest,

Misc/NEWS

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ Core and Builtins
1313
Library
1414
-------
1515

16+
- Issue #10478: Reentrant calls inside buffered IO objects (for example by
17+
way of a signal handler) now raise a RuntimeError instead of freezing the
18+
current process.
19+
1620
- Issue #10464: netrc now correctly handles lines with embedded '#' characters.
1721

1822

Modules/_io/bufferedio.c

Lines changed: 51 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -224,6 +224,7 @@ typedef struct {
224224

225225
#ifdef WITH_THREAD
226226
PyThread_type_lock lock;
227+
volatile long owner;
227228
#endif
228229

229230
Py_ssize_t buffer_size;
@@ -259,15 +260,33 @@ typedef struct {
259260
/* These macros protect the buffered object against concurrent operations. */
260261

261262
#ifdef WITH_THREAD
262-
#define ENTER_BUFFERED(self) \
263-
Py_BEGIN_ALLOW_THREADS \
264-
PyThread_acquire_lock(self->lock, 1); \
263+
static int
264+
_enter_buffered_busy(buffered *self)
265+
{
266+
if (self->owner == PyThread_get_thread_ident()) {
267+
PyErr_Format(PyExc_RuntimeError,
268+
"reentrant call inside %R", self);
269+
return 0;
270+
}
271+
Py_BEGIN_ALLOW_THREADS
272+
PyThread_acquire_lock(self->lock, 1);
265273
Py_END_ALLOW_THREADS
274+
return 1;
275+
}
276+
277+
#define ENTER_BUFFERED(self) \
278+
( (PyThread_acquire_lock(self->lock, 0) ? \
279+
1 : _enter_buffered_busy(self)) \
280+
&& (self->owner = PyThread_get_thread_ident(), 1) )
266281

267282
#define LEAVE_BUFFERED(self) \
268-
PyThread_release_lock(self->lock);
283+
do { \
284+
self->owner = 0; \
285+
PyThread_release_lock(self->lock); \
286+
} while(0);
287+
269288
#else
270-
#define ENTER_BUFFERED(self)
289+
#define ENTER_BUFFERED(self) 1
271290
#define LEAVE_BUFFERED(self)
272291
#endif
273292

@@ -423,7 +442,8 @@ buffered_close(buffered *self, PyObject *args)
423442
int r;
424443

425444
CHECK_INITIALIZED(self)
426-
ENTER_BUFFERED(self)
445+
if (!ENTER_BUFFERED(self))
446+
return NULL;
427447

428448
r = buffered_closed(self);
429449
if (r < 0)
@@ -436,7 +456,8 @@ buffered_close(buffered *self, PyObject *args)
436456
/* flush() will most probably re-take the lock, so drop it first */
437457
LEAVE_BUFFERED(self)
438458
res = PyObject_CallMethodObjArgs((PyObject *)self, _PyIO_str_flush, NULL);
439-
ENTER_BUFFERED(self)
459+
if (!ENTER_BUFFERED(self))
460+
return NULL;
440461
if (res == NULL) {
441462
goto end;
442463
}
@@ -639,6 +660,7 @@ _buffered_init(buffered *self)
639660
PyErr_SetString(PyExc_RuntimeError, "can't allocate read lock");
640661
return -1;
641662
}
663+
self->owner = 0;
642664
#endif
643665
/* Find out whether buffer_size is a power of 2 */
644666
/* XXX is this optimization useful? */
@@ -665,7 +687,8 @@ buffered_flush(buffered *self, PyObject *args)
665687
CHECK_INITIALIZED(self)
666688
CHECK_CLOSED(self, "flush of closed file")
667689

668-
ENTER_BUFFERED(self)
690+
if (!ENTER_BUFFERED(self))
691+
return NULL;
669692
res = _bufferedwriter_flush_unlocked(self, 0);
670693
if (res != NULL && self->readable) {
671694
/* Rewind the raw stream so that its position corresponds to
@@ -692,7 +715,8 @@ buffered_peek(buffered *self, PyObject *args)
692715
return NULL;
693716
}
694717

695-
ENTER_BUFFERED(self)
718+
if (!ENTER_BUFFERED(self))
719+
return NULL;
696720

697721
if (self->writable) {
698722
res = _bufferedwriter_flush_unlocked(self, 1);
@@ -727,15 +751,17 @@ buffered_read(buffered *self, PyObject *args)
727751

728752
if (n == -1) {
729753
/* The number of bytes is unspecified, read until the end of stream */
730-
ENTER_BUFFERED(self)
754+
if (!ENTER_BUFFERED(self))
755+
return NULL;
731756
res = _bufferedreader_read_all(self);
732757
LEAVE_BUFFERED(self)
733758
}
734759
else {
735760
res = _bufferedreader_read_fast(self, n);
736761
if (res == Py_None) {
737762
Py_DECREF(res);
738-
ENTER_BUFFERED(self)
763+
if (!ENTER_BUFFERED(self))
764+
return NULL;
739765
res = _bufferedreader_read_generic(self, n);
740766
LEAVE_BUFFERED(self)
741767
}
@@ -763,7 +789,8 @@ buffered_read1(buffered *self, PyObject *args)
763789
if (n == 0)
764790
return PyBytes_FromStringAndSize(NULL, 0);
765791

766-
ENTER_BUFFERED(self)
792+
if (!ENTER_BUFFERED(self))
793+
return NULL;
767794

768795
if (self->writable) {
769796
res = _bufferedwriter_flush_unlocked(self, 1);
@@ -819,7 +846,8 @@ buffered_readinto(buffered *self, PyObject *args)
819846

820847
/* TODO: use raw.readinto() instead! */
821848
if (self->writable) {
822-
ENTER_BUFFERED(self)
849+
if (!ENTER_BUFFERED(self))
850+
return NULL;
823851
res = _bufferedwriter_flush_unlocked(self, 0);
824852
LEAVE_BUFFERED(self)
825853
if (res == NULL)
@@ -863,7 +891,8 @@ _buffered_readline(buffered *self, Py_ssize_t limit)
863891
goto end_unlocked;
864892
}
865893

866-
ENTER_BUFFERED(self)
894+
if (!ENTER_BUFFERED(self))
895+
goto end_unlocked;
867896

868897
/* Now we try to get some more from the raw stream */
869898
if (self->writable) {
@@ -1013,7 +1042,8 @@ buffered_seek(buffered *self, PyObject *args)
10131042
}
10141043
}
10151044

1016-
ENTER_BUFFERED(self)
1045+
if (!ENTER_BUFFERED(self))
1046+
return NULL;
10171047

10181048
/* Fallback: invoke raw seek() method and clear buffer */
10191049
if (self->writable) {
@@ -1051,7 +1081,8 @@ buffered_truncate(buffered *self, PyObject *args)
10511081
return NULL;
10521082
}
10531083

1054-
ENTER_BUFFERED(self)
1084+
if (!ENTER_BUFFERED(self))
1085+
return NULL;
10551086

10561087
if (self->writable) {
10571088
res = _bufferedwriter_flush_unlocked(self, 0);
@@ -1705,7 +1736,10 @@ bufferedwriter_write(buffered *self, PyObject *args)
17051736
return NULL;
17061737
}
17071738

1708-
ENTER_BUFFERED(self)
1739+
if (!ENTER_BUFFERED(self)) {
1740+
PyBuffer_Release(&buf);
1741+
return NULL;
1742+
}
17091743

17101744
/* Fast path: the data to write can be fully buffered. */
17111745
if (!VALID_READ_BUFFER(self) && !VALID_WRITE_BUFFER(self)) {

0 commit comments

Comments
 (0)