Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 1fc240e

Browse files
committed
Generalize dictionary() to accept a sequence of 2-sequences. At the
outer level, the iterator protocol is used for memory-efficiency (the outer sequence may be very large if fully materialized); at the inner level, PySequence_Fast() is used for time-efficiency (these should always be sequences of length 2). dictobject.c, new functions PyDict_{Merge,Update}FromSeq2. These are wholly analogous to PyDict_{Merge,Update}, but process a sequence-of-2- sequences argument instead of a mapping object. For now, I left these functions file static, so no corresponding doc changes. It's tempting to change dict.update() to allow a sequence-of-2-seqs argument too. Also changed the name of dictionary's keyword argument from "mapping" to "x". Got a better name? "mapping_or_sequence_of_pairs" isn't attractive, although more so than "mosop" <wink>. abstract.h, abstract.tex: Added new PySequence_Fast_GET_SIZE function, much faster than going thru the all-purpose PySequence_Size. libfuncs.tex: - Document dictionary(). - Fiddle tuple() and list() to admit that their argument is optional. - The long-winded repetitions of "a sequence, a container that supports iteration, or an iterator object" is getting to be a PITA. Many months ago I suggested factoring this out into "iterable object", where the definition of that could include being explicit about generators too (as is, I'm not sure a reader outside of PythonLabs could guess that "an iterator object" includes a generator call). - Please check my curly braces -- I'm going blind <0.9 wink>. abstract.c, PySequence_Tuple(): When PyObject_GetIter() fails, leave its error msg alone now (the msg it produces has improved since PySequence_Tuple was generalized to accept iterable objects, and PySequence_Tuple was also stomping on the msg in cases it shouldn't have even before PyObject_GetIter grew a better msg).
1 parent b016da3 commit 1fc240e

7 files changed

Lines changed: 199 additions & 36 deletions

File tree

Doc/api/abstract.tex

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ \section{Object Protocol \label{object}}
125125
the Unicode string representation on success, \NULL{} on failure.
126126
This is the equivalent of the Python expression
127127
\samp{unistr(\var{o})}. Called by the
128-
\function{unistr()}\bifuncindex{unistr} built-in function.
128+
\function{unistr()}\bifuncindex{unistr} built-in function.
129129
\end{cfuncdesc}
130130

131131
\begin{cfuncdesc}{int}{PyObject_IsInstance}{PyObject *inst, PyObject *cls}
@@ -715,10 +715,17 @@ \section{Sequence Protocol \label{sequence}}
715715

716716
\begin{cfuncdesc}{PyObject*}{PySequence_Fast_GET_ITEM}{PyObject *o, int i}
717717
Return the \var{i}th element of \var{o}, assuming that \var{o} was
718-
returned by \cfunction{PySequence_Fast()}, and that \var{i} is
719-
within bounds. The caller is expected to get the length of the
720-
sequence by calling \cfunction{PySequence_Size()} on \var{o}, since
721-
lists and tuples are guaranteed to always return their true length.
718+
returned by \cfunction{PySequence_Fast()}, \var{o} is not \NULL{},
719+
and that \var{i} is within bounds.
720+
\end{cfuncdesc}
721+
722+
\begin{cfuncdesc}{int}{PySequence_Fast_GET_SIZE}{PyObject *o}
723+
Returns the length of \var{o}, assuming that \var{o} was
724+
returned by \cfunction{PySequence_Fast()} and that \var{o} is
725+
not \NULL{}. The size can also be gotten by calling
726+
\cfunction{PySequence_Size()} on \var{o}, but
727+
\cfunction{PySequence_Fast_GET_SIZE()} is faster because it can
728+
assume \var{o} is a list or tuple.
722729
\end{cfuncdesc}
723730

724731

Doc/lib/libfuncs.tex

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,28 @@ \section{Built-in Functions \label{built-in-funcs}}
175175
\code{del \var{x}.\var{foobar}}.
176176
\end{funcdesc}
177177

178+
\begin{funcdesc}{dictionary}{\optional{mapping-or-sequence}}
179+
Return a new dictionary initialized from the optional argument.
180+
If an argument is not specified, return a new empty dictionary.
181+
If the argument is a mapping object, return a dictionary mapping the
182+
same keys to the same values as does the mapping object.
183+
Else the argument must be a sequence, a container that supports
184+
iteration, or an iterator object. The elements of the argument must
185+
each also be of one of those kinds, and each must in turn contain
186+
exactly two objects. The first is used as a key in the new dictionary,
187+
and the second as the key's value. If a given key is seen more than
188+
once, the last value associated with it is retained in the new
189+
dictionary.
190+
For example, these all return a dictionary equal to
191+
\code{\{1: 2, 2: 3\}}:
192+
\code{dictionary(\{1: 2, 2: 3\})},
193+
\code{dictionary(\{1: 2, 2: 3\}.items()},
194+
\code{dictionary(\{1: 2, 2: 3\}.iteritems()},
195+
\code{dictionary(zip((1, 2), (2, 3)))},
196+
\code{dictionary([[2, 3], [1, 2]])}, and
197+
\code{dictionary([(i-1, i) for i in (2, 3)])}.
198+
\end{funcdesc}
199+
178200
\begin{funcdesc}{dir}{\optional{object}}
179201
Without arguments, return the list of names in the current local
180202
symbol table. With an argument, attempts to return a list of valid
@@ -472,7 +494,7 @@ \section{Built-in Functions \label{built-in-funcs}}
472494
may be a sequence (string, tuple or list) or a mapping (dictionary).
473495
\end{funcdesc}
474496

475-
\begin{funcdesc}{list}{sequence}
497+
\begin{funcdesc}{list}{\optional{sequence}}
476498
Return a list whose items are the same and in the same order as
477499
\var{sequence}'s items. \var{sequence} may be either a sequence, a
478500
container that supports iteration, or an iterator object. If
@@ -726,7 +748,7 @@ \section{Built-in Functions \label{built-in-funcs}}
726748
printable string.
727749
\end{funcdesc}
728750

729-
\begin{funcdesc}{tuple}{sequence}
751+
\begin{funcdesc}{tuple}{\optional{sequence}}
730752
Return a tuple whose items are the same and in the same order as
731753
\var{sequence}'s items. \var{sequence} may be a sequence, a
732754
container that supports iteration, or an iterator object.

Include/abstract.h

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -951,26 +951,30 @@ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx*/
951951

952952

953953
DL_IMPORT(PyObject *) PySequence_List(PyObject *o);
954-
955954
/*
956955
Returns the sequence, o, as a list on success, and NULL on failure.
957956
This is equivalent to the Python expression: list(o)
958957
*/
959958

960959
DL_IMPORT(PyObject *) PySequence_Fast(PyObject *o, const char* m);
961-
962960
/*
963961
Returns the sequence, o, as a tuple, unless it's already a
964962
tuple or list. Use PySequence_Fast_GET_ITEM to access the
965-
members of this list.
963+
members of this list, and PySequence_Fast_GET_SIZE to get its length.
966964
967965
Returns NULL on failure. If the object does not support iteration,
968966
raises a TypeError exception with m as the message text.
969967
*/
970968

969+
#define PySequence_Fast_GET_SIZE(o) \
970+
(PyList_Check(o) ? PyList_GET_SIZE(o) : PyTuple_GET_SIZE(o))
971+
/*
972+
Return the size of o, assuming that o was returned by
973+
PySequence_Fast and is not NULL.
974+
*/
975+
971976
#define PySequence_Fast_GET_ITEM(o, i)\
972977
(PyList_Check(o) ? PyList_GET_ITEM(o, i) : PyTuple_GET_ITEM(o, i))
973-
974978
/*
975979
Return the ith element of o, assuming that o was returned by
976980
PySequence_Fast, and that i is within bounds.

Lib/test/test_descr.py

Lines changed: 43 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -178,23 +178,33 @@ def dict_constructor():
178178
vereq(d, {})
179179
d = dictionary({})
180180
vereq(d, {})
181-
d = dictionary(mapping={})
181+
d = dictionary(x={})
182182
vereq(d, {})
183183
d = dictionary({1: 2, 'a': 'b'})
184184
vereq(d, {1: 2, 'a': 'b'})
185+
vereq(d, dictionary(d.items()))
186+
vereq(d, dictionary(x=d.iteritems()))
185187
for badarg in 0, 0L, 0j, "0", [0], (0,):
186188
try:
187189
dictionary(badarg)
188190
except TypeError:
189191
pass
192+
except ValueError:
193+
if badarg == "0":
194+
# It's a sequence, and its elements are also sequences (gotta
195+
# love strings <wink>), but they aren't of length 2, so this
196+
# one seemed better as a ValueError than a TypeError.
197+
pass
198+
else:
199+
raise TestFailed("no TypeError from dictionary(%r)" % badarg)
190200
else:
191201
raise TestFailed("no TypeError from dictionary(%r)" % badarg)
192202
try:
193203
dictionary(senseless={})
194204
except TypeError:
195205
pass
196206
else:
197-
raise TestFailed("no TypeError from dictionary(senseless={}")
207+
raise TestFailed("no TypeError from dictionary(senseless={})")
198208

199209
try:
200210
dictionary({}, {})
@@ -204,11 +214,9 @@ def dict_constructor():
204214
raise TestFailed("no TypeError from dictionary({}, {})")
205215

206216
class Mapping:
217+
# Lacks a .keys() method; will be added later.
207218
dict = {1:2, 3:4, 'a':1j}
208219

209-
def __getitem__(self, i):
210-
return self.dict[i]
211-
212220
try:
213221
dictionary(Mapping())
214222
except TypeError:
@@ -217,9 +225,36 @@ def __getitem__(self, i):
217225
raise TestFailed("no TypeError from dictionary(incomplete mapping)")
218226

219227
Mapping.keys = lambda self: self.dict.keys()
220-
d = dictionary(mapping=Mapping())
228+
Mapping.__getitem__ = lambda self, i: self.dict[i]
229+
d = dictionary(x=Mapping())
221230
vereq(d, Mapping.dict)
222231

232+
# Init from sequence of iterable objects, each producing a 2-sequence.
233+
class AddressBookEntry:
234+
def __init__(self, first, last):
235+
self.first = first
236+
self.last = last
237+
def __iter__(self):
238+
return iter([self.first, self.last])
239+
240+
d = dictionary([AddressBookEntry('Tim', 'Warsaw'),
241+
AddressBookEntry('Barry', 'Peters'),
242+
AddressBookEntry('Tim', 'Peters'),
243+
AddressBookEntry('Barry', 'Warsaw')])
244+
vereq(d, {'Barry': 'Warsaw', 'Tim': 'Peters'})
245+
246+
d = dictionary(zip(range(4), range(1, 5)))
247+
vereq(d, dictionary([(i, i+1) for i in range(4)]))
248+
249+
# Bad sequence lengths.
250+
for bad in ['tooshort'], ['too', 'long', 'by 1']:
251+
try:
252+
dictionary(bad)
253+
except ValueError:
254+
pass
255+
else:
256+
raise TestFailed("no ValueError from dictionary(%r)" % bad)
257+
223258
def test_dir():
224259
if verbose:
225260
print "Testing dir() ..."
@@ -1830,7 +1865,7 @@ def keywords():
18301865
vereq(unicode(string='abc', errors='strict'), u'abc')
18311866
vereq(tuple(sequence=range(3)), (0, 1, 2))
18321867
vereq(list(sequence=(0, 1, 2)), range(3))
1833-
vereq(dictionary(mapping={1: 2}), {1: 2})
1868+
vereq(dictionary(x={1: 2}), {1: 2})
18341869

18351870
for constructor in (int, float, long, complex, str, unicode,
18361871
tuple, list, dictionary, file):
@@ -2371,7 +2406,7 @@ def f(a): return a
23712406
vereq(f.__call__(a=42), 42)
23722407
a = []
23732408
list.__init__(a, sequence=[0, 1, 2])
2374-
vereq(a, [0, 1, 2])
2409+
vereq(a, [0, 1, 2])
23752410

23762411
def test_main():
23772412
class_docstrings()

Misc/NEWS

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,11 @@ XXX Planned XXX Release date: 14-Nov-2001
44

55
Type/class unification and new-style classes
66

7+
- dictionary() now accepts an iterable object producing 2-sequences.
8+
For example, dictionary(d.items()) == d for any dictionary d. The
9+
argument, and the elements of the argument, can be any iterable
10+
objects.
11+
712
- Methods of built-in types now properly check for keyword arguments
813
(formerly these were silently ignored). The only built-in methods
914
that take keyword arguments are __call__, __init__ and __new__.
@@ -31,6 +36,10 @@ Build
3136

3237
C API
3338

39+
- New function PySequence_Fast_GET_SIZE() returns the size of a non-
40+
NULL result from PySequence_Fast(), more quickly than calling
41+
PySequence_Size().
42+
3443
New platforms
3544

3645
- Updated RISCOS port by Dietmar Schwertberger.

Objects/abstract.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1278,7 +1278,7 @@ PySequence_Tuple(PyObject *v)
12781278
/* Get iterator. */
12791279
it = PyObject_GetIter(v);
12801280
if (it == NULL)
1281-
return type_error("tuple() argument must support iteration");
1281+
return NULL;
12821282

12831283
/* Guess result size and allocate space. */
12841284
n = PySequence_Size(v);

Objects/dictobject.c

Lines changed: 102 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -993,7 +993,89 @@ dict_update(PyObject *mp, PyObject *other)
993993

994994
/* Update unconditionally replaces existing items.
995995
Merge has a 3rd argument 'override'; if set, it acts like Update,
996-
otherwise it leaves existing items unchanged. */
996+
otherwise it leaves existing items unchanged.
997+
998+
PyDict_{Update,Merge} update/merge from a mapping object.
999+
1000+
PyDict_{Update,Merge}FromSeq2 update/merge from any iterable object
1001+
producing iterable objects of length 2.
1002+
*/
1003+
1004+
static int
1005+
PyDict_MergeFromSeq2(PyObject *d, PyObject *seq2, int override)
1006+
{
1007+
PyObject *it; /* iter(seq2) */
1008+
int i; /* index into seq2 of current element */
1009+
PyObject *item; /* seq2[i] */
1010+
PyObject *fast; /* item as a 2-tuple or 2-list */
1011+
1012+
assert(d != NULL);
1013+
assert(PyDict_Check(d));
1014+
assert(seq2 != NULL);
1015+
1016+
it = PyObject_GetIter(seq2);
1017+
if (it == NULL)
1018+
return -1;
1019+
1020+
for (i = 0; ; ++i) {
1021+
PyObject *key, *value;
1022+
int n;
1023+
1024+
fast = NULL;
1025+
item = PyIter_Next(it);
1026+
if (item == NULL) {
1027+
if (PyErr_Occurred())
1028+
goto Fail;
1029+
break;
1030+
}
1031+
1032+
/* Convert item to sequence, and verify length 2. */
1033+
fast = PySequence_Fast(item, "");
1034+
if (fast == NULL) {
1035+
if (PyErr_ExceptionMatches(PyExc_TypeError))
1036+
PyErr_Format(PyExc_TypeError,
1037+
"cannot convert dictionary update "
1038+
"sequence element #%d to a sequence",
1039+
i);
1040+
goto Fail;
1041+
}
1042+
n = PySequence_Fast_GET_SIZE(fast);
1043+
if (n != 2) {
1044+
PyErr_Format(PyExc_ValueError,
1045+
"dictionary update sequence element #%d "
1046+
"has length %d; 2 is required",
1047+
i, n);
1048+
goto Fail;
1049+
}
1050+
1051+
/* Update/merge with this (key, value) pair. */
1052+
key = PySequence_Fast_GET_ITEM(fast, 0);
1053+
value = PySequence_Fast_GET_ITEM(fast, 1);
1054+
if (override || PyDict_GetItem(d, key) == NULL) {
1055+
int status = PyDict_SetItem(d, key, value);
1056+
if (status < 0)
1057+
goto Fail;
1058+
}
1059+
Py_DECREF(fast);
1060+
Py_DECREF(item);
1061+
}
1062+
1063+
i = 0;
1064+
goto Return;
1065+
Fail:
1066+
Py_XDECREF(item);
1067+
Py_XDECREF(fast);
1068+
i = -1;
1069+
Return:
1070+
Py_DECREF(it);
1071+
return i;
1072+
}
1073+
1074+
static int
1075+
PyDict_UpdateFromSeq2(PyObject *d, PyObject *seq2)
1076+
{
1077+
return PyDict_MergeFromSeq2(d, seq2, 1);
1078+
}
9971079

9981080
int
9991081
PyDict_Update(PyObject *a, PyObject *b)
@@ -1699,23 +1781,20 @@ static int
16991781
dict_init(PyObject *self, PyObject *args, PyObject *kwds)
17001782
{
17011783
PyObject *arg = NULL;
1702-
static char *kwlist[] = {"mapping", 0};
1784+
static char *kwlist[] = {"x", 0};
1785+
int result = 0;
17031786

17041787
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|O:dictionary",
17051788
kwlist, &arg))
1706-
return -1;
1707-
if (arg != NULL) {
1708-
if (PyDict_Merge(self, arg, 1) < 0) {
1709-
/* An error like "AttributeError: keys" is too
1710-
cryptic in this context. */
1711-
if (PyErr_ExceptionMatches(PyExc_AttributeError)) {
1712-
PyErr_SetString(PyExc_TypeError,
1713-
"argument must be of a mapping type");
1714-
}
1715-
return -1;
1716-
}
1789+
result = -1;
1790+
1791+
else if (arg != NULL) {
1792+
if (PyObject_HasAttrString(arg, "keys"))
1793+
result = PyDict_Merge(self, arg, 1);
1794+
else
1795+
result = PyDict_MergeFromSeq2(self, arg, 1);
17171796
}
1718-
return 0;
1797+
return result;
17191798
}
17201799

17211800
static PyObject *
@@ -1725,8 +1804,15 @@ dict_iter(dictobject *dict)
17251804
}
17261805

17271806
static char dictionary_doc[] =
1728-
"dictionary() -> new empty dictionary\n"
1729-
"dictionary(mapping) -> new dict initialized from mapping's key+value pairs";
1807+
"dictionary() -> new empty dictionary.\n"
1808+
"dictionary(mapping) -> new dict initialized from a mapping object's\n"
1809+
" (key, value) pairs.\n"
1810+
"dictionary(seq) -> new dict initialized from the 2-element elements of\n"
1811+
" a sequence; for example, from mapping.items(). seq must be an\n"
1812+
" iterable object, producing iterable objects each producing exactly\n"
1813+
" two objects, the first of which is used as a key and the second as\n"
1814+
" its value. If a given key is seen more than once, the dict retains\n"
1815+
" the last value associated with it.";
17301816

17311817
PyTypeObject PyDict_Type = {
17321818
PyObject_HEAD_INIT(&PyType_Type)

0 commit comments

Comments
 (0)