You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a general need for looping over not only functions on scalars
but also over functions on vectors (or arrays), as explained on http://scipy.org/scipy/numpy/wiki/GeneralLoopingFunctions. We propose
to realize this concept by generalizing the universal functions
(ufuncs), and provide a C implementation that adds ~500 lines
to the numpy code base. In current (specialized) ufuncs, the elementary
function is limited to element-by-element operations, whereas the
generalized version supports "sub-array" by "sub-array" operations.
The Perl vector library PDL provides a similar functionality and its
terms are re-used in the following.
Each generalized ufunc has information associated with it that states
what the "core" dimensionality of the inputs is, as well as the
corresponding dimensionality of the outputs (the element-wise ufuncs
have zero core dimensions). The list of the core dimensions for all
arguments is called the "signature" of a ufunc. For example, the
ufunc numpy.add has signature "(),()->()" defining two scalar inputs
and one scalar output.
Another example is (see the GeneralLoopingFunctions page) the function inner1d(a,b) with a signature of "(i),(i)->()". This applies the
inner product along the last axis of each input, but keeps the
remaining indices intact. For example, where a is of shape (3,5,N)
and b is of shape (5,N), this will return an output of shape (3,5).
The underlying elementary function is called 3*5 times. In the
signature, we specify one core dimension "(i)" for each input and zero core
dimensions "()" for the output, since it takes two 1-d arrays and
returns a scalar. By using the same name "i", we specify that the two
corresponding dimensions should be of the same size (or one of them is
of size 1 and will be broadcasted).
The dimensions beyond the core dimensions are called "loop" dimensions. In
the above example, this corresponds to (3,5).
The usual numpy "broadcasting" rules apply, where the signature
determines how the dimensions of each input/output object are split
into core and loop dimensions:
While an input array has a smaller dimensionality than the corresponding
number of core dimensions, 1's are pre-pended to its shape.
The core dimensions are removed from all inputs and the remaining
dimensions are broadcasted; defining the loop dimensions.
The output is given by the loop dimensions plus the output core dimensions.
= Definitions =
Elementary Function:
Each ufunc consists of an elementary function that performs the
most basic operation on the smallest portion of array arguments
(e.g. adding two numbers is the most basic operation in adding two
arrays). The ufunc applies the elementary function multiple times
on different parts of the arrays. The input/output of elementary
functions can be vectors; e.g., the elementary function of inner1d
takes two vectors as input.
Signature:
A signature is a string describing the input/output dimensions of
the elementary function of a ufunc. See section below for more
details.
Core Dimension:
The dimensionality of each input/output of an elementary function
is defined by its core dimensions (zero core dimensions correspond
to a scalar input/output). The core dimensions are mapped to the
last dimensions of the input/output arrays.
Dimension Name:
A dimension name represents a core dimension in the signature.
Different dimensions may share a name, indicating that they are of
the same size (or are broadcastable).
Dimension Index:
A dimension index is an integer representing a dimension name. It
enumerates the dimension names according to the order of the first
occurrence of each name in the signature.
= Details of Signature =
The signature defines "core" dimensionality of input and output
variables, and thereby also defines the contraction of the
dimensions. The signature is represented by a string of the
following format:
Core dimensions of each input or output array are represented by a
list of dimension names in parentheses, "(i_1,...,i_N)"; a scalar
input/output is denoted by "()". Instead of "i_1", "i_2", etc, one can
use any valid Python variable name.
Dimension lists for different arguments are separated by ",".
Input/output arguments are separated by "->".
If one uses the same dimension name in multiple locations, this
enforces the same size (or broadcastable size) of the corresponding
dimensions.
Core dimensions that share the same name must be broadcastable, as
the two "i"s in our example above. Each dimension name typically
corresponding to one level of looping in the elementary function's
implementation.
White spaces are ignored.
Here are some examples of signatures.
|| add || "(),()->()" || ||
|| inner1d || "(i),(i)->()" || ||
|| sum1d || "(i)->()" || ||
|| dot2d || "(m,n),(n,p)->(m,p)" || (matrix multiplication) ||
|| outer_inner || "(i,t),(j,t)->(i,j)" || (inner over the last dimension, outer over the second to last, and loop/broadcast over the rest.) ||
= C-API for implementing Elementary Functions =
The current interface remains unchanged, and PyUFunc_FromFuncAndData
can still be used to implement (specialized) ufuncs, consisting of
scalar elementary functions.
One can use PyUFunc_FromFuncAndDataAndSignature to declare a more
general ufunc. The argument list is the same as PyUFunc_FromFuncAndData, with an additional argument specifying the
signature as C string.
Furthermore, the callback function is of the same type as before, void (*foo)(char **args, intp *dimensions, intp *steps, void *func).
When invoked, args is a list of length nargs containing
the data of all input/output arguments. For a scalar elementary
function, steps is also of length nargs, denoting the strides used
for the arguments. dimensions is a pointer to a single integer
defining the size of the axis to be looped over.
For a non-trivial signature, dimensions will also contain the sizes
of the core dimensions as well, starting at the second entry. Only
one size is provided for each unique dimension name and the sizes are
given according to the first occurrence of a dimension name in the
signature.
The first nargs elements of steps remain the same as for scalar
ufuncs. The following elements contain the strides of all core
dimensions for all arguments in order.
For example, consider a ufunc with signature "(i,j),(i)->()". In
this case, args will contain three pointers to the data of the
input/output arrays a, b, c. Furthermore, dimensions will be [N, I, J] to define the size of N of the loop and the sizes I and J
for the core dimensions i and j. Finally, steps will be [a_N, b_N, c_N, a_i, a_j, b_i], containing all necessary strides.
= License =
The attached code and the above documentation was written by Wenjie
Fu [email protected] and Hans-Andreas Engel [email protected], and
is provided under the numpy license:
Copyright (c) 2005, NumPy Developers
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.
* Neither the name of the NumPy Developers nor the names of any
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The text was updated successfully, but these errors were encountered:
Original ticket http://projects.scipy.org/numpy/ticket/887 on 2008-08-13 by trac user fuwenjie, assigned to unknown.
= Generalized Universal Functions =
There is a general need for looping over not only functions on scalars
but also over functions on vectors (or arrays), as explained on
http://scipy.org/scipy/numpy/wiki/GeneralLoopingFunctions. We propose
to realize this concept by generalizing the universal functions
(ufuncs), and provide a C implementation that adds ~500 lines
to the numpy code base. In current (specialized) ufuncs, the elementary
function is limited to element-by-element operations, whereas the
generalized version supports "sub-array" by "sub-array" operations.
The Perl vector library PDL provides a similar functionality and its
terms are re-used in the following.
Each generalized ufunc has information associated with it that states
what the "core" dimensionality of the inputs is, as well as the
corresponding dimensionality of the outputs (the element-wise ufuncs
have zero core dimensions). The list of the core dimensions for all
arguments is called the "signature" of a ufunc. For example, the
ufunc numpy.add has signature
"(),()->()"
defining two scalar inputsand one scalar output.
Another example is (see the GeneralLoopingFunctions page) the function
inner1d(a,b)
with a signature of"(i),(i)->()"
. This applies theinner product along the last axis of each input, but keeps the
remaining indices intact. For example, where
a
is of shape(3,5,N)
and
b
is of shape(5,N)
, this will return an output of shape(3,5)
.The underlying elementary function is called 3*5 times. In the
signature, we specify one core dimension
"(i)"
for each input and zero coredimensions
"()"
for the output, since it takes two 1-d arrays andreturns a scalar. By using the same name
"i"
, we specify that the twocorresponding dimensions should be of the same size (or one of them is
of size 1 and will be broadcasted).
The dimensions beyond the core dimensions are called "loop" dimensions. In
the above example, this corresponds to
(3,5)
.The usual numpy "broadcasting" rules apply, where the signature
determines how the dimensions of each input/output object are split
into core and loop dimensions:
number of core dimensions, 1's are pre-pended to its shape.
dimensions are broadcasted; defining the loop dimensions.
= Definitions =
Elementary Function:
Signature:
Core Dimension:
Dimension Name:
Dimension Index:
= Details of Signature =
The signature defines "core" dimensionality of input and output
variables, and thereby also defines the contraction of the
dimensions. The signature is represented by a string of the
following format:
list of dimension names in parentheses,
"(i_1,...,i_N)"
; a scalarinput/output is denoted by
"()"
. Instead of"i_1"
,"i_2"
, etc, one canuse any valid Python variable name.
","
.Input/output arguments are separated by
"->"
.enforces the same size (or broadcastable size) of the corresponding
dimensions.
The formal syntax of signatures is as follows.
Notes:
the two
"i"
s in our example above. Each dimension name typicallycorresponding to one level of looping in the elementary function's
implementation.
Here are some examples of signatures.
|| add ||
"(),()->()"
|| |||| inner1d ||
"(i),(i)->()"
|| |||| sum1d ||
"(i)->()"
|| |||| dot2d ||
"(m,n),(n,p)->(m,p)"
|| (matrix multiplication) |||| outer_inner ||
"(i,t),(j,t)->(i,j)"
|| (inner over the last dimension, outer over the second to last, and loop/broadcast over the rest.) ||= C-API for implementing Elementary Functions =
The current interface remains unchanged, and
PyUFunc_FromFuncAndData
can still be used to implement (specialized) ufuncs, consisting of
scalar elementary functions.
One can use
PyUFunc_FromFuncAndDataAndSignature
to declare a moregeneral ufunc. The argument list is the same as
PyUFunc_FromFuncAndData
, with an additional argument specifying thesignature as C string.
Furthermore, the callback function is of the same type as before,
void (*foo)(char **args, intp *dimensions, intp *steps, void *func)
.When invoked,
args
is a list of lengthnargs
containingthe data of all input/output arguments. For a scalar elementary
function,
steps
is also of lengthnargs
, denoting the strides usedfor the arguments.
dimensions
is a pointer to a single integerdefining the size of the axis to be looped over.
For a non-trivial signature,
dimensions
will also contain the sizesof the core dimensions as well, starting at the second entry. Only
one size is provided for each unique dimension name and the sizes are
given according to the first occurrence of a dimension name in the
signature.
The first
nargs
elements ofsteps
remain the same as for scalarufuncs. The following elements contain the strides of all core
dimensions for all arguments in order.
For example, consider a ufunc with signature
"(i,j),(i)->()"
. Inthis case,
args
will contain three pointers to the data of theinput/output arrays
a
,b
,c
. Furthermore,dimensions
will be[N, I, J]
to define the size ofN
of the loop and the sizesI
andJ
for the core dimensions
i
andj
. Finally,steps
will be[a_N, b_N, c_N, a_i, a_j, b_i]
, containing all necessary strides.= License =
The attached code and the above documentation was written by Wenjie
Fu [email protected] and Hans-Andreas Engel [email protected], and
is provided under the numpy license:
The text was updated successfully, but these errors were encountered: