Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
MAINT: Add new recfunctions to numpy function API
  • Loading branch information
ahaldane committed Nov 23, 2018
commit 61371de744b363eacdb2ae277c33d365164380f3
23 changes: 23 additions & 0 deletions numpy/lib/recfunctions.py
Original file line number Diff line number Diff line change
Expand Up @@ -888,6 +888,12 @@ def _get_fields_and_offsets(dt, offset=0):
fields.extend(_get_fields_and_offsets(field[0], field[1] + offset))
return fields


def _structured_to_unstructured_dispatcher(arr, dtype=None, copy=None,
casting=None):
return (arr,)

@array_function_dispatch(_structured_to_unstructured_dispatcher)
def structured_to_unstructured(arr, dtype=None, copy=False, casting='unsafe'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are places we learnt that unsafe was a bad default, but ended up stuck with it, leaving users surprised by the conversion.

Should we apply that learning here, and pick a more conservative default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have missed that. Maybe in #8733? But that was about assignment using unsafe casting, with no option to specify otherwise, unlike here where there is a keyword the user can specify.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that issue was one of the ones I was thinking of - thanks for linking it, I was looking for that for unrelated reasons too!

Unsafe just feels like an... unsafe default to me. In my opinion, unsafe behavior should be something you ask for, not something you get by default. You're picking between:

  • as is: f(...) vs f(..., casting='safe')
  • proposed: f(..., casting='unsafe') vs f(...)

I'd much rather see the word 'unsafe' to tell me I need to think more carefully about that line of code, rather than having to look for the absence of it.

I don't have a good memory of how the other casting modes behave. I'd be inclined to pick same_kind to match the default value of the casting argument for for ufuncs

Copy link
Member Author

@ahaldane ahaldane Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an argument for unsafe:

First, it matches the default for the same keyword in astype, and so its easier for the user to remember if they are used to using astype.

Second, it seems like most of the time the user wants unsafe, because there are many common casts that are ruled out otherwise. For instance casts from f8 to i8 are disallowed with same_kind, but I expect this is a very common cast.

Actually, for reasons I don't understand, ufuncs seem to allow casts from f8 to i8 even though they supposedly use same_kind:

>>> np.arange(3, dtype='f8').astype('i8', casting='same_kind')                 
TypeError: Cannot cast array from dtype('float64') to dtype('int64') according to the rule 'same_kind'
>>> np.add(np.arange(3, dtype='f8'), np.arange(3, dtype='i8'))                 
array([0., 2., 4.])
>>> np.can_cast('f8', 'i8', casting='same_kind')
False

So ufuncs appear to use unsafe casting despite the keyword default??

"""
Converts and n-D structured array into an (n+1)-D unstructured array.
Expand Down Expand Up @@ -968,6 +974,11 @@ def structured_to_unstructured(arr, dtype=None, copy=False, casting='unsafe'):
# finally is it safe to view the packed fields as the unstructured type
return arr.view((out_dtype, sum(counts)))

def _unstructured_to_structured_dispatcher(arr, dtype=None, names=None,
align=None, copy=None, casting=None):
return (arr,)

@array_function_dispatch(_unstructured_to_structured_dispatcher)
def unstructured_to_structured(arr, dtype=None, names=None, align=False,
copy=False, casting='unsafe'):
"""
Expand Down Expand Up @@ -1061,6 +1072,10 @@ def unstructured_to_structured(arr, dtype=None, names=None, align=False,
# finally view as the final nested dtype and remove the last axis
return arr.view(out_dtype)[..., 0]

def _apply_along_fields_dispatcher(func, arr):
return (arr,)

@array_function_dispatch(_apply_along_fields_dispatcher)
def apply_along_fields(func, arr):
"""
Apply function 'func' as a reduction across fields of a structured array.
Expand Down Expand Up @@ -1100,6 +1115,10 @@ def apply_along_fields(func, arr):
# works and avoids axis requirement, but very, very slow:
#return np.apply_along_axis(func, -1, uarr)

def _assign_fields_by_name_dispatcher(dst, src, zero_unassigned=None):
return dst, src

@array_function_dispatch(_assign_fields_by_name_dispatcher)
def assign_fields_by_name(dst, src, zero_unassigned=True):
"""
Assigns values from one structured array to another by field name.
Expand Down Expand Up @@ -1137,6 +1156,10 @@ def assign_fields_by_name(dst, src, zero_unassigned=True):
assign_fields_by_name(dst[name], src[name],
zero_unassigned)

def _require_fields_dispatcher(array, required_dtype):
return (array,)

@array_function_dispatch(_require_fields_dispatcher)
def require_fields(array, required_dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name strikes me as a little odd, but I also can't think of a better one.

It might be handy to use the word "require" in the description somewhere, to make the name easier to remember.

"""
Casts a structured array to a new dtype using assignment by field-name.
Expand Down