Skip to content

Interoperability with Numpy #8

@marcodeangelis

Description

@marcodeangelis

Compatibility with Numpy

Currently Numpy functions are avoided in intervals. So for example, the following code will not work.

from intervals.number import Interval
import numpy as np
x=Interval(3)
y=np.sin(x)

The code will result in the warning
VisibleDeprecationWarning: Creating an ndarray from nested sequences exceeding the maximum number of dimensions of 32 is deprecated. If you mean to do this, you must specify 'dtype=object' when creating the ndarray.

and will result in the error

y=np.sin(x)
AttributeError: 'Interval' object has no attribute 'sin'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/marcus/Code/Git/public/intervals/test_issue.py", line 4, in <module>
    y=np.sin(x)
TypeError: loop of ufunc does not support argument 0 of type Interval which has no callable sin method

We avoid this error by overriding the sin function entirely. This means that Numpy functions like 'np.sin' will need to be replaced by the intervals sin function:

from intervals.methods import sin

However, this is not optimal for two reasons: (1) the user does not have the freedom to use Numpy functions, so there is reduced interoperability, (2) there is unexpected behaviour for binary operations that are handled by the Numpy API.

An example of (2) is the following.

import intervals.number as number
import numpy as np

a = number.Interval([1,2])
b = np.array([1,2])

c = a+b  # works as intended
print(f'a+b=\n{c}')
#a+b=
#[2.0,2.0]
#[4.0,4.0]

c = b+a  # works unexpectedly
print(f'b+a=\n{c_}')
#b+a=
#[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[2.0,2.0]
#                                [3.0,3.0]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
#
# [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[3.0,3.0]
#                                [4.0,4.0]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]

Solution

The Numpy __array_ufunc__ protocol provides control on what object takes over when executing Numpy ufunc.

Extract from https://numpy.org/devdocs/user/basics.interoperability.html

A universal function (or ufunc for short) is a “vectorized” wrapper for a function that takes a fixed number of specific inputs and produces a fixed number of specific outputs. The output of the ufunc (and its methods) is not necessarily a ndarray, if not all input arguments are ndarrays. Indeed, if any input defines an __array_ufunc__ method, control will be passed completely to that function, i.e., the ufunc is overridden. The __array_ufunc__ method defined on that (non-ndarray) object has access to the NumPy ufunc. Because ufuncs have a well-defined structure, the foreign __array_ufunc__ method may rely on ufunc attributes like .at(), .reduce(), and others.

A subclass can override what happens when executing NumPy ufuncs on it by overriding the default ndarray.__array_ufunc__ method. This method is executed instead of the ufunc and should return either the result of the operation, or NotImplemented if the operation requested is not implemented.

For general Numpy functions (non-ufunc) the Numpy __array_function__ protocol is also available.

Extract from https://numpy.org/devdocs/reference/arrays.classes.html#special-attributes-and-methods

The presence of array_ufunc also influences how ndarray handles binary operations like arr + obj and arr < obj when arr is an ndarray and obj is an instance of a custom class. There are two possibilities. If obj.__array_ufunc__ is present and not None, then ndarray.__add__ and friends will delegate to the ufunc machinery, meaning that arr + obj becomes np.add(arr, obj), and then add invokes obj.__array_ufunc__. This is useful if you want to define an object that acts like an array.

Alternatively, if obj.__array_ufunc__ is set to None, then as a special case, special methods like ndarray.__add__ will notice this and unconditionally raise TypeError. This is useful if you want to create objects that interact with arrays via binary operations, but are not themselves arrays. For example, a units handling system might have an object m representing the “meters” unit, and want to support the syntax arr * m to represent that the array has units of “meters”, but not want to otherwise interact with arrays via ufuncs or otherwise. This can be done by setting __array_ufunc__ = None and defining __mul__ and __rmul__ methods. (Note that this means that writing an array_ufunc that always returns NotImplemented is not quite the same as setting __array_ufunc__ = None: in the former case, arr + obj will raise TypeError, while in the latter case it is possible to define a __radd__ method to prevent this.)

The above does not hold for in-place operators, for which ndarray never returns NotImplemented. Hence, arr += obj would always lead to a TypeError. This is because for arrays in-place operations cannot generically be replaced by a simple reverse operation. (For instance, by default, arr += obj would be translated to arr = arr + obj, i.e., arr would be replaced, contrary to what is expected for in-place array operations.)

The solution thus consists in placing a __array_ufunc__ method in the Interval class and override the behaviour of all Numpy ufunc. A list of all ufunc is https://numpy.org/devdocs/reference/ufuncs.html#ufuncs.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions