Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: numpy.random.Generator.dirichlet should accept zeros. #22547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
WayneHajas opened this issue Nov 7, 2022 · 6 comments
Closed

BUG: numpy.random.Generator.dirichlet should accept zeros. #22547

WayneHajas opened this issue Nov 7, 2022 · 6 comments

Comments

@WayneHajas
Copy link

Describe the issue:

numpy.random.mtrand.RandomState.dirichlet no longer accepts alpha(count)-values that are zero.

With older (e.g. 1.11.3) versions of numpy, dirichlet accepted zero as an alpha(count) value.

numpy.version
'1.11.3'
dirichlet([5,9,0,8])
array([ 0.17970351, 0.35902845, 0. , 0.46126803])
dirichlet([5,9,0,8])
array([ 0.15228294, 0.45822224, 0. , 0.38949482])

With newer (e.g. 1.21.5) versions of numpy, alpha(count) values must be greater than zero. Very small real-values are accepted.

numpy.version
'1.21.5'
dirichlet([5,9,0.000001,8])
array([0.38285451, 0.26206592, 0. , 0.35507958])
dirichlet([5,9,0,8])
Traceback (most recent call last):
File "", line 1, in
File "mtrand.pyx", line 4390, in numpy.random.mtrand.RandomState.dirichlet
ValueError: alpha <= 0

I have some applications where alpha(count|)-values are raw-data and zero is a very valid value. These applications worked with old versions of numpy but not with newer versions.

Reproduce the code example:

dirichlet([5,9,0,8])

Error message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mtrand.pyx", line 4390, in numpy.random.mtrand.RandomState.dirichlet
ValueError: alpha <= 0

NumPy/Python version information:

1.21.5

Context for the issue:

I have some applications where alpha(count|)-values are raw-data and zero is a very valid value. These applications worked with old versions of numpy but not with newer versions.

@WayneHajas WayneHajas changed the title BUG: <Please write a comprehensive title after the 'BUG: ' prefix> BUG: numpy.random.Generator.dirichlet should accept zeros. Nov 7, 2022
@rkern
Copy link
Member

rkern commented Nov 7, 2022

This was implemented in #9577. That PR description claims that it hangs when the values are 0, but I don't see how that can be the case as standard_gamma() has had special cases for shape == 0. It did appear to hang when the values were very, very low but nonzero because of the unprotected loop in that case. I suspect the main reason alpha[i] == 0 was excluded because Wikipedia entry claims that alpha[i] > 0 is required, but that is often dodgy.

I think we could make Generator.dirichlet() accept alpha[i] == 0 (though not RandomState, per NEP 19).

@MatteoRaso
Copy link
Contributor

I think it might actually be important for alpha[i] to be greater than 0. The PDF of the distribution is inversely proportional to B(alpha), which is the product of all gamma(alpha[i]) divided by the gamma of the sum of alpha. If alpha[i] is 0, then gamma(alpha[i]) is inf, which breaks everything.

@rkern
Copy link
Member

rkern commented Nov 17, 2022

Sometimes those kinds of divergences in the PDF don't really affect our ability to draw random numbers. I think this is just such a case. When alpha[i] = 0, then we're really just drawing from a Dirichlet distribution of one dimension lower, with i removed. Then we shove a 0 back in its place when we're done.

This is analogous to the case of a multivariate normal with a singular covariance matrix. The PDF is notionally infinite on the ridge. You can transform to the lower-dimensional nonsingular space, draw the multivariate normal there, then transform back to the full space.

Both of these are coherent procedures that have practical uses.

@WayneHajas
Copy link
Author

WayneHajas commented Nov 21, 2022 via email

@pcralmeida
Copy link
Contributor

Hello! I would like to work on this issue. From my understanding, accepting alpha[i] == 0 would suffice, open to sugestions though.

@WarrenWeckesser
Copy link
Member

The dirichlet method of the Generator class now allows elements of alpha to be zero (see #23440 and the follow-up #24220):

In [4]: np.__version__
Out[4]: '2.0.0.dev0+git20230813.104addf'

In [5]: rng = np.random.default_rng()

In [6]: rng.dirichlet([5, 9, 0, 8])
Out[6]: array([0.22294196, 0.51402094, 0.        , 0.26303709])

In [7]: rng.dirichlet([5, 9, 0, 8], size=8)
Out[7]: 
array([[0.31371971, 0.31066775, 0.        , 0.37561255],
       [0.25549883, 0.52689855, 0.        , 0.21760262],
       [0.15567579, 0.40443253, 0.        , 0.43989168],
       [0.18513736, 0.55825023, 0.        , 0.25661241],
       [0.25517287, 0.40680073, 0.        , 0.3380264 ],
       [0.29160739, 0.43306643, 0.        , 0.27532618],
       [0.23052236, 0.3841242 , 0.        , 0.38535344],
       [0.18530714, 0.49334535, 0.        , 0.32134751]])

Per NEP 19, the RandomState.dirichlet (aka np.random.dirichlet) won't be updated, so I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants