Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Histogram of list of datetimes #11899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Zinjaai opened this issue Aug 20, 2018 · 8 comments Β· Fixed by #11921
Closed

Histogram of list of datetimes #11899

Zinjaai opened this issue Aug 20, 2018 · 8 comments Β· Fixed by #11921

Comments

@Zinjaai
Copy link

Zinjaai commented Aug 20, 2018

Bug report

Bug summary
When creating a histogram of a list of datetimes, the input seems to be interpreted as a sequency of arrays.

Code for reproduction

from datetime import datetime
from matplotlib import pyplot as plt
plt.hist([datetime(2018,1,1),  datetime(2018, 2, 1),  datetime(2018,3, 1)])

# We get the expected result,  when we cast the list to a numpy array (but only if we specify the dtype?!)
# import numpy
# plt.hist(numpy.array([datetime(2018,1,1),  datetime(2018, 2, 1),  datetime(2018,3, 1)], dtype='datetime64[h]'))

Actual outcome

Output of n is a list of arrays, indicating that the input dates are interpreted as a sequence of arrays.

# If applicable, paste the console output here
([array([1., 0., 0., 0., 0., 0., 0., 0., 0., 0.]),
  array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.]),   #  <---  expected array instead of list of arrays
  array([0., 0., 0., 0., 0., 0., 0., 0., 0., 1.])],
 array([736695. , 736700.9, 736706.8, 736712.7, 736718.6, 736724.5,
        736730.4, 736736.3, 736742.2, 736748.1, 736754. ]),
 <a list of 3 Lists of Patches objects>)

Output of matplotlib version 2.2.0, 2.2.2 and 2.2.3

Expected outcome

# If applicable, paste the console output here
(array([1., 0., 0., 0., 0., 1., 0., 0., 0., 1.]),
 array([736695. , 736700.9, 736706.8, 736712.7, 736718.6, 736724.5,
        736730.4, 736736.3, 736742.2, 736748.1, 736754. ]),
 <a list of 10 Patch objects>)

This worked in matplotlib version 2.1.2

Matplotlib version

  • Operating system: Fedora 27
  • Matplotlib version: 2.2.3
  • Matplotlib backend (print(matplotlib.get_backend())): agg
  • Python version: 3.6.5
  • Jupyter version (if applicable):
  • Other libraries:
@jklymak
Copy link
Member

jklymak commented Aug 20, 2018

Your example is nt quite complete, so maybe I'm doing something wrong, but for

from matplotlib import pyplot as plt
import datetime

n, x = plt.hist([datetime.datetime(2018,1,1),  datetime.datetime(2018, 2, 1),
                 datetime.datetime(2018,3, 1)])

on 2.1.2 I get

Traceback (most recent call last):
  File "testHist.py", line 5, in <module>
    datetime.datetime(2018,3, 1)])
ValueError: too many values to unpack (expected 2)

So I don't think it worked on 2.1.2 either.

@jklymak jklymak added the status: needs clarification Issues that need more information to resolve. label Aug 20, 2018
@Zinjaai
Copy link
Author

Zinjaai commented Aug 21, 2018

Thanks for the fast feedback!
Arghs, forgot to import the datetime in the example (updated issue now).
The value error ist the result of plt.hist() returning 3 instead of 2 variables, namely n, bins and patches.

Code variant with assigning output to variables:

from matplotlib import pyplot as plt
from datetime import datetime
n, x, p = plt.hist([datetime(2018,1,1),  datetime(2018, 2, 1),  datetime(2018,3, 1)])
print(n)  

Output in 2.12 is:
[1. 0. 0. 0. 0. 1. 0. 0. 0. 1.]

Output in 2.2.3 is:
[array([1., 0., 0., 0., 0., 0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0., 0., 0., 0., 0., 1.])]

@Zinjaai
Copy link
Author

Zinjaai commented Aug 23, 2018

@jklymak have i been able to clarify your questions? Greetings!

@jklymak jklymak added API: consistency API: argument checking and removed status: needs clarification Issues that need more information to resolve. labels Aug 23, 2018
@jklymak
Copy link
Member

jklymak commented Aug 23, 2018

This bisects to c02a84d "Correctly convert units for a stacked histogram" @dstansby #9654

Obviously this works for floats, so the issue is that your data is not floats. I guess this used to work before, but whether or not it really should have is debatable. What should

plt.hist(['a', 'b', 'c'])

return? It looks like the reason datetime worked at all is that it has a toordinal method.

@Zinjaai
Copy link
Author

Zinjaai commented Aug 24, 2018

I guess this used to work before, but whether or not it really should have is debatable.

To be able to create a histogram from a list of datetimes seems pretty beneficial to me.

from datetime import datetime
from matplotlib import pyplot as plt
plt.figure(figsize=(12, 6))
dates = [datetime(2010, 6, 15)] * 2 + \
        [datetime(2011, 6, 15)] * 3 + \
        [datetime(2012, 6, 15)] * 7 + \
        [datetime(2013, 6, 15)] * 4 + \
        [datetime(2015, 6, 15)] * 1
plt.hist(dates, bins=5, rwidth=0.8)
plt.xlabel('Year')
plt.ylabel('Number of events')

Old matplotlib version
old

Newer version
new

In contrast to dates, i haven't that often tried to create histograms of strings, .... πŸ˜•

@tacaswell tacaswell added this to the v3.1.0 milestone Feb 11, 2019
@haydenflinner
Copy link

haydenflinner commented Feb 26, 2019

This just caused plotting a 70MB file to explode to over 50GB of memory for me, quite surprising that I can't plot a histogram of datetimes.

Edit: The fix:
mpl_data = mdates.date2num(data)

Also try histtype='step' if it's slow to render

@jklymak
Copy link
Member

jklymak commented Feb 26, 2019

This should be fixed on master...

@haydenflinner
Copy link

Awesome! I'm on a pretty old version, so I'll leave the explicit conversion for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants