Question

I've just joined two arrays of unequal length together with the command:

allorders = map(None,todayorders, lastyearorders)

where "none" is given where today orders fails to have a value (as the todayorders array is not as long).

However, when I try to pass the allorders array into a matplotlib bar chart:

 p10= plt.bar(ind, allorders[9],   width, color='#0000DD', bottom=allorders[8])

..I get the following error:

TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'

So, is there a way for matplotlib to accept none datatypes? if not, how do I replace the 'Nones' with zeroes in my allorders array?

If you can, as I am a Python newbie (coming over from the R community), please provide detailed code from start to finish that I can use/test.

Was it helpful?

Solution

Use a list comprehension:

allorders = [i if i[0] is not None else (0, i[1]) for i in allorders]

OTHER TIPS

With numpy:

import numpy as np
allorders = np.array(allorders)

This creates an arrray of objects due to the Nones. We can replace them with zeros:

allorders[allorders == None] = 0

Then convert the array to the proper type:

allorders.astype(int)

Since it sounds like you want this all to be in numpy, the direct answer to your question is really just an aside, and the right answer doesn't being until the "Of course…" paragraph.

If you think about it, you're using map with a None first parameter as a zip_longest, because Python doesn't have a zip_longest. But it does have one, in itertools—and it allows you to specify a custom fillvalue. So, you can do this all in one step with izip_longest:

>>> import itertools
>>> todayorders = [1, 2]
>>> lastyearorders = [1, 2, 3]
>>> allorders = itertools.izip_longest(todayorders, lastyearorders, fillvalue=0)
>>> list(allorders)
[(1, 1), (2, 2), (0, 3)]

This only fills in 0 for the Nones that show up as extra values for the shorter list; if you want to replace every None with a 0, you have to do it Martijn Pieters's way. But I think this is what you want.

Also, note that list(allorders) at the end: izip_longest, like most things in itertools, returns an iterator, not a list. Or, in terms you might be more familiar with, it returns a "lazy" sequence rather than a "strict" one. If you're just going to iterate over the result, that's actually better, but if you need to use it with some function that requires a list (like printing it out in human-readable form—or accessing allorders[9], as in your example), you need to explicitly convert it first.

If you actually want a numpy.array rather than a list, you can get there directly, without going through a list first. (If all you're ever going to do with it is matplotlib it, you probably do want an array.) The clearest way is to just use np.fromiter(allorders) instead of list(allorders). You might want to pass an explicit dtype=int (or whatever's appropriate). And, if you know the size (which you do—it's max(len(todayorders), len(lastyearorders))), in some cases it's faster or simpler to pass an explicit count as well.

Of course if any of the numpy stuff sounds appealing, you probably should stay within numpy in the first place, instead of using map or izip_longest:

>>> todayorders.resize(lastyearorders.shape)
>>> allorders = np.vstack(todayorders, lastyearorders).transpose()

Unfortunately, that mutates todayorders, and as far as I know, the equivalent immutable function numpy.resize doesn't give you any way to "zero-extend", but instead repeats the values. Hopefully I'm wrong and someone will suggest the easy way, but otherwise, you have to do it explicitly:

>>> extrazeros = np.zeros(len(lastyearorders) - len(todayorders), dtype=int)
>>> allorders = np.vstack(np.concatenate((todayorders, extrazeros)), lastyearorders)
>>> allorders = allorders.transpose()
array([[ 1,  1],
       [ 2,  2],
       [ 0,  3]])

Of course if you do a lot of that, I'd write a zeroextend function that takes a pair of arrays and extends one to match the other (or, if you're not just dealing with 1D, extends the shorter one on each axis to make the other).

At any rate, aside from being faster and using less temporary memory than using map, izip_longest, etc., this also means that you end up with a final array with the right dtype (int rather than object)—which means your result also uses less long-term memory, and everything you do from then on will also be faster and use less temporary memory.

For completeness: It is possible to have pyplot handle None values, but I don't think it's what you want. For example, you can pass it a Transform object whose transform method converts None to 0. But this will be effectively the same as Martijn Pieters's answer but much more verbose, and there's no advantage at all unless you need to plot tons of such arrays.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top