Question

I'm trying to plot topic visualization from a model. I want to do something like bokeh covariance implementation.

My data is:

data 1: index,                            topics.   
data 2: index, topics, weights(use it for color). 

where topic is just set of words.

How do i give the data to bokeh to plot the above data? From the example, data handling is not intuitive.

With matplot, it looks like this.
Obviously, it is not visually helpful to see what topic correspond to each circle. Here is my matplotlib code:

x = []
y = []
area = []

for row in joined:
      x.append(row['index']) 
      y.append(row['index'])
      #weight.append(row['score'])
      area.append(np.pi * (15 * row['score'])**2)
scale_values = 1000
plt.scatter(x, y, s=scale_values*np.array(area), alpha=0.5)
plt.show()

Any idea/suggestions?

Was it helpful?

Solution

UPDATE: The answer below is still correct in all major points, but the API has changed slightly to be more explicit as of Bokeh 0.7. In general, things like:

rect(...)

should be replaced with

p = figure(...)
p.rect(...)

Here are the relevant lines from the Les Mis examples, simplified to your case. Let's take a look:

# A "ColumnDataSource" is like a dict, it maps names to columns of data.
# These names are not special we can call the columns whatever we like.
source = ColumnDataSource(
    data=dict(
        x = [row['name'] for row in joined],
        y = [row['name'] for row in joined],
        color = list_of_colors_one_for_each_row, 
    )
)

# We need a list of the categorical coordinates
names = list(set(row['name'] for row in joined))

# rect takes center coords (x,y) and width and height. We will draw 
# one rectangle for each row.
rect('x', 'y',        # use the 'x' and 'y' fields from the data source
     0.9, 0.9,        # use 0.9 for both width and height of each rectangle 
     color = 'color', # use the 'color' field to set the color
     source = source, # use the data source we created above
     x_range = names, # sequence of categorical coords for x-axis
     y_range = names, # sequence of categorical coords for y-axis
)

A few notes:

  • For numeric data x_range and y_range usually get supplied automatically. We have to give them explicitly here because we are using categorial coordinates.

  • You can order the list of names for x_range and y_range however you like, this is the order they are displayed on the plot axis.

  • I'm assuming you want to use categorical coordinates. :) This is what the Les Mes example does. See the bottom of this answer if you want numerical coordinates.

Also, the Les Mis example was a little more complicated (it had a hover tool) which is why we created a ColumnDataSource by hand. If you just need a simple plot you can probably skip creating a data source yourself, and just pass the data in to rect directly:

names = list(set(row['name'] for row in joined))

rect(names,    # x (categorical) coordinate for each rectangle
     names,    # y (categorical) coordinate for each rectangle
     0.9, 0.9, # use 0.9 for both width and height of each rectangle
     color = some_colors, # color for each rect
     x_range = names, # sequence of categorical coords for x-axis
     y_range = names, # sequence of categorical coords for y-axis
)

Another note: this only plots rectangles on the diagonal, where the x- and y-coordinates are the same. That seems to be what you want from your description. But just for completeness, it's possible to plot rectangles that have different x- and y-coordinates. The Les Mis example does this.

Finally, maybe you don't actually want categorical axes? If you just want to use the numeric index of the coordinates, its even simpler:

inds = [row['index'] for row in joined]

rect(inds,    # x-coordinate for each rectangle
     inds,    # y-coordinate for each rectangle
     0.9, 0.9, # use 0.9 for both width and height of each rectangle
     color = some_colors, # color for each rect
)

Edit: Here is a complete runnable example that uses numeric coords:

from bokeh.plotting import * 

output_file("foo.html")

inds = [2, 5, 6, 8, 9]
colors = ["red", "orange", "blue", "green", "#4488aa"]

rect(inds, inds, 1.0, 1.0, color=colors)

show()

and here is one that uses the same values as categorical coords:

from bokeh.plotting import * 

output_file("foo.html")

inds = [str(x) for x in [2, 5, 6, 8, 9]]
colors = ["red", "orange", "blue", "green", "#4488aa"]

rect(inds, inds, 1.0, 1.0, color=colors, x_range=inds, y_range=inds)

show()
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top