Question

I'm using python with matplotlib to create plots out of data, an I'd like to save this plots on a pdf file (but I could use also a more specific format). I'm using basically this instructions:

plt.plot(data)
figname = ''.join([filename, '_', label, '.pdf'])
plt.savefig(figname)

But what this does is create an image of the plot with the zoom in which it's displayed; I would like to create a copy that shows all points (>10000) that I'm plotting so I would be able to zoom to any level. Which is the correct way to do that?

EDIT: is there a format (such as '.fig' for Matlab) that calls directly the viewer of Matplotlib with the data i saved? Maybe it's possible to create a .py script that saves the points and that i can call to quickly re-display them? I think that this is what is done by the .fig Matlab file.

This is the detail of the plot exported as pdf

And this is what I would like to obtain, still being able to see the whole plot, that is bigger

Was it helpful?

Solution

I don't know of any native Matplotlib file format which includes your data; in fact, I'm not sure the Matploblib objects even have a write function defined.

What I do instead to simulate the Matlab .fig concept is to save the processed data (as a numpy array, or pickled) and run a separate .py script to recreate the Matplotlib plots.

So in steps:

  1. Process your data and make some pretty plots until you are fully content
  2. Save/pickle your processed data as close to the plot commands as possible (you might even want to store the data going into a histogram if making the histogram takes a long time)
  3. Write a new script in which you import the data and copy/paste the plotting commands from the original script

It is a bit clumsy, but it works. If you really want, you could embed the pickled data as a string in your plotting script (Embed pickle (or arbitrary) data in python script). This gives you the benefit of working with a single python script containing both the data as well as the plotting code.

Edit

You can check for the existence of your stored processed data file and skip the processing steps if this file exists. So:

if not processed_data.file exists:
   my_data = process_raw_data()
else:
   my_data = read_data_from_file(processed_data.file)

plot(my_data)

In this way, you can have one script for both creating the graph in the first place, and re-plotting the graph using pre-processed data.

You might want to add a runtime argument for forcing a re-processing of the data in case you change something to the processing script and don't want to manually remove your processed data file.

OTHER TIPS

Use plt.xlim and plt.ylim to set the domain and range. Set figsize to indirectly control the pixel resolution of the final image. (figsize sets the size of the figure in inches; the default dpi is 100.) You can also control the dpi in the call to plt.savefig.

With figsize = (10, 10) and dpi = 100, the image will have resolution 1000x1000.


For example,

import matplotlib.pyplot as plt
import numpy as np

x, y = np.random.random((2,10000))
plt.plot(x, y, ',')
figname = '/tmp/test.pdf'
xmin, xmax = 0, 1
ymin, ymax = 0, 1
plt.xlim(xmin, xmax)
plt.ylim(ymin, ymax)
plt.savefig(figname)

Your pdf viewer should be able to zoom in any region so individual points can be distinguished.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top