Nx3 column data to 2d matrix for image processing

https://stackoverflow.com/questions/10403341

04-06-2021
|

Question

I am trying to find local maxima and countours in a Nx3 data in format ('x','y','value') i read from a text file; 'x' and 'y' form an evenly spaced grid and there is single value for every combination of 'x','y', it looks like this:

  3.0, -0.4, 56.94369888305664        
  3.0, -0.3, 56.97200012207031        
  3.0, -0.2, 56.77149963378906        
  3.0, -0.1, 56.41230010986328        
  3.0,  0,   55.8302001953125       
  3.0,  0.1, 55.81560134887695        
  3.0,  0.2, 55.600399017333984        
  3.0,  0.3, 55.51969909667969        
  3.0,  0.4, 55.18550109863281         
  3.2, -0.4, 56.26380157470703 
  3.2, -0.3, 56.228599548339844
  ...

The problem is that the image code I am trying to use(link) requires the data to be in a different 2d matrix format for image processing. This is the relevant part of the code:

# Construct some test data
x, y = np.ogrid[-np.pi:np.pi:100j, -np.pi:np.pi:100j]
r = np.sin(np.exp((np.sin(x)**3 + np.cos(y)**2)))

# Find contours at a constant value of 0.8
contours = measure.find_contours(r, 0.8)

Can somebody help transform my data to the required 'grided' format?

EDIT: I finally went for pandas but I find the chosen answer better in the general case.This is what I did:

from pandas import read_csv
data=read_csv(filename, names=['x','y','values']).pivot(index='x', columns='y',
              values='values')

After this data.values holds the table in 2d 'image form' the like I wanted.

y   -0.4        -0.3        -0.2        -0.1
x               
3.0  86.9423     87.6398     87.5256     89.5779
3.2  76.9414     77.7743     78.8633     76.8955
3.4  71.4146     72.8257     71.7210     71.5232

Solution

The best solution really depends on details your not giving. By the way, you should really give your code, or at least the np.loadtxt instruction. In the following, "data" is the array loaded from the file using:

data = np.loadtxt('file.txt', [('x',float), ('y',float), ('value',float)])

1) Direct reshape:

Following on what @tom10 said
If you know that your (x,y,value) data is stored in the specific order:

[(x0,y0,v00), (x0,y1,v01), .... , (x1,y0,v10),(x1,y1,v11), ... ,(xN,yM,vNM)]

And that the values of all (x,y) pairs are given. Then the best is to make a 1D numpy array from your list of values and reshape it:

x = np.unique(data['x'])
y = np.unique(data['y'])
r = data['value'].reshape((x.size,y.size))

2) General cases:

see Populate arrays in python (numpy)? for a similar question and an other solution using dictionaries

If your cannot guaranty anything else than having (x,y,value) tuples:

# indexing: list of x and y coordinates, and functions that map them to index
x  = np.unique(data['x']).tolist()
y  = np.unique(data['y']).tolist()
ix = np.vectorize(lambda i: x.index(i), otypes='i')
iy = np.vectorize(lambda j: y.index(j), otypes='i')

# create output array
r  = np.zeros((x.size,y.size), float)   # default value is 0
r[ix(data['x']), iy(data['y'])] = data['value']

Note: In the reference given above, an other approach using dictionaries is given. I think this is more readable, but I did not test their relative speed.

3) Intermediate cases?

You might have an intermediate case, between a regular grid coordinates given in a specific order and no constraint at all. The general case being potentially very slow, you should design your algorithm to take advantage of any rule your data follow.

One example is if you know that the x-y indexing follow a specific rule, but are not necessarily given in order. For instance, if you know that the x and y are equally spaced "grid" coordinates, of the form:

coordinate = min_coordinate + i*step

Then find min_coordinate and step (for both x and y), and find i by solving this equation. This way, you avoid the costly index mapping np.vectorized(... list.index(...)):

x  = np.unique(data['x'])
y  = np.unique(data['y'])
ix = (data['x']-x.min())/(x[1]-x[0])
iy = (data['y']-y.min())/(y[1]-y[0])

# create output array
r  = np.ones((x.size,y.size), float)*np.nan   # default value is NaN
r[ix.astype(int), iy.astype(int)] = data['value']

OTHER TIPS

For program you're using, you just need the data to be rectangular array of z values (in the example they give they just use x and y to construct z, but then never use them again). It looks like you have array that's 9 by N (where N is something you don't show). One easy way to get this is to just read the data in as a flat collection of z values, skipping the x,y values, reshape to set the shape you'd like. (I can't really write the code for this because you haven't given enough info, but it shouldn't be difficult.)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow