translation/rotation through phase correlation in python

https://stackoverflow.com/questions/19811064

04-07-2022
|

Question

I have two pictures, one that was the original and another one that I have modified so that it's translated up and left a bit and then rotated 90 degrees (so the shape of the picture is transposed as well).

Now I'd like to determine how many pixels (or any distance unit) the modified picture is translated from the original, as well as the degrees of rotation relative to the original. Phase correlation is supposed to solve this problem by first converting the coordinates to logpolar coordinates, then doing a number of things so that in the end you get a correlation matrix. From that matrix I'm supposed to find the peak and the (x,y) combination will reveal the translation and rotation somehow. This link explains it much better: Phase correlation

This is the following code I have:

import scipy as sp
from scipy import ndimage
from PIL import Image
from math import *
import numpy as np

def logpolar(input,silent=False):
    # This takes a numpy array and returns it in Log-Polar coordinates.

    if not silent: print("Creating log-polar coordinates...")
    # Create a cartesian array which will be used to compute log-polar coordinates.
    coordinates = sp.mgrid[0:max(input.shape)*2,0:360]
    # Compute a normalized logarithmic gradient
    log_r = 10**(coordinates[0,:]/(input.shape[0]*2.)*log10(input.shape[1]))
    # Create a linear gradient going from 0 to 2*Pi
    angle = 2.*pi*(coordinates[1,:]/360.)

    # Using scipy's map_coordinates(), we map the input array on the log-polar 
    # coordinate. Do not forget to center the coordinates!
    if not silent: print("Interpolation...")
    lpinput = ndimage.interpolation.map_coordinates(input,
                                            (log_r*sp.cos(angle)+input.shape[0]/2.,
                                             log_r*sp.sin(angle)+input.shape[1]/2.),
                                            order=3,mode='constant')

    # Returning log-normal...
    return lpinput

def load_image( infilename ) :
    img = Image.open( infilename )
    img.load()
    data = np.asarray( img, dtype="int32" )
    return data

def save_image( npdata, outfilename ) :
    img = Image.fromarray( np.asarray( np.clip(npdata,0,255), dtype="uint8"), "L" )
    img.save( outfilename )

image = load_image("C:/images/testing_image1.jpg")
target = load_image("C:/images/testing_otherimage.jpg")

# Conversion to log-polar coordinates
lpimage = logpolar(image)
lptarget = logpolar(target)

# Correlation through FFTs
Fcorr = np.fft.fft(lpimage)*np.fft.fft(lptarget)
correlation = np.fft.ifft(Fcorr)

The problem I have now is that this code will give as output:

Traceback (most recent call last):
  File "./phase.py", line 44, in <module>
    lpimage = logpolar(image)
  File "./phase.py", line 24, in logpolar
    order=3,mode='constant')
  File "C:\Python27\lib\site-packages\scipy\ndimage\interpolation.py", line 295, in map_coordinates
    raise RuntimeError('invalid shape for coordinate array')
RuntimeError: invalid shape for coordinate array

As I just have a very superficial understanding of what exactly is happening in the whole phase correlation process, I'm unclear on what the problem is about. I have tried to see if something's wrong with the input so I added save_image(image,"C:/testing.jpg") right after loading the image to see if there's something wrong with the numpy array from my images. And sure enough, the images I convert to np array, cannot be converted back to an image. This is the error I get:

  Traceback (most recent call last):
  File "./phase.py", line 41, in <module>
    save_image(image,"C:/testing.jpg")
  File "./phase.py", line 36, in save_image
    img = Image.fromarray( np.asarray( np.clip(npdata,0,255), dtype="uint8"), "L" )
  File "C:\Python27\lib\site-packages\PIL\Image.py", line 1917, in fromarray
    raise ValueError("Too many dimensions.")
ValueError: Too many dimensions.

Taking a peek at the original documentation didn't give me much inspiration on what the problem could be. I don't think the code to convert images to numpy arrays are wrong as I've tested for the type with print type(image) and the results looked legit. Yet I can't convert it back to an image. Any help I can get would be greatly appreciated.

Solution

I think the problem is that you are trying to input a 3D image array (R,G,B,A?), into your function. Whereas the input only takes a 2D arrays. Try using a single channel to determine the transformation. E.g.

image = load_image("/path/to/image")[:,:,0]

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow