Coefficient Correlation Over a Large Binary Image Data-Set - Slow Performance

https://stackoverflow.com/questions/23647687

22-07-2023
|

문제

I am trying to build an OCR by calculating the Coefficient Correlation between characters extracted from an image with every character I have pre-stored in a database. My implementation is based on Java and pre-stored characters are loaded into an ArrayList upon the beginning of the application, i.e.

ArrayList<byte []> storedCharacters, extractedCharacters;
storedCharacters = load_all_characters_from_database();
extractedCharacters = extract_characters_from_image();

// Calculate the coefficent between every extracted character
// and every character in database.
double maxCorr = -1;
for(byte [] extractedCharacter : extractedCharacters)
  for(byte [] storedCharacter : storedCharactes)
  {
     corr = findCorrelation(extractedCharacter, storedCharacter)
     if (corr > maxCorr)
       maxCorr = corr;
  }
...
...
public double findCorrelation(byte [] extractedCharacter, byte [] storedCharacter)
{
  double mag1, mag2, corr = 0;
  for(int i=0; i < extractedCharacter.length; i++)
  {
     mag1 += extractedCharacter[i] * extractedCharacter[i];
     mag2 += storedCharacter[i] * storedCharacter[i];
     corr += extractedCharacter[i] * storedCharacter[i];
  } // for
  corr /= Math.sqrt(mag1*mag2);
  return corr;
}

The number of extractedCharacters are around 100-150 per image but the database has 15600 stored binary characters. Checking the coefficient correlation between every extracted character and every stored character has an impact on the performance as it needs around 15-20 seconds to complete for every image, with an Intel i5 CPU. Is there a way to improve the speed of this program, or suggesting another path of building this bringing similar results. (The results produced by comparing every character with such a large dataset is quite good).

Thank you in advance

UPDATE 1

public static void run() {
    ArrayList<byte []> storedCharacters, extractedCharacters;
    storedCharacters = load_all_characters_from_database();
    extractedCharacters = extract_characters_from_image();
    
    // Calculate the coefficent between every extracted character
    // and every character in database.
    computeNorms(charComps, extractedCharacters);       
    double maxCorr = -1;
    for(byte [] extractedCharacter : extractedCharacters)
      for(byte [] storedCharacter : storedCharactes)
      {
         corr = findCorrelation(extractedCharacter, storedCharacter)
         if (corr > maxCorr)
           maxCorr = corr;
      }
    }
}
private static double[] storedNorms;
private static double[] extractedNorms;
       
// Correlation  between to binary images
public static double findCorrelation(byte[] arr1, byte[] arr2, int strCharIndex, int extCharNo){
         final int dotProduct = dotProduct(arr1, arr2);
         final double corr = dotProduct * storedNorms[strCharIndex] * extractedNorms[extCharNo];
         return corr;
}
    
public static void computeNorms(ArrayList<byte[]> storedCharacters, ArrayList<byte[]> extractedCharacters) {
          storedNorms = computeInvNorms(storedCharacters);
          extractedNorms = computeInvNorms(extractedCharacters);
}
    
private static double[] computeInvNorms(List<byte []> a) {
         final double[] result = new double[a.size()];
         
         for (int i=0; i < result.length; ++i) 
            result[i] = 1 / Math.sqrt(dotProduct(a.get(i), a.get(i)));
         return result;
}
      
private static int dotProduct(byte[] arr1, byte[] arr2) {
         int dotProduct = 0; 
         for(int i = 0; i< arr1.length; i++)
            dotProduct += arr1[i] * arr2[i];
          
         return dotProduct;
}

해결책

Nowadays, it's hard to find a CPU with a single core (even in mobiles). As the tasks are nicely separated, you can do it with a few lines only. So I'd go for it, though the gain is limited.

In case you really mean cross-correlation, then a transform like DFT or DCT could help. They surely do for big images, but with yours 12x16, I'm not sure.

Maybe you mean just a dot product? And maybe you should tell us?

Note that you actually don't need to compute the correlation, most of the time you only need is find out if it's bigger than a threshold:

corr = findCorrelation(extractedCharacter, storedCharacter)
..... more code to check if this is the best match ......

This may lead to some optimizations or not, depending on how the images look like.

Note also that a simple low level optimization can give you nearly a factor of 4 as in this question of mine. Maybe you really should tell us what you're doing?

UPDATE 1

I guess that due to the computation of three products in the loop, there's enough instruction level parallelism, so a manual loop unrolling like in my above question is not necessary.

However, I see that those three products get computed some 100 * 15600 times, while only one of them depends on both extractedCharacter and storedCharacter. So you can compute

100 + 15600 + 100 * 15600

dot products instead of

 3 * 100 * 15600

This way you may get a factor of three pretty easily.

Or not. After this step there's a single sum computed in the relevant step and the problem linked above applies. And so does its solution (unrolling manually).

Factor 5.2

results

While byte[] is nicely compact, the computation involves extending them to ints, which costs some time as my benchmark shows. Converting the byte[]s to int[]s before all the correlations gets computed saves time. Even better is to make use of the fact that this conversion for storedCharacters can be done beforehand.

Manual loop unrolling twice helps but unrolling more doesn't.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow