Question

which is the best way to write a bidimensional hashmap efficiently in Java? Just to give an example of what I'm talking about: I'm developing some algorithms related to collective intelligence, these algorithms works by calculating correlation between pairs of elements..

Without caching these values, since they are calculated on same pairs multiple times, performance are horrible.. (algorithms can be O(n^2) but maybe O(n^3) so I was thinking about using an HashMap to store values to be used multiple times.

Which is the most efficient way to implement such a data structure in Java? It should be possble to cache and remove a value generated by a pair of elements with O(1), but using an explicit class seems too heavy anyway.

If Java will turn out to be not enough I'll have to switch to C/C++, so any idea related to these languages are welcome too.

Thanks

Was it helpful?

Solution 2

I partially solved the problem by concatenating hashcodes of both items using something like this:

private long computeKey(Object o1, Object o2)
{
    int h1 = o1.hashCode();
    int h2 = o2.hashCode();

    if (h1 < h2)
    {
        int swap = h1;
        h1 = h2;
        h2 = swap;
    }

    return ((long)h1) << 32 | h2;
}

I still have to figure out which is the most efficient way to store all the elements already cached with a specified one to remove them when the algorithm don't need the item anymore, just to avoid filling of the HashMap with a waste of item. That's because the kind of algorithm merges two items at every iteration removing them from used ones but adding the new generated item.

OTHER TIPS

The easiest way to do this is to define a Pair class. It should be immutable (hash keys should not change), and hashCode() should be consistent with equals.

Something like (method implementations omitted):

public class Pair() {
  int a, b;

  public Pair(int a, int b);

  public int getA();

  public int getB();

  public boolean equals(Object obj);

  public int hashCode();
}

Notes:

  • If you don't want ints, sub in whatever type you want, or make your Pair class generic if you want it to be flexible.

  • It would be up to you whether (x, y) == (y,x).

With this in hand, you can have a HashMap<Pair, SomethingElse> as your cache.

Google Collections supports bi-directional hashmaps, see BiMap.

(BTW, Google Collections seems to be gaining more mindshare over Apache Collections.)

Update: Note @danben and @sateesh's clarification. A BiMap would be fine if you need to get a y given an x, or an x given a y. But it sounds like you really want to look up an (x, y) point and get back a value that contains your cached information. In that case, go with @danben's suggestion.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top