Вопрос

Hi I'm using a Set backed by a HashMap to keep track of what edges I have already traversed in a graph. I was planning on keying the set by the result of adding the hashcodes of the data stored in each edge.

v.getData().hashCode() + wordV.getData().hashCode()

But how reliable is this when using contains to check to see if the edge is in the set? Couldn't I hypothetically get a false positive? Is there anyway to overcome this?

the exact statement that causes me concern is:

edgeSet.contains(v.getData().hashCode() + wordV.getData().hashCode())

thanks!

Oh by the way I'm using Java.

EDIT:

I should have made this clear in the question. In my graph there is no edge object, there are vertex objects which each hold a list of more vertex objects, which is the edge. So I guess the question that follows from that combined with your responses is:

Can I use a Set to store references to information as opposed to objects....? i.e. can I store the result of adding the two hashcodes for the vertices' data objects?

EDIT2:

I am indeed using the Java library for my hashmap, I declare it as below:

Set<Integer> edgeSet = Collections.newSetFromMap(new ConcurrentHashMap<Integer, Boolean>());
Это было полезно?

Решение

Note: from your question I couldn't tell whether you were using HashSet, or your own home rolled implementation. Be aware that Java's HashSet is simply a wrapper for a HashMap where the values are ignored. HashSet.contains simply calls containsKey on the inner map.

HashMap.containsKey uses the same lookup as get. That will compute the hash and use it to find the right bucket. From there it will walk the bucket and use equals until it finds the exact match. Assuming the element type implements both hashCode and equals correctly, there's no chance of getting a false positive using containsKey since equals is ultimately used to confirm.

The relevant source code is in the package-private method getEntry, which is used by both containsKey and get:

final Entry<K,V> getEntry(Object key) {
    int hash = (key == null) ? 0 : hash(key.hashCode());
    for (Entry<K,V> e = table[indexFor(hash, table.length)];
         e != null;
         e = e.next) {
        Object k;
        if (e.hash == hash &&
            ((k = e.key) == key || (key != null && key.equals(k))))
            return e;
    }
    return null;
}

EDIT:

Can I use a Set to store references to information as opposed to objects....? i.e. can I store the result of adding the two hashcodes for the vertices' data objects?

No, you would need to implement a new class representing this information and store instances of it in the Set. This could be a simple POJO with a field for each piece of information, and with hashCode and equals overridden correctly.

Другие советы

Hash codes, by definition, will have collisions. Adding them together helps nothing.

You should make the edges of your graph support hashCode and equals, and simply put the edges in a hash set as you go.

class Edge { ... equals and hashCode ... }

HashSet<Edge> traversed = new HashSet<Edge>();
traversed.add(edge);
...
if(traversed.contains(edge)) ...

If you are numbering your edges instead, then Integer already has a good hash code and equals, so use it:

HashSet<Integer> traversed = new HashSet<Integer>();
if(traversed.contains(edgeNumber)) ...
traversed.add(3);

As long as you override both hashCode() and equals(), you'll be fine. Hash codes are never guaranteed to be unique. That said, you're kind of misusing the Set. With storing classes with properly implemented hashCode() and equals() methods, methods such as contains()' will have perfect accuracy. However, that's not how you're using it here. As it sounds like you're almost building your own data structure / collection, you'll need to consider doing this the same way that the 'Hash' collections do - use a HashMap to store buckets of your hashes - the hash value as the key, and then a collection of whatever as the value to compare against. This will allow you to quickly see if the parent map even has the hash you're looking for. If not, you're done (false). If it does, then you need to confirm that its "bucket" has the specific values you're looking for (true).

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top