Question

I just finished the main part of the current data structures project, and am working on collecting the statistics. One requirement is that a count of all the references within the TreeMap be recorded.

This Map contains a 31,000+ nodes where a String is mapped to a TreeSet of indeterminate size. I need to traverse the map and keep a running count of the number of items in the set.

Originally my idea was this:

Set<String> keySet= lyricWords.keySet();  
Iterator<String> iter= keySet.iterator();
String current= iter.next();

while (iter.hasNext){
  runCount+= lyricWords.get(current).size();
}

The runtime for this is far too long to be acceptable. Is there a more efficient way to do this on the final structure? I could keep a count as the map is built, but the professor wants the numbers to be based on the final structure itself.

Was it helpful?

Solution

This isn't of much use to you since this is an assignment you're working on, but this is an example where a data structure specifically designed for mapping keys to multiple values shows how much better it is than a Map<T, Collection<V>>.

Guava's Multimap collection type keeps track of the total number of entries it contains, so if you were using a TreeMultimap<String, Foo> rather than a TreeMap<String, TreeSet<Foo>> you could just call multimap.size() to get the number you're looking for.

By the way, the Multimap implementations store a running total of the number of entries which is updated when entries are added to or removed from it. You might be able to do this by doing some fancy stuff with subclassing the TreeMap and wrapping the TreeSets that are added to it, but it would be quite challenging to make it all work properly I think.

OTHER TIPS

I'm not sure. But, probably, you have infinitive loop. Try:

runCount+= iter.next().size();
for (Map.Entry<String, TreeSet> e: lyricWords.entrySet()) {
  runCount+= e.getValue().size();
}

I dont see a problem with keeping a count as the map is built.

The count will be correct at the end, and you wont have to incur the cost of iterating through the entire thing again.

I think that the tree can and should keep track of its size

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top