Question

Assume an application producing a number of HashMap<String, MyClass> data structures, each containing dozens to hundreds of Comparable objects of type MyClass, which need to end up in a single and sorted Collection.

Two possible implementations of this functionality return a SortedSet or a sorted List, as follows:

public static Set<MyClass> getSortedSet(HashMap<String, MyClass>... allMaps)
{
    SortedSet<MyClass> set = new TreeSet<MyClass>();

    Collection<MyClass> c;

    for (HashMap<String, MyClass> map:allMaps)
    {
        c = map.values();
        set.addAll(c);
    }

    return set;
}

public static List<MyClass> getSortedList(HashMap<String, MyClass>... allMaps)
{
    List<MyClass> list = new ArrayList<MyClass>();

    Collection<MyClass> c;

    for (HashMap<String, MyClass> map:allMaps)
    {
        c = map.values();
        list.addAll(c);
    }

    Collections.sort(list);

    return list;
}

Would there be any clear performance advantage to any of the above 2 methods?

Is there a faster way of implementing the same functionality?

Was it helpful?

Solution

Some problems with your sorted list method:

ArrayLists are backed by arrays. Whenever you add a new element, it might have to grow the array behind the scenes. If you want to use this approach, you should create the ArrayList of the proper size before hand.

The sorting after you add all elements seems non-optimal. Why not add elements into the list at their correct position? (Use a sorted collection then turn into a list) A good Sorted List for Java

To actually answer you question, I would go with the approach that uses the TreeSet behind the scenes. Because, if the user wants, they can always do Set.toArray() and then have a list.

OTHER TIPS

Sets and Lists differ by their nature in the sense, Sets do not keep duplicates. You can have only one instance of an object. Lists allow you to keep duplicates.

As such, Sets do more work, hence they are slower.

Is there any reason to assume that one implementation will be always faster than the other?

No, there is no reason to assume that.

Which is faster could depend on the quantity of data, on its properties, on the performance characteristics of the comparator, on your JDK, on the JIT compiler etc.

The only way to know for sure is by benchmarking the code on realistic data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top