Question

What is a simple and fast way to get an iterator that returns at most N elements from the start of a List?

The simplest versions I could come up with are:

#1:

import com.google.common.collect.Iterators;

// ...

public static <E> Iterator<E> lengthLimitedIterator(Iterable<E> source, int maxLen) {
    return Iterators.partition(source.iterator(), maxLen).next().iterator();
}

#2:

public static <E> Iterator<E> lengthLimitedIterator(List<E> source, int maxLen) {
    return source.subList(0, Math.min(source.size(), maxLen)).iterator();
}

Unfortunately both versions create a temporary List which significantly affects performance as I am calling this method millions of times in a tight loop.

Are there any other library functions I could use for this?


Note: I cannot avoid iterating over the list as I am passing it to a method which takes an iterator as its argument and I cannot modify that class.

Was it helpful?

Solution

Seems as if the feature will be added to guava, currently (as of r06) in beta:

public static <T> Iterator<T> limit(Iterator<T> iterator, int limitSize)

OTHER TIPS

You already know it's a list, so you can just call the List.subList(int fromIndex, int toIndex) method. As per the specification, the subList is backed by the original list, so it's not really creating a full blown List, just some kind of proxy object.

This is a place where a Decorator works very well: your decorator keeps a count, which is incremented by next(), and used by control hasNext().

Example (intentionally incomplete):

public class LengthLimitedIterator<T>
implements Iterator<T>
{
    private Iterator<T> _wrapped;
    private int _length;
    private int _count;

    public LengthLimitedIterator(Iterator<T> wrapped, int length)
    {
        _wrapped = wrapped;
        _length = length;
    }


    public boolean hasNext()
    {
        if (_count < _length)
            return _wrapped.hasNext();
        return false;
    }

    public T next()
    {
        // FIXME - add exception if count >= length
        _count++;
        return _wrapped.next();
    }

Why not simply

list.subList(0, 42).iterator();

I am not sure why you mind about the creation of that temporary list. It doesn't do anything I'd consider expensive. In fact, creating this list is by far cheaper than iterating over it, which I assume you do.

The ArrayList.sublist(int,int) method does not create a copy of the original list. Instead it returns a SubList instance that wraps the original ArrayList. The iterator returned by the sublist derived from Array doesn't make a copy either.

So my advice is to try using ArrayList as your base list type and the sublist method. If that is not fast enough, implement your own variant of ArrayList that implements a limitedLengthIterator method. For example, you should be able to get rid of the code that checks for concurrent modifications.

If you are concerned about performance, don't use an Iterator, use an index on an array. This will give much better performance. Getting the first N elements of an array is trivial.

This version turns out to be faster than any of the other examples:

public static <E> Iterator<E> lengthLimitedIterator(List<E> source, int maxLen) {
    maxLen = Math.min(maxLen, source.size());
    ArrayList<E> tempList = new ArrayList<E>(maxLen);
    for (int i = 0; i < maxLen; ++ i) {
       tempList.add(source.get(i));
    }
    return tempList.iterator();
}

If the temporary list has to be created anyway, an ArrayList is faster than the decorated lists returned by the other library methods.

My guess is that ArrayList is getting some special treatment within the VM.

Maybe this would be inefficient for very long lists, but my lists are short (nearly always fewer than 50 elements.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top