Find largest and second largest element in a range

https://stackoverflow.com/questions/1412751

06-07-2019
|

Question

How do I find the above without removing the largest element and searching again? Is there a more efficient way to do this? It does not matter if the these elements are duplicates.

Solution

for (e: all elements) {
 if (e > largest) {
   second = largest;
   largest = e;
 } else if (e > second) {
   second = e;
 }
}

You could either initialize largest and second to an appropriate lower bound, or to the first two items in the list (check which one is bigger, and don't forget to check if the list has at least two items)

OTHER TIPS

using partial_sort ?

std::partial_sort(aTest.begin(), aTest.begin() + 2, aTest.end(), Functor);

An Example:

std::vector<int> aTest;

    aTest.push_back(3);
    aTest.push_back(2);
    aTest.push_back(4);
    aTest.push_back(1);


    std::partial_sort(aTest.begin(), aTest.begin()+2,aTest.end(), std::greater<int>());

    int Max = aTest[0];
int SecMax = aTest[1];

nth_element(begin, begin+n,end,Compare) places the element that would be nth (where "first" is "0th") if the range [begin, end) were sorted at position begin+n and makes sure that everything from [begin,begin+n) would appear before the nth element in the sorted list. So the code you want is:

nth_element(container.begin(),
            container.begin()+1,
            container.end(),
            appropriateCompare);

This will work well in your case, since you're only looking for the two largest. Assuming your appropriateCompare sorts things from largest to smallest, the second largest element with be at position 1 and the largest will be at position 0.

Lets assume you mean to find the two largest unique values in the list.

If the list is already sorted, then just look at the second last element (or rather, iterate from the end looking for the second last value).

If the list is unsorted, then don't bother to sort it. Sorting is at best O(n lg n). Simple linear iteration is O(n), so just loop over the elements keeping track:

v::value_type second_best = 0, best = 0;
for(v::const_iterator i=v.begin(); i!=v.end(); ++i)
   if(*i > best) {
     second_best = best;
     best = *i;
   } else if(*i > second_best) {
     second_best = *i;
   }

There are of course other criteria, and these could all be put into the test inside the loop. However, should you mean that two elements that both have the same largest value should be found, you have to consider what happens should three or more elements all have this largest value, or if two or more elements have the second largest.

The optimal algorithm shouldn't need more than 1.5 * N - 2 comparisons. (Once we've decided that it's O(n), what's the coefficient in front of N? 2 * N comparisons is less than optimal).

So, first determine the "winner" and the "loser" in each pair - that's 0.5 * N comparisons.

Then determine the largest element by comparing winners - that's another 0.5 * N - 1 comparisons.

Then determine the second-largest element by comparing the loser of the pair where the largest element came from against the winners of all other pairs - another 0.5 * N - 1 comparisons.

Total comparisons = 1.5 N - 2.

The answer depends if you just want the values, or also iterators pointing at the values.

Minor modification of @will answer.

v::value_type second_best = 0, best = 0;
for(v::const_iterator i=v.begin(); i!=v.end(); ++i)
{
   if(*i > best)
   {
     second_best = best;
     best = *i;
   }
   else if (*i > second_best)
   {
     second_best = *i;
   }
}

Create a sublist from n..m, sort it descending. Then grab the first two elements. Delete these elements from the orginal list.

You can scan the list in one pass and save the 1st and 2nd values, that has a O(n) efficiency while sorting is O(n log n).

EDIT:
I think that partial sort is O(n log k)

Untested but fun:

template <typename T, int n>
class top_n_functor : public unary_function<T, void>
{

  void operator() (const T& x) {
     auto f = lower_bound(values_.begin(), values_.end(), x);

     if(values_.size() < n) {
         values_.insert(f, x);
         return;
     }

     if(values_.begin() == f)
          return;

     auto removed = values_.begin();
     values_.splice(removed, values_, removed+1, f);

     *removed = x;
  }

  std::list<T> values() {
     return values_;
  }
private:
   std::list<T> values_;
};

int main()
{
  int A[] = {1, 4, 2, 8, 5, 7};
  const int N = sizeof(A) / sizeof(int);

  auto vals = for_each(A, A + N, top_n_functor<int,2>()).values();

  cout << "The top is " << vals.front()
       << " with second place being " << *(vals.begin()+1) << endl;
}

If the largest is the first element, search for the second largest in [largest+1,end). Otherwise search in [begin,largest) and [largest+1,end) and take the maximum of the two. Of course, this has O(2n), so it's not optimal.

If you have random-access iterators, you could do as quick sort does and use the ever-elegant recursion:

template< typename T >
std::pair<T,T> find_two_largest(const std::pair<T,T>& lhs, const std::pair<T,T>& rhs)
{
  // implementation finding the two largest of the four values left as an exercise :) 
}

template< typename RAIter >
std::pair< typename std::iterator_traits<RAIter>::value_type
         , typename std::iterator_traits<RAIter>::value_type > 
find_two_largest(RAIter begin, RAIter end)
{
  const ptr_diff_t diff = end-begin;
  if( diff < 2 )
    return std::make_pair(*begin, *begin);
  if( diff < 3 )
    return std::make_pair(*begin, *begin+1);
  const RAIter middle = begin + (diff)/2;
  typedef std::pair< typename std::iterator_traits<RAIter>::value_type
                   , typename std::iterator_traits<RAIter>::value_type > 
    result_t;
  const result_t left = find_two_largest(begin,middle);
  const result_t right = find_two_largest(middle,end);

  return find_two_largest(left,right);
}

This has O(n) and shouldn't make more comparisons than NomeN's implementation.

top k is usually a bit better than n(log k)

 template <class t,class ordering>
 class TopK {
 public:
    typedef std::multiset<t,ordering,special_allocator> BEST_t;
    BEST_t best;
    const size_t K;
    TopK(const size_t k)
        : K(k){
    } 
    const BEST_t& insert(const t& item){
        if(best.size()<k){
            best.insert(item);
            return best;
        }
        //k items in multiset now
        //and here is why its better - because if the distribution is random then
        //this and comparison above are usually the comparisons that is done; 
        if(compare(*best.begin(),item){//item better than worst
           erase(begin());//the worst
           best.insert(item); //log k-1 average as only k-1 items in best
        } 
        return best;
    } 
    template <class it>
    const BEST_t& insert(it i,const it last){
        for(;i!=last;++i){
            insert(*i);    
        }
        return best;
    }
  };

Of course the special_allocator can in essence be just an array of k multiset value_types and a list of those nodes (which typically has nothing on it as the other k are in use in the multiset until its time to put a new one in and we erase and then immediate ly reuse it. Good to have this or else the memory alloc/free in std::multiset and the cache line crap kills ya. Its a (very) tiny bit of work to give it static state without violating STL allocator rules.

Not as good as a specialized algo for exactly 2 but for fixed k<<n, I would GUESS (2n+delta*n) where delta is small - my DEK ACP vol3 S&S is packed away and an estimate on delta is a bit more work that I want to do.

average worst is I would guess n(log(k-1) + 2) when in opposite order and all distinct.

best is 2n + k(log k) for the k best being the first

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow