Question

You can pass a function pointer, function object (or boost lambda) to std::sort to define a strict weak ordering of the elements of the container you want sorted.

However, sometimes (enough that I've hit this several times), you want to be able to chain "primitive" comparisons.

A trivial example would be if you were sorting a collection of objects that represent contact data. Sometimes you will want to sort by

last name, first name, area code
. Other times
first name, last name
- yet other times
age, first name, area code
... etc

Now, you can certainly write an additional function object for each case, but that violates the DRY principle - especially if each comparison is less trivial.

It seems like you should be able to write a hierarchy of comparison functions - the low level ones do the single, primitive, comparisons (e.g. first name < first name), then higher level ones call the lower level ones in succession (probably chaining with && to make use of short circuit evaluation) to generate the composite functions.

The trouble with this approach is that std::sort takes a binary predicate - the predicate can only return a bool. So if you're composing them you can't tell if a "false" indicates equality or greater than. You can make your lower level predicates return an int, with three states - but then you would have to wrap those in higher level predicates before they could be used with std::sort on their own.

In all, these are not insurmountable problems. It just seems harder than it should be - and certainly invites a helper library implementation.

Therefore, does anyone know of any pre-existing library (esp. if it's a std or boost library) that can help here - of have any other thoughts on the matter?

[Update]

As mentioned in some of the comments - I've gone ahead and written my own implementation of a class to manage this. It's fairly minimal, and probably has some issues with it in general. but on that basis, for anyone interested, the class is here:

http://pastebin.com/f52a85e4f

And some helper functions (to avoid the need to specify template args) is here:

http://pastebin.com/fa03d66e

Was it helpful?

Solution

You could build a little chaining system like so:

struct Type {
  string first, last;
  int age;
};

struct CmpFirst {
  bool operator () (const Type& lhs, const Type& rhs) { return lhs.first < rhs.first; }
};

struct CmpLast {
  bool operator () (const Type& lhs, const Type& rhs) { return lhs.last < rhs.last; }
};

struct CmpAge {
  bool operator () (const Type& lhs, const Type& rhs) { return lhs.age < rhs.age; }
};

template <typename First, typename Second>
struct Chain {
  Chain(const First& f_, const Second& s_): f(f_), s(s_) {}

  bool operator () (const Type& lhs, const Type& rhs) {
    if(f(lhs, rhs))
      return true;
    if(f(rhs, lhs))
      return false;

    return s(lhs, rhs);
  }

  template <typename Next>
  Chain <Chain, Next> chain(const Next& next) const {
     return Chain <Chain, Next> (*this, next);
  }

  First f;
  Second s;
};

struct False { bool operator() (const Type& lhs, const Type& rhs) { return false; } };

template <typename Op>
Chain <False, Op> make_chain(const Op& op) { return Chain <False, Op> (False(), op); }

Then to use it:

vector <Type> v;  // fill this baby up

sort(v.begin(), v.end(), make_chain(CmpLast()).chain(CmpFirst()).chain(CmpAge()));

The last line is a little verbose, but I think it's clear what's intended.

OTHER TIPS

One conventional way to handle this is to sort in multiple passes and use a stable sort. Notice that std::sort is generally not stable. However, there’s std::stable_sort.

That said, I would write a wrapper around functors that return a tristate (representing less, equals, greater).

You can try this:

Usage:

struct Citizen {
    std::wstring iFirstName;
    std::wstring iLastName;
};

ChainComparer<Citizen> cmp;
cmp.Chain<std::less>( boost::bind( &Citizen::iLastName, _1 ) );
cmp.Chain<std::less>( boost::bind( &Citizen::iFirstName, _1 ) );

std::vector<Citizen> vec;
std::sort( vec.begin(), vec.end(), cmp );

Implementation:

template <typename T>
class ChainComparer {
public:

    typedef boost::function<bool(const T&, const T&)> TComparator;
    typedef TComparator EqualComparator;
    typedef TComparator CustomComparator;

    template <template <typename> class TComparer, typename TValueGetter>
    void Chain( const TValueGetter& getter ) {

        iComparers.push_back( std::make_pair( 
            boost::bind( getter, _1 ) == boost::bind( getter, _2 ), 
            boost::bind( TComparer<TValueGetter::result_type>(), boost::bind( getter, _1 ), boost::bind( getter, _2 ) ) 
        ) );
    }

    bool operator()( const T& lhs, const T& rhs ) {
        BOOST_FOREACH( const auto& comparer, iComparers ) {
            if( !comparer.first( lhs, rhs ) ) {
                return comparer.second( lhs, rhs );
            }
        }

        return false;
    }

private:
    std::vector<std::pair<EqualComparator, CustomComparator>> iComparers;
};

std::sort is not guaranteed to be stable because stable sorts are usually slower than non-stable ones ... so using a stable sort multiple times looks like a recipe for performance trouble...

And yes it's really a shame that sort ask for a predicate: I see no other way than create a functor accepting a vector of tristate functions ...

The chaining solution is verbose. You could also use boost::bind in conjunction with std::logical_and to build your sorting predicate. See the linked article for more information: How the boost bind library can improve your C++ programs

Variadic templates in C++ 11 give a shorter option:

    #include <iostream>
    using namespace std;

    struct vec { int x,y,z; };

    struct CmpX {
      bool operator() (const vec& lhs, const vec& rhs) const 
      { return lhs.x < rhs.x; }
    };

    struct CmpY {
      bool operator() (const vec& lhs, const vec& rhs) const 
      { return lhs.y < rhs.y; }
    };

    struct CmpZ {
      bool operator() (const vec& lhs, const vec& rhs) const 
      { return lhs.z < rhs.z; }
    };

    template <typename T>
    bool chained(const T &, const T &) {
      return false;
    }

    template <typename CMP, typename T, typename ...P>
    bool chained(const T &t1, const T &t2, const CMP &c, P...p) {
      if (c(t1,t2)) { return true;          }
      if (c(t2,t1)) { return false;         }
      else          { return chained(t1, t2, p...); }
    }

    int main(int argc, char **argv) {
      vec x = { 1,2,3 }, y = { 2,2,3 }, z = { 1,3,3 };
      cout << chained(x,x,CmpX(),CmpY(),CmpZ()) << endl;
      return 0;
    }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top