Question

I have an std::vector of std::strings, each of which is a filename. Suppose filenames are of the format some_name_n.xyz.

The problem is that some_name_10.xyz is less than some_name_2.xyz. The files are produced by some other process.

What is the least painful way to sort them so that the number after '_' is considered for comparison, and not just its length?

Was it helpful?

Solution

std::sort allows you to specify a binary function for comparing two elements: http://www.cplusplus.com/reference/algorithm/sort/

Now it's just a matter of constructing that binary function. A partial example is here: Sorting std::strings with numbers in them?

OTHER TIPS

The least painful way is to put approporiate leading zeroes into your file names (even writing a second script that takes the generated names and renames them may be easier than writing your own sort routine).

The second least painful way is to write your own sort predicate that does sorts _ delimited numbers as a number rather than lexicographically.

Here's a comparison that handles any number of numeric values embedded in the strings:

#include <cstdlib>
#include <cctype>
#include <iostream>

#ifdef  _MSC_VER
#define strtoll _strtoi64
#endif

int cmp(const char* lhs, const char* rhs)
{
    while (*lhs || *rhs)
    {
        if (isdigit(*lhs) && isdigit(*rhs))
        {
            char* l_end;
            char* r_end;
            long long l = strtoll(lhs, &l_end, 10);
            long long r = strtoll(rhs, &r_end, 10);
            if (l < r) return -1;
            if (l > r) return 1;
            lhs = l_end;
            rhs = r_end;
        }
        else
            if (*lhs != *rhs)
                return *lhs - *rhs;
            else
                ++lhs, ++rhs;
    }
    return *lhs - *rhs;
}

It's deliberately "C style" so it can be applied directly and efficiently to character arrays. It returns a negative number if lhs < rhs, 0 if they're equal, and a positive number if lhs > rhs.

You can call this from a comparison functor or lambda specified to std::sort.

You can have a custom comparator something like following :

struct Comp{

    auto get_num (const std::string& a)
    {
        auto it1 = std::find_if( a.begin(), a.end(), ::isdigit );
        auto it2 = std::find_if( a.begin(), a.end(), 
                               [](char x){ return x == '.' ;}) ;
        /* Do some checks here for std::string::npos*/
        auto pos1 = std::distance( a.begin(), it1) ;
        auto pos2 = std::distance( it1, it2) ;
        return std::stoi (a.substr( pos1, pos2 )) ;
    }

    bool operator () (const std::string& a, const std::string& b)
    {
        return get_num (a) < get_num (b) ;
    }

};

See demo here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top