Question

I was trying to pick a standard way to convert integrals to strings, so I went on and did a small performance evaluation by measuring the execution time of 3 methods

#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <chrono>
#include <random>
#include <exception>
#include <type_traits>
#include <boost/lexical_cast.hpp>

using namespace std;

// 1. A way to easily measure elapsed time -------------------
template<typename TimeT = std::chrono::milliseconds>
struct measure
{
    template<typename F>
    static typename TimeT::rep execution(F const &func)
    {
        auto start = std::chrono::system_clock::now();
        func();
        auto duration = std::chrono::duration_cast< TimeT>(
            std::chrono::system_clock::now() - start);
        return duration.count();
    }
};
// -----------------------------------------------------------

// 2. Define the conversion functions ========================
template<typename T> // A. Using stringstream ================
string StringFromNumber_SS(T const &value) {
    stringstream ss;
    ss << value;
    return ss.str();
}

template<typename T> // B. Using boost::lexical_cast =========
string StringFromNumber_LC(T const &value) {
    return boost::lexical_cast<string>(value);
}

template<typename T> // C. Using c++11 to_string() ===========
string StringFromNumber_C11(T const &value) {
    return std::to_string(value);
}
// ===========================================================

// 3. A wrapper to measure the different executions ----------
template<typename T, typename F>
long long MeasureExec(std::vector<T> const &v1, F const &func)
{
    return measure<>::execution([&]() {
        for (auto const &i : v1) {
            if (func(i) != StringFromNumber_LC(i)) {
                throw std::runtime_error("FAIL");
            }
        }
    });
}
// -----------------------------------------------------------

// 4. Machinery to generate random numbers into a vector -----
template<typename T>
typename std::enable_if<std::is_integral<T>::value>::type 
FillVec(vector<T> &v)
{
    std::mt19937 e2(1);
    std::uniform_int_distribution<> dist(3, 1440);
    std::generate(v.begin(), v.end(), [&]() { return dist(e2); });
}

template<typename T>
typename std::enable_if<!std::is_integral<T>::value>::type 
FillVec(vector<T> &v)
{
    std::mt19937 e2(1);
    std::uniform_real_distribution<> dist(-1440., 1440.);
    std::generate(v.begin(), v.end(), [&]() { return dist(e2); });
}
// -----------------------------------------------------------

int main()
{
    std::vector<int> v1(991908);
    FillVec(v1);

    cout << "C++ 11 method ......... " <<
        MeasureExec(v1, StringFromNumber_C11<int>) << endl;
    cout << "String stream method .. " <<
        MeasureExec(v1, StringFromNumber_SS<int>) << endl;
    cout << "Lexical cast method ... " <<
        MeasureExec(v1, StringFromNumber_LC<int>) << endl;

    return 0;
}

A typical output (running Release in VS2013 which implies /O2 optimization flag) would be

C++ 11 method ......... 273

String stream method .. 1923

Lexical cast method ... 222

UPDATE

Alternatively an online run on gcc with

g++ -std=c++11 -Ofast -march=native -Wall -pedantic main.cpp && ./a.out

C++ 11 method ......... 414

String stream method .. 1538

Lexical cast method ... 275

Disclaimer : Results are to be compared among each other and not across machines

Questions

1. Why is the string stream method consistently the worst (by an order of magnitude)? Should it be viewed as deprecated now that faster alternatives emerged?

2. Why is lexical cast consistently the best? Can we assume that this is the fastest implementation?

Please feel free to tweak and play with your versions of this code. I'd appreciate your insights on the topic.

PS

The code that was actually run, had only one measurement per main(). Here all were 3 were presented together to save space.

Optimization flags are compiler specific or application mandated. I'm just providing the code blocks to perform the tests and expect from SO users to chip in with their results or suggestions to what the optimum configuration per compiler would be (for what it's worth I provided the flags used here).

The code works for any numeric to string conversion (it takes changing the type of v1 in main). sehe did for double (mentioned in his answer's comment). It's a good idea to play with that too.

Was it helpful?

Solution

Question 1. Why is the string stream method consistently the worst?

The classical mistake: creating a new stringstream every single time

template<typename T> // 1. Using stringstream
string StringFromIntegral_SS(T const &value) {
    thread_local stringstream ss;
    ss.str("");
    ss.clear();
    ss << value;
    return ss.str();
}

Question 2. Why is lexical cast consistently the best? Can we assume that this is the fastest implementation ?

Because it's most specialized; and, no, faster implementations exist. FastFormat and Boost Spirit have competitive offerings, as far as I know.

Update Boost Spirit Karma still easily beats the bunch:

template<typename T> // 4. Karma to string
std::string StringFromIntegral_K(T const &value) {
    thread_local auto const gen = boost::spirit::traits::create_generator<T>::call();
    thread_local char buf[20];
    char* it = buf;
    boost::spirit::karma::generate(it, gen, value);
    return std::string(buf, it);
}

Timings:

C++ 11 method 111
String stream method 103
Lexical cast method 57
Spirit Karma method 36
Spirit Karma method with string_ref 13

See it Live On Coliru Clang or GCC


BONUS

Just to goof off, a version using boost::string_ref is much faster still due the reduced allocations:

template<typename T> // 5. Karma to string_ref
boost::string_ref StringFromIntegral_KSR(T const &value) {
    thread_local auto const gen = boost::spirit::traits::create_generator<T>::call();
    thread_local char buf[20];
    char* it = buf;
    boost::spirit::karma::generate(it, gen, value);
    return boost::string_ref(buf, it-buf);
}

I've tested all modified methods for correctness using an asserting test loop:

return measure<>::execution(
    //[&]() { for (auto const &i : v1) { func(i); }});
    [&]() { for (auto const &i : v1) { assert(func(i) == StringFromIntegral_LC(i)); }});

OTHER TIPS

std::format

A new method has been added to our arsenal with c++20, namely that of std::format. Obtaining a string from a number num would be as easy as:

std::format("{}", num);

Since gcc does not support it yet, I extended the original benchmark using fmt:

fmt::format("{}", num);

i.e. the library std::format is based on (and will probably be the bulk of upcoming implementations), in this online demo. Dissapointed to see it's 4x slower than std::to_string despite the library's advertised speed:

C++ 11 method ......... 56
String stream method .. 1171
Lexical cast method ... 78
Format method ... 210

Maybe I'm not doing this library justice (after all the reported benchmarks for fmt consider the print functionality) so I'm leaving the benchmark and report here, in case tweaks or corrections can be suggested.

This is a version of the benchmark with sehe's optimized use of stringstream (i.e. as a thread local variable)


charconv

The advertised as fastest modern method, that of using the charconv header keeps its promise and beats competition, with relative timings like:

Program returned: 0
C++ 11 method ......... 13
String stream method .. 1044
Lexical cast method ... 23
Format method ... 152
Charconv method ... 11

As shown in the following Demo, a major difference of the charconv API is that it forces the user to preallocate the result buffer (where the output is placed) even placing it in automatic memory (stack). If one goes against the grain and (unlike what's shown in the demo above) does things like creating strings on the fly, e.g. in case they need to store each "new object":

template<typename T> // D. Using c++17 to_chars() ===========
std::string StringFromNumber_CharConv(T const &value) {
    std::array<char, 10> ret;
    auto [ptr, ec] = std::to_chars(ret.data(), ret.data() + ret.size(), value);
    return  std::string(ret.data(), ptr);
}

then performance of to_chars matches that of std::to_string. So two thoughts on that:

  1. Either memory allocation is the determining factor here, and this benchmark is not fine grained enough to distinguish between to_string and to_chars.
  2. Or std::to_string has already incorporated advances made by charconv.

That said, it seems that if one needs to go as fast as possible, charconv provides the API to do so since it doesn't enforce memory allocation and the "core" convertion algorithm is apparently as fast as it gets.

Determining factor will of course be the length of numbers we are converting since different algorithms may shine for differnt input. The 4 number integers we use in the benchmark might not leave much room for charconv to show its power, something that large floating point numbers would showcase better.

I added absl::StrCat as an option, made sure the code would handle floating-point as well, and then ran the benchmark loop 3 times.

I also ran them on both clang and gcc.

See https://godbolt.org/z/7feoaEEKr

For int on gcc, the results I got were:

C++ 11 method ......... 81
String stream method .. 112
Lexical cast method ... 96
Format method ......... 212
StrCat method ......... 608

C++ 11 method ......... 81
String stream method .. 131
Lexical cast method ... 143
Format method ......... 704
StrCat method ......... 118

C++ 11 method ......... 103
String stream method .. 153
Lexical cast method ... 633
Format method ......... 191
StrCat method ......... 91

I don't know what's going in with the benchmarks sometimes getting much slower results, but I suspect there's some allocator garbage collection due to all the memory allocation / deallocation.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top