Question

I have mmapped a huge file into char string and made a c++ string out of it. I need to parse this string based on a delimit character which is a space character and store the values in matrix. I could do it from one thread but I need to optimize it. So I'm using multiple threads to parse strings from this sstream and store it in matrix . Though based on thread id, I could store the parsed data into matrix synchronously but How do i synchronize the parsing since any thread can get scheduled anytime and parse string. Here is my code

void* parseMappedString(void* args)
{
    char temp[BUFFSIZE];
    long int threadID = *((long int*)args);
    if (threadID  < 0)
        threadID = 0;

    for (int i = ((threadID) * 160); i < ((threadID+1) * 160); i++)
    {
        for (int j = 0; j < 4000; j++)
        {
            pthread_mutex_lock(&ParseMatrixMutex);
            if ((matrix_str.getline(temp,BUFFSIZE, ' ')) )
            {
                pthread_mutex_unlock(&ParseMatrixMutex);
                matrix[i][j] = parseFloat((temp));
            }
            else
            {
                pthread_mutex_unlock(&ParseMatrixMutex);
            }
        }
    }
}

void create_threads_for_parsing(void)
{
    long int i;

    for (i = 0; i < 5; i++)
        pthread_create(&Threads[i], NULL, parseMappedString, (void*)&i);
}

In the code if you see that there are total five threads and each thread is processing 160 * 4000 elements. And they are storing based on their thread id hence into unique location in matrix. so that way it is synchronized. But getline can be done by any thread at any time hence thread no 5 can parse data which belongs to first thread. How do i avoid this ?

I had to following because I receive 1-4 threadids in args but never 0. It is always coming as some junk negative value hence I had to hardcode it like this.

if (threadID < 0) threadID = 0;

Was it helpful?

Solution

I have mmapped a huge file into char string and made a c++ string

Don't, std::string has to copy the memory, so you lose the performance improvement mmap would otherwise get you. Just work on the raw memory as a char array

I could do it from one thread but I need to optimize it

Are you sure multiple threads will optimize it? Did you profile and confirm it's definitely CPU-bound and not I/O bound?


If you're sure multiple threads is the way to go, I'd suggest doing this:

  1. create N threads (this should be based on the number of cores and then tweaked according to test results)
  2. carve your mmap'd region up into N blocks of approximately equal size
    • you can just search back & forth for the nearest newline to your block boundary
  3. have each thread n create its own independent output
  4. combine all the outputs afterwards

As for the bug in the code I'm trying to persuade you not to use: you pass (void*)&i as your argument to the thread function. This is a pointer to an automatic local that goes out of scope at the end of create_threads_for_parsing, so it's likely to be random garbage by the time any thread reads it. Even if it weren't random garbage (ie, if create_threads_for_parsing joined all the threads before returning, to keep i in scope), it would be the same pointer for each thread.

To safely pass a distinct integer id to each thread, you should allocate a distinct integer for each thread, and pass its address. It's either that or mess around with intptr_t.

OTHER TIPS

std::string::getline is not thread-safe, you cannot use getline() from different threads.

You either need to access a known position in the raw string-data in memory using strncopy (c-style)

strncopy(matrix_str.c_str(), temp, 4000);

or using the substring-function (C++-style)

std::string piece = matrix_str.substr(i,4000)

EDIT: If your matrix_str is not a std::string but a std::sstream object, this will not work as a stream has to be accessed in order. Your question is a bit vague on that part...

The code is almost fully mutexed -- so there's no point at all to use threads.

The idea of palatalization is to allow work actually done at the same time. For that you shall reduce data sharing, ideally to zero.

Like splitting the big string into 4 parts up front and post that to threads, so they can read and process it, placing result in their exclusive place too. The output can go to the matrix if no cells are shared, but be aware of false sharing that could still ruin performance.

On the weird 0 ID part: I thought the posted code is just demonstration, but you may have it like that literally.

You must join all the threads before leaving function create_threads_for_parsing. As currently you pass to threads pointer to a local variable in it.

Worse, the variable is shared, so you have a race condition on it. You do something like:

static const int ids = {0, 1, 2, 3, 4};

and pass a pointer to the proper cell in the loop.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top