Question

I'm writing a straightforward C program on Linux and wish to use an existing library's API which expects data from a file. I must feed it a file name as a const char*. But i have data, just like content of a file, already sitting in a buffer allocated on the heap. There is plenty of RAM and we want high performance. Wanting to avoid writing a temporary file to disk, what is a good way to feed the data to this API in a way that looks like a file?

Here's a cheap pretend version of my code:

marvelouslibrary.h:

int marvelousfunction(const char *filename);

normal-persons-usage.cpp, for which library was originally designed:

#include "marvelouslibrary.h"
int somefunction(char *somefilename)
{
    return marvelousfunction(somefilename);
}

myprogram.cpp:

#include "marvelouslibrary.h"
int one_of_my_routines() 
{
    byte* stuff = new byte[1000000];
    // fill stuff[] with...stuff!
    // stuff[] holds same bytes as might be found in a file

    /* magic goes here: make filename referring to stuff[] */

   return marvelousfunction( ??? );
}

To be clear, the marvelouslibrary does not offer any API functions that accept data by pointer; it can only read a file.

I thought of pipes and mkfifo(), but seems meant for communicating between processes. I am no expert at these things. Does a named pipe work okay read and written in the same process? Is this a wise approach?

Or skip being clever, go with plan "B" which is to shuddup and just write a temp file. However, i'd like to learn something new and find out what's possible in this situation, beside getting high performance.

Was it helpful?

Solution

Given that you likely have a function like:

char *read_data(const char *fileName)

I think you will need to "skip being clever, go with plan "B" which is to shuddup and just write a temp file."

If you can dig around and find out if the call you are making is calling another function that takes a File * or an int for the file descriptor then you can do something better.

One thought that does come to mind, can you cahnge your code to write to a memory mapped file instead of to the heap? That way you would have a file on disk already and you would avoid the copying (though it'll still be on disk) and you can still give the function call the file name.

OTHER TIPS

I'm not sure what kind of input the library function wants ... does it need a path/file name, or open file pointer, or open file descriptor?

If you don't want to hack the library and the function wants a string (path to a file), try making the temporary file in /dev/shm.

Otherwise, mmap might be the best option, please be sure to research posix_madvise() when using mmap() (or its counterpart posix_fadvise() if using a temporary file).

It looks like your talking about very little data to begin with, so I don't think you'll see a performance impact in whatever route you take.

Edit

Sorry, I just re-read your question .. perhaps I just read too fast. There is no way you are going to feed a function like:

char * foo(const char *filepath)

... with mmap().

If you can not modify the library to accept a file descriptor instead (or as an alternate to the path) .. just use /dev/shm and a temporary file, it will be quite cheap.

You're on Linux, can't you just grab the source of the library and hack in the function you need? If it's useful to others, you could even send a patch to the original author, so it will be in future versions for everyone.

Edit: Sorry. Just read the question. With my advise below, you fork a spare process, and the question of "does in work in a single process does not come up". I also see no reason you couldn't spawn a separate thread to do the push...


Not in the least elegant, but you could:

  1. open a named pipe.
  2. fork a streamer that does nothing but try to write to the pipe
  3. pass the name of the pipe

which should be pretty robust...

mmap(), perhaps?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top