Question

I am trying to create a data buffer, more specifically, an image buffer, which will be shared among multiple modules. Those modules only reads from the buffer and don't communicate with each other at all. My difficulty is:

1.Large data size:

larger than 10M per image, that means copying those data around for different threads is not desirable

2.I don't want the memory grow wild:

as new data continuously come in(in real-time), the very old data must be deleted when all the modules have finished using it.

However, to make things even more complex, those modules who consumes the data are at different pace: some are faster/slower, some needs more data(multiple images) to get a result, some need less(only one image)

I have been thinking about using shared_ptr to solve the 1st problem: create a queue of boost shared_ptr(s), each shared_ptr() points to an image(char array). Then pass a subset of those pointers to different module.

I am a totally newbie in this smart pointer field. What will be a good solution for this problem?

Thanks.

Was it helpful?

Solution

Assuming you hand the shared_ptrs to the modules as soon as the buffer is created, they are a good fit. You don't even need to store them centrally in that case.

It gets more complicated however, if you create the buffers at one point and only at some other point later the modules request the buffer.
In that case you have to figure out what behaviour you want.
Do you want to hold the buffers for some time? Or until at least one module has used them? Or until some new data comes in?

integration of comment:
As you want all your readers/modules to handle all incoming data you can simply give these an input queue. On incoming data just hand the modules an shared_ptr/shared_array to the new buffer, which add them to the queue.
Remember to handle the multi-threading issues though for the queue access.

OTHER TIPS

A Boost shared pointer is exactly what I was going to suggest. Yes, let the pointer class do the work for you.

Note that you will want to use boost::shared_array instead of shared_ptr if you are storing array pointers.

The shared_array class template stores a pointer to a dynamically allocated array. (Dynamically allocated array are allocated with the C++ new[] expression.) The object pointed to is guaranteed to be deleted when the last shared_array pointing to it is destroyed or reset.

According to your requirements, I think you could use two principles:

  • shared_array<char> which will handle the multi-thread synchronization and memory-handling
  • one queue per module: this one is necessary since each module is dealing with the images at its own pace

Then, as soon as you get an image, you allocate it on the heap in a shared_array<char>. This pointer is then replicated in all the queues.

Each queue individually requires synchronization, it's a classic Consumer / Producer thing though, so you'll probably program it (quite) easily, especially since each queue only have ONE producer (the thread which receives the image) and ONE consumer.

Let's have an example: let's take 3 modules, one is fast, one is medium and the last use the images 3 by 3.

=> receiving image 1
module a: ['1'] -> processing (none)
module b: ['1'] -> processing (none)
module c: ['1'] -> processing (none)

=> modules a, b starts treatment of '1'
module a: [] -> processing '1'
module b: [] -> processing '1'
module c: ['1'] -> processing (none)

=> receiving image 2
module a: ['2'] -> processing '1'
module b: ['2'] -> processing '1'
module c: ['2', '1'] -> processing (none)

=> module a finishes treatment of '1', starts treatment of '2'
module a: [] -> processing '2'
module b: ['2'] -> processing '1'
module c: ['2', '1'] -> processing (none)

=> receiving image 3
module a: ['3'] -> processing '2'
module b: ['3', '2'] -> processing '1'
module c: ['3', '2', '1'] -> processing (none)

=> module c starts treatment of '1', '2' and '3'
module a: ['3'] -> processing '2'
module b: ['3', '2'] -> processing '1'
module c: [] -> processing '1', '2' and '3'

=> module a finishes treatment of '2', starts treatment of '3'
=> module b finishes treatment of '1', starts treatment of '2'
=> module c finishes treatment of '1' and '2', keeps '3' for future batch
module a: [] -> processing '3'
module b: ['3'] -> processing '2'
module c: [] -> processing '3' (waiting)

--> at this point '1' is deleted from memory

You can even make this 'easy' if each module (thread) registers its queue in a 'pool'.

I would also advise signalling, I always think it was better for a producer to signal that a new item had been inserted (if the queue was empty) that having the consumer thread constantly polling the queue...

1.Large data size:

You are correct in choosing to store the image data in heap allocated buffers, and then passing pointers to them between your processing modules.

2.I don't want the memory grow wild

You don't have to use a queue for memory management if you are using shared_ptr(). Design your modules to create/accept a shared_ptr() when it needs access to the data, and when it is done, to delete the shared_ptr(). The intention of a shared_ptr() is that the heap memory owned by the pointer is deleted when there are no more references to it.

save the image to a file so you could try posix file mapping to map to memory per image. after mapping, you could make it as the shared memory to be used efficiently even among multi-processes.

Btw: does your system support posix file mapping? e.g. mmap in Linux etc.

Use boost::shared_array as data container (John suggestion). And boost::circular_buffer as input queue in your modules.

boost::circular_buffer< const boost::shared_array<char> > input_queue_;

As images are shared you should not modify them but make a new copy when needed.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top