Question

I need to create a library for something like a client server interaction. The data requests are not frequent and mostly only a few bytes (ints, longs). Very few requests for user defined structs. I have looked at some similar questions (these and these) but didnt find the answer for what I have asked here since I am primarily concerned about data being generic.

The problem in design I have is wrt the sharing of data between threads: My library has 2 threads (pthreads) with one being a msg handler (event loop). The msg handler loop gets responses from a server and needs to pass data to the other thread. The data is generic - as in could be user defined struct or int or long. Also, the same type of response could be received more than once (each response is uniquely identified) which means a single global variable per data type will not do. The other thread is waiting on a conditional variable and will be woken up by the msg handler thread. The conditional variable is not on a boolean var but instead to check if the msg handler has added the requested info to some "known location". So, how to achieve this data sharing between threads?

I considered some of the below approaches:

  1. Global generic list - with void *data...The handler thread allocates memory upon receiving the server's response and signals the other thread. The first thread wakes up and walks the list looking for a particular id. When found, uses the data and then frees the memory.

    There's no need to define different msg types (for int, struct, long etc) but the walk will be comparatively long.

  2. Global array of lists - An array of list head pointers for each data type. Allocation and free are same as above. Except, the array is indexed by type of data (an enum).

    The walk is comparatively short but needs defining different msg_types for each list.

  3. Im also concerned with dynamically allocating small data like int which causes fragmentation. So tried allocating an array with no dynamic allocation but was not successful on the semantics. (Couldn't find a complex enough example)

    I will invest more time on this If this is the right way. (An example will help!)

  4. Macros for generating per type struct/msg... Just decided this is an overkill(looked very complex in examples)

  5. Shared memory - Using a char * buffer for allocation seemed too awkward..

What is the best design to follow in such a scenario. I hope this is understandable.

Was it helpful?

Solution

This is a producer-consumer scenario, and my preference is to implement a blocking queue.

The event handler pushes messages onto the queue. The worker thread blocks on the queue when it is empty, and wakes up when there is something to process.

The messages can be allocated and freed, but it's usually better if they are drawn from a free list. Messages are simply a union of types, with a type field and an id in common. Nice and simple.

Extending to multiple worker threads and/or new message types is straightforward.

You can easily find pre-written code for a blocking queue of this kind in your language of choice.

[There may be something about your constraints that don't fit this approach, but if so please edit your question to make it clear.]


There are three basic strategies for a variable length blocking queue in C++. Assume that a message is a union of different types of different sizes.

  1. The messages on the queue are fixed in size and contain a pointer to a message. Each message is allocated and freed as needed.
  2. The messages on the queue are fixed in size and contain a pointer to a message. Messages are kept in pools (free lists/arrays) to avoid allocation overhead.
  3. The messages on the queue are variable in size. They consist of a stream of bytes pushed onto the queue, with a length to know how much to pop off.

You can easily find sample code and full working examples by searching for "blocking queue variable length C++" or similar.

OTHER TIPS

My library has 2 threads

The data requests are not frequent

These are deciding factors: it means you won't have throughput related issues, like memory allocation pressure, and you don't need a complex signaling system.

  1. You can use a simple list to store your items, and malloc/free structures as needed
  2. Having only two threads, it's a 1 producer / 1 consumer setup, which means that you can simply use a guard cell in your list to effectively split usage between consumer and producer without having to mutex it.

However, if you think that these factors may change in the future, you will need to make sure that the code accessing your list data structure is well delineated from the rest (ie, make an API), to leave you room to modify that list usage later on (say by including mutexes, or including message key indexing).

Regarding the data structure: you need your code to know what type of data it has to handle, which means that the data being passed on must contain a type tag. The easiest method is to make a union type of all the datatype you will have to process, and prepend it with a enum value tagging the data:

typedef enum {
   TYPE1,
   TYPE2,
   TYPE3
   // etc.
} data_tag;

typedef union {
   struct type1 {/* .. */};
   struct type2 {/* .. */};
   struct type3 {/* .. */};
   // etc.
} data_content;


typedef struct {
   data_tag tag;
   data_content value;
} message;

Then, within your code, simply use a switch on the tag field to discriminate use cases.

A possible solution for memory handling is to use so called memory pools instead of a simple malloc scheme. Basically, you allocate in advance many instances of each of the data_content sub types, and construct your messages around them.

typedef struct {
   data_tag tag;
   data_content *value;
};

If you are careful (ie. you construct your values properly, and switch on them appropriately) , the fact that you are actually pointing to a single datatype rather than an union shouldn't matter, and should let you avoid cluttering your code with many type casts.

For small memory pressure, you probably don't need to make it too complicated, but you could add in a mechanism to expand these pools in case of strong demand, by implementing them as a list of array of structs.

If you are really concerned about code complexity, consider using a language which support memory recollection built in (say, Erlang, OCaml, Go, Python, etc.). Additionally, many functional languages (like OCaml) support so called algebraic datatypes, easily allowing the implementation of the above message type in the language.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top