Original notes 2013-05-05
In a comment, I noted:
You have wrap control so that when the writer reaches the end of the array, the next entry is placed at the start, but you do not have any over-filling control, so if the producers produce faster than the consumers consume, the producers overwrite unread data. You need to ensure that the array Fifo
is not over-filled.
Nick Rosencrantz observed:
You're right, increasing the FIFO size to 100 fixes the bug.
Actually, increasing the FIFO size does not fix the bug; it simply avoids the bug for longer. You need to keep track of whether the write pointer (index) is going to catch up with the read pointer, and either not add or delay adding a new number until one of the consumers has read the number at the read pointer so that there is space once more.
Actual testing of the code
The code in the question is the code from the exercise verbatim. I'm not sure what the equipment described is, but the screen shots suggest Windows is involved. I have a Mac. Superficially, that makes testing the code hard — one of the files is assembler. However, the code can easily be supplemented by Unix (POSIX pthread) implementations of the primitives. Actually, the lab has been well designed; it was very easy to do the simulation. However, you must take any results I report with an appropriate pinch of salt — a Mac is a very different machine.
I prefer functions declared before they're defined or used.
#include <unistd.h>
extern void fatal_error(char * msg);
extern void Sleep (int n);
extern void Signal(int *sem);
extern void Wait(int *sem);
extern void PutFifo(int tal);
extern int GetFifo(void);
extern void Producer(struct Prod * prodstruct);
extern void Consumer(int * tal);
The while (1);
loop in fatal_error()
is a busy wait; I'd rather use pause()
. However, it was never actually exercised, so maybe it doesn't matter.
The necessary oslab_*()
primitives can be simulated with POSIX pthreads trivially:
/* pthread implementation */
#include <pthread.h>
#include <time.h>
static pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER;
void oslab_begin_critical_region(void)
{
pthread_mutex_lock(&mtx);
}
void oslab_end_critical_region(void)
{
pthread_mutex_unlock(&mtx);
}
int oslab_create_thread(int (*thread_function)(void *), void *data, unsigned int *stack)
{
typedef void *(*ThreadMain)(void *);
static int threadnum = 0;
pthread_t pth;
pthread_attr_t pat;
pthread_attr_init(&pat);
pthread_attr_setdetachstate(&pat, PTHREAD_CREATE_DETACHED);
if (pthread_create(&pth, &pat, (ThreadMain)thread_function, data) != 0)
{
char buffer[128];
sprintf(buffer, "Failed to create thread with stack %p\n", stack);
fatal_error(buffer);
}
return ++threadnum;
}
void oslab_idle(void)
{
pause();
}
void oslab_yield(void)
{
struct timespec rqtp = { .tv_sec = 0, .tv_nsec = 1000000 }; // 1 millisecond
if (nanosleep(&rqtp, 0) != 0)
fatal_error("nanosleep failed\n");
}
/* end pthread implementation */
All the messages are in the irritating format "\nMessage"
; that is a screwball format. Rewrite them all so the newline is at the end, where the 'end of line' character always should be.
When the debug printing is enabled, the lines should be forced out. I used fflush(0)
(equivalent, in this context, to fflush(stdout)
) after the debugging print statements. An alternative (better) way to do that would be to call setvbuf()
in main()
to set line buffering.
char buffer[BUFSIZ];
setvbuf(stdout, buffer, _IOLBF, BUFSIZ);
Results
With these changes in place (leaving the synchronization primitives — calls to Wait()
, Signal()
and oslab_yield()
— commented out), all hell breaks loose:
System starting...
Producer 1 is created with thread-ID 1
Producer 2 is created with thread-ID 2
Producer 3 is created with thread-ID 3
Consumer 1 is created with thread-ID 4
Consumer 2 is created with thread-ID 5
Consumer 1 gets Prime 0
Next Prime from producer 1 is 2003
Consumer 2 gets Prime 0
Next Prime from producer 2 is 5003
Next Prime from producer 3 is 8009
Consumer 1 gets Prime 0
Next Prime from producer 1 is 2011
Consumer 2 gets Prime 0
Next Prime from producer 2 is 5009
Consumer 1 gets Prime 0
Next Prime from producer 1 is 2017
Consumer 2 gets Prime 0
Next Prime from producer 3 is 8011
Next Prime from producer 2 is 5011
Consumer 1 gets Prime 0
Next Prime from producer 1 is 2027
Consumer 2 gets Prime 0
Consumer 1 gets Prime 0
Next Prime from producer 3 is 8017
Next Prime from producer 2 is 5021
Next Prime from producer 1 is 2029
Consumer 2 gets Prime 0
Consumer 1 gets Prime 2003
Consumer 2 gets Prime 2029
Next Prime from producer 1 is 2039
Next Prime from producer 3 is 8039
Next Prime from producer 2 is 5023
Consumer 1 gets Prime 8009
Consumer 2 gets Prime 2011
However, if you enable the synchronization primitives (but leave the debug code disabled), then you get sane behaviour: the consumer do not try to read the queue before there is data in the queue, and the writers do not try to write to the queue when there isn't space in the queue.
System starting...
Producer 1 is created with thread-ID 1
Producer 2 is created with thread-ID 2
Producer 3 is created with thread-ID 3
Consumer 1 is created with thread-ID 4
Consumer 2 is created with thread-ID 5
Next Prime from producer 1 is 2003
Next Prime from producer 2 is 5003
Next Prime from producer 3 is 8009
Consumer 1 gets Prime 2003
Consumer 2 gets Prime 5003
Next Prime from producer 1 is 2011
Next Prime from producer 3 is 8011
Next Prime from producer 2 is 5009
Next Prime from producer 2 is 5011
Consumer 1 gets Prime 8009
Consumer 2 gets Prime 2011
Next Prime from producer 1 is 2017
Next Prime from producer 3 is 8017
Consumer 1 gets Prime 8011
Consumer 2 gets Prime 5009
Next Prime from producer 2 is 5021
Next Prime from producer 1 is 2027
Next Prime from producer 3 is 8039
Next Prime from producer 2 is 5023
Consumer 1 gets Prime 5011
Consumer 2 gets Prime 2017
Next Prime from producer 3 is 8053
Next Prime from producer 1 is 2029
Next Prime from producer 2 is 5039
This is to be expected. If you have true multi-core threading (Intel Core i7 does), then without the synchronization, you get all sorts of odd-ball behaviour. With the synchronization, everything is calm. I've let the code run free with the output going to file. When the results are analyzed, you see one appearance of each of the primes 2003..4999, two appearances of each of the primes 5003..7993, and three appearances of each of the primes from 8009 upwards, which is what you'd expect.
If you enable the debugging code, you get to see more output:
System starting...
Producer 1 is created with thread-ID 1
Next Prime from producer 1 is 2003
Producer 2 is created with thread-ID 2
PutFifo: 2003
wraddr = 0
Producer 3 is created with thread-ID 3
Consumer 1 is created with thread-ID 4
GetFifo: 2003
rdaddr = 0
Consumer 2 is created with thread-ID 5
Next Prime from producer 2 is 5003
Next Prime from producer 3 is 8009
Consumer 1 gets Prime 2003
PutFifo: 5003
wraddr = 1
GetFifo: 5003
rdaddr = 1
Consumer 1 gets Prime 5003
Next Prime from producer 1 is 2011
PutFifo: 2011
wraddr = 2
GetFifo: 2011
rdaddr = 2
Consumer 2 gets Prime 2011
PutFifo: 8009
wraddr = 3
Next Prime from producer 2 is 5009
PutFifo: 5009
wraddr = 4
GetFifo: 8009
rdaddr = 3
Consumer 1 gets Prime 8009
Next Prime from producer 3 is 8011
GetFifo: 5009
PutFifo: 8011
rdaddr = 4
Next Prime from producer 1 is 2017
wraddr = 5
Consumer 2 gets Prime 5009
Next Prime from producer 2 is 5011
PutFifo: 5011
wraddr = 6
PutFifo: 2017
wraddr = 7
GetFifo: 8011
rdaddr = 5
Consumer 2 gets Prime 8011
GetFifo: 5011
rdaddr = 6
Next Prime from producer 3 is 8017
Consumer 1 gets Prime 5011
PutFifo: 8017
wraddr = 8
Next Prime from producer 2 is 5021
PutFifo: 5021
wraddr = 9
Next Prime from producer 1 is 2027
PutFifo: 2027
wraddr = 0
GetFifo: 2017
rdaddr = 7
Consumer 1 gets Prime 2017
This shows wraddr
wrapping from 9 to 0. It is otherwise verbose and unexciting.
Running GDB
It is not clear that you would be able to run GDB on the program in the original environment. You're using home-brew threading (if you're using the original oslab_lowlevel_c.c
and oslab_lowlevel_asm.s
source), and GDB won't be aware of the multi-threading.
With the POSIX threading I've used, it would be possible to debug the code using GDB.