You have several options to try, however I would do it like this:
have two buffers for storing the texture as a byte array, say
0
and1
,do the texture computation in one buffer,
when it's done, signal it by setting a
volatile int updated_buffer
to the updated buffer index.have the other thread read
updated_buffer
periodically, and keep a copy of its latest value. When that copy and the int are not in sync., update the copy and upload the texture to memory,
Note that this solution relies on several things:
there are only two threads dealing with the byte array buffers,
updated_buffer
is only read by the (texture) consumer thread and written by the producer thread,and most importantly the texture upload is sensibly faster than the computation.
If #2 or #3 are broken, you will have to use a more stringent synchronization approach on the texture buffers, like mutexes, to make sure the texture buffers don't get overwritten when still being uploaded.
Finally, your recursive computation could get a slight boost by moving to an iteration below a certain threshold (say 8*8 pixel block), instead of going all the way down to 1px. In fact, doing it all iteratively should be faster (if done in a single thread on a single core), although it depends a lot on the algorithm for computing the pixels.