c# concurrency of struct array

https://stackoverflow.com/questions/10993521

14-06-2021
|

题

Given an array of struct:

public struct Instrument
{
    public double NoS;
    public double Last;
}

var a1 = new Instrument[100];

And a threading task pool that is writing to those elements on the basis that a single element may at most be written to by two threads concurrently, one for each of the double fields (there is upstream queuing by topic effectively).

And the knowledge that double's can be written atomically on 64 bit. (edit this mistakenly said 32 bit originally)

I need to periodically perform a calculation using all the values in the array and I'd like them to be consistent during the calc.

So I can snapshot the array with:

var snapshot = a1.Clone();

Now the question I have is with regards to the specifics of the syncronisation. If I make the members volatile, I don't think that is going to help the clone at all, as the read/write aquire/releases are not at the array level.

Now I could have an array lock, but this will add a lot of contention on the most frequent process of writing data into the array. So not ideal.

Alternatively I could have a per row lock, but that would be a real pain as they'd all need to be aquired prior to clone, meanwhile I've got the writes all backing up.

Now it doesn't really matter if the snapshot doesn't have the very latest value if its a matter of microseconds etc, so I think I could probably get away with just having no lock. My only concern is if there could be a scenario where there isn't a cache writeback for a sustained period. Is this something I should worry about? The writers are in TPL dataflow and the sole logic is to set the two fields in the struct. I don't really know how or if function scope tends to correlate to cache write backs though.

Thoughts/advice?

edit: What about if I used an interlocked write to the variables in the struct?

edit2: The volume of writes is MUCH higher than the reads. There are also two seperate and concurrent services writing to the Nos & Last fields. So they could be being written simultaneously at once. This causes problems with a reference object approach for atomicity.

edit3: Further detail. Assume array is from 30-1000 elements and each element could be being updated multiple times a second.

解决方案

Since Instrument contains two doubles (two 64-bit values), you can't write it atomically (even on 64-bit machines). This means that the Clone method can never make a thread-safe copy without doing some kind of synchronization.

TLDR; Don't use a struct, use an immutable class.

You would probably have more luck with a small redesign. Try using immutable data structures and concurrent collections from the .NET framework. For instance, make your Instrument an immutable class:

// Important: Note that Instrument is now a CLASS!! 
public class Instrument
{
    public Instrument(double nos, double last)
    {
        this.NoS = nos;
        this.Last = last;
    }

    // NOTE: Private setters. Class can't be changed
    // after initialization.
    public double NoS { get; private set; }
    public double Last { get; private set; }
}

This way updating an Instrument means you have to create a new one, which makes it much easier to reason about this. When you are sure that only one thread is working with a single Instrument you are done, since a worker can now safely do this:

Instrument old = a[5];

var newValue = new Instrument(old.NoS + 1, old.Last - 10);

a[5] = newValue;

Since, reference types are 32-bit (or 64-bit on a 64-bit machine) updating the reference is garanteed to be atomic. The clone will now always result in a correct copy (it might lack behind, but that doesn't seem to be a problem for you).

UPDATE

After re-reading your question, I see that I misread it, since one thread is not writing to an Instrument, but is writing to an instrument value, but the solution is practically the same: use immutable reference types. One simple trick for instance, is to change the backing fields of the NoS and Last properties to objects. This makes updating them atomic:

// Instrument can be a struct again.
public struct Instrument
{
    private object nos;
    private object last;

    public double NoS
    {
        get { return (double)(this.nos ?? 0d); }
        set { this.nos = value; }
    }

    public double Last
    {
        get { return (double)(this.last ?? 0d); }
        set { this.last = value; }
    }
}

When changing one of the properties, the value will be boxed, and boxed values are immutable reference types. This way you can safely update those properties.

其他提示

And the knowledge that double's can be written atomically on 32 bit.

No, that is not guaranteed:

12.5 Atomicity of variable references

Reads and writes of the following data types shall be atomic: bool, char, byte, sbyte, short, ushort, uint, int, float, and reference types. In addition, reads and writes of enum types with an underlying type in the previous list shall also be atomic. Reads and writes of other types, including long, ulong, double, and decimal, as well as user-defined types, need not be atomic.

(emphasis mine)

No guarantee is made regarding doubles on 32-bit, or even on 64-bit. A strcut composed of 2 doubles is even more problematic. You should rethink your strategy.

You could (ab)use a ReaderWriterLockSlim.

Take a read lock when writing (since you say there is no contention between writers). And take a write lock when cloning.

Not sure I'd do this though unless there's really no alternative. Could be confusing for whoever maintains this down the line.

Reads and writes of individual array elements, or individual struct fields, are generally independent. If while one thread is writing a particular field of a particular struct instance, no other thread will attempt to access that same field, an array of structs will be implicitly threadsafe without any locking required beyond the logic that enforces the above conditions.

If it is possible that one thread might try to read a double while another thread is writing it, but it's not possible that two threads might try to write simultaneously, there are a number of approaches you can take to ensure that a read won't see a partially-written value. One which hasn't been mentioned yet would be to define an int64 field, and use custom methods to read and write double values there (bitwise-converting them, and using Interlocked as needed).

Another approach would be to have a changeCount variable for each array slot, which gets incremented so the two LSB's are "10" before anything else before the struct is written, and Interlocked.Increment it by 2 afterward (see note below). Before code reads the struct, it should check whether a write is in progress. If not, it should perform the read and ensure a write hasn't started or happened (if a write occurred after the read was started, loop back to the beginning). If a write is in progress when code wants to read, it should acquire a shared lock, check whether the write is still in progress, and if so use an interlocked operation to set the LSB of changeCount and Monitor.Wait on the lock. The code which wrote the struct should notice in its Interlocked.Increment that the LSB got set, and should Pulse the lock. If the memory model ensures that reads by a single thread will be processed in order, and that writes by a single thread will be processed in order, and if only one thread will ever try to write an array slot at a time, this approach should limit the multi-processor overhead to a single Interlocked operation in the non-contention case. Note that one must carefully study the rules about what is or is not implied by the memory model before using this sort of code, since it can be tricky.

BTW, there are two more approaches one could take if one wanted to have each array element be a class type rather than a struct:

Use an immutable class type, and use `Interlocked.CompareExchange` any time you want to update an element. The pattern to use is this:
```
  MyClass oldVal,newVal;
  do
  {
    oldVal = theArray[subscript];
    newVal = new MyClass(oldVal.this, oldVal.that+5); // Or whatever change
  } while (Threading.Interlocked.CompareExchange(theArray[subscript], newVal, oldVal) != oldVal);
```
This approach will always yield a logically-correct atomic update of the array element. If, between the time the array element is read and the time it is updated, something else changes the value, the `CompareExchange` will leave the array element unaffected, and the code will loop back and try again. This approach works reasonably well in the absence of contention, though every update will require generating a new object instance. If many threads are trying to update the same array slot, however, and the constructor for `MyClass` takes any significant amount of time to execute, it's possible for code to thrash, repeatedly creating new objects and then finding out they're obsolete by the time they could be stored. Code will always make forward progress, but not necessarily quickly.
Use a mutable class, and lock on the class objects any time one wishes to read or write them. This approach would avoid having to create new class object instances any time something is changed, but locking would add some overhead of its own. Note that both reads and writes would have to be locked, whereas the immutable-class approach only required `Interlocked` methods to be used on writes.

I tend to think arrays of structs are nicer data holders than arrays of class objects, but both approaches have advantages.

Ok, so had a think about this over lunch.

I see two, possibly 3 solutions here.

First important note: The immutable idea does not work in my use case because I have two services running in parallel writing to NoS and Last independently. This means that I would need an extra layer of sync logic between those two services to ensure that whilst the new ref is being created by one services, the other one is not doing the same. Classic race condition problem so definitely not right for this problem (although yes I could have a ref for each double and do it that way but its getting ridiculous at that point)

Solution 1 Whole cache level lock. Maybe use a spinlock and just lock for all updates and the snapshot (with memcpy). This is simplest and probably totally fine for volumes I'm talking about.

Solution 2 Make all writes to doubles use interlocked writes. when I want to snapshot, iterate the array and each value using interlocked read to populate the copy. This may cause per struct tearing but the doubles are intact which is fine as this is continuously updating data so the concept of latest is a little abstract.

Solution 3 Don't think this will work, but what about interlocked writes to all doubles, and then just use memcopy. I am not sure if I will get tearing of the doubles though? (remember I don't care about tearing at struct level).

If solution 3 works then I guess its best performance, but otherwise I am more inclined for solution 1.

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow