Are writes to a MutableByteArray atomic?

Question

I'm guessing this is undefined behavior, and in any case multi-word writes are not atomic, as demonstrated by this quick-and-dirty program:

{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
module Main where
import Data.Primitive.Types
import Data.Primitive.ByteArray
import Control.Concurrent
import Control.Concurrent.STM
import Control.Applicative
import Control.Monad
import GHC.Prim
main = do
    arr <- newByteArray 16  -- on 64-bit, this is enough for two Ints (8 each)
    inp <- newTVarIO (0::Int)
    o1 <- newEmptyTMVarIO
    o2 <- newEmptyTMVarIO
    let writer last = do
            val <- atomically $ do
                    x <- readTVar inp
                    x <$ check (x > last)
            let v' = (val,val+1)
            writeByteArray arr 0 v'
            atomically $ putTMVar o1 ()
            writer val
        reader last = do
            val <- atomically $ do
                    x <- readTVar inp
                    x <$ check (x > last)
            rVal <- readByteArray arr 0 :: IO (Int,Int)
            let v1 = (val,val+1)
                v0 = (val-1,val)
            when (not $ rVal `elem` [v0,v1,(0,0)]) $ error $ show (val, "got:", rVal)
            atomically $ putTMVar o2 ()
            reader val
    let control :: Int -> IO ()
        control !n = do
            atomically $ writeTVar inp n
            mapM_ (atomically . takeTMVar) [o1,o2]
            when (n<100000) $ control (n+1)
    forkIO $ writer 0
    forkIO $ reader 0
    control 1
    print "done!"

instance Prim (Int,Int) where
    sizeOf# _ = 2# *# (sizeOf# (undefined :: Int))
    alignment# _ = alignment# ( undefined :: Int)
    readByteArray# arr n s = case readByteArray# arr (2# *# n) s of
        (#s',i1 #) -> case readByteArray# arr ((2# *# n) +# 1#) s of
          (#s'2,i2 #) -> (#s'2,(i1,i2)#)
    writeByteArray# arr n (i1,i2) s = case writeByteArray# arr (2# *# n) i1 s of
          s' -> writeByteArray# arr ((2# *# n) +# 1#) i2 s'

building this program with ghc-7.6.3, -O2 -threaded -rtsopts, in 7 executions with -N3 I got the following outcomes:

foo: (4,"got:",(3,5))
foo: (59037,"got:",(59036,59038))
foo: "done!"
foo: (92936,"got:",(92936,92936))
foo: (399,"got:",(398,400))
foo: (7196,"got:",(7195,7197))
foo: (11950,"got:",(11950,11950))

reads/writes of a single machine word are probably atomic so long as the memory model of the CPU architecture makes that guarantee.

One objection to this demo is that the Prim instance for (Int,Int) is bogus. Which it sort-of is. However, given the available primitives, I don't see how to implement anything better for multi-word types.

You'll need to use some other synchronization method to ensure that multi-word writes are atomic. One simple approach is to keep your array in an MVar. Or maybe my kickchan package would help (at least inspirationally, if it doesn't address your use case).