You basically have the right ideas. Here are a few points:
1) The wave file is linear signal vs time, so your understanding of it is correct. (Many audio things are logarithmic, so it not unreasonable to think it might be non-linear -- eg, LPs are encoded in a nonlinear way.)
2) If you're going to do math, first convert to float
or int32
so you don't overrun the limits of int16
.
3) To offset in time, use numpy slicing. That is, something like new = old[1000:]+old[:-1000]
. Note that you need to add sections of the same length together, so if you add a time shift, you can't add it to the full array because the timeshift will be shorter.
4) As for adding with "random time" you can to that with the above for a single random time. To make the time vary continuously throughout the addition, you need to warp your original signal and that will be more complicated.