The Mersenne Twister sequence is 2 ** ( 624 * 32 - 1 ) - 1
long, and the seed value is used to set an internal state for the PRNG that directly relates to the position within that sequence.
The easiest-to-find repeat appears to be every 2 ** ( 624 * 32 )
, and can be shown to work like this:
repeat_every = 2 ** ( 624 * 32 )
start_value = 5024214421 # Try any value
r1 = Random.new( start_value )
r2 = Random.new( start_value + repeat_every )
r17 = Random.new( start_value + 17 * repeat_every )
r23 = Random.new( start_value + 23 * repeat_every )
r1.rand == r2.rand
# true
r17.rand == r23.rand
# true
Or try this:
repeat_every = 2 ** ( 624 * 32 )
start_value = 5024214421 # Try any value
r1 = Random.new( start_value )
r2 = Random.new( start_value + repeat_every )
Array.new(10) { r1.rand(100) }
# => [84, 86, 8, 58, 5, 21, 79, 10, 17, 50]
Array.new(10) { r2.rand(100) }
# => [84, 86, 8, 58, 5, 21, 79, 10, 17, 50]
The repeat value relates to how Mersenne Twister works. The internal state of MT is an array of 624 32-bit unsigned integers. The Ruby source code you linked packs a Ruby Fixnum into an array - the magic command is
rb_integer_pack( seed, buf, len, sizeof(uint32_t), 0,
INTEGER_PACK_LSWORD_FIRST|INTEGER_PACK_NATIVE_BYTE_ORDER );
however, this isn't something easy to play with, it is defined in internal.h
, so only really accessible if you work on Ruby interpreter itself. You cannot access this function from within a normal C extension.
The packed integer is then loaded to the MT's internal state by the function init_by_array
. This is quite a complex-looking function - the packed seed value is not written literally into the state, but instead the state is generated item by item, adding in the supplied array values, using a variety of xors, additions and cross-referencing the previous value (the Ruby source here also adds in the packed array's index position, commented "non-linear", I think that is one of the referenced modifications to standard MT)
Note that the size of the MT sequence is smaller than 2 ** ( 624 * 32 )
- the repeat_every
value I show here is skipping over 2 sequences at a time, but it is the easiest to find repeating seed value, because it is easy to see how it sets the internal state exactly the same (because the first 624 items in the array representation of the seed are identical, and that is all that gets used later). Other seed values will also produce the same internal state, but the relationship is a complex mapping that pairs up each integer in the 19938-bit space with another integer which creates the same state for MT.