In addition to @Deduplicator good advice, need to scale the j
offset by size
.
This code also changes the j
calculation. I do not think what you have will work well.
#include <stdlib.h>
static void shuffle(void *array, size_t n, size_t size) {
// This if() is not needed functionally, but left per OP's style
if (n > 1) {
char *carray = array;
void * aux;
aux = malloc(size);
size_t i;
for (i = 1; i < n; ++i) {
size_t j = rand() % (i + 1);
j *= size;
memcpy(aux, &carray[j], size);
memcpy(&carray[j], &carray[i*size], size);
memcpy(&carray[i*size], aux, size);
}
free(aux);
}
}
Clarify the weakness of size_t j = i + rand() / (RAND_MAX / (n - i) + 1);
Substituting rand()
with the values 0
to RAND_MAX
in the formula will generate a sample distribution. Example (i=0, n=30265, RAND_MAX=2147483647) generates the values j = 0 to 30263
, 70957 times each and j = 30264
= 41000. This is a bad case and likely not the worst case.
Using size_t j = i + r%(n-i);
generates a nearly uniform result.
Example (i=0, n=30265, RAND_MAX=2147483647) generates the values j = 0 to 307
, 70957 times each and j = 308 to 30264
= 70956 times. Methods exist to compensate for this small bias. Many SO postings.
Note: Inherit weaknesses in a poor rand()
are not fixed by either of these methods, but method selection can make things noticeably worse when trying to use a sub-range of 0
to RAND_MAX
[Edit]
Holes appear with size_t j = i + rand() / (RAND_MAX / (n - i) + 1);
Somewhere about n > sqrt(RAND_MAX)
and i=0
, values near n
will not be generated. Example: (i=0, n=104643, RAND_MAX=2147483647)
does not generate values 104638 to 104642.