Optimal LINQ query to get a random sub collection - Shuffle
-
22-07-2019 - |
Question
Please suggest an easiest way to get a random shuffled collection of count 'n' from a collection having 'N' items. where n <= N
Solution
Another option is to use OrderBy and to sort on a GUID value, which you can do so using:
var result = sequence.OrderBy(elem => Guid.NewGuid());
I did some empirical tests to convince myself that the above actually generates a random distribution (which it appears to do). You can see my results at Techniques for Randomly Reordering an Array.
OTHER TIPS
Further to mquander's answer and Dan Blanchard's comment, here's a LINQ-friendly extension method that performs a Fisher-Yates-Durstenfeld shuffle:
// take n random items from yourCollection
var randomItems = yourCollection.Shuffle().Take(n);
// ...
public static class EnumerableExtensions
{
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
{
return source.Shuffle(new Random());
}
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source, Random rng)
{
if (source == null) throw new ArgumentNullException("source");
if (rng == null) throw new ArgumentNullException("rng");
return source.ShuffleIterator(rng);
}
private static IEnumerable<T> ShuffleIterator<T>(
this IEnumerable<T> source, Random rng)
{
var buffer = source.ToList();
for (int i = 0; i < buffer.Count; i++)
{
int j = rng.Next(i, buffer.Count);
yield return buffer[j];
buffer[j] = buffer[i];
}
}
}
This has some issues with "random bias" and I am sure it's not optimal, this is another possibility:
var r = new Random();
l.OrderBy(x => r.NextDouble()).Take(n);
Shuffle the collection into a random order and take the first n
items from the result.
A bit less random, but efficient:
var rnd = new Random();
var toSkip = list.Count()-n;
if (toSkip > 0)
toSkip = rnd.Next(toSkip);
else
toSkip=0;
var randomlySelectedSequence = list.Skip(toSkip).Take(n);
I write this overrides method:
public static IEnumerable<T> Randomize<T>(this IEnumerable<T> items) where T : class
{
int max = items.Count();
var secuencia = Enumerable.Range(1, max).OrderBy(n => n * n * (new Random()).Next());
return ListOrder<T>(items, secuencia.ToArray());
}
private static IEnumerable<T> ListOrder<T>(IEnumerable<T> items, int[] secuencia) where T : class
{
List<T> newList = new List<T>();
int count = 0;
foreach (var seed in count > 0 ? secuencia.Skip(1) : secuencia.Skip(0))
{
newList.Add(items.ElementAt(seed - 1));
count++;
}
return newList.AsEnumerable<T>();
}
Then, I have my source list (all items)
var listSource = p.Session.QueryOver<Listado>(() => pl)
.Where(...);
Finally, I call "Randomize" and I get a random sub-collection of items, in my case, 5 items:
var SubCollection = Randomize(listSource.List()).Take(5).ToList();
Sorry for ugly code :-), but
var result =yourCollection.OrderBy(p => (p.GetHashCode().ToString() + Guid.NewGuid().ToString()).GetHashCode()).Take(n);