Seleccione N elementos aleatorios de una Lista <T> en C#

https://stackoverflow.com/questions/48087

09-06-2019
|

Pregunta

Necesito un algoritmo rápido para seleccionar 5 elementos aleatorios de una lista genérica.Por ejemplo, me gustaría obtener 5 elementos aleatorios de un List<string>.

Solución

Repita y para cada elemento haga que la probabilidad de selección = (número necesario)/(número restante)

Entonces, si tuvieras 40 artículos, el primero tendría una probabilidad de 5/40 de ser seleccionado.Si es así, el siguiente tiene una probabilidad de 4/39; de lo contrario, tiene una probabilidad de 5/39.Cuando llegues al final, tendrás tus 5 elementos y, a menudo, los tendrás todos antes de eso.

Otros consejos

Usando linq:

YourList.OrderBy(x => rnd.Next()).Take(5)

public static List<T> GetRandomElements<T>(this IEnumerable<T> list, int elementsCount)
{
    return list.OrderBy(arg => Guid.NewGuid()).Take(elementsCount).ToList();
}

En realidad, este es un problema más difícil de lo que parece, principalmente porque muchas soluciones matemáticamente correctas no le permitirán aprovechar todas las posibilidades (más sobre esto a continuación).

Primero, aquí hay algunos generadores de números correctos y fáciles de implementar:

(0) La respuesta de Kyle, que es O(n).

(1) Genere una lista de n pares [(0, rand), (1, rand), (2, rand), ...], ordénelos por la segunda coordenada y use la primera k (para usted, k =5) índices para obtener su subconjunto aleatorio.Creo que esto es fácil de implementar, aunque es tiempo O (n log n).

(2) Inicie una lista vacía s = [] que crecerá hasta convertirse en los índices de k elementos aleatorios.Elija un número r en {0, 1, 2, ..., n-1} al azar, r = rand % n, y agréguelo a s.Luego tome r = rand % (n-1) y coloque s;agregue a r los # elementos menores que en s para evitar colisiones.Luego tome r = rand % (n-2) y haga lo mismo, etc.hasta que tengas k elementos distintos en s.Esto tiene un tiempo de ejecución en el peor de los casos O (k ^ 2).Entonces, para k << n, esto puede ser más rápido.Si mantiene s ordenado y realiza un seguimiento de los intervalos contiguos que tiene, puede implementarlo en O(k log k), pero es más trabajo.

@Kyle: tienes razón, pensándolo bien, estoy de acuerdo con tu respuesta.Al principio lo leí apresuradamente y pensé erróneamente que estaba indicando elegir secuencialmente cada elemento con una probabilidad fija k/n, lo cual habría sido incorrecto, pero su enfoque adaptativo me parece correcto.Lo lamento.

Bien, y ahora viene el truco:asintóticamente (para k fijo, n creciente), ¡hay n^k/k!opciones de k subconjunto de elementos entre n elementos [esta es una aproximación de (n elija k)].Si n es grande y k no es muy pequeño, entonces estos números son enormes.La mejor duración de ciclo que puede esperar en cualquier generador de números aleatorios estándar de 32 bits es 2^32 = 256^4.Entonces, si tenemos una lista de 1000 elementos y queremos elegir 5 al azar, no hay forma de que un generador de números aleatorios estándar abarque todas las posibilidades.Sin embargo, siempre que esté de acuerdo con una opción que funcione bien para conjuntos más pequeños y que siempre "parezca" aleatoria, entonces estos algoritmos deberían estar bien.

Apéndice:Después de escribir esto, me di cuenta de que es complicado implementar la idea (2) correctamente, así que quería aclarar esta respuesta.Para obtener un tiempo O(k log k), necesita una estructura similar a una matriz que admita búsquedas e inserciones O(log m); un árbol binario equilibrado puede hacer esto.Usando dicha estructura para construir una matriz llamada s, aquí hay un pseudopython:

# Returns a container s with k distinct random numbers from {0, 1, ..., n-1}
def ChooseRandomSubset(n, k):
  for i in range(k):
    r = UniformRandom(0, n-i)                 # May be 0, must be < n-i
    q = s.FirstIndexSuchThat( s[q] - q > r )  # This is the search.
    s.InsertInOrder(q ? r + q : r + len(s))   # Inserts right before q.
  return s

Sugiero analizar algunos casos de muestra para ver cómo esto implementa eficientemente la explicación en inglés anterior.

Creo que la respuesta seleccionada es correcta y bastante dulce.Sin embargo, lo implementé de manera diferente, ya que también quería el resultado en orden aleatorio.

    static IEnumerable<SomeType> PickSomeInRandomOrder<SomeType>(
        IEnumerable<SomeType> someTypes,
        int maxCount)
    {
        Random random = new Random(DateTime.Now.Millisecond);

        Dictionary<double, SomeType> randomSortTable = new Dictionary<double,SomeType>();

        foreach(SomeType someType in someTypes)
            randomSortTable[random.NextDouble()] = someType;

        return randomSortTable.OrderBy(KVP => KVP.Key).Take(maxCount).Select(KVP => KVP.Value);
    }

Me acabo de encontrar con este problema y algunas búsquedas más en Google me llevaron al problema de mezclar aleatoriamente una lista: http://en.wikipedia.org/wiki/Fisher-Yates_shuffle

Para mezclar su lista de forma completamente aleatoria (en su lugar), haga esto:

Para mezclar una matriz a de n elementos (índices 0..n-1):

  for i from n − 1 downto 1 do
       j ← random integer with 0 ≤ j ≤ i
       exchange a[j] and a[i]

Si solo necesita los primeros 5 elementos, en lugar de ejecutar i desde n-1 hasta 1, solo necesita ejecutarlo hasta n-5 (es decir:n-5)

Digamos que necesitas k artículos,

Esto se convierte en:

  for (i = n − 1; i >= n-k; i--)
  {
       j = random integer with 0 ≤ j ≤ i
       exchange a[j] and a[i]
  }

Cada elemento seleccionado se intercambia hacia el final de la matriz, por lo que los k elementos seleccionados son los últimos k elementos de la matriz.

Esto lleva tiempo O(k), donde k es el número de elementos seleccionados aleatoriamente que necesita.

Además, si no desea modificar su lista inicial, puede escribir todos sus intercambios en una lista temporal, revertir esa lista y aplicarlos nuevamente, realizando así el conjunto inverso de intercambios y devolviéndole su lista inicial sin cambios. el tiempo de ejecución O(k).

Finalmente, para el verdadero riguroso, si (n == k), debe detenerse en 1, no en n-k, ya que el número entero elegido al azar siempre será 0.

Puedes usar esto pero el pedido se realizará en el lado del cliente.

 .AsEnumerable().OrderBy(n => Guid.NewGuid()).Take(5);

De Dragones en el algoritmo, una interpretación en C#:

int k = 10; // items to select
var items = new List<int>(new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 });
var selected = new List<int>();
double needed = k;
double available = items.Count;
var rand = new Random();
while (selected.Count < k) {
   if( rand.NextDouble() < needed / available ) {
      selected.Add(items[(int)available-1])
      needed--;
   }
   available--;
}

Este algoritmo seleccionará índices únicos de la lista de elementos.

Estaba pensando en el comentario de @JohnShedletsky sobre el respuesta aceptada respecto a (parafrasear):

deberías poder hacer esto en O(subset.Length), en lugar de O(originalList.Length)

Básicamente, deberías poder generar subset índices aleatorios y luego extraerlos de la lista original.

El método

public static class EnumerableExtensions {

    public static Random randomizer = new Random(); // you'd ideally be able to replace this with whatever makes you comfortable

    public static IEnumerable<T> GetRandom<T>(this IEnumerable<T> list, int numItems) {
        return (list as T[] ?? list.ToArray()).GetRandom(numItems);

        // because ReSharper whined about duplicate enumeration...
        /*
        items.Add(list.ElementAt(randomizer.Next(list.Count()))) ) numItems--;
        */
    }

    // just because the parentheses were getting confusing
    public static IEnumerable<T> GetRandom<T>(this T[] list, int numItems) {
        var items = new HashSet<T>(); // don't want to add the same item twice; otherwise use a list
        while (numItems > 0 )
            // if we successfully added it, move on
            if( items.Add(list[randomizer.Next(list.Length)]) ) numItems--;

        return items;
    }

    // and because it's really fun; note -- you may get repetition
    public static IEnumerable<T> PluckRandomly<T>(this IEnumerable<T> list) {
        while( true )
            yield return list.ElementAt(randomizer.Next(list.Count()));
    }

}

Si quisieras ser aún más eficiente, probablemente usarías un HashSet del índices, no los elementos de la lista reales (en caso de que tenga tipos complejos o comparaciones costosas);

La prueba unitaria

Y para asegurarnos de que no tengamos colisiones, etc.

[TestClass]
public class RandomizingTests : UnitTestBase {
    [TestMethod]
    public void GetRandomFromList() {
        this.testGetRandomFromList((list, num) => list.GetRandom(num));
    }

    [TestMethod]
    public void PluckRandomly() {
        this.testGetRandomFromList((list, num) => list.PluckRandomly().Take(num), requireDistinct:false);
    }

    private void testGetRandomFromList(Func<IEnumerable<int>, int, IEnumerable<int>> methodToGetRandomItems, int numToTake = 10, int repetitions = 100000, bool requireDistinct = true) {
        var items = Enumerable.Range(0, 100);
        IEnumerable<int> randomItems = null;

        while( repetitions-- > 0 ) {
            randomItems = methodToGetRandomItems(items, numToTake);
            Assert.AreEqual(numToTake, randomItems.Count(),
                            "Did not get expected number of items {0}; failed at {1} repetition--", numToTake, repetitions);
            if(requireDistinct) Assert.AreEqual(numToTake, randomItems.Distinct().Count(),
                            "Collisions (non-unique values) found, failed at {0} repetition--", repetitions);
            Assert.IsTrue(randomItems.All(o => items.Contains(o)),
                        "Some unknown values found; failed at {0} repetition--", repetitions);
        }
    }
}

Seleccionando N aleatorio Los elementos de un grupo no deberían tener nada que ver con orden!La aleatoriedad se trata de imprevisibilidad y no de barajar posiciones en un grupo.Todas las respuestas que tratan de algún tipo de ordenamiento seguramente serán menos eficientes que las que no lo hacen.Dado que la eficiencia es la clave aquí, publicaré algo que no cambie demasiado el orden de los elementos.

1) Si necesitas verdadero valores aleatorios, lo que significa que no hay restricciones sobre qué elementos elegir (es decir, una vez elegido, el elemento se puede volver a seleccionar):

public static List<T> GetTrueRandom<T>(this IList<T> source, int count, 
                                       bool throwArgumentOutOfRangeException = true)
{
    if (throwArgumentOutOfRangeException && count > source.Count)
        throw new ArgumentOutOfRangeException();

    var randoms = new List<T>(count);
    randoms.AddRandomly(source, count);
    return randoms;
}

Si desactiva el indicador de excepción, podrá elegir elementos aleatorios cualquier cantidad de veces.

Si tiene {1, 2, 3, 4}, entonces puede dar {1, 4, 4}, {1, 4, 3} etc. para 3 elementos o incluso {1, 4, 3, 2, 4} para ¡5 artículos!

Esto debería ser bastante rápido, ya que no tiene nada que comprobar.

2) Si necesitas individual miembros del grupo sin repetición, entonces me basaría en un diccionario (como muchos ya han señalado).

public static List<T> GetDistinctRandom<T>(this IList<T> source, int count)
{
    if (count > source.Count)
        throw new ArgumentOutOfRangeException();

    if (count == source.Count)
        return new List<T>(source);

    var sourceDict = source.ToIndexedDictionary();

    if (count > source.Count / 2)
    {
        while (sourceDict.Count > count)
            sourceDict.Remove(source.GetRandomIndex());

        return sourceDict.Select(kvp => kvp.Value).ToList();
    }

    var randomDict = new Dictionary<int, T>(count);
    while (randomDict.Count < count)
    {
        int key = source.GetRandomIndex();
        if (!randomDict.ContainsKey(key))
            randomDict.Add(key, sourceDict[key]);
    }

    return randomDict.Select(kvp => kvp.Value).ToList();
}

El código es un poco más largo que otros enfoques de diccionario aquí porque no solo estoy agregando, sino también eliminando de la lista, por lo que son como dos bucles.Puedes ver aquí que no tengo reordenado cualquier cosa cuando count se vuelve igual a source.Count.Eso es porque creo la aleatoriedad debe estar en el conjunto devuelto como un todo.quiero decir si quieres 5 elementos aleatorios de 1, 2, 3, 4, 5, no debería importar si es 1, 3, 4, 2, 5 o 1, 2, 3, 4, 5, pero si necesitas 4 elementos del mismo conjunto, entonces debería ceder de forma impredecible en 1, 2, 3, 4, 1, 3, 5, 2, 2, 3, 5, 4 etc.En segundo lugar, cuando el recuento de elementos aleatorios que se devolverán es más de la mitad del grupo original, entonces es más fácil eliminarlos source.Count - count elementos del grupo que agregar count elementos.Por razones de rendimiento he usado source en lugar de sourceDict para obtener luego un índice aleatorio en el método de eliminación.

Entonces, si tienes {1, 2, 3, 4}, esto puede terminar en {1, 2, 3}, {3, 4, 1} etc. para 3 elementos.

3) Si necesita valores aleatorios verdaderamente distintos de su grupo teniendo en cuenta los duplicados en el grupo original, entonces puede utilizar el mismo enfoque que el anterior, pero un HashSet Será más ligero que un diccionario.

public static List<T> GetTrueDistinctRandom<T>(this IList<T> source, int count, 
                                               bool throwArgumentOutOfRangeException = true)
{
    if (count > source.Count)
        throw new ArgumentOutOfRangeException();

    var set = new HashSet<T>(source);

    if (throwArgumentOutOfRangeException && count > set.Count)
        throw new ArgumentOutOfRangeException();

    List<T> list = hash.ToList();

    if (count >= set.Count)
        return list;

    if (count > set.Count / 2)
    {
        while (set.Count > count)
            set.Remove(list.GetRandom());

        return set.ToList();
    }

    var randoms = new HashSet<T>();
    randoms.AddRandomly(list, count);
    return randoms.ToList();
}

El randoms variable se convierte en HashSet para evitar que se agreguen duplicados en los casos más raros en los que Random.Next puede producir el mismo valor, especialmente cuando la lista de entrada es pequeña.

Entonces { 1, 2, 2, 4 } => 3 elementos aleatorios => { 1, 2, 4 } y nunca { 1, 2, 2}

{ 1, 2, 2, 4 } => 4 elementos aleatorios => ¡excepción!o {1, 2, 4} dependiendo del indicador establecido.

Algunos de los métodos de extensión que he utilizado:

static Random rnd = new Random();
public static int GetRandomIndex<T>(this ICollection<T> source)
{
    return rnd.Next(source.Count);
}

public static T GetRandom<T>(this IList<T> source)
{
    return source[source.GetRandomIndex()];
}

static void AddRandomly<T>(this ICollection<T> toCol, IList<T> fromList, int count)
{
    while (toCol.Count < count)
        toCol.Add(fromList.GetRandom());
}

public static Dictionary<int, T> ToIndexedDictionary<T>(this IEnumerable<T> lst)
{
    return lst.ToIndexedDictionary(t => t);
}

public static Dictionary<int, T> ToIndexedDictionary<S, T>(this IEnumerable<S> lst, 
                                                           Func<S, T> valueSelector)
{
    int index = -1;
    return lst.ToDictionary(t => ++index, valueSelector);
}

Si se trata de rendimiento con decenas de miles de elementos en la lista que deben repetirse 10000 veces, entonces es posible que desee tener clase aleatoria más rápida que System.Random, pero no creo que sea gran cosa considerando que lo último probablemente nunca sea un cuello de botella, es bastante rápido.

Editar: Si también necesita reorganizar el orden de los artículos devueltos, no hay nada que pueda superarlo. El enfoque Fisher-Yates de Dhakim - breve, dulce y simple..

La solución simple que uso (probablemente no sea buena para listas grandes):Copie la lista en la lista temporal, luego, en bucle, seleccione aleatoriamente el elemento de la lista temporal y colóquelo en la lista de elementos seleccionados mientras lo elimina de la lista temporal (para que no se pueda volver a seleccionar).

Ejemplo:

List<Object> temp = OriginalList.ToList();
List<Object> selectedItems = new List<Object>();
Random rnd = new Random();
Object o;
int i = 0;
while (i < NumberOfSelectedItems)
{
            o = temp[rnd.Next(temp.Count)];
            selectedItems.Add(o);
            temp.Remove(o);
            i++;
 }

Combiné varias de las respuestas anteriores para crear un método de extensión evaluado por Lazily.Mis pruebas mostraron que el enfoque de Kyle (Orden (N)) es muchas veces más lento que el uso de un conjunto por parte de drzaus para proponer los índices aleatorios para elegir (Orden (K)).El primero realiza muchas más llamadas al generador de números aleatorios y además itera más veces sobre los elementos.

Los objetivos de mi implementación fueron:

1) No obtenga la lista completa si se le proporciona un IEnumerable que no es un IList.Si me dan una secuencia de millones de elementos, no quiero quedarme sin memoria.Utilice el enfoque de Kyle para obtener una solución en línea.

2) Si puedo decir que es una IList, use el enfoque de drzaus, con un giro.Si K es más de la mitad de N, me arriesgo a ser derrotado ya que elijo muchos índices aleatorios una y otra vez y tengo que omitirlos.Por eso compongo una lista de los índices que NO se deben conservar.

3) Garantizo que los artículos serán devueltos en el mismo orden en que fueron encontrados.El algoritmo de Kyle no requirió alteración.El algoritmo de drzaus requería que no emitiera elementos en el orden en que se eligen los índices aleatorios.Reúno todos los índices en un SortedSet y luego emito elementos en orden de índice ordenado.

4) Si K es grande en comparación con N e invierto el sentido del conjunto, entonces enumero todos los elementos y pruebo si el índice no está en el conjunto.Esto significa que pierdo el tiempo de ejecución del pedido (k), pero como K está cerca de N en estos casos, no pierdo mucho.

Aquí está el código:

    /// <summary>
    /// Takes k elements from the next n elements at random, preserving their order.
    /// 
    /// If there are fewer than n elements in items, this may return fewer than k elements.
    /// </summary>
    /// <typeparam name="TElem">Type of element in the items collection.</typeparam>
    /// <param name="items">Items to be randomly selected.</param>
    /// <param name="k">Number of items to pick.</param>
    /// <param name="n">Total number of items to choose from.
    /// If the items collection contains more than this number, the extra members will be skipped.
    /// If the items collection contains fewer than this number, it is possible that fewer than k items will be returned.</param>
    /// <returns>Enumerable over the retained items.
    /// 
    /// See http://stackoverflow.com/questions/48087/select-a-random-n-elements-from-listt-in-c-sharp for the commentary.
    /// </returns>
    public static IEnumerable<TElem> TakeRandom<TElem>(this IEnumerable<TElem> items, int k, int n)
    {
        var r = new FastRandom();
        var itemsList = items as IList<TElem>;

        if (k >= n || (itemsList != null && k >= itemsList.Count))
            foreach (var item in items) yield return item;
        else
        {  
            // If we have a list, we can infer more information and choose a better algorithm.
            // When using an IList, this is about 7 times faster (on one benchmark)!
            if (itemsList != null && k < n/2)
            {
                // Since we have a List, we can use an algorithm suitable for Lists.
                // If there are fewer than n elements, reduce n.
                n = Math.Min(n, itemsList.Count);

                // This algorithm picks K index-values randomly and directly chooses those items to be selected.
                // If k is more than half of n, then we will spend a fair amount of time thrashing, picking
                // indices that we have already picked and having to try again.   
                var invertSet = k >= n/2;  
                var positions = invertSet ? (ISet<int>) new HashSet<int>() : (ISet<int>) new SortedSet<int>();

                var numbersNeeded = invertSet ? n - k : k;
                while (numbersNeeded > 0)
                    if (positions.Add(r.Next(0, n))) numbersNeeded--;

                if (invertSet)
                {
                    // positions contains all the indices of elements to Skip.
                    for (var itemIndex = 0; itemIndex < n; itemIndex++)
                    {
                        if (!positions.Contains(itemIndex))
                            yield return itemsList[itemIndex];
                    }
                }
                else
                {
                    // positions contains all the indices of elements to Take.
                    foreach (var itemIndex in positions)
                        yield return itemsList[itemIndex];              
                }
            }
            else
            {
                // Since we do not have a list, we will use an online algorithm.
                // This permits is to skip the rest as soon as we have enough items.
                var found = 0;
                var scanned = 0;
                foreach (var item in items)
                {
                    var rand = r.Next(0,n-scanned);
                    if (rand < k - found)
                    {
                        yield return item;
                        found++;
                    }
                    scanned++;
                    if (found >= k || scanned >= n)
                        break;
                }
            }
        }  
    }

Utilizo un generador de números aleatorios especializado, pero puedes usar C#. Aleatorio si quieres.(RápidoAleatorio fue escrito por Colin Green y es parte de SharpNEAT.Tiene un período de 2^128-1, que es mejor que muchos RNG).

Aquí están las pruebas unitarias:

[TestClass]
public class TakeRandomTests
{
    /// <summary>
    /// Ensure that when randomly choosing items from an array, all items are chosen with roughly equal probability.
    /// </summary>
    [TestMethod]
    public void TakeRandom_Array_Uniformity()
    {
        const int numTrials = 2000000;
        const int expectedCount = numTrials/20;
        var timesChosen = new int[100];
        var century = new int[100];
        for (var i = 0; i < century.Length; i++)
            century[i] = i;

        for (var trial = 0; trial < numTrials; trial++)
        {
            foreach (var i in century.TakeRandom(5, 100))
                timesChosen[i]++;
        }
        var avg = timesChosen.Average();
        var max = timesChosen.Max();
        var min = timesChosen.Min();
        var allowedDifference = expectedCount/100;
        AssertBetween(avg, expectedCount - 2, expectedCount + 2, "Average");
        //AssertBetween(min, expectedCount - allowedDifference, expectedCount, "Min");
        //AssertBetween(max, expectedCount, expectedCount + allowedDifference, "Max");

        var countInRange = timesChosen.Count(i => i >= expectedCount - allowedDifference && i <= expectedCount + allowedDifference);
        Assert.IsTrue(countInRange >= 90, String.Format("Not enough were in range: {0}", countInRange));
    }

    /// <summary>
    /// Ensure that when randomly choosing items from an IEnumerable that is not an IList, 
    /// all items are chosen with roughly equal probability.
    /// </summary>
    [TestMethod]
    public void TakeRandom_IEnumerable_Uniformity()
    {
        const int numTrials = 2000000;
        const int expectedCount = numTrials / 20;
        var timesChosen = new int[100];

        for (var trial = 0; trial < numTrials; trial++)
        {
            foreach (var i in Range(0,100).TakeRandom(5, 100))
                timesChosen[i]++;
        }
        var avg = timesChosen.Average();
        var max = timesChosen.Max();
        var min = timesChosen.Min();
        var allowedDifference = expectedCount / 100;
        var countInRange =
            timesChosen.Count(i => i >= expectedCount - allowedDifference && i <= expectedCount + allowedDifference);
        Assert.IsTrue(countInRange >= 90, String.Format("Not enough were in range: {0}", countInRange));
    }

    private IEnumerable<int> Range(int low, int count)
    {
        for (var i = low; i < low + count; i++)
            yield return i;
    }

    private static void AssertBetween(int x, int low, int high, String message)
    {
        Assert.IsTrue(x > low, String.Format("Value {0} is less than lower limit of {1}. {2}", x, low, message));
        Assert.IsTrue(x < high, String.Format("Value {0} is more than upper limit of {1}. {2}", x, high, message));
    }

    private static void AssertBetween(double x, double low, double high, String message)
    {
        Assert.IsTrue(x > low, String.Format("Value {0} is less than lower limit of {1}. {2}", x, low, message));
        Assert.IsTrue(x < high, String.Format("Value {0} is more than upper limit of {1}. {2}", x, high, message));
    }
}

Aquí tienes una implementación basada en Mezcla de Fisher-Yates cuya complejidad del algoritmo es O(n) donde n es el subconjunto o tamaño de muestra, en lugar del tamaño de la lista, como señaló John Shedletsky.

public static IEnumerable<T> GetRandomSample<T>(this IList<T> list, int sampleSize)
{
    if (list == null) throw new ArgumentNullException("list");
    if (sampleSize > list.Count) throw new ArgumentException("sampleSize may not be greater than list count", "sampleSize");
    var indices = new Dictionary<int, int>(); int index;
    var rnd = new Random();

    for (int i = 0; i < sampleSize; i++)
    {
        int j = rnd.Next(i, list.Count);
        if (!indices.TryGetValue(j, out index)) index = j;

        yield return list[index];

        if (!indices.TryGetValue(i, out index)) index = i;
        indices[j] = index;
    }
}

Según la respuesta de Kyle, aquí está mi implementación de C#.

/// <summary>
/// Picks random selection of available game ID's
/// </summary>
private static List<int> GetRandomGameIDs(int count)
{       
    var gameIDs = (int[])HttpContext.Current.Application["NonDeletedArcadeGameIDs"];
    var totalGameIDs = gameIDs.Count();
    if (count > totalGameIDs) count = totalGameIDs;

    var rnd = new Random();
    var leftToPick = count;
    var itemsLeft = totalGameIDs;
    var arrPickIndex = 0;
    var returnIDs = new List<int>();
    while (leftToPick > 0)
    {
        if (rnd.Next(0, itemsLeft) < leftToPick)
        {
            returnIDs .Add(gameIDs[arrPickIndex]);
            leftToPick--;
        }
        arrPickIndex++;
        itemsLeft--;
    }

    return returnIDs ;
}

Este método puede ser equivalente al de Kyle.

Digamos que su lista es de tamaño n y quiere k elementos.

Random rand = new Random();
for(int i = 0; k>0; ++i) 
{
    int r = rand.Next(0, n-i);
    if(r<k) 
    {
        //include element i
        k--;
    }
}

Funciona de maravilla :)

-Alex Gilbert

Ampliando la respuesta de @ers, si uno está preocupado por posibles diferentes implementaciones de OrderBy, esto debería ser seguro:

// Instead of this
YourList.OrderBy(x => rnd.Next()).Take(5)

// Temporarily transform 
YourList
    .Select(v => new {v, i = rnd.Next()}) // Associate a random index to each entry
    .OrderBy(x => x.i).Take(5) // Sort by (at this point fixed) random index 
    .Select(x => x.v); // Go back to enumerable of entry

Esto es lo mejor que se me ocurrió en un primer corte:

public List<String> getRandomItemsFromList(int returnCount, List<String> list)
{
    List<String> returnList = new List<String>();
    Dictionary<int, int> randoms = new Dictionary<int, int>();

    while (randoms.Count != returnCount)
    {
        //generate new random between one and total list count
        int randomInt = new Random().Next(list.Count);

        // store this in dictionary to ensure uniqueness
        try
        {
            randoms.Add(randomInt, randomInt);
        }
        catch (ArgumentException aex)
        {
            Console.Write(aex.Message);
        } //we can assume this element exists in the dictonary already 

        //check for randoms length and then iterate through the original list 
        //adding items we select via random to the return list
        if (randoms.Count == returnCount)
        {
            foreach (int key in randoms.Keys)
                returnList.Add(list[randoms[key]]);

            break; //break out of _while_ loop
        }
    }

    return returnList;
}

Usar una lista de elementos aleatorios dentro de un rango de 1: recuento total de la lista y luego simplemente extraer esos elementos de la lista parecía ser la mejor manera, pero usar el Diccionario para garantizar la unicidad es algo sobre lo que todavía estoy reflexionando.

También tenga en cuenta que utilicé una lista de cadenas, reemplácela según sea necesario.

¿Por qué no algo como esto?

 Dim ar As New ArrayList
    Dim numToGet As Integer = 5
    'hard code just to test
    ar.Add("12")
    ar.Add("11")
    ar.Add("10")
    ar.Add("15")
    ar.Add("16")
    ar.Add("17")

    Dim randomListOfProductIds As New ArrayList

    Dim toAdd As String = ""
    For i = 0 To numToGet - 1
        toAdd = ar(CInt((ar.Count - 1) * Rnd()))

        randomListOfProductIds.Add(toAdd)
        'remove from id list
        ar.Remove(toAdd)

    Next
'sorry i'm lazy and have to write vb at work :( and didn't feel like converting to c#

Es mucho más difícil de lo que uno podría pensar.Ver el gran artículo "Barajado" de Jeff.

Escribí un artículo muy breve sobre ese tema que incluye código C#:
Devuelve un subconjunto aleatorio de N elementos de una matriz determinada

Meta:Seleccione N número de elementos de la fuente de colección sin duplicación.Creé una extensión para cualquier colección genérica.Así es como lo hice:

public static class CollectionExtension
{
    public static IList<TSource> RandomizeCollection<TSource>(this IList<TSource> source, int maxItems)
    {
        int randomCount = source.Count > maxItems ? maxItems : source.Count;
        int?[] randomizedIndices = new int?[randomCount];
        Random random = new Random();

        for (int i = 0; i < randomizedIndices.Length; i++)
        {
            int randomResult = -1;
            while (randomizedIndices.Contains((randomResult = random.Next(0, source.Count))))
            {
                //0 -> since all list starts from index 0; source.Count -> maximum number of items that can be randomize
                //continue looping while the generated random number is already in the list of randomizedIndices
            }

            randomizedIndices[i] = randomResult;
        }

        IList<TSource> result = new List<TSource>();
        foreach (int index in randomizedIndices)
            result.Add(source.ElementAt(index));

        return result;
    }
}

Recientemente hice esto en mi proyecto usando una idea similar a El punto 1 de Tyler.
Estaba cargando un montón de preguntas y seleccionando cinco al azar.La clasificación se logró mediante un Comparador.
aTodas las preguntas se cargaron en una lista del Clasificador de preguntas, que luego se clasificó utilizando el Función de clasificación de listas y los primeros k elementos fueron seleccionados.

    private class QuestionSorter : IComparable<QuestionSorter>
    {
        public double SortingKey
        {
            get;
            set;
        }

        public Question QuestionObject
        {
            get;
            set;
        }

        public QuestionSorter(Question q)
        {
            this.SortingKey = RandomNumberGenerator.RandomDouble;
            this.QuestionObject = q;
        }

        public int CompareTo(QuestionSorter other)
        {
            if (this.SortingKey < other.SortingKey)
            {
                return -1;
            }
            else if (this.SortingKey > other.SortingKey)
            {
                return 1;
            }
            else
            {
                return 0;
            }
        }
    }

Uso:

    List<QuestionSorter> unsortedQuestions = new List<QuestionSorter>();

    // add the questions here

    unsortedQuestions.Sort(unsortedQuestions as IComparer<QuestionSorter>);

    // select the first k elements

Este es mi enfoque (texto completo aquí http://krkadev.blogspot.com/2010/08/random-numbers- without-repetition.html ).

Debería ejecutarse en O(K) en lugar de O(N), donde K es el número de elementos deseados y N es el tamaño de la lista para elegir:

public <T> List<T> take(List<T> source, int k) {
 int n = source.size();
 if (k > n) {
   throw new IllegalStateException(
     "Can not take " + k +
     " elements from a list with " + n +
     " elements");
 }
 List<T> result = new ArrayList<T>(k);
 Map<Integer,Integer> used = new HashMap<Integer,Integer>();
 int metric = 0;
 for (int i = 0; i < k; i++) {
   int off = random.nextInt(n - i);
   while (true) {
     metric++;
     Integer redirect = used.put(off, n - i - 1);
     if (redirect == null) {
       break;
     }
     off = redirect;
   }
   result.add(source.get(off));
 }
 assert metric <= 2*k;
 return result;
}

Esto no es tan elegante ni tan eficiente como la solución aceptada, pero es rápido de redactar.Primero, permute la matriz aleatoriamente, luego seleccione los primeros K elementos.En pitón,

import numpy

N = 20
K = 5

idx = np.arange(N)
numpy.random.shuffle(idx)

print idx[:K]

Usaría un método de extensión.

    public static IEnumerable<T> TakeRandom<T>(this IEnumerable<T> elements, int countToTake)
    {
        var random = new Random();

        var internalList = elements.ToList();

        var selected = new List<T>();
        for (var i = 0; i < countToTake; ++i)
        {
            var next = random.Next(0, internalList.Count - selected.Count);
            selected.Add(internalList[next]);
            internalList[next] = internalList[internalList.Count - selected.Count];
        }
        return selected;
    }

public static IEnumerable<T> GetRandom<T>(this IList<T> list, int count, Random random)
    {
        // Probably you should throw exception if count > list.Count
        count = Math.Min(list.Count, count);

        var selectedIndices = new SortedSet<int>();

        // Random upper bound
        int randomMax = list.Count - 1;

        while (selectedIndices.Count < count)
        {
            int randomIndex = random.Next(0, randomMax);

            // skip over already selected indeces
            foreach (var selectedIndex in selectedIndices)
                if (selectedIndex <= randomIndex)
                    ++randomIndex;
                else
                    break;

            yield return list[randomIndex];

            selectedIndices.Add(randomIndex);
            --randomMax;
        }
    }

Memoria:~ contar
Complejidad:O(cuenta²)

Cuando N es muy grande, el método normal que mezcla aleatoriamente los N números y selecciona, digamos, los primeros k números, puede resultar prohibitivo debido a la complejidad del espacio.El siguiente algoritmo requiere sólo O(k) para complejidades tanto de tiempo como de espacio.

http://arxiv.org/abs/1512.00501

def random_selection_indices(num_samples, N):
    modified_entries = {}
    seq = []
    for n in xrange(num_samples):
        i = N - n - 1
        j = random.randrange(i)

        # swap a[j] and a[i] 
        a_j = modified_entries[j] if j in modified_entries else j 
        a_i = modified_entries[i] if i in modified_entries else i

        if a_i != j:
            modified_entries[j] = a_i   
        elif j in modified_entries:   # no need to store the modified value if it is the same as index
            modified_entries.pop(j)

        if a_j != i:
            modified_entries[i] = a_j 
        elif i in modified_entries:   # no need to store the modified value if it is the same as index
            modified_entries.pop(i)
        seq.append(a_j)
    return seq

Usar LINQ con listas grandes (cuando es costoso tocar cada elemento) Y si puedes vivir con la posibilidad de duplicados:

new int[5].Select(o => (int)(rnd.NextDouble() * maxIndex)).Select(i => YourIEnum.ElementAt(i))

Para mi uso, tenía una lista de 100.000 elementos y, debido a que los extraje de una base de datos, reduje aproximadamente a la mitad (o mejor) el tiempo en comparación con una ronda de toda la lista.

Tener una lista grande reducirá en gran medida las probabilidades de duplicados.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow