Selezionare N elementi casuali da un Elenco<T> in C#

https://stackoverflow.com/questions/48087

09-06-2019
|

Domanda

Ho bisogno di un veloce algoritmo per selezionare 5 elementi casuali da un elenco generico.Per esempio, mi piacerebbe avere 5 elementi casuali da un List<string>.

Soluzione

Scorrere e per ogni elemento di rendere la probabilità di selezione = (numero)/(numero a sinistra)

Quindi, se si aveva 40 elementi, il primo sarebbe un 5/40 possibilità di essere selezionato.Se si, la prossima è una 4/39 possibilità, altrimenti è un 5/39 possibilità.Con il tempo si arriva alla fine avrete il vostro 5 elementi, e spesso avrete tutti prima che.

Altri suggerimenti

Utilizzando linq:

YourList.OrderBy(x => rnd.Next()).Take(5)

public static List<T> GetRandomElements<T>(this IEnumerable<T> list, int elementsCount)
{
    return list.OrderBy(arg => Guid.NewGuid()).Take(elementsCount).ToList();
}

Questo è in realtà un problema più difficile di quanto sembra, soprattutto perché molti matematicamente corretta soluzioni non effettivamente consentire di colpire tutte le possibilità (più su questo più avanti).

In primo luogo, qui sono alcuni facili da implementare, correggere-se-si-sono-un-vero-generatore di numeri casuali:

(0) di Kyle risposta, che è O(n).

(1) Generare un elenco di n coppie [(0, rand), (1, rand), (2, rand), ...], ordina la seconda coordinata e utilizzare il primo k (per te, per k=5) indici di ottenere il vostro sottoinsieme casuale.Penso che questo è facile da implementare, anche se è O(n log n) tempo.

(2) Init una lista vuota s = [] che crescere per essere indici di k elementi casuali.Scegliere un numero r in {0, 1, 2, ..., n-1} a caso, r = rand % n, e di aggiungere a questo s.Prendere quindi r = rand % (n-1) e bastone in s;aggiungi a r i # elementi di meno che in s per evitare collisioni.Prendere quindi r = rand % (n-2), e fare la stessa cosa, etc.fino a quando si dispone di k elementi distinti s.Questo è il peggior caso di tempo di esecuzione O(k^2).Quindi per k << n, questo può essere più veloce.Se si mantiene ordinati e la pista che intervalli contigui si ha, si può implementare in O(k log k), ma è necessario più lavoro.

@Kyle, hai ragione, a pensarci bene sono d'accordo con la tua risposta.Ho fretta di leggere in un primo momento, ed erroneamente pensato che si stava indicando in sequenza scegliere, ogni elemento fisso con probabilità k/n, che sarebbe stato sbagliato, ma il tuo approccio adattivo appare corretta per me.Mi dispiace.

Ok, e ora per il calciatore:asintoticamente (fisso k, n crescita), ci sono n^k/k!scelte di k elemento sottoinsieme di n elementi [questa è una approssimazione di (n scegliere k)].Se n è grande, e k non è molto piccola, quindi questi numeri sono enormi.Il miglior ciclo di lunghezza si può sperare in qualche standard a 32 bit generatore di numero casuale è 2^32 = 256^4.Quindi, se abbiamo una lista di 1000 elementi, e vogliamo scegliere il 5 a caso, non c'è un modo standard generatore di numero casuale ha colpito tutte le possibilità.Tuttavia, se sei ok con una scelta che funziona bene per i più piccoli set, e sempre "sembra" random, quindi questi algoritmi dovrebbe essere ok.

Addendum:Dopo aver scritto questo, mi sono accorto che è difficile da implementare idea (2) correttamente, così ho voluto precisare questa risposta.Per ottenere a O(k log k), avete bisogno di un array-come la struttura che supporta O(log m) ricerca e inserti - un albero binario bilanciato può fare questo.L'utilizzo di tale struttura per costruire un array chiamato s, ecco alcuni pseudopython:

# Returns a container s with k distinct random numbers from {0, 1, ..., n-1}
def ChooseRandomSubset(n, k):
  for i in range(k):
    r = UniformRandom(0, n-i)                 # May be 0, must be < n-i
    q = s.FirstIndexSuchThat( s[q] - q > r )  # This is the search.
    s.InsertInOrder(q ? r + q : r + len(s))   # Inserts right before q.
  return s

Io suggerisco di correre attraverso un paio di casi campione per vedere come questo in modo efficiente implementa la precedente spiegazione in inglese.

Penso che la risposta selezionata è corretta e abbastanza dolce.L'ho implementato in modo diverso anche se, come ho voluto anche il risultato in ordine casuale.

    static IEnumerable<SomeType> PickSomeInRandomOrder<SomeType>(
        IEnumerable<SomeType> someTypes,
        int maxCount)
    {
        Random random = new Random(DateTime.Now.Millisecond);

        Dictionary<double, SomeType> randomSortTable = new Dictionary<double,SomeType>();

        foreach(SomeType someType in someTypes)
            randomSortTable[random.NextDouble()] = someType;

        return randomSortTable.OrderBy(KVP => KVP.Key).Take(maxCount).Select(KVP => KVP.Value);
    }

Ho appena imbattuto in questo problema, e alcuni più ricerche su google mi ha portato il problema in modo casuale riproduzione casuale di una lista: http://en.wikipedia.org/wiki/Fisher-Yates_shuffle

Completamente casuale (shuffle elenco (in luogo) questo:

Per la riproduzione casuale di un array a di n elementi (indici 0..n-1):

  for i from n − 1 downto 1 do
       j ← random integer with 0 ≤ j ≤ i
       exchange a[j] and a[i]

Se hai bisogno solo i primi 5 elementi, quindi invece di corsa, ho tutta la strada da n-1 a 1, hai solo bisogno di eseguire a n-5 (ie:n-5)

Diciamo che avete bisogno di k elementi,

Questo diventa:

  for (i = n − 1; i >= n-k; i--)
  {
       j = random integer with 0 ≤ j ≤ i
       exchange a[j] and a[i]
  }

Ogni elemento selezionato viene scambiato verso la fine dell'array, in modo che il k elementi scelti sono l'ultimo k elementi di un array.

Questo richiede tempo O(k), dove k è il numero di selezionati in modo casuale gli elementi di cui hai bisogno.

Inoltre, se non si desidera modificare la vostra lista iniziale, è possibile scrivere tutti i vostri swap in una lista temporanea, di invertire tale elenco, e applicare di nuovo, quindi eseguire l'inverso set di swap e tornare elenco iniziale senza cambiare l'O(k) tempo di esecuzione.

Infine, per il vero stickler, if (n == k), si devono fermare a 1, non n-k, come la scelta casualmente intero sarà sempre 0.

È possibile utilizzare questo, ma l'ordinazione avverrà sul lato client

 .AsEnumerable().OrderBy(n => Guid.NewGuid()).Take(5);

Da Draghi nell'Algoritmo, una interpretazione in C#:

int k = 10; // items to select
var items = new List<int>(new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 });
var selected = new List<int>();
double needed = k;
double available = items.Count;
var rand = new Random();
while (selected.Count < k) {
   if( rand.NextDouble() < needed / available ) {
      selected.Add(items[(int)available-1])
      needed--;
   }
   available--;
}

Questo algoritmo selezionare unica seguenti indici degli elementi di lista.

Stavo pensando commento di @JohnShedletsky sul accettato risposta per quanto riguarda (parafrasi):

si dovrebbe essere in grado per questo di O(sottoinsieme.Lunghezza), piuttosto che O(originalList.Lunghezza)

In sostanza, si dovrebbe essere in grado di generare subset casuale indici e poi li raccolgo dalla lista originale.

Il Metodo

public static class EnumerableExtensions {

    public static Random randomizer = new Random(); // you'd ideally be able to replace this with whatever makes you comfortable

    public static IEnumerable<T> GetRandom<T>(this IEnumerable<T> list, int numItems) {
        return (list as T[] ?? list.ToArray()).GetRandom(numItems);

        // because ReSharper whined about duplicate enumeration...
        /*
        items.Add(list.ElementAt(randomizer.Next(list.Count()))) ) numItems--;
        */
    }

    // just because the parentheses were getting confusing
    public static IEnumerable<T> GetRandom<T>(this T[] list, int numItems) {
        var items = new HashSet<T>(); // don't want to add the same item twice; otherwise use a list
        while (numItems > 0 )
            // if we successfully added it, move on
            if( items.Add(list[randomizer.Next(list.Length)]) ) numItems--;

        return items;
    }

    // and because it's really fun; note -- you may get repetition
    public static IEnumerable<T> PluckRandomly<T>(this IEnumerable<T> list) {
        while( true )
            yield return list.ElementAt(randomizer.Next(list.Count()));
    }

}

Se si voleva essere ancora più efficiente, si sarebbe probabilmente utilizzare un HashSet del indici, non le elenco di elementi (nel caso in cui hai dei tipi complessi o costosi confronti);

L'Unità Di Test

E per assicurarsi che non ci sono collisioni, etc.

[TestClass]
public class RandomizingTests : UnitTestBase {
    [TestMethod]
    public void GetRandomFromList() {
        this.testGetRandomFromList((list, num) => list.GetRandom(num));
    }

    [TestMethod]
    public void PluckRandomly() {
        this.testGetRandomFromList((list, num) => list.PluckRandomly().Take(num), requireDistinct:false);
    }

    private void testGetRandomFromList(Func<IEnumerable<int>, int, IEnumerable<int>> methodToGetRandomItems, int numToTake = 10, int repetitions = 100000, bool requireDistinct = true) {
        var items = Enumerable.Range(0, 100);
        IEnumerable<int> randomItems = null;

        while( repetitions-- > 0 ) {
            randomItems = methodToGetRandomItems(items, numToTake);
            Assert.AreEqual(numToTake, randomItems.Count(),
                            "Did not get expected number of items {0}; failed at {1} repetition--", numToTake, repetitions);
            if(requireDistinct) Assert.AreEqual(numToTake, randomItems.Distinct().Count(),
                            "Collisions (non-unique values) found, failed at {0} repetition--", repetitions);
            Assert.IsTrue(randomItems.All(o => items.Contains(o)),
                        "Some unknown values found; failed at {0} repetition--", repetitions);
        }
    }
}

Selezione Di N casuale gli elementi da un gruppo non dovrebbe avere nulla a che fare con ordine!La casualità è di circa imprevedibilità e non mischiare le posizioni in un gruppo.Tutte le risposte che fare con una specie di ordinazione è destinata ad essere meno efficiente rispetto a quelli che non lo fanno.Dal momento che l'efficienza è la chiave qui, io posto qualcosa che non cambia l'ordine degli elementi troppo.

1) Se avete bisogno di vero valori casuali che significa che non vi è alcuna restrizione su quali elementi scegliere (vale a dire, una volta scelto l'oggetto può essere modificata dall'utente):

public static List<T> GetTrueRandom<T>(this IList<T> source, int count, 
                                       bool throwArgumentOutOfRangeException = true)
{
    if (throwArgumentOutOfRangeException && count > source.Count)
        throw new ArgumentOutOfRangeException();

    var randoms = new List<T>(count);
    randoms.AddRandomly(source, count);
    return randoms;
}

Se si imposta l'eccezione bandiera off, allora si può scegliere elementi casuali qualsiasi numero di volte.

Se si dispone di { 1, 2, 3, 4 }, poi si può dare { 1, 4, 4 }, { 1, 4, 3 } ecc per 3 elementi o anche { 1, 4, 3, 2, 4 } per i 5 pezzi!

Questo dovrebbe essere abbastanza veloce, in quanto non ha nulla da controllare.

2) Se avete bisogno di individuale i membri del gruppo, senza ripetizione, quindi vorrei fare affidamento su un dizionario (come molti hanno fatto notare già).

public static List<T> GetDistinctRandom<T>(this IList<T> source, int count)
{
    if (count > source.Count)
        throw new ArgumentOutOfRangeException();

    if (count == source.Count)
        return new List<T>(source);

    var sourceDict = source.ToIndexedDictionary();

    if (count > source.Count / 2)
    {
        while (sourceDict.Count > count)
            sourceDict.Remove(source.GetRandomIndex());

        return sourceDict.Select(kvp => kvp.Value).ToList();
    }

    var randomDict = new Dictionary<int, T>(count);
    while (randomDict.Count < count)
    {
        int key = source.GetRandomIndex();
        if (!randomDict.ContainsKey(key))
            randomDict.Add(key, sourceDict[key]);
    }

    return randomDict.Select(kvp => kvp.Value).ToList();
}

Il codice è un po ' più lunghe rispetto ad altri dizionario approcci qui, perché io non sono solo l'aggiunta, ma anche la rimozione dall'elenco, quindi è un pò di due passanti.Si può vedere qui che non ho riordinate niente a count diventa uguale source.Count.Questo perché credo che casualità dovrebbero essere in risultato tutta.Voglio dire, se si desidera 5 elementi casuali da 1, 2, 3, 4, 5, non importa se si tratta di 1, 3, 4, 2, 5 o 1, 2, 3, 4, 5, ma , se è necessario 4 voci dalla stessa serie, quindi dovrebbe resa in modo imprevedibile 1, 2, 3, 4, 1, 3, 5, 2, 2, 3, 5, 4 ecc.In secondo luogo, quando il conteggio di oggetti casuali per essere restituito è più della metà del gruppo originale, quindi è più facile da rimuovere source.Count - count voci dal gruppo che aggiungere count elementi.Per motivi di prestazioni ho usato source invece di sourceDict per ottenere quindi casuale indice il metodo di rimozione.

Quindi, se avete { 1, 2, 3, 4 }, questo può finire in { 1, 2, 3 }, { 3, 4, 1 } ecc per 3 elementi.

3) Se avete bisogno di realmente distinti casuale valori del gruppo, tenendo conto del duplicati nel gruppo originale, allora si può utilizzare lo stesso approccio come sopra, ma una HashSet sarà più leggero di un dizionario.

public static List<T> GetTrueDistinctRandom<T>(this IList<T> source, int count, 
                                               bool throwArgumentOutOfRangeException = true)
{
    if (count > source.Count)
        throw new ArgumentOutOfRangeException();

    var set = new HashSet<T>(source);

    if (throwArgumentOutOfRangeException && count > set.Count)
        throw new ArgumentOutOfRangeException();

    List<T> list = hash.ToList();

    if (count >= set.Count)
        return list;

    if (count > set.Count / 2)
    {
        while (set.Count > count)
            set.Remove(list.GetRandom());

        return set.ToList();
    }

    var randoms = new HashSet<T>();
    randoms.AddRandomly(list, count);
    return randoms.ToList();
}

Il randoms variabile è fatto un HashSet per evitare duplicati (con aggiunta più raro dei rari casi in cui Random.Next il risultato è il medesimo valore, soprattutto quando l'ingresso in lista è di piccole dimensioni.

Così { 1, 2, 2, 4 } => 3 elementi casuali => { 1, 2, 4 } e non { 1, 2, 2}

{ 1, 2, 2, 4 } => 4 elementi casuali => eccezione!!o { 1, 2, 4 } a seconda del flag impostato.

Alcuni dei metodi di estensione che ho usato:

static Random rnd = new Random();
public static int GetRandomIndex<T>(this ICollection<T> source)
{
    return rnd.Next(source.Count);
}

public static T GetRandom<T>(this IList<T> source)
{
    return source[source.GetRandomIndex()];
}

static void AddRandomly<T>(this ICollection<T> toCol, IList<T> fromList, int count)
{
    while (toCol.Count < count)
        toCol.Add(fromList.GetRandom());
}

public static Dictionary<int, T> ToIndexedDictionary<T>(this IEnumerable<T> lst)
{
    return lst.ToIndexedDictionary(t => t);
}

public static Dictionary<int, T> ToIndexedDictionary<S, T>(this IEnumerable<S> lst, 
                                                           Func<S, T> valueSelector)
{
    int index = -1;
    return lst.ToDictionary(t => ++index, valueSelector);
}

Se si tratta di prestazioni con decine di 1000s di elementi nell'elenco di dover essere ripetuto 10000 volte, allora si può desiderare di avere più veloce classe random di System.Random, ma io non credo che sia un grosso problema, considerando quest'ultimo probabilmente non è mai un collo di bottiglia, è abbastanza abbastanza veloce..

Edit: Se avete bisogno di ri-organizzare l'ordine dei prodotti restituiti, allora non c'è niente che può battere dhakim la Fisher-Yates approccio a breve, dolce e semplice..

La semplice soluzione che uso io (probabilmente non va bene per gli elenchi di grandi dimensioni):Copia la lista in lista temporanea, poi in loop in modo casuale selezionare la Voce da temp elenco e metterlo in elementi selezionati elenco durante la rimozione forma temp elenco (e quindi non può essere modificata dall'utente).

Esempio:

List<Object> temp = OriginalList.ToList();
List<Object> selectedItems = new List<Object>();
Random rnd = new Random();
Object o;
int i = 0;
while (i < NumberOfSelectedItems)
{
            o = temp[rnd.Next(temp.Count)];
            selectedItems.Add(o);
            temp.Remove(o);
            i++;
 }

Ho combinato le diverse risposte di cui sopra per creare un Pigramente valutata metodo di estensione.Il mio test ha mostrato che Kyle approccio di Ordine(N)) è molte volte più lento di drzaus' uso di un set di proporre casuale indici di scegliere (di Ordine(K)).L'ex esegue molte più chiamate per il generatore di numeri casuali, più si ripeta più volte negli articoli.

Gli obiettivi del mio attuazione sono state:

1) non si rendono conto l'elenco completo se dato un oggetto IEnumerable che non è un IList.Se mi viene data una sequenza di un'infinità di elementi, non voglio esaurire la memoria.L'uso di Kyle approccio di soluzione.

2) Se posso dire che è un IList, utilizzare drzaus' approccio, con una torsione.Se K è più della metà di N, io rischio di botte come scegliere molti casuale indici di nuovo e di nuovo e di saltare loro.Così ho comporre una lista di indici di NON tenere.

3) posso garantire che la merce verrà restituita nello stesso ordine in cui sono stati rilevati.Kyle algoritmo richiesto nessuna alterazione.drzaus' algoritmo richiesto che io non emettono elementi in ordine casuale indici sono scelti.Ho capito tutti gli indici in un SortedSet, quindi emettono elementi ordinati in ordine di indice.

4) Se K è grande rispetto a N e I invertire il senso del set, poi ho enumerare tutti gli elementi di prova e se l'indice non è nel set.Questo significa che Perdo l'Ordine(K) il tempo di esecuzione, ma dato K è vicino a N in questi casi, non si perde molto.

Ecco il codice:

    /// <summary>
    /// Takes k elements from the next n elements at random, preserving their order.
    /// 
    /// If there are fewer than n elements in items, this may return fewer than k elements.
    /// </summary>
    /// <typeparam name="TElem">Type of element in the items collection.</typeparam>
    /// <param name="items">Items to be randomly selected.</param>
    /// <param name="k">Number of items to pick.</param>
    /// <param name="n">Total number of items to choose from.
    /// If the items collection contains more than this number, the extra members will be skipped.
    /// If the items collection contains fewer than this number, it is possible that fewer than k items will be returned.</param>
    /// <returns>Enumerable over the retained items.
    /// 
    /// See http://stackoverflow.com/questions/48087/select-a-random-n-elements-from-listt-in-c-sharp for the commentary.
    /// </returns>
    public static IEnumerable<TElem> TakeRandom<TElem>(this IEnumerable<TElem> items, int k, int n)
    {
        var r = new FastRandom();
        var itemsList = items as IList<TElem>;

        if (k >= n || (itemsList != null && k >= itemsList.Count))
            foreach (var item in items) yield return item;
        else
        {  
            // If we have a list, we can infer more information and choose a better algorithm.
            // When using an IList, this is about 7 times faster (on one benchmark)!
            if (itemsList != null && k < n/2)
            {
                // Since we have a List, we can use an algorithm suitable for Lists.
                // If there are fewer than n elements, reduce n.
                n = Math.Min(n, itemsList.Count);

                // This algorithm picks K index-values randomly and directly chooses those items to be selected.
                // If k is more than half of n, then we will spend a fair amount of time thrashing, picking
                // indices that we have already picked and having to try again.   
                var invertSet = k >= n/2;  
                var positions = invertSet ? (ISet<int>) new HashSet<int>() : (ISet<int>) new SortedSet<int>();

                var numbersNeeded = invertSet ? n - k : k;
                while (numbersNeeded > 0)
                    if (positions.Add(r.Next(0, n))) numbersNeeded--;

                if (invertSet)
                {
                    // positions contains all the indices of elements to Skip.
                    for (var itemIndex = 0; itemIndex < n; itemIndex++)
                    {
                        if (!positions.Contains(itemIndex))
                            yield return itemsList[itemIndex];
                    }
                }
                else
                {
                    // positions contains all the indices of elements to Take.
                    foreach (var itemIndex in positions)
                        yield return itemsList[itemIndex];              
                }
            }
            else
            {
                // Since we do not have a list, we will use an online algorithm.
                // This permits is to skip the rest as soon as we have enough items.
                var found = 0;
                var scanned = 0;
                foreach (var item in items)
                {
                    var rand = r.Next(0,n-scanned);
                    if (rand < k - found)
                    {
                        yield return item;
                        found++;
                    }
                    scanned++;
                    if (found >= k || scanned >= n)
                        break;
                }
            }
        }  
    }

Io uso specializzato generatore di numeri casuali, ma si può usare solo con C#'s Casuale se lo si desidera.(FastRandom è stato scritto da Colin Verde e fa parte di SharpNEAT.Ha un periodo di 2^128-1 che è meglio di molti Rng.)

Qui ci sono le unità di test:

[TestClass]
public class TakeRandomTests
{
    /// <summary>
    /// Ensure that when randomly choosing items from an array, all items are chosen with roughly equal probability.
    /// </summary>
    [TestMethod]
    public void TakeRandom_Array_Uniformity()
    {
        const int numTrials = 2000000;
        const int expectedCount = numTrials/20;
        var timesChosen = new int[100];
        var century = new int[100];
        for (var i = 0; i < century.Length; i++)
            century[i] = i;

        for (var trial = 0; trial < numTrials; trial++)
        {
            foreach (var i in century.TakeRandom(5, 100))
                timesChosen[i]++;
        }
        var avg = timesChosen.Average();
        var max = timesChosen.Max();
        var min = timesChosen.Min();
        var allowedDifference = expectedCount/100;
        AssertBetween(avg, expectedCount - 2, expectedCount + 2, "Average");
        //AssertBetween(min, expectedCount - allowedDifference, expectedCount, "Min");
        //AssertBetween(max, expectedCount, expectedCount + allowedDifference, "Max");

        var countInRange = timesChosen.Count(i => i >= expectedCount - allowedDifference && i <= expectedCount + allowedDifference);
        Assert.IsTrue(countInRange >= 90, String.Format("Not enough were in range: {0}", countInRange));
    }

    /// <summary>
    /// Ensure that when randomly choosing items from an IEnumerable that is not an IList, 
    /// all items are chosen with roughly equal probability.
    /// </summary>
    [TestMethod]
    public void TakeRandom_IEnumerable_Uniformity()
    {
        const int numTrials = 2000000;
        const int expectedCount = numTrials / 20;
        var timesChosen = new int[100];

        for (var trial = 0; trial < numTrials; trial++)
        {
            foreach (var i in Range(0,100).TakeRandom(5, 100))
                timesChosen[i]++;
        }
        var avg = timesChosen.Average();
        var max = timesChosen.Max();
        var min = timesChosen.Min();
        var allowedDifference = expectedCount / 100;
        var countInRange =
            timesChosen.Count(i => i >= expectedCount - allowedDifference && i <= expectedCount + allowedDifference);
        Assert.IsTrue(countInRange >= 90, String.Format("Not enough were in range: {0}", countInRange));
    }

    private IEnumerable<int> Range(int low, int count)
    {
        for (var i = low; i < low + count; i++)
            yield return i;
    }

    private static void AssertBetween(int x, int low, int high, String message)
    {
        Assert.IsTrue(x > low, String.Format("Value {0} is less than lower limit of {1}. {2}", x, low, message));
        Assert.IsTrue(x < high, String.Format("Value {0} is more than upper limit of {1}. {2}", x, high, message));
    }

    private static void AssertBetween(double x, double low, double high, String message)
    {
        Assert.IsTrue(x > low, String.Format("Value {0} is less than lower limit of {1}. {2}", x, low, message));
        Assert.IsTrue(x < high, String.Format("Value {0} is more than upper limit of {1}. {2}", x, high, message));
    }
}

Qui si ha una implementazione basata su Fisher-Yates Shuffle il cui algoritmo di complessità O(n) dove n è il sottoinsieme o la dimensione del campione, invece di lista di dimensioni, come Giovanni Shedletsky sottolineato.

public static IEnumerable<T> GetRandomSample<T>(this IList<T> list, int sampleSize)
{
    if (list == null) throw new ArgumentNullException("list");
    if (sampleSize > list.Count) throw new ArgumentException("sampleSize may not be greater than list count", "sampleSize");
    var indices = new Dictionary<int, int>(); int index;
    var rnd = new Random();

    for (int i = 0; i < sampleSize; i++)
    {
        int j = rnd.Next(i, list.Count);
        if (!indices.TryGetValue(j, out index)) index = j;

        yield return list[index];

        if (!indices.TryGetValue(i, out index)) index = i;
        indices[j] = index;
    }
}

Basato su Kyle risposta, ecco la mia implementazione c#.

/// <summary>
/// Picks random selection of available game ID's
/// </summary>
private static List<int> GetRandomGameIDs(int count)
{       
    var gameIDs = (int[])HttpContext.Current.Application["NonDeletedArcadeGameIDs"];
    var totalGameIDs = gameIDs.Count();
    if (count > totalGameIDs) count = totalGameIDs;

    var rnd = new Random();
    var leftToPick = count;
    var itemsLeft = totalGameIDs;
    var arrPickIndex = 0;
    var returnIDs = new List<int>();
    while (leftToPick > 0)
    {
        if (rnd.Next(0, itemsLeft) < leftToPick)
        {
            returnIDs .Add(gameIDs[arrPickIndex]);
            leftToPick--;
        }
        arrPickIndex++;
        itemsLeft--;
    }

    return returnIDs ;
}

Questo metodo può essere equivalente a Kyle.

Dire che la tua lista è di dimensione n e volete k elementi.

Random rand = new Random();
for(int i = 0; k>0; ++i) 
{
    int r = rand.Next(0, n-i);
    if(r<k) 
    {
        //include element i
        k--;
    }
}

Funziona come un fascino :)

-Alex Gilbert

Si estende da @re la risposta, se uno è preoccupato per le possibili diverse implementazioni di OrderBy, questo dovrebbe essere sicuro:

// Instead of this
YourList.OrderBy(x => rnd.Next()).Take(5)

// Temporarily transform 
YourList
    .Select(v => new {v, i = rnd.Next()}) // Associate a random index to each entry
    .OrderBy(x => x.i).Take(5) // Sort by (at this point fixed) random index 
    .Select(x => x.v); // Go back to enumerable of entry

Questo è il meglio che ho potuto venire con un primo taglio:

public List<String> getRandomItemsFromList(int returnCount, List<String> list)
{
    List<String> returnList = new List<String>();
    Dictionary<int, int> randoms = new Dictionary<int, int>();

    while (randoms.Count != returnCount)
    {
        //generate new random between one and total list count
        int randomInt = new Random().Next(list.Count);

        // store this in dictionary to ensure uniqueness
        try
        {
            randoms.Add(randomInt, randomInt);
        }
        catch (ArgumentException aex)
        {
            Console.Write(aex.Message);
        } //we can assume this element exists in the dictonary already 

        //check for randoms length and then iterate through the original list 
        //adding items we select via random to the return list
        if (randoms.Count == returnCount)
        {
            foreach (int key in randoms.Keys)
                returnList.Add(list[randoms[key]]);

            break; //break out of _while_ loop
        }
    }

    return returnList;
}

Utilizzo di un elenco di casuali all'interno di un intervallo di 1 - elenco totale conteggio e poi basta tirare quelle voci in elenco, sembrava essere il modo migliore, ma usare il Dizionario per garantire l'unicità è qualcosa che sto ancora rimuginando sopra.

Inoltre nota che ho usato un elenco di stringhe, sostituire, se necessario.

perché non qualcosa di simile a questo:

 Dim ar As New ArrayList
    Dim numToGet As Integer = 5
    'hard code just to test
    ar.Add("12")
    ar.Add("11")
    ar.Add("10")
    ar.Add("15")
    ar.Add("16")
    ar.Add("17")

    Dim randomListOfProductIds As New ArrayList

    Dim toAdd As String = ""
    For i = 0 To numToGet - 1
        toAdd = ar(CInt((ar.Count - 1) * Rnd()))

        randomListOfProductIds.Add(toAdd)
        'remove from id list
        ar.Remove(toAdd)

    Next
'sorry i'm lazy and have to write vb at work :( and didn't feel like converting to c#

È molto più difficile di quanto si possa pensare.Vedere la grande Articolo "Rimescolamento" da parte di Jeff.

Ho scritto un breve articolo su questo tema, codice C#:
Ritorno sottoinsieme casuale di N elementi di una matrice data

Obiettivo:Selezionare il numero N di oggetti da collezione sorgente senza duplicazione.Ho creato un'estensione per qualsiasi raccolta generica.Ecco come ho fatto:

public static class CollectionExtension
{
    public static IList<TSource> RandomizeCollection<TSource>(this IList<TSource> source, int maxItems)
    {
        int randomCount = source.Count > maxItems ? maxItems : source.Count;
        int?[] randomizedIndices = new int?[randomCount];
        Random random = new Random();

        for (int i = 0; i < randomizedIndices.Length; i++)
        {
            int randomResult = -1;
            while (randomizedIndices.Contains((randomResult = random.Next(0, source.Count))))
            {
                //0 -> since all list starts from index 0; source.Count -> maximum number of items that can be randomize
                //continue looping while the generated random number is already in the list of randomizedIndices
            }

            randomizedIndices[i] = randomResult;
        }

        IList<TSource> result = new List<TSource>();
        foreach (int index in randomizedIndices)
            result.Add(source.ElementAt(index));

        return result;
    }
}

Recentemente ho fatto questo sul mio progetto utilizzando un idea simile a Tyler punto 1.
Stavo caricando un sacco di domande e la selezione di cinque a caso.L'ordinamento è stato raggiunto utilizzando un IComparer.
atutte le domande sono state caricate in un QuestionSorter lista, che è stata quindi ordinata utilizzando il Elenco della funzione di Ordinamento e i primi k elementi selezionati.

    private class QuestionSorter : IComparable<QuestionSorter>
    {
        public double SortingKey
        {
            get;
            set;
        }

        public Question QuestionObject
        {
            get;
            set;
        }

        public QuestionSorter(Question q)
        {
            this.SortingKey = RandomNumberGenerator.RandomDouble;
            this.QuestionObject = q;
        }

        public int CompareTo(QuestionSorter other)
        {
            if (this.SortingKey < other.SortingKey)
            {
                return -1;
            }
            else if (this.SortingKey > other.SortingKey)
            {
                return 1;
            }
            else
            {
                return 0;
            }
        }
    }

Utilizzo:

    List<QuestionSorter> unsortedQuestions = new List<QuestionSorter>();

    // add the questions here

    unsortedQuestions.Sort(unsortedQuestions as IComparer<QuestionSorter>);

    // select the first k elements

Ecco il mio approccio al testo integrale qui http://krkadev.blogspot.com/2010/08/random-numbers-without-repetition.html ).

Esso deve essere eseguito in O(K) al posto di O(N), dove K è il numero di wanted elementi e N è la dimensione della lista tra cui scegliere:

public <T> List<T> take(List<T> source, int k) {
 int n = source.size();
 if (k > n) {
   throw new IllegalStateException(
     "Can not take " + k +
     " elements from a list with " + n +
     " elements");
 }
 List<T> result = new ArrayList<T>(k);
 Map<Integer,Integer> used = new HashMap<Integer,Integer>();
 int metric = 0;
 for (int i = 0; i < k; i++) {
   int off = random.nextInt(n - i);
   while (true) {
     metric++;
     Integer redirect = used.put(off, n - i - 1);
     if (redirect == null) {
       break;
     }
     off = redirect;
   }
   result.add(source.get(off));
 }
 assert metric <= 2*k;
 return result;
}

Questo non è elegante o efficiente come soluzione accettata, ma è veloce a scrivere.Primo, permute array in modo casuale, quindi selezionare i primi K elementi.In python,

import numpy

N = 20
K = 5

idx = np.arange(N)
numpy.random.shuffle(idx)

print idx[:K]

Vorrei utilizzare un metodo di estensione.

    public static IEnumerable<T> TakeRandom<T>(this IEnumerable<T> elements, int countToTake)
    {
        var random = new Random();

        var internalList = elements.ToList();

        var selected = new List<T>();
        for (var i = 0; i < countToTake; ++i)
        {
            var next = random.Next(0, internalList.Count - selected.Count);
            selected.Add(internalList[next]);
            internalList[next] = internalList[internalList.Count - selected.Count];
        }
        return selected;
    }

public static IEnumerable<T> GetRandom<T>(this IList<T> list, int count, Random random)
    {
        // Probably you should throw exception if count > list.Count
        count = Math.Min(list.Count, count);

        var selectedIndices = new SortedSet<int>();

        // Random upper bound
        int randomMax = list.Count - 1;

        while (selectedIndices.Count < count)
        {
            int randomIndex = random.Next(0, randomMax);

            // skip over already selected indeces
            foreach (var selectedIndex in selectedIndices)
                if (selectedIndex <= randomIndex)
                    ++randomIndex;
                else
                    break;

            yield return list[randomIndex];

            selectedIndices.Add(randomIndex);
            --randomMax;
        }
    }

Memoria:~conte
Complessità:O(conte²)

Quando N è molto grande, il metodo normale che casualmente si mescola il N numeri e seleziona, per dire, primi k numeri, può essere proibitivo a causa di spazio complessità.Il seguente algoritmo richiede solo O(k) sia per il tempo e lo spazio complessità.

http://arxiv.org/abs/1512.00501

def random_selection_indices(num_samples, N):
    modified_entries = {}
    seq = []
    for n in xrange(num_samples):
        i = N - n - 1
        j = random.randrange(i)

        # swap a[j] and a[i] 
        a_j = modified_entries[j] if j in modified_entries else j 
        a_i = modified_entries[i] if i in modified_entries else i

        if a_i != j:
            modified_entries[j] = a_i   
        elif j in modified_entries:   # no need to store the modified value if it is the same as index
            modified_entries.pop(j)

        if a_j != i:
            modified_entries[i] = a_j 
        elif i in modified_entries:   # no need to store the modified value if it is the same as index
            modified_entries.pop(i)
        seq.append(a_j)
    return seq

Utilizzando LINQ con elenchi di grandi dimensioni (quando costose toccare ogni elemento) E se si può vivere con la possibilità di duplicati:

new int[5].Select(o => (int)(rnd.NextDouble() * maxIndex)).Select(i => YourIEnum.ElementAt(i))

Per il mio utilizzo ho avuto un elenco di 100.000 elementi, e a causa del loro essere tirato da un DB ho circa halfed (o meglio) rispetto a un rnd su tutta la lista.

Avendo un ampio elenco di ridurre la probabilità molto per i duplicati.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow