Structures de données .NET: ArrayList, List, HashTable, Dictionnaire, SortedList, SortedDictionary & # 8212; Vitesse, mémoire et quand utiliser chacun?

StackOverflow https://stackoverflow.com/questions/128636

Question

.NET a beaucoup de structures de données complexes. Malheureusement, certains d'entre eux sont assez similaires et je ne sais pas toujours quand utiliser l'un ou l'autre. La plupart de mes livres C # et Visual Basic en parlent dans une certaine mesure, mais ils n'entrent jamais vraiment dans les détails.

Quelle est la différence entre Array, ArrayList, List, Hashtable, Dictionary, SortedList et SortedDictionary?

Lesquels sont énumérables (IList - peut faire des boucles 'pour chaque')? Lesquels utilisent des paires clé / valeur (IDict)?

Qu'en est-il de l'empreinte mémoire? Vitesse d'insertion? Vitesse de récupération?

Existe-t-il d'autres structures de données qui valent la peine d'être mentionnées?

Je cherche toujours plus de détails sur l'utilisation de la mémoire et sa vitesse (notation Big-O).

Était-ce utile?

La solution

De mémoire:

  • Array * - représente un tableau de mémoire old-school, un peu comme un alias pour un tableau de type [] normal. Peut énumérer. Ne peut pas grandir automatiquement. Je supposerais une vitesse d’insertion et de retour très rapide.

  • ArrayList - tableau à croissance automatique. Ajoute plus de frais généraux. Can enum., Probablement plus lent qu'un tableau normal mais quand même assez rapide. Celles-ci sont beaucoup utilisées dans .NET

  • List - un de mes favoris - peut être utilisé avec des génériques afin que vous puissiez avoir un tableau fortement typé, par exemple. Liste < string > . Sinon, cela ressemble beaucoup à ArrayList

  • Table de hachage - ancienne table de hachage. O (1) à O (n) pire des cas. Peut énumérer les propriétés value et keys et faire des paires clé / valeur

  • Dictionnaire - comme ci-dessus, mais uniquement fortement typé via des génériques, tel que Dictionnaire , chaîne >

  • SortedList - liste générique triée. Ralentit lors de l'insertion car il doit trouver où mettre les choses. Peut enum.., Probablement la même chose sur la récupération car il n'a pas à recourir, mais la suppression sera plus lente qu'une vieille liste plaine.

J'ai tendance à utiliser List et Dictionnaire tout le temps - une fois que vous commencez à les utiliser fortement typés avec des génériques, il est très difficile de revenir au standard non générique. les uns.

Il existe également de nombreuses autres structures de données - il y a KeyValuePair que vous pouvez utiliser pour faire des choses intéressantes, il y a un SortedDictionary qui peut également être utile.

Autres conseils

Dans la mesure du possible, utilisez des médicaments génériques. Cela inclut:

  • Liste au lieu de ArrayList
  • Dictionnaire au lieu de HashTable

D'abord, toutes les collections de .NET implémentent IEnumerable.

Deuxièmement, bon nombre des collections sont des doublons, car des génériques ont été ajoutés à la version 2.0 du framework.

Ainsi, bien que les collections génériques ajoutent probablement des fonctionnalités, dans la plupart des cas:

  • List est une implémentation générique de ArrayList.
  • Le dictionnaire est une implémentation générique de Hashtable

Les tableaux sont une collection de taille fixe dans laquelle vous pouvez modifier la valeur stockée dans un index donné.

SortedDictionary est un IDictionary qui est trié en fonction des clés. SortedList est un IDictionary qui est trié en fonction d’un IComparer requis.

Ainsi, les implémentations IDictionary (celles qui supportent KeyValuePairs) sont les suivantes: * Hashtable * Dictionnaire * SortedList * SortedDictionary

Une autre collection ajoutée dans .NET 3.5 est le hachage. C’est une collection qui prend en charge les opérations sur les ensembles.

De plus, LinkedList est une implémentation standard de la liste chaînée (List est une liste de type array pour une récupération plus rapide).

Un bon aide-mémoire mentionnant la complexité des structures de données, des algorithmes, etc.

Voici quelques conseils généraux à votre intention:

  • Vous pouvez utiliser foreach sur les types qui implémentent IEnumerable . IList est essentiellement un IEnumberable avec des propriétés Count et Item (accès aux éléments à l'aide d'un index de base zéro). IDictionary signifie en revanche que vous pouvez accéder aux éléments par un index non chiffrable.

  • Array , ArrayList et Liste implémentent tous IList . Dictionnaire , SortedDictionary et Hashtable implémente IDictionary .

  • Si vous utilisez .NET 2.0 ou une version ultérieure, il est recommandé d’utiliser des équivalents génériques des types mentionnés.

  • Pour connaître la complexité spatio-temporelle de diverses opérations sur ces types, consultez leur documentation.

  • Les structures de données .NET se trouvent dans l'espace de noms System.Collections . Il existe des bibliothèques de types telles que PowerCollections qui offrent des structures de données supplémentaires.

  • >
  • Pour bien comprendre les structures de données, consultez des ressources telles que CLRS .

.NET data structures:

More to conversation about why ArrayList and List are actually different

Arrays

As one user states, Arrays are the "old school" collection (yes, arrays are considered a collection though not part of System.Collections). But, what is "old school" about arrays in comparison to other collections, i.e the ones you have listed in your title (here, ArrayList and List(Of T))? Let's start with the basics by looking at Arrays.

To start, Arrays in Microsoft .NET are, "mechanisms that allow you to treat several [logically-related] items as a single collection," (see linked article). What does that mean? Arrays store individual members (elements) sequentially, one after the other in memory with a starting address. By using the array, we can easily access the sequentially stored elements beginning at that address.

Beyond that and contrary to programming 101 common conceptions, Arrays really can be quite complex:

Arrays can be single dimension, multidimensional, or jadded (jagged arrays are worth reading about). Arrays themselves are not dynamic: once initialized, an array of n size reserves enough space to hold n number of objects. The number of elements in the array cannot grow or shrink. Dim _array As Int32() = New Int32(100) reserves enough space on the memory block for the array to contain 100 Int32 primitive type objects (in this case, the array is initialized to contain 0s). The address of this block is returned to _array.

According to the article, Common Language Specification (CLS) requires that all arrays be zero-based. Arrays in .NET support non-zero-based arrays; however, this is less common. As a result of the "common-ness" of zero-based arrays, Microsoft has spent a lot of time optimizing their performance; therefore, single dimension, zero-based (SZs) arrays are "special" - and really the best implementation of an array (as opposed to multidimensional, etc.) - because SZs have specific intermediary language instructions for manipulating them.

Arrays are always passed by reference (as a memory address) - an important piece of the Array puzzle to know. While they do bounds checking (will throw an error), bounds checking can also be disabled on arrays.

Again, the biggest hindrance to arrays is that they are not re-sizable. They have a "fixed" capacity. Introducing ArrayList and List(Of T) to our history:

ArrayList - non-generic list

The ArrayList (along with List(Of T) - though there are some critical differences, here, explained later) - is perhaps best thought of as the next addition to collections (in the broad sense). ArrayList inherit from the IList (a descendant of 'ICollection') interface. ArrayLists, themselves, are bulkier - requiring more overhead - than Lists.

IList does enable the implementation to treat ArrayLists as fixed-sized lists (like Arrays); however, beyond the additional functionallity added by ArrayLists, there are no real advantages to using ArrayLists that are fixed size as ArrayLists (over Arrays) in this case are markedly slower.

From my reading, ArrayLists cannot be jagged: "Using multidimensional arrays as elements... is not supported". Again, another nail in the coffin of ArrayLists. ArrayLists are also not "typed" - meaning that, underneath everything, an ArrayList is simply a dynamic Array of Objects: Object[]. This requires a lot of boxing (implicit) and unboxing (explicit) when implementing ArrayLists, again adding to their overhead.

Unsubstantiated thought: I think I remember either reading or having heard from one of my professors that ArrayLists are sort of the bastard conceptual child of the attempt to move from Arrays to List-type Collections, i.e. while once having been a great improvement to Arrays, they are no longer the best option as further development has been done with respect to collections

List(Of T): What ArrayList became (and hoped to be)

The difference in memory usage is significant enough to where a List(Of Int32) consumed 56% less memory than an ArrayList containing the same primitive type (8 MB vs. 19 MB in the above gentleman's linked demonstration: again, linked here) - though this is a result compounded by the 64-bit machine. This difference really demonstrates two things: first (1), a boxed Int32-type "object" (ArrayList) is much bigger than a pure Int32 primitive type (List); second (2), the difference is exponential as a result of the inner-workings of a 64-bit machine.

So, what's the difference and what is a List(Of T)? MSDN defines a List(Of T) as, "... a strongly typed list of objects that can be accessed by index." The importance here is the "strongly typed" bit: a List(Of T) 'recognizes' types and stores the objects as their type. So, an Int32 is stored as an Int32 and not an Object type. This eliminates the issues caused by boxing and unboxing.

MSDN specifies this difference only comes into play when storing primitive types and not reference types. Too, the difference really occurs on a large scale: over 500 elements. What's more interesting is that the MSDN documentation reads, "It is to your advantage to use the type-specific implementation of the List(Of T) class instead of using the ArrayList class...."

Essentially, List(Of T) is ArrayList, but better. It is the "generic equivalent" of ArrayList. Like ArrayList, it is not guaranteed to be sorted until sorted (go figure). List(Of T) also has some added functionality.

I sympathise with the question - I too found (find?) the choice bewildering, so I set out scientifically to see which data structure is the fastest (I did the test using VB, but I imagine C# would be the same, since both languages do the same thing at the CLR level). You can see some benchmarking results conducted by me here (there's also some discussion of which data type is best to use in which circumstances).

They're spelled out pretty well in intellisense. Just type System.Collections. or System.Collections.Generics (preferred) and you'll get a list and short description of what's available.

Hashtables/Dictionaries are O(1) performance, meaning that performance is not a function of size. That's important to know.

EDIT: In practice, the average time complexity for Hashtable/Dictionary<> lookups is O(1).

The generic collections will perform better than their non-generic counterparts, especially when iterating through many items. This is because boxing and unboxing no longer occurs.

An important note about Hashtable vs Dictionary for high frequency systematic trading engineering: Thread Safety Issue

Hashtable is thread safe for use by multiple threads. Dictionary public static members are thread safe, but any instance members are not guaranteed to be so.

So Hashtable remains the 'standard' choice in this regard.

There are subtle and not-so-subtle differences between generic and non-generic collections. They merely use different underlying data structures. For example, Hashtable guarantees one-writer-many-readers without sync. Dictionary does not.

Actually, I think MSDN helps provide pretty good answers to all these questions. Just look up .NET collections.

Most popular C# Data Structures and Collections

  • Array
  • ArrayList
  • List
  • LinkedList
  • Dictionary
  • HashSet
  • Stack
  • Queue
  • SortedList

C#.NET has a lot of different data structures, for example, one of the most common ones is an Array. However C# comes with many more basic data structures. Choosing the correct data structure to use is part of writing a well structured and efficient program.

In this article I will go over the built-in C# data structures, including the new ones introduces in C#.NET 3.5. Note that many of these data structures apply for other programming languages.

Array

The perhaps simplest and most common data structure is the array. A C# array is basically a list of objects. Its defining traits are that all the objects are the same type (in most cases) and there is a specific number of them. The nature of an array allows for very fast access to elements based on their position within the list (otherwise known as the index). A C# array is defined like this:

[object type][] myArray = new [object type][number of elements]

Some examples:

 int[] myIntArray = new int[5];
 int[] myIntArray2 = { 0, 1, 2, 3, 4 };

As you can see from the example above, an array can be intialized with no elements or from a set of existing values. Inserting values into an array is simple as long as they fit. The operation becomes costly when there are more elements than the size of the array, at which point the array needs to be expanded. This takes longer because all the existing elements must be copied over to the new, bigger array.

ArrayList

The C# data structure, ArrayList, is a dynamic array. What that means is an ArrayList can have any amount of objects and of any type. This data structure was designed to simplify the processes of adding new elements into an array. Under the hood, an ArrayList is an array whose size is doubled every time it runs out of space. Doubling the size of the internal array is a very effective strategy that reduces the amount of element-copying in the long run. We won't get into the proof of that here. The data structure is very simple to use:

    ArrayList myArrayList = new ArrayList();
    myArrayList.Add(56);
    myArrayList.Add("String");
    myArrayList.Add(new Form());

The downside to the ArrayList data structure is one must cast the retrived values back into their original type:

int arrayListValue = (int)myArrayList[0]

Sources and more info you can find here :

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top