Question

After reading the excellent accepted answer in this question:

How is the c#/.net 3.5 dictionary implemented?

I decided to set my initial capacity to a large guess and then trim it after I read in all values. How can I do this? That is, how can I trim a Dictionary so the gc will collect the unused space later?

My goal with this is optimization. I often have large datasets and the time penalty for small datasets is acceptable. I want to avoid the overhead of reallocating and copying the data that is incured with small initial capacities on large datasets.

Was it helpful?

Solution 2

In .NET 5 there is the method TrimExcess doing exactly what you're asking:

Sets the capacity of this dictionary to what it would be if it had been originally initialized with all its entries.

OTHER TIPS

According to Reflector, the Dictionary class never shrinks. void Resize() is hard-coded to always double the size.

You can probably create a new dictionary and use the respective constructor to copy over the items. This will be quite inefficient.

Or, implement your own dictionary with the existing one as a blue-print. This is less work than you might think at first.

Be sure to benchmark both approaches.

You might consider putting your data in a list first. Then you know the list's size, and can create a dictionary with that capacity (now exactly right for the data you want) and populate it.

Allowing the list to dynamically resize (as you add the elements) should be cheaper than allowing a dictionary to resize. (But, as others have noted, test the performance yourself!) Resizing a dictionary involves a rehashing operation, which means every element's GetHashCode will get called again, as well as the reference being copied into the new data structure. Resizing a list just means copying the references, so should be cheaper.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top