Is there a better way to use C# dictionaries than TryGetValue?

https://softwareengineering.stackexchange.com/questions/396567

01-03-2021
|

Pergunta

I find myself often looking up questions online, and many solutions include dictionaries. However, whenever I try to implement them, I get this horrible reek in my code. For example every time I want to use a value:

int x;
if (dict.TryGetValue("key", out x)) {
    DoSomethingWith(x);
}

That's 4 lines of code to essentially do the following: DoSomethingWith(dict["key"])

I've heard that using the out keyword is an anti pattern because it makes functions mutate their parameters.

Also, I find myself often needing a "reversed" dictionary, where I flip the keys and values.

Similarly, I often would like to iterate through the items in a dictionary and find myself converting keys or values to lists etc to do this better.

I feel like there's almost always a better, more elegant way of using dictionaries, But I'm at a loss.

Solução

Dictionaries (C# or otherwise) are simply a container where you look up a value based on a key. In many languages it's more correctly identified as a Map with the most common implementation being a HashMap.

The problem to consider is what happens when a key does not exist. Some languages behave by returning null or nil or some other equivalent value. Silently defaulting to a value instead of informing you that a value does not exist.

For better or for worse, the C# library designers came up with an idiom to deal with the behavior. They reasoned that the default behavior for looking up a value that does not exist is to throw an exception. If you want to avoid exceptions, then you can use the Try variant. It's the same approach they use for parsing strings into integers or date/time objects. Essentially, the impact is like this:

T count = int.Parse("12T45"); // throws exception

if (int.TryParse("12T45", out count))
{
    // Does not throw exception
}

And that carried forward to the dictionary, whose indexer delegates to Get(index):

var myvalue = dict["12345"]; // throws exception
myvalue = dict.Get("12345"); // throws exception

if (dict.TryGet("12345", out myvalue))
{
    // Does not throw exception
}

This is simply the way the language is designed.

Should out variables be discouraged?

C# isn't the first language to have them, and they have their purpose in specific situations. If you are trying to build a highly concurrent system, then you cannot use out variables at the concurrency boundaries.

In many ways, if there is an idiom that is espoused by the language and core library providers, I try to adopt those idioms in my APIs. That makes the API feel more consistent and at home in that language. So a method written in Ruby isn't going to look like a method written in C#, C, or Python. They each have a preferred way of building code, and working with that helps the users of your API learn it more quickly.

Are Maps in General an Anti-pattern?

They have their purpose, but many times they may be the wrong solution for the purpose you have. Particularly if you have a bi-directional mapping you need. There are many containers and ways of organizing data. There are many approaches that you can use, and sometimes you need to think a bit before you pick that container.

If you have a very short list of bi-directional mapping values, then you might only need a list of tuples. Or a list of structs, where you can just as easily find the first match on either side of the mapping.

Think of the problem domain, and pick the most appropriate tool for the job. If there isn't one, then create it.

Outras dicas

Some good answers here on the general principles of hashtables/dictionaries. But I thought I'd touch on your code example,

int x;
if (dict.TryGetValue("key", out x)) 
{
    DoSomethingWith(x);
}

As of C# 7 (which I think is around two years old), that can be simplified to:

if (dict.TryGetValue("key", out var x))
{
    DoSomethingWith(x);
}

And of course it could be reduced to one line:

if (dict.TryGetValue("key", out var x)) DoSomethingWith(x);

If you have a default value for when the key doesn't exist, it can become:

DoSomethingWith(dict.TryGetValue("key", out var x) ? x : defaultValue);

So you can achieve compact forms by using reasonably recent language additions.

This is neither a code smell nor an anti-pattern, as using TryGet-style functions with an out parameter is idiomatic C#. However, there are 3 options provided in C# to work with a Dictionary, so should you be sure you are using the correct one for your situation. I think I know where the rumor of there being a problem using an out parameter comes from, so I'll handle that at the end.

What features to use when working with a C# Dictionary:

If you are sure the key will be in the Dictionary, use the Item[TKey] property
If the key should generally be in the Dictionary, but it is bad/rare/problematic that it is not there, you should use Try...Catch so that an error will be raised and then you can try to handle the error gracefully
If you are not sure the key will be in the Dictionary, use TryGet with the out parameter

To justify this, one need only refer to the documentation for Dictionary TryGetValue, under "Remarks":

This method combines the functionality of the ContainsKey method and the Item[TKey] property.

...

Use the TryGetValue method if your code frequently attempts to access keys that are not in the dictionary. Using this method is more efficient than catching the KeyNotFoundException thrown by the Item[TKey] property.

This method approaches an O(1) operation.

The whole reason TryGetValue exists is to act as a more convenient way to use ContainsKey and Item[TKey], while avoiding having to search the dictionary twice - so pretending it doesn't exist and doing the two things it does manually is a rather awkward choice.

In practice, I rarely have ever used a raw Dictionary, due to this simple maxim: choose the most generic class/container that gives you the functionality you need. Dictionary was not designed to look up by value rather than by key (for example), so if that is something you want it may make more sense to use an alternate structure. I think I may have used Dictionary one time in the last year-long development project I did, simply because it was rarely the right tool for the job I was trying to do. Dictionary is certainly not the Swiss Army knife of the C# toolbox.

What's wrong with out parameters?

CA1021: Avoid out parameters

Although return values are commonplace and heavily used, the correct application of out and ref parameters requires intermediate design and coding skills. Library architects who design for a general audience should not expect users to master working with out or ref parameters.

I'm guessing that's where you heard that out parameters were something like an anti-pattern. As with all rules, you should read closer to understand the 'why', and in this case there is even an explicit mention of how the Try pattern does not violate the rule:

Methods that implement the Try pattern, such as System.Int32.TryParse, do not raise this violation.

There are at least two methods missing from C# dictionaries that in my opinion clean up code considerably in a lot of situations in other languages. The first is returning an Option, which lets you write code like the following in Scala:

dict.get("key").map(doSomethingWith)

The second is returning a user-specified default value if the key isn't found:

doSomethingWith(dict.getOrElse("key", "key not found"))

There is something to be said for using the idioms a language provides when appropriate, like the Try pattern, but that doesn't mean you have to only use what the language provides. We're programmers. It's okay to create new abstractions to make our specific situation easier to understand, especially if it eliminates a lot of repetition. If you frequently need something, like reverse lookups or iterating through values, make it happen. Create the interface you wish you had.

That's 4 lines of code to essentially do the following: DoSomethingWith(dict["key"])

I agree that this is inelegant. A mechanism that I like to use in this case, where the value is a struct type, is:

public static V? TryGetValue<K, V>(
      this Dictionary<K, V> dict, K key) where V : struct => 
  dict.TryGetValue(key, out V v) ? new V?(v) : new V?();

Now we have a new version of TryGetValue that returns int?. We can then do a similar trick for extending T?:

public static void DoIt<T>(
      this T? item, Action<T> action) where T : struct
{
  if (item != null) action(item.GetValueOrDefault());
}

And now put it together:

dict.TryGetValue("key").DoIt(DoSomethingWith);

and we're down to a single, clear statement.

I've heard that using the out keyword is an anti pattern because it makes functions mutate their parameters.

I would be slightly less strongly worded, and say that avoiding mutations when possible is a good idea.

I find myself often needing a "reversed" dictionary, where I flip the keys and values.

Then implement or obtain a bidirectional dictionary. They are straightforward to write, or there are plenty of implementations available on the internet. There are a bunch of implementations here, for example:

https://stackoverflow.com/questions/268321/bidirectional-1-to-1-dictionary-in-c-sharp

Similarly, I often would like to iterate through the items in a dictionary and find myself converting keys or values to lists etc to do this better.

Sure, we all do.

I feel like there's almost always a better, more elegant way of using dictionaries, But I'm at a loss.

Ask yourself "suppose I had a class other than Dictionary that implemented the exact operation that I wish to do; what would that class look like?" Then, once you've answered that question, implement that class. You're a computer programmer. Solve your problems by writing computer programs!

The TryGetValue() construct is only necessary if you don't know whether "key" is present as a key within the dictionary or not, otherwise DoSomethingWith(dict["key"]) is perfectly valid.

A "less dirty" approach might be to use ContainsKey() as a check instead.

Other answers contain great points, so I won't restate them here, but instead I'll focus on this part, which seems to be largely ignored so far:

Similarly, I often would like to iterate through the items in a dictionary and find myself converting keys or values to lists etc to do this better.

Actually, it's pretty easy to iterate over a dictionary, since it does implement IEnumerable:

var dict = new Dictionary<int, string>();

foreach ( var item in dict ) {
    Console.WriteLine("{0} => {1}", item.Key, item.Value);
}

If you prefer Linq, that too works out just fine:

dict.Select(x=>x.Value).WhateverElseYouWant();

All in all, I don't find Dictionaries to be an antipattern - they're just a specific tool with specific uses.

Also, for even more specifics, check out SortedDictionary (uses R-B trees for a more predictable performance) and SortedList (which is also a confusingly named dictionary that sacrifices insertion speed for lookup speed, but shines if you're staring out with a fixed, pre-sorted set). I've had a case where replacing a Dictionary with a SortedDictionary resulted in an order of magnitude faster execution (but it could happen the other way round too).

If you feel using a dictionary is awkward it may not be the right choice for your problem. Dictionaries are great but like one commenter noticed, often they are used as a shortcut for something that should have been a class. Or the dictionary itself may be right as a core storage method but there should be a wrapper class around it to provide the desired service methods.

A lot has been said about Dict[key] versus TryGet. What I use a lot is iterating over the dictionary using KeyValuePair. Apparently this is a less commonly known construct.

The main boon of a dictionary is that it is really fast compared to other collections as the number of items grows. If you have to check whether a key exists a lot you may want to ask yourself if your use of a dictionary is appropriate. As a client you should typically know what you put in there and thus what would be safe to query.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange