Is it common (or encouraged) practice to overload a function to accept IEnumerable<T>, ICollection<T>, IList<T>, etc.?

https://stackoverflow.com/questions/1760399

21-09-2019
|

Question

EDIT:

From the answers given, it's been made rather clear to me how the design I'm asking about below should actually be implemented. With those suggestions in mind (and in response to a comment politely pointing out that my example code does not even compile), I've edited the following code to reflect what the general consensus seems to be. The question that remains may no longer make sense in light of the code, but I'm leaving it as it is for posterity.

Suppose I have three overloads of a function, one taking IEnumerable<T>, one taking ICollection<T>, and one taking IList<T>, something like the following:

public static T GetMiddle<T>(IEnumerable<T> values) {
    IList<T> list = values as IList<T>;
    if (list != null) return GetMiddle(list);

    int count = GetCount<T>(values);

    T middle = default(T);
    int index = 0;

    foreach (T value in values) {
        if (index++ >= count / 2) {
            middle = value;
            break;
        }
    }

    return middle;
}

private static T GetMiddle<T>(IList<T> values) {
    int middleIndex = values.Count / 2;
    return values[middleIndex];
}

private static int GetCount<T>(IEnumerable<T> values) {
    // if values is actually an ICollection<T> (e.g., List<T>),
    // we can get the count quite cheaply
    ICollection<T> genericCollection = values as ICollection<T>;
    if (genericCollection != null) return genericCollection.Count;

    // same for ICollection (e.g., Queue<T>, Stack<T>)
    ICollection collection = values as ICollection;
    if (collection != null) return collection.Count;

    // otherwise, we've got to count values ourselves
    int count = 0;
    foreach (T value in values) count++;

    return count;
}

The idea here is that, if I've got an IList<T>, that makes my job easiest; on the other hand, I can still do the job with an ICollection<T> or even an IEnumerable<T>; the implementation for those interfaces just isn't as efficient.

I wasn't sure if this would even work (if the runtime would be able to choose an overload based on the parameter passed), but I've tested it and it seems to.

My question is: is there a problem with this approach that I haven't thought of? Alternately, is this in fact a good approach, but there's a better way of accomplishing it (maybe by attempting to cast the values argument up to an IList<T> first and running the more efficient overload if the cast works)? I'm just interested to know others' thoughts.

Solution

If you have a look at how LINQ extension methods are implemented using Reflector, you can see that a few extension methods on IEnumerable<T>, such as Count(), attempt to cast the sequence to an ICollection<T> or an IList<T> to optimize the operation (for example, using the ICollection<T>.Count property instead of iterating through an IEnumerable<T> and counting the elements). So your best bet is most likely to accept an IEnumerable<T> and then do this kind of optimizations if ICollection<T> or IList<T> are available.

OTHER TIPS

I think one version accepting IEnumerable<T> would be the way to go, and check inside the method if the parameter is one of the more derived collection types. With three versions as you propose, you lose the efficiency benefit if someone passes you a (runtime) IList<T> that the compiler statically considers an IEnumerable<T>:

        IList<string> stringList = new List<string> { "A", "B", "C" };
        IEnumerable<string> seq = stringList;
        Extensions.GetMiddle(stringList); // calls IList version
        Extensions.GetMiddle(seq);        // calls IEnumerable version

I'd say it's uncommon, and potentially confusing, so would be unlikely to be a good choice for a public API.

You could accept an IEnumerable<T> parameter, and internally check if it is in fact an ICollection<T> or IList<T>, and optimize accordingly.

This might be analagous to some of the optimizations in some of the IEnumerable<T> extension methods in the .NET 3.5 Framework.

I am really indifferent. If I saw it your way I would not think anything of it. But Joe's idea has merit. It might look like the following.

public static T GetMiddle<T>(IEnumerable<T> values)
{
  if (values is IList<T>) return GetMiddle((IList<T>)values);
  if (values is ICollection<T>) return GetMiddle((ICollection<T>)values);

  // Use the default implementation here.
}

private static T GetMiddle<T>(ICollection<T> values)
{
}

private static T GetMiddle<T>(IList<T> values)
{
}

While it is legal to overload a method to accept either a base type or a derived type, with all other parameters being otherwise identical, it is only advantageous to do so if the compiler will often be able to identify the latter form as being a better match. Because it would be very common for objects which implement ICollection<T> to be passed around by code which only needs an IEnumerable<T>, it would be very common for implementations of ICollection<T> to be passed into the IEnumerable<T> overload. Consequently, the IEnumerable<T> overload should probably check whether a passed-in object implements ICollection<T> and handle then specially if so.

If the most natural way of implementing the logic for an ICollection<T> would be to write a special method for it, there would be nothing particularly wrong with having a public overload which accepts an ICollection<T>, and having the IEnumerable<T> overload call the ICollection<T> one if given an object that implements ICollection<T>. Having such an overload be public wouldn't add much value, but it likely wouldn't hurt anything either. On the other hand, in situations where an object implements both IEnumerable<T> and ICollection, but not ICollection<T> (for example, a List<Cat> implements IEnumerable<Animal> and ICollection, but not ICollection<Animal>), one might want to use both interfaces, but that could not be done without either typecasting in the method that uses them, or passing the method which uses them both an ICollection reference and an IEnumerable<T> reference. The latter would be very ugly in a public method, and the former approach would lose the benefits of overloading.

Usually when designing interfaces you want to accept a 'lowest common denominator' type for the arguments. For return types it is a matter of some debate. I generally think creating the above overloads is overkill. It's biggest problem is the introduction of unneeded code-paths that now must be tested. Better to have one method that performs the operation one way and works 100% of the time. With the given overloads above you might have an inconsistency in behavior and not even realize it, or worse yet you may accidentally introduce a change in one and not in the other copies.

If you can do it with IEnumerable<T> then use that, if not then use the least interface needed.

No. It's certainly uncommon.

Anyway. Since IList<T> inherits from ICollection<T> and IEnumerable<T>, and ICollection<T> inherits from IEnumerable<T>, your only concern would be performance in IEnumerable<T> types.

I just see no reason to overload the function in that way, providing different signatures to achieve exactly the same result and accepting exactly the same types as parameter (no matter if you have an IEnumerable<T> or IList<T>, you would be able to pass it to any of the three overloads); that would just cause confusion.

When you overload a function, is just to provide a way to pass a different type of parameter that you cannot pass to the function if it would not have that signature.

Don't optimize unless it's really necessary. If you want to optimize, do it undercover. You won't pretend someone using your class to be aware of that "optimization" in order to decide which method signature to use, right?

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow