What does the author mean by casting the interface reference to any implementation?

https://softwareengineering.stackexchange.com/questions/357163

19-01-2021
|

Question

I am currently in the process of trying to master C#, so I am reading Adaptive Code via C# by Gary McLean Hall.

He writes about patterns and anti-patterns. In the implementations versus interfaces part he writes the following:

Developers who are new to the concept of programming to interfaces often have difficulty letting go of what is behind the interface.

At compile time, any client of an interface should have no idea which implementation of the interface it is using. Such knowledge can lead to incorrect assumptions that couple the client to a specific implementation of the interface.

Imagine the common example in which a class needs to save a record in persistent storage. To do so, it rightly delegates to an interface, which hides the details of the persistent storage mechanism used. However, it would not be right to make any assumptions about which implementation of the interface is being used at run time. For example, casting the interface reference to any implementation is always a bad idea.

It might be the language barrier, or my lack of experience, but I don't quite understand what that means. Here is what I understand:

I have a free time fun project to practice C#. There I have a class:

public class SomeClass...

This class is used in a lot of places. While learning C#, I read that it is better to abstract with an interface, so I made the following

public interface ISomeClass <- Here I made a "contract" of all the public methods and properties SomeClass needs to have.

public class SomeClass : ISomeClass <- Same as before. All implementation here.

So I went into all some class references and replaced them with ISomeClass.

Except in the construction, where I wrote:

ISomeClass myClass = new SomeClass();

Am I understanding correctly that this is wrong? If yes, why so, and what should I do instead?

Solution

Abstracting your class into an interface is something you should consider if and only if you intend on writing other implementations of said interface or the strong possibility of doing so in the future exists.

So perhaps SomeClass and ISomeClass is a bad example, because it would be like having a OracleObjectSerializer class and a IOracleObjectSerializer interface.

A more accurate example would be something like OracleObjectSerializer and a IObjectSerializer. The only place in your program where you care what implementation to use is when the instance is created. Sometimes this is further decoupled by using a factory pattern.

Everywhere else in your program should use IObjectSerializer not caring how it works. Lets suppose for a second now that you also have a SQLServerObjectSerializer implementation in addition to OracleObjectSerializer. Now suppose you need to set some special property to set and that method is only present in OracleObjectSerializer and not SQLServerObjectSerializer.

There are two ways to go about it: the incorrect way and the Liskov substitution principle approach.

The incorrect way

The incorrect way, and the very instance referred to in your book, would be to take an instance of IObjectSerializer and cast it to OracleObjectSerializer and then call the method setProperty available only on OracleObjectSerializer. This is bad because even though you may know an instance to be an OracleObjectSerializer, you're introducing yet another point in your program where you care to know what implementation it is. When that implementation changes, and presumably it will sooner or later if you have multiple implementations, best case scenario, you will need to find all these places and make the correct adjustments. Worst case scenario, you cast a IObjectSerializer instance to a OracleObjectSerializer and you receive a runtime failure in production.

Liskov Substitution Principle approach

Liskov said that you should never need methods like setProperty in the implementation class as in the case of my OracleObjectSerializer if done properly. If you abstract a class OracleObjectSerializer to IObjectSerializer, you should encompass all the methods necessary to use that class, and if you can't, then something is wrong with your abstraction (trying to make a Dog class work as a IPerson implementation for instance).

The correct approach would be to provide a setProperty method to IObjectSerializer. Similar methods in SQLServerObjectSerializer would ideally work through this setProperty method. Better still, you standardize property names through an Enum where each implementation translates that enum to the equivalent for its own database terminology.

Put simply, using an ISomeClass is only half of it. You should never need to cast it outside the method that is responsible for its creation. To do so is almost certainly a serious design mistake.

OTHER TIPS

The accepted answer is correct and very useful, but I would like to briefly address specifically the line of code you asked about:

ISomeClass myClass = new SomeClass();

Broadly speaking, this isn't terrible. The thing that should be avoided whenever possible would be doing this:

void someMethod(ISomeClass interface){
    SomeClass cast = (SomeClass)interface;
}

Where your code is provided an interface externally, but internally casts it to a specific implementation, "Because I know it will only be that implementation". Even if that did end up being true, by using an interface and casting it to an implementation you are voluntarily giving up real type safety just so you can pretend to use abstraction. If somebody else were to work on the code later and see a method that accepts an interface parameter, then they're going to assume that any implementation of that interface is a valid option to pass in. It could even be yourself a bit down the line after you've forgotten that a specific method lies about what parameters it needs. If you ever feel the need to cast from an interface to a specific implementation, then either the interface, the implementation, or the code which references them is incorrectly designed and should change. For example, if the method only works when the argument passed in is a specific class, then the parameter should only accept that class.

Now, looking back to your constructor call

ISomeClass myClass = new SomeClass();

the problems from casting don't really apply. None of this appears to be exposed externally, so there isn't any particular risk associated with it. Essentially, this line of code itself is an implementation detail that interfaces are designed to abstract to begin with, so an external observer would see it working the same way regardless of what they do. However, this also doesn't GAIN anything from the existence of an interface. Your myClass has the type ISomeClass, but it doesn't have any reason since it's always assigned a specific implementation, SomeClass. There are some minor potential advantages, such as being able to swap the implementation in code by changing just the constructor call, or reassigning that variable later to a different implementation, but unless there's somewhere else that requires the variable be typed to the interface rather than the implementation this pattern makes your code look like interfaces were used only by rote, not out of actual understanding of the benefits of interfaces.

I think it's easier to show your code with a bad example:

public interface ISomeClass
{
    void DoThing();
}

public class SomeClass : ISomeClass
{
    public void DoThing()
    {
       // Mine for BitCoin
    }

}

public class AnotherClass : ISomeClass
{
    public void DoThing()
    {
        // Mine for oil
    }
    public Decimal Depth;
 }

 void main()
 {
     ISomeClass task = new SomeClass();

     task.DoThing(); //  This is good

     Console.WriteLine("Depth = {0}", ((AnotherClass)task).Depth); <-- The task object will not have this field
 }

The problem is that when you initially write the code, there is probably only one implementation of that interface, so that casting would still work, it's just that in the future, you might implement another class and then (as my example shows), you try and access data that doesn't exist in the object you are using.

Just for clarity, let's define casting.

Casting is forcibly converting something from one type to another. A common example is casting a floating point number to an integer type. A specific conversion may be specified when casting but the default is to simply reinterpret the bits.

Here's an example of casting from this Microsoft docs page.

// Create a new derived type.  
Giraffe g = new Giraffe();  

// Implicit conversion to base type is safe.  
Animal a = g;  

// Explicit conversion is required to cast back  
// to derived type. Note: This will compile but will  
// throw an exception at run time if the right-side  
// object is not in fact a Giraffe.  
Giraffe g2 = (Giraffe) a;

You could do the same thing and cast something that implements an interface to a particular implementation of that interface, but you should not because that will result in an error or unexpected behaviour if a different implementation than you expect is used.

My 5 cents:

All those examples are Ok, but they are not real world examples and did not shown the real world intentions.

I do not know C# so I will give abstract example (mix between Java and C++). Hope that's OK.

Suppose you have interface iList:

interface iList<Key,Value>{
   bool add(Key k, Value v);
   bool remove(Element e);
   Value get(Key k);
}

Now suppose there are lots of implementations:

DynamicArrayList - uses flat array, fast to insert and remove at the end.
LinkedList - uses double linked list, fast to insert in front and the end.
AVLTreeList - uses AVL Tree, fast to do everything, but uses lots of memory
SkipList - uses SkipList, fast to do everything, slower than AVL Tree, but uses less memory.
HashList - uses HashTable

One can think of lots of different implementations.

Now suppose we have following code:

uint begin_size = 1000;
iList list = new DynamicArrayList(begin_size);

It clearly show our intention that we want to use iList. Sure we no longer be able to perform DynamicArrayList specific operations, but we need a iList.

Consider following code:

iList list = factory.getList();

Now we do not even know what the implementation is. This last example is often used in image processing when you load some file from disk and you do not need its file type (gif, jpeg, png, bmp...), but all you want is to do some image manipulation (flip, scale, save as png at the end).

You have an interface ISomeClass, and an object myObject of which you know nothing from your code except that it is declared to implement ISomeClass.

You have a class SomeClass, which you know implements the interface ISomeClass. You know that because it's declared to implement ISomeClass, or you implemented it yourself to implement ISomeClass.

What's wrong with casting myClass to SomeClass? Two things are wrong. One, you don't really know that myClass is something that can be converted to SomeClass (an instance of SomeClass or a subclass of SomeClass), so the cast could go wrong. Two, you shouldn't have to do this. You should be working with myClass declared as an iSomeClass, and use ISomeClass methods.

The point where you get a SomeClass object is when an interface method is called. At some point you call myClass.myMethod(), which is declared in the interface, but has an implementation in SomeClass, and of course possibly in many other classes implementing ISomeClass. If a call ends up in your SomeClass.myMethod code, then you know that self is an instance of SomeClass, and at that point it is absolutely fine and actually correct to use it as a SomeClass object. Of course if it is actually an instance of OtherClass and not SomeClass, then you won't arrive at the SomeClass code.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange