Behaviour that depends on two sides

https://softwareengineering.stackexchange.com/questions/410672

11-03-2021
|

Pergunta

I would like to have the following interface:

Resource {
public:
void copyInto(Resource* src) = 0;
}

But in order to implement this, the implementation would need to know (or make assumptions about) the implementation that it is copying from. When i instead have a copyFrom method, the problem would just reverse, meaning that the source would need to know about the target it is copying to.

I thought about two possible solutions: The simple one would be to have a staging-Resource with a defined form, into which a source-Resource copies, and from which the destination-Resource copies. This would create overhead, and make all Resources depend on the implementation of this staging-Resource.

The other one would be two define a CopyOperation class, that takes information about the source from one side, and information about the destination on the other. It then resolves the copy based on that information.

Are there any goto-solutions/patterns to this problem (which doesn't seem too special)? If so, advice/resources, and/or considerations in respect to my mentioned ideas would be highly appreciated!

Solução

At the first glance, it looks like you need double dispatch. In OO, one way to do double dispatch is the Visitor Pattern, but it might be an overkill, and there's a tradeoff involved, so don't jump to that idea immediately. But that tradeoff is what I want to point out, as understanding that can help you decide what to do.

In the Visitor Pattern, you have a type hierarchy that represents different kinds of data structures ("Elements" - corresponding to the different kinds of resources in your code), and a hierarchy that represents operations on those elements ("Visitors" - operations represented by objects). The idea behind the name is that the operations "visit" some other object, but that's a bit confusing, IMO. The "Accept" method in the image below is better understood as "Do", as in element.Do(operation), where operation is a concrete instance of a visitor.

The diagram perhaps looks a bit intimidating, but it's not hard to understand how it works: an Element polymorphically accepts a visitor instance (element.Do(operation)), and immediately does operation.Visit(this), which causes the right overload of Visit to be called - and then in that overload, both concrete types are known.

An important thing to be aware of is that in this setup, you can easily add new operations (you simply derive a new visitor), but it's hard to add new kinds of elements (because you have to change the Visitor interface, and add another overload to make it work). Even if you aren't going to use the pattern, it's good to be aware of this tradeoff as it can come up in other circumstances (it's an example of the so called expression problem). This property is not a drawback in and of itself, just another tool in your design toolbox that you may or may not choose to use.

In your case, there's an added problem that at the moment there's no clear separation of operations and the different resource types, so it could be hard to sort that out.

Generally speaking, if this constraint is in line with what you're truing to do, it might be simpler to drop the operation hierarchy, and just do a switch/if in functions that work on the elements (basically switching on the type, or some token that indicates the type). That's more or less your second idea:

"The other one would be two define a CopyOperation class, that takes information about the source from one side, and information about the destination on the other. It then resolves the copy based on that information."

It has the similar constraints, though - it's hard to add new kinds of resources, as you'd have to edit previously existing functions to add another case. Also, it's prone to suffer from combinatorial explosion. Yes, you can subclass the type to extend support to new resource types, but then your client code needs to know which subclass to use.

An alternative is to decouple by introducing an abstraction - not an abstract base class, but an exchange data-format of sorts; that's your first idea, and is not as bad as you think, provided that you can come up with a good exchange format.

"The simple one would be to have a staging-Resource with a defined form, into which a source-Resource copies, and from which the destination-Resource copies."

Organizing around a common exchange format solves the combinatorial explosion problem I mentioned before.

"This would create overhead"

This depends on what your resources actually are; more specifically, it depends on how expensive it is to make a copy, and whether or not you can make shallow copies (e.g., if some of the underlying data is immutable, or append only, or some such scheme that lets you share it). If it's not obvious if the overhead is acceptable or not, you can do tests to gather some performance data.

"and make all Resources depend on the implementation of this staging-Resource."

Yes, but if the abstraction is good enough, then all the different Resource types will be decoupled from each other (you can easily add new kinds of resources). This is why it's important to come up with a good abstraction (one that's relatively stable in face of change) - or to refine towards one as you develop.

The question is: is the nature of the resources you are trying to model such that there's a good general representation. If there is, great. If there isn't, you may be trying to overgeneralize, and it may be better to do things in more concrete way.

The kind of interface represented below (taken form Flater's answer) could prove to be perfectly fine in that case; it pushes the responsibility of knowing the concrete type to calling code, which might very well have that knowledge. It's also better to do this if having an intermediate representation is too expensive.

Resource {
    public:
      void copyInto(Buffer* src) = 0;
      void copyInto(Image* src) = 0;
}

This still constrains the Resource class to a limited set of concrete resource types (hard to add support for new kinds of resources). You can approach this in two ways: (1) treat the set of supported resource types as the set of supported/standard exchange formats, or (2) pull these out into free methods that take two parameters - not everything has to be defined in a class. With (2), you can then just add another free method when you want. The combinatorial explosion problem is back, through.

Another thing to consider is how these resources are actually going to be used, and given their usage, if you really want to treat them all in the same way through an abstraction. E.g., if the calling code is essentially in position to know the concrete type of the resource every time, you may chose to store different types of resources separately (e.g., into separate arrays), and potentially not have a common abstraction at all.

Outras dicas

In a strongly typed language, this would be handled by the compiler based on the types in signatures.

For dynamically typed languages you can use the double dispatch approach. For example, the whole mixed-type arithmetic in Smalltalk is implemented that way.

In your case, a method copyInto in class Foo would call copyFromFoo of the target, which then can do the right thing based on the source class Foo. Of course, you can substitute interfaces for classes here when your language has them.

Your question is hard to understand. Based on some of your statements and expectations, I suspect that there's a misunderstanding on what exactly an interface does (as opposed to an implementation).

What I think has happened is that you've tried to boil it down to the essentials (and that's a good intention, don't get me wrong), but you've boiled the wrong parts down and have omitted relevant considerations for your interface definition.

But in order to implement this, the implementation would need to know (or make assumptions about) the implementation that it is copying from.

That goes against the grain of what an interface is. An interface exists specifically so you don't have to know the implementation details and you can instead just focus on the contract itself.

A simple example to showcase the point (I'm using C# here as I'm most familiar with it)

public interface IClock
{
    DateTime GetTime();
}

Any class that implements IClock will be able to tell the time. How they tell the time (e.g. using system time, looking it up online, asking the user to input the time, a mocked clock for testing, ...) is irrelevant to the code that wishes to use an IClock object (without knowing the specific implementing class, e.g.:

IClock myClock = GetClock(); //  intentionally obfuscated

Console.WriteLine($"It is currently {myClock.GetTime()}");

If you need to know what specific class is implementing the interface, and you can't work with just the interface itself, then the interface is pointless or at least not useful in its current state.

Do i understand right that in this case you would also recommend against letting such classes derive from one interface when their behaviour/characteristics differ in this way?

It depends on what you mean by "behavior". If by "behavior" you mean that the contract is the same but the implementation varies, that's exactly why you should be using the interface.

If by "behavior" you mean that there's no reusable contract since each implementation is so vastly different, then it's impossible to even define a usable interface, so you wouldn't be able to use one even if you wanted to.

Resource may be Buffers or Images, which may recide in Host-coherent memory, or on local graphics-card-memory. Therefore a copy-operation will be different in each of these cases.

Resource {
    public:
      void copyInto(Resource* src) = 0;
}

You've defined an interface called Resource. You then stated that this interface required a copyInto method, which in and of itself requires a Resource type.

In other words, your interface declares that any Resource can be copied into any Resource, no exceptions or custom code needed.

That's presumably not what you want. However, you've actually stipulated two things here, and these can be separated. You can have the interface define methods for specific types ("copy into this specific type") instead of a generalized "copy into any resource" method. In effect

Resource {
    public:
      void copyInto(Buffer* src) = 0;
      void copyInto(Image* src) = 0;
}

This allows you to write custom method bodies for both of these methods.

You and I may know that it's the Buffer and Image classes that in and of themselves will be implementing the Resource interface, but that's irrelevant for the contract as stipulated. There is no technical requirement for it, you could just as well be writing these methods to use parameters of types that don't implement Resource.

Note that any class that implements Resource will always have to implement both methods, e.g. there can be no Resource which can only be copied to a buffer. If that is something you need, then you need to separate your interface into two interfaces. This follows the Interface Segregation Principle (ISP):

ICopyToBuffer {
    public:
      void copyInto(Buffer* src) = 0;
}

ICopyToImage {
    public:
      void copyInto(Image* src) = 0;
}

Your classes can then implement whichever interface they need, and they can pick multiple. Again, using C# syntax:

public class CopiesIntoNothing { ... }

public class CopiesIntoBuffer : ICopyToBuffer { ... }

public class CopiesIntoEverything: ICopyToBuffer, ICopyToImage { ... }

These class names were used to showcase the point. In reality, the class names generally wouldn't reveal which interfaces they implement.

The other one would be two define a CopyOperation class, that takes information about the source from one side, and information about the destination on the other.

The interface example I just gave has defined different methods depending on the target type (copy to buffer, copy to image). When you implement this interface on your source types (i.e. the class you will copy from), you will be required to implement a method body.

This means that you are able to use completely different implementations in each source type (but using the same contract!)

public class Buffer : Resource
{
    public void copyInto(Buffer b)
    {
        // Buffer to buffer logic
    }

    public void copyInto(Image i)
    {
        // Buffer to image logic
    }
}

public class Image : Resource
{
    public void copyInto(Buffer b)
    {
        // Image to buffer logic
    }

    public void copyInto(Image i)
    {
        // Image to image logic
    }
}

As you can see, this neatly divides your implementations into all possible combinations that can be made.

The drawback here is that for every new type you create, you're going to have to create a new interface method, and every implementing class is going to have to implement this new method. This can be dramatically improved by using ISP and separating the interfaces, but if you actually require that every resource must be copyable to every other resource, then the drawback actually isn't a drawback, it's an enforcement of what you want.

There's one fringe case left: suppose there is a new resource whose implementations are the same for all types. Are you now stuck having to copy/paste this implementation for every method? Nope! Well, you're going to have to still implement all the methods, but you are free to work in some reusable logic, e.g.:

public class MagicData : Resource
{
    public void copyInto(Buffer b)
    {
        copyIntoResource(b);
    }

    public void copyInto(Image i)
    {
        copyIntoResource(i);
    }

    private void copyIntoResource(Resource r)
    {
        // MagicData to [ANY RESOURCE] logic      
    }
}

To summarize

Interfaces should be created so that consumers no longer need to know which specific type (that implements the interface) they are working with.
Interfaces inherently allow you to have different implementations on each implementing class (as long as they adhere to the same contract)
If you need to further divide your implementations, have your interface declare multiple methods.
If not every implementing class needs to implement every interface method, then you must separate your interfaces
If you want every implementing class to always implement every interface method, then keep the interface in one piece.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange