Constructing an object: should I expose or hide parameters passed to the constructor?

https://softwareengineering.stackexchange.com/questions/404661

07-03-2021
|

Question

I've a habit I just mechanically do without even thinking too much about it.

Whenever a constructor is waiting for some parameters, I consider this a public information that should be available by the calling code later on, if desired.

For example:

public class FooRepository : IFooRepository
{
    public FooRepository(IDbConnection dbConnection)
    {
        DbConnection = dbConnection ?? throw new ArgumentNullException(nameof(dbConnection));
    }

    public IDbConnection DbConnection { get; }
}

The calling code which instantiated a FooRepository object is passing an IDbConnection object and therefore has the right to access this information later on but can't modify it anymore (no set on the DbConnection property)

The dbConnection parameter could be passed explicitly or by dependency injection, it doesn't matter. The FooRepository shouldn't be aware of such details.

However, yesterday when doing peer programming with a coworker, he told me that any class I write should expose just the minimum useful information. He said developers shouldn't be able to analyse and mess with the internal state of an object.

I don't quite agree with him. In order to not waste too much time, I don't want to think a few minutes for each parameter to determine if this would be a good idea to expose it or not. In my opinion, there are some use cases we simply can't think of, when we first write a new class.

Whether or not the class will finally be included in a Nuget package, doesn't really matter. I just don't want to limit users of my class from accessing information they explicitly passed when instantiating the object or could be easily retrieved from the dependency injection framework.

Could someone explain to me what is considered a good practice here?

Should I really think whether each parameter makes sense to be exposed? Or is there a design pattern I can just instinctively apply without wasting too much time?

Any resource on the subject are welcome.

Solution

The calling code which instantiated a FooRepository object is passing an IDbConnection object and therefore has the right to access this information later on

This is not true when you're dealing with things like the factory pattern, where the instantiator of the object is not the handler of the object. Factory patterns quite often exist specifically because the object's construction is an implementation detail that should be abstracted away.

This applies to more cases than just the factory pattern. Essentially, it applies to any object that gets passed around at least once.

but can't modify it anymore (no set on the DbConnection property)

This isn't true for reference types. It's true that you can't change which object is being referenced, but you can still alter its content. For example:

public class Foo
{
    public string Name { get; set; }
}

public class Baz
{
    public Foo Foo { get; } // allegedly: "can't modify it anymore"

    public Baz(Foo foo)
    {
        this.Foo = foo;
    }
}

var myFoo = new Foo() { Name = "Hello" };
var myBaz = new Baz(myFoo);

As per your claim, myBaz.Foo can no longer be modified. Yet this code is perfectly legal:

myBaz.Foo.Name = "a completely different name";

And that's still a risk you take.

he told me that any class I write, it should expose just the minimum useful information.

I don't want to think few minutes for each parameter to determine if this would be a good idea to expose it or not.

These two don't quite follow. It doesn't require you to think about it, it requires you to default to private instead of public like you currently do. Unless there is a valid reason to expose it, don't.

This is an oversimplification as there are cases where you shouldn't start out on private (e.g. DTO properties), but if you're still struggling with evaluating this, it's already better to default to private instead of public.

In my opinion, there are some use cases we simply can't think of when we first write a new class.

In my opinion, this is indicative of not quite understanding the class' responsibility and how it fits in the existing codebase.

In fact, that's sort of what you state in the question: you don't want to think about it. But you really should. For your example, what would ever be the purpose of a repository exposing its database connection? I can't think of any answer here that does not immediately violate good practice rules, can you?
Exposing the database connection is not part of the repository's purpose, which is all about providing access to a persistent data store.

In part, this is a matter of experience which will come over time. Every time you have to change the access modifier on an existing property/method is a time to learn why the previous choice was not the right one. Do it enough and you will improve at judging public contracts on the first design.

In my opinion, there are some use cases we simply can't think of when we first write a new class.

Don't forget OCP: "software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification".

If you are inherently accounting for needing to change the internals of classes as time passes, you're taking a stance orthogonal to OCP.

That's not to say internals can't be changed when e.g. bugs are found or breaking changes are implemented; but it does mean you should try to avoid it as best as you can. Changing existing (often central) logic is a most common source of bugs, especially the crippling ones.

Whether or not the class will finally be included in a Nuget package doesn't really matter.

It really does matter. If your library is only being used in the same solution file, you can change things very quickly to your needs and can confirm it's working with a simple build.

But Nuget compounds the issue. If you change the contracts of your classes exposed in yout Nuget package, that means that every Nuget consumer will have to deal with breaking changes.
From personal experience, the issue is further compounded by Nuget servers not keeping a record of who has consumed your Nuget package, which makes it hard to figure out who all your consumers are and warn them ahead of time that breaking changes are about to be released.

Had you defaulted to making things private, and then selectively expose them, there would be less of a problem here. Adding to the contract without changing the existing parts does not break existing code.
Removing things from the contract, which is what would happen if you default to public, would always be liable to breaking code that depends on the thing you're now removing from the contract.

Should I really think for each parameter if it makes sense to expose it?

Yes. But it's not as complicated as you're making it out to be. Understanding what a certain class needs to expose or not is something you need to think about once per class. What is this class' purpose? How do I want this class to be used by its consumers?

After that, all properties/methods that you develop can easily be matched to the class' purpose, which is not a new evaluation but simply applying the decision you already made.

Or is there a design pattern I can just instinctively apply without wasting too much time?

If you were using interfaces on all your classes and using interface-based dependency injection, it would really help you in understanding how to separate a class' contract (things in the interface) from its implementation (things not in the interface).

Take for example:

public interface ISodaVendingMachine
{
    Soda GetDrink();
}

public class RegularVendingMachine : ISodaVendingMachine
{
    private Drinks drinks;

    public RegularVendingMachine(Drinks drinks)
    {
        this.drinks = drinks;
    }

    public Soda GetDrink()
    {
        return this.drinks.TakeOne();
    }
}

public class ConjuringVendingMachine : ISodaVendingMachine
{
    private PhilosophersStone philosophersStone;

    public ConjuringVendingMachine(PhilosophersStone philosophersStone)
    {
        this.philosophersStone = philosophersStone;
    }

    public Soda GetDrink()
    {
        return philosophersStone.PerformIncantation("Drinkum givum");
    }
}

The internals of each vending machine is up to them. It doesn't matter how they have access to and dispense a drink to the consumer. To the consumer, that's an irrelevant implementation detail. The customer doesn't want to know how the sausage gets made.

What matters for the public contracts is that they dispense a drink to the consumer, and thus the ISodaVendingMachine interface is built specifically for that purpose.

Notice how the interface doesn't care about anothing other than what it was designed to ensure.

When you have that interface, you can already see that anything in your class that isn't part of that interface should most likely be private as it is an implementation detail, not a contract.

OTHER TIPS

You coworker is right. Internal state should be encapsulated by default and only exposed where there is a good reason to. So when in doubt, hide.

Should I really think for each parameter if it makes sense to expose it? Or is there a design pattern I can just instinctively apply without wasting too much time?

Just hide them all by default, that is the easiest. It will be obvious to you if some property need to be exposed.

In general, an object should expose as little surface as necessary in order to solve its own concern. This avoids accidental coupling and makes the code overall more modular and easier to change.

This is part of the more general pattern that access to any data or functionality should be as limited as possible:

The scope of a variable should be as small as possible (local is better than global, inner scope better than outer scope etc.)
Access to members as constrained as possible (this is why members defaults to private).
Any interface as small as possible
The lifetime of any object as short as possible.
Immutable object are better than mutable.

It all comes down to reducing the number of things which a given section of code may affect, and which may affect the code. The fewer dependencies and interactions, the fewer bugs and the easier it is to change the code and add new features without breaking stuff left and right.

I just don't want to limit users of my class from accessing information they explicitly passed when instantiating the object

I think you might have an antipattern going there. If the user of the class passed the information, then it must already have the information, right? So why would it need to retrieve it again from the object it just passed it to?

In your example, the connection used by a repository is an implementation detail. The whole idea of a repository is that you can use it without concerns about the underlying storage. So the need to fetch the connection from a repository suggest a different problem in the design.

Think about why you are creating a repository in the first place. You're trying to abstract the concept of persistence so that clients don't need to know how data is stored. You do this so that you can shield clients from future changes to the persistence mechanism.

Exposing a DbConnection from your repository makes it part of your public interface. You're saying "I will always use a database, specifically an IDbConnection", which makes it difficult to switch the implementation (to in-memory or file-based for example).

If you really need to expose DbConnection, you should first create an interface IFooRepository that only contains generic persistence methods, and a concrete implementation DbFooRepository that additionally exposes the connection. Generally, you want clients to depend only on the IFooRepository interface so that they are not coupled to the database, but if a specific client really needs direct access to the database, they can use DbFooRepository instead.

In my opinion, there are some use cases we simply can't think of when we first write a new class.

YAGNI. It's easier to add it when you need to (which may be never). Don't just expose a bunch of details "just in case" someone might need it in the future. Especially if you're going to publish a NuGet package, keep in mind that once you make something a public API, you can never remove it.

You should expose what you need to in order to do what you want the class to do. Anything else is at best gold-plating, and at worse overwhelming the consumer with extraneous details. As an example of the later, it’s pretty unlikely that a user of your FooRepository will need to know what logger you are using, but that’s something that gets injected into a lot of Repositories.

Public information isn’t interesting information, or even useful information, it should be required information. If you can’t come up with a reason why your users would be demanding the information, saying the class is unusable without it, then leave it out. As you develop the class, you may come up with a reason why you need to expose some aspect of it, and that is fine. But even if the class is a Builder, it will most likely have lots of details which don’t need to be shared. Don’t over share.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange