Open closed principle: code duplication and encapsulation

https://softwareengineering.stackexchange.com/questions/405635

08-03-2021
|

Question

In another question on this site, asking to clarify the open closed principle, @Kate Gregory answered this. I'm interested in this part specifically:

Imagine you wrote an Invoice class that works perfectly and has no bugs. It makes a PDF of an invoice. Then someone says they want an HTML invoice with links in it. You don't change any code in Invoice to satisfy this request. Instead, you make another class, HTMLInvoice, that does what they now want. You leverage inheritance so that you don't have to write a lot of duplicate code in HTMLInvoice.

I'm wondering how this view can be matched with proper encapsulation. In my view, a properly encapsulated Invoice class would take the data it needs, do all it's work privately, and then spit out a PDF. However, in that case HTMLInvoice would be unable to derive from Invoice in order to reduce "duplicate code". Without access to Invoice's private fields/methods there is nothing for HTMLInvoice to re-use.

Surely the idea can't be to make Invoice's private members protected (from the start) to facilitate this?

Solution

Invoice class that works perfectly and has no bugs. It makes a PDF of an invoice. Then someone says they want an HTML invoice...

The example used in the quoted answer is hypothetical and somewhat imprecise:

It is not clear if Invoice is an abstract business document that will generate a concrete representation (PDF, HTML, XML/EDIFACT electronic invoice ...) or if it is already a concrete representation (i.e. invoice it tightly coupled to the output, which the wording suggests).
It is not either clear what is the "contact" of Invoice, i.e. the promises made by its interfaces.

Your doubts are therefore perfectly justified: if Invoice promises a PDF output, you cannot derive an HTMLInvoice output unless you break Liskov's substitution principle.

The whole problem here, is that the wording suggests that Invoice is already coupled to its representation, which could be seen as an implementation detail. So, as you point out, it would not hide the implementation detail.

But this is a fictuous example and not real-world. First of all, the format of a business document is not an implementation detail: producing a self standing PDF which can be legally archived, or an HTM which allows hyperlink to additional information, or even an electronic invoice is nowadays often a requirement.

Keeping this in mind, you could think of a design in which Invoice produces an output, leaving the format open. So Invoice would be slightly more abstract than in the above mentioned example. If your invoice has a method ElectronicDocument ProduceOutput() you could then have class specialisations PDFInvoice, HTMLInvoice, EDIFACTInvoide that would return a PDF, and HTML or and XML/EDIFACT document provided those are all specialisations of ElectronicDocument: no encapsulation issue; Open/Close and LSP are safeguarded.

But most probably, you should favor composition over inheritance here. The invoice format object would indeed better be defined at runtime. Perhaps it could change for an invoice, if the customer calls and tells that he wants it in another format (that's daily business in many accounting departments). Or perhaps even you could allow an Invoice to work with several representations.

There are plenty of ways to decouple the invoice format from the invoice (e.g. Builder to construct different representations, output Strategy to select the right document generation algorithm, visitor if you'd prefer to have a document generator using the invoice rather than the other way round, etc...). So in reality, you'd first think about your design, and depending on your intents, you would then chose the most appropriate way to decouple elements and ensure proper separation of concerns.

OTHER TIPS

In my view, a properly encapsulated Invoice class would take the data it needs, do all it's work privately, and then spit out a PDF

If you make the class like that then it is "closed to extension"

If you are planning ahead and following the open/closed principle then the you would make an Invoice class and a InvoiceWriter class, or other ways of enabling you to switch out the PDF generation part of the code.

The example in the answer you quote is talking about inheriting from Invoice or some InvoiceBase class, because this would allow code that uses the Invoice class to be open to extention. As you could pass that code a HTMLinvoice or PDFinvoice and it would function accordingly without change.

You have to be able to recognise that writing a PDF is a different responsibility than whatever Invoice covers and where to draw the line between classes.

I think Kate's example is somewhat ill chosen. I would have interpreted Modification as "changing a thing that was there before to do something different or differently" and Extension as "changing a thing to do an additional thing". I would have put changing the output format as a Modification, because something that was there before isn't happening anymore and has been replaced by something else. This is not solved by inheritance.

I'd identify what pdf invoice and html invoice have in common, and refactor into a common base class. That would then apply OCP, the common base class is extended, not modified.

Imagine you wrote an Invoice class that works perfectly and has no bugs. It makes a PDF of an invoice.

Then someone says what if they want an HTML invoice?
Then someone says what if they want to invoice services rather than items?
Then someone says what if they want to invoice in a different country with different tax laws?
Then someone says what if they want to invoice items shipped in each box rather than in each sale?

The thing about change is you don't know what direction it going to come from until it's coming. You can make yourself crazy trying to accommodate it before it's here.

Because of that, do not expect whoever wrote the original Invoice to have any idea that someone might someday want an HTML invoice. Instead expect them to allow for Invoice to be done differently.

Maybe that means they already have the PDF knowledge broken out in a different class that can be substituted during construction so the whole thing works the same way. Maybe they don't. But they damn well better let you step in and change what it does without changing Invoice directly. Even if that means inheriting from a concrete implementation. Ick.

It's much nicer if your follow OCP in a way that supports LSP and DIP as well. Basically what you ensure is that the using code won't have to change because it doesn't know it's talking to InvoiceHTML or InvoicePDF. It's just talking to Invoice, whatever that is. This knowledge has to exist somewhere though, so we move it to construction of Invoice. This is why construction code rarely follows OCP. But since it's free of business logic it's less painful to change.

Whichever way they do it they've followed OCP if they let you change what actually happens when using code uses what it thinks is still Invoice without having to destroy the original Invoice

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange