
Recently, I've read a blog post, that I can't find back, about how we should "free the data". The main point of the post was that we use classes and encapsulation too much since a lot of problems can be solved with less overhead by using plain old (passive) data structure combined with function overload. The post raised my awareness about the cost of creating (and maintaining) classes and classes hierarchies. In order to share this newly earned awareness with my colleagues, I tried to pinpoint conditions that justify the creation of classes. So far, I've found

  • Presence of an invariant. For instance, a map should always contains the same number of keys and elements. You do not want the user to add a key and forget to add the corresponding element.
  • Implementation hiding to have the freedom to change it easily. For instance, a Point can be encoded with Cartesian coordinates (x,y) or with a radius and an angle.
  • Homogeneous manipulation. For instance if you want Dog and Cat to be manipulated the same way because they are both specialization of the more general concept of Animal.

What are the other reasons to create classes or classes hierarchies?

Edit: By cost, I refer to the time, money and Technical Debt required to create and maintain classes and classes hierarchies. This cost should be compare to the cost of other solutions.

Edit 2: I realized that trying to make this question general was a mistake. I definitely have c++ in mind.

Foi útil?


This is a very broad question, as it depends on the language used, and the features that language offers:

  • Can "plain old data structures" be made immutable?
  • Does the language enforce encapsulation of private functions and data, or is it by convention?
  • Is the language statically or dynamically typed?
  • Does it allow functions outside of static classes?
  • Does it treat functions as first class values?
  • Does it support interfaces?
  • Does it support records, structs etc?
  • Does it even support classes?

Depending on the answers to the above questions, the strategy used will vary. If, for example, the language doesn't support classes, you won't be using them...

Having said all that, there are some general rules that can be followed across languages:

  1. Avoid global state. If you have data, that's globally accessible and is mutable, you're on the path to debugging hell. Just don't do it.
  2. Avoid coupling. Whether it's through having objects spin up instances of other classes, or functions hard-coded to call other public functions, you're making the code harder to test and maintain. Use injection techniques and keep coupling as loose as possible.
  3. Avoid inheritance. Inheritance causes coupling problems, including the Fragile Base Class Problem, weakens encapsulation and causes testing problems. Unless you are using a language that can only achieve polymorphism via inheritance (ie doesn't support truly abstract classes or interfaces), then don't use inheritance.

As a rule of thumb, for a typical modern language that supports static functions and classes:

  1. Keep data as immutable as possible,
  2. Keep data and functionality as separate as possible,
  3. But use objects to encapsulate state and provide methods to handle that state,
  4. Only use functions (static methods) when they can be made pure, ie they produce a result from the parameters in a deterministic fashion without side effects.
  5. Design to interfaces (or the equivalent) and use injection as much as possible.

Outras dicas

Classes vs "free data" + functions

The freedom of data is always at the expense of the freedom of the programmer:

  • With a POD you're free to do whatever you want. And what you want, you will implement in a nicely overloaded function.
  • But other programmers (or your future yourself) are also free to do whatever they want and express their creativity. They are free to shoot themselves in the foot if they don't pay close attention to what they do.
  • In addition, as soon as you expose a POD publicly, it's no more a freedom: it's an instant technical debt! Why because you have no longer control on how the data is used, so you have no longer the power to change the structure as you want (e.g. if you'd decide to replace integer day, month, year with a single date field). Changes would require careful analysis of all the code using this data.
  • In small systems with a couple of programmers you can impose some discipline which mitigate these risks. However in programming at the large (spacecraft engines, nuclear powerplants, telecommunication systems), with thousands of programmers, such visibility will inevitably cause bugs, or undesired dependencies that hamper future maintenability. It's statistical.

This is why so many efforts were undertaken in programming languages to control of visibility of data and functions has been an issue in large systems, since the early days of structured programming:

  • In pre-object oriented world, this lead to Simula's concept of module that was developped further in languages like Modula2 or ADA.
  • In the more recent object oriented languages, this lead to the conept of class. THe class let you by the way the liery to create and manage multiple instances very easily whereas in modules you don't have this facitility.

What I try to say above is that the discipline of classes give you more freedom than you think. It's just a question of point of view (i.e. the owner of the class vs. the consumer).

What classes can do more

As you've mentioned, classes can do more than free data with functions:

  • encapsulation and separation of concerns
  • data abstraction
  • inheritance going from a general class to more specific ones
  • polymorphism, allowing invocation of functions/methods that are specific to a class without knowing at compile time which class the object will be

Classes are also a building bloc for additional freedoms:

  • We witnessed in leading languages the emergence of generic programming, which allows you to defined a generic function independently of the data that it has to manipulate.
  • The concept of design patterns also was a breakthrough. It's difficult to imagine such patterns implemented with POD and functions. (In fact I can imagine, because I did for years in a non-object oriented language: it works but with which complexity...)


As conclusion, a quote from Bjarne Stroustrup:

Do we really need multiple inheritance? Not really. We can do without multiple inheritance by using workarounds, exactly as we can do without single inheritance by using workarounds. We can even do without classes by using workarounds. (...) The reason languages provide inheritance (...) is that language-supported inheritance is typically superior to workarounds (e.g. use of forwarding functions to sub-objects or separately allocated objects) for ease of programming, for detecting logical problems, for maintainability, and often for performance.

Licenciado em: CC-BY-SA com atribuição
scroll top