Does splitting a potentially monolithic application into several smaller ones help prevent bugs? [closed]

https://softwareengineering.stackexchange.com/questions/388461

22-02-2021
|

Question

Another way of asking this is; why do programs tend to be monolithic?

I am thinking of something like an animation package like Maya, which people use for various different workflows.

If the animation and modelling capabilities were split into their own separate application and developed separately, with files being passed between them, would they not be easier to maintain?

Solution

Yes. Generally two smaller less complex applications are much easier to maintain than a single large one.

However, you get a new type of bug when the applications all work together to achieve a goal. In order to get them to work together they have to exchange messages and this orchestration can go wrong in various ways, even though every application might function perfectly. Having a million tiny applications has its own special problems.

A monolithic application is really the default option you end up with when you add more and more features to a single application. It's the easiest approach when you consider each feature on its own. It's only once it has grown large that you can look at the whole and say "you know what, this would work better if we separated out X and Y".

OTHER TIPS

Does splitting a potentially monolithic application into several smaller ones help prevent bugs

Things are seldom that simple in reality.

Splitting up does definitely not help to prevent those bugs in the first place. It can sometimes help to find bugs faster. An application which consists of small, isolated components may allow more individual (kind of "unit"-) tests for those components, which can make it sometimes easier to spot the root cause of certain bugs, and so allow it to fix them faster.

However,

even an application which appears to be monolithic from the outside may consist of a lot unit-testable components inside, so unit testing is not necessarily harder for a monolithic app
as Ewan already mentioned, the interaction of several components introduce additional risks and bugs. And debugging an application system with complex interprocess communication can be significantly harder than debugging a single-process application

This depends also a lot on how well a larger app can split up into components, and how broad the interfaces between the components are, and how those interfaces are used.

In short, this is often a trade-off, and nothing where a "yes" or "no" answer is correct in general.

why do programs tend to be monolithic

Do they? Look around you, there are gazillions of Web apps in the world which don't look very monolithic to me, quite the opposite. There are also a lot of programs available which provide a plugin model (AFAIK even the Maya software you mentioned does).

would they not be easier to maintain

"Easier maintenance" here often comes from the fact that different parts of an application can be developed more easily by different teams, so better distributed workload, specialized teams with clearer focus, and on.

I'll have to disagree with the majority on this one. Splitting up an application into two separate ones does not in itself make the code any easier to maintain or reason about.

Separating code into two executables just changes the physical structure of the code, but that's not what is important. What decides how complex an application is, is how tightly coupled the different parts that make it up are. This is not a physical property, but a logical one.

You can have a monolithic application that has a clear separation of different concerns and simple interfaces. You can have a microservice architecture that relies on implementation details of other microservices and is tightly coupled with all others.

What is true is that the process of how to split up one large application into smaller ones, is very helpful when trying to establish clear interfaces and requirements for each part. In DDD speak that would be coming up with your bounded contexts. But whether you then create lots of tiny applications or one large one that has the same logical structure is more of a technical decision.

Easier to maintain once you've finished splitting them, yes. But splitting them is not always easy. Trying to split off a piece of a program into a reusable library reveals where the original developers failed to think about where the seams should be. If one part of the application is reaching deep into another part of the application, it can be difficult to fix. Ripping the seams forces you to define the internal APIs more clearly, and this is what ultimately makes the code base easier to maintain. Reusability and maintainability are both products of well defined seams.

It's important to remember that correlation is not causation.

Building a large monolith and then splitting it up into several small parts may or may not lead to a good design. (It can improve the design, but it isn't guaranteed to.)

But a good design often leads to a system being built as several small parts rather than a large monolith. (A monolith can be the best design, it's just much less likely to be.)

Why are small parts better? Because they're easier to reason about. And if it's easy to reason about correctness, you're more likely to get a correct result.

To quote C.A.R. Hoare:

There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.

If that's the case, why would anyone build an unnecessarily complicated or monolithic solution? Hoare provides the answer in the very next sentence:

The first method is far more difficult.

And later in the same source (the 1980 Turing Award Lecture):

The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.

This is not a question with a yes or no answer. The question is not just ease of maintenance, it is also a question efficient use of skills.

Generally, a well-written monolithic application is efficient. Inter-process and inter-device communication is not cheap. Breaking up a single process decreases efficiency. However, executing everything on a single processor can overload the processor and slow performance. This is the basic scalability issue. When the network enters the picture, the problem gets more complicated.

A well written monolithic application that can operate efficiently as a single process on a single server can be easy to maintain and keep free of defects, but still not be an efficient use of coding and architectural skills. The first step is to break the process into libraries that still execute as the same process, but are coded independently, following disciplines of cohesion and loose coupling. A good job at this level improves maintainability and seldom affects performance.

The next stage is to divide the monolith into separate processes. This is harder because you enter into tricky territory. It's easy to introduce race condition errors. The communication overhead increases and you must be careful of "chatty interfaces." The rewards are great because you break a scalability barrier, but the potential for defects also increases. Multi-process applications are easier to maintain on the module level, but the overall system is more complicated and harder to troubleshoot. Fixes can be devilishly complicated.

When the processes are distributed to separate servers or to a cloud style implementation, the problems get harder and the rewards greater. Scalability soars. (If you are considering a cloud implementation that does not yield scalability, think hard.) But the problems that enter at this stage can be incredibly difficult to identify and think through.

No. it does not make it easier to maintain. If anything welcome to more problems.

Why?

The programs are not orthogonal they need to preserve each others work in so far as is reasonable, which implies a common understanding.
Much code of both programs are identical. Are you maintaining a common shared library, or maintaining two separate copies?
You now have two development teams. How are they communicating?
You now have two products that need:
- a common UI style, interaction mechanisms, etc... So you now have design problems. (How are the dev teams communicating again?)
- backward compatibility (can modeller v1 be imported into animator v3?)
- cloud/network integration (if its a feature) now has to be updated across twice as many products.
You now have three consumer markets: Modellers, Animators and Modeller Animators
- They will have conflicting priorities
- They will have conflicting support needs
- They will have conflicting usage styles
Do the Modeller Animators have to open two separate applications to work on the same file? Is there a third application with both functions, does one application load the functions of the other?
etc...

That being said smaller code bases equal easier to maintain at the application level, you're just not going to get a free lunch. This is the same problem at the heart of Micro-Service/Any-Modular-Architecture. Its not a panacea, maintenance difficulty at the application level is traded for maintenance difficulties at the orchestration level. Those issues are still issues, they just aren't in the code base any more, they will need to be either avoided, or solved.

If solving the problem at the orchestration level is simpler then solving it at each application level then it makes sense to split it into two code bases and deal with the orchestration issues.

Otherwise no, just do not do it, you would be better served by improving the internal modularity of the application itself. Push out sections of code into cohesive and easier to maintain libraries that the application acts as a plugin to. After all a monolith is just the orchestration layer of a library landscape.

There were a lot of good answers but since there is almost a dead split I'll throw my hat into the ring too.

In my experience as a software engineer, I have found this to not be a simple problem. It really depends on the size, scale, and purpose of the application. Older applications by virtue of the inertia required to change them, are generally monolithic as this was a common practice for a long time (Maya would qualify in this category). I assume you're talking about newer applications in general.

In small enough applications that are more-or-less single concern the overhead required to maintain many separate parts generally exceeds the utility of having the separation. If it can be maintained by one person, it can probably be made monolithic without causing too many problems. The exception to this rule is when you have many different parts (a frontend, backend, perhaps some data layers in between) that are conveniently separated (logically).

In very large even single concern applications splitting it up makes sense in my experience. You have the benefit of reducing a subset of the class of bugs possible in exchange for other (sometimes easier to solve) bugs. In general, you can also have teams of people working in isolation which improves productivity. Many applications these days however are split pretty finely, sometimes to their own detriment. I have also been on teams where the application was split across so many microservices unnecessarily that it introduced a lot of overhead when things stop talking to each other. Additionally, having to hold all of the knowledge of how each part talks to the other parts gets much harder with each successive split. There is a balance, and as you can tell by the answers here the way to do it isnt very clear, and there is really no standard in place.

For UI apps it is unlikely to decrease overall amount of bugs but will shift balance of bug mix toward problems caused by communication.

Speaking of user facing UI applications/sites - users are extremely non-patient and demand low response time. This makes any communication delays into bugs. As result one will trade potential decrease of bugs due to decreased complexity of a single component with very hard bugs and timing requirement of cross-process/cross-machine communication.

If units of the data the program deals with are large (i.e. images) then any cross-process delays would be longer and harder to eliminate - something like "apply transformation to 10mb image" will instantly gain +20mb of disk/network IO in addition to 2 conversion from in-memory format to serializabe format and back. There is really not much you can do to hide time needed to do so from the user.

Additionally any communication and especially disk IO is subject to AntiVirus/Firewall checks - this inevitably adds another layer of hard to reproduce bugs and even more delays.

Splitting monolithic "program" shines where communication delays are not critical or already unavoidable

parallelizable bulk processing of information where you can trade small extra delays for significant improvement of individual steps (sometimes eliminating need for custom components by using off-the-shelf once). Small individual step footprint may let you use multiple cheaper machines instead of single expensive one for example.
splitting monolithic services into less coupled micro-services - calling several services in parallel instead of one most likely will not add extra delays (may even decrease overall time if each individual one is faster and there are no dependencies)
moving out operations that users expect to take long time - rendering complicated 3d scene/movie, computing complex metrics about data,...
all sorts of "auto-complete", "spell-check", and other optional aids can and often made to be external - most obvious example is browser's url auto-suggestions where your input send to external service (search engine) all the time.

Note that this applies to desktop apps as well as web sites - user facing portion of the program tends to be "monolithic" - all user interaction code tied to single piece of data is usually running in a single process (it is not unusual to split processes on per-piece-of-data basis like HTML page or an image but it is orthogonal to this question). Even for most basic site with user input you'll see validation logic running on the client side even if making it server side would be more modular and reduce complexity/code duplication.

Does [it] help prevent bugs?

Prevent? Well, no, not really.

It helps detect bugs.
Namely all the bugs you didn't even know you had, that you only discovered when you tried to split that whole mess into smaller parts. So, in a way, it prevented those bugs from making their appearance in production — but the bugs were already there.
It helps reduce the impact of bugs.
Bugs in monolithic applications have the potential to bring down the whole system and keep the user from interacting with your application at all. If you split that application into components, most bugs will —by design— only affect one of the components.
It creates a scenario for new bugs.
If you want to keep the user experience the same, you will need to include new logic for all those components to communicate (via REST services, via OS system calls, what have you) so they can interact seamlessly from the user's POV.
As a simple example: your monolithic app let users create a model and animate it without leaving the app. You split the app in two components: modeling and animation. Now your users have to export the modeling app's model to a file, then find the file and then open it with the animation app... Let's face it, some users are not gonna like that, so you have to include new logic for the modeling app to export the file and automatically launch the animation app and make it open the file. And this new logic, as simple as it may be, can have a number of bugs regarding data serialization, file access and permissions, users changing the installation path of the apps, etc.
It is the perfect excuse to apply much needed refactoring.
When you decide to split a monolithic app into smaller components, you (hopefully) do so with a lot more knowledge and experience about the system than when it was first designed, and thanks to that you can apply a number of refactors to make the code cleaner, simpler, more efficient, more resilient, more secure. And this refactoring can, in a way, help prevent bugs. Of course, you could also apply the same refactoring to the monolithic app to prevent the same bugs, but you don't because it's so monolithic that you're afraid of touching something in the UI and breaking business logic ¯\_(ツ)_/¯

So I wouldn't say you're preventing bugs just by breaking a monolithic app into smaller components, but you're indeed making it easier to reach a point in which bugs can be more easily prevented.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange