Distributing Tiers Across VMs

https://stackoverflow.com/questions/8792598

15-04-2021
|

Question

Although this is a Java-centric question it really applies to any system utilizing a multi-tier architecture.

In 3-tier architectures, you typically have 3 tiers:

A client/presentation tier where the client code lives; and
A middleware tier where business logic lives; and
A data/eis tier where the RDBMS and other data-heavy systems live

In Java land, for a web application, this might look like:

An application server, such as GlassFish running both the "web tier" (WARs comprising the client tier in a web app) as well as the "business tier" (EJBs, middleware, etc.); and
A RDBMS server embodying the data tier

In a virtualized/clustered environment, these applications (GlassFish, RBMBS such as Oracle or PostgreSQL, etc.) will run on VMs.

My question: What are the standard ways of allocating/distributing this 3-tier architecture across these VMs? Meaning, any one of the following "strategies" might be viable, but not preferential:

One VM (let's say all VMs are Ubuntu Servers so cost/price doesn't factor into the equation) running both GlassFish and the RDBMS (all 3 tiers)
Two VMs: an application server VM running GlassFish, and a database server VM running, say, PostgreSQL
Three VMs: two app server VMs both running GlassFish, however 1 GlassFish instance is only running WARs (web tier) whereas the 2nd FlassFish instance is running the middleware/biz logic; then a 3rd DB server

Obviously if all the servers (all tiers) were running on the same VM, they may run faster or more efficiently because they wouldn't be bogged down with network latency. But they'd be on the same VM, which needs I would need mega-hardware to support them. There might also be security concerns with this setup.

There will be pros/cons to each. I'm interested in what strategies would best accomplish the following goals: (1) maximizes throughput/speed, (2) is best suited for a clustered/cloud environment and (3) maximizes security.

Thanks in advance!

Solution

(1) maximizes throughput/speed,

This depends entirely on your application. e.g the Database could be your bottle neck in which case what you in JVM does matter so much.

(2) is best suited for a clustered/cloud environment

If you are going to distribute your system, you are most likely to want to distribute your presentation layer. This is because the work they do is dependant on the number of clients and the work each client does is largely independent. (in the presentation layer)

and (3) maximizes security.

Having more VMs doesn't guarantee improved security. Your JVM should be setup so the different applications running in it are pretty separate anyway. If you want to prevent denial of attack and your back end services are used by other systems, you may want to separate them otherwise, it doesn't make much difference.

OTHER TIPS

I think the answer to your questions is highly dependent on this application behavior, database usage, etc... and nothing can be said without looking at current performance metrics (as other answers already mentioned). I post some of my thoughts as guidelines.

Database

Most organizations run RDBMS in separate hosts, and some engineers would choose to never virtualize these, depending on what their DB vendor best practices say for their case (that said, I do normally consider VMs to be equivalent to physical hosts and I use them whenever possible).

Performance-wise, RDBMS often require kernel tuning or unconventional filesystem strategies, and having them in separate hosts can help. If the DB ever needs to be set up in high availability mode or in a cluster, having it separated from application servers can also make things easier. Note that database tuning, if you ever need to do it, can be a difficult topic if taken seriously, and it often involves stuff like aligning partitions in the disk, trying to reduce disk head movement by cleverly allocating database data segments/files, considering DBMS and OS cache sizes and strategies... All this can impact other applications running in the same host so I'd rather leave the DB alone.

In addition, RDBMS often serve several applications (there are good reasons for this: sometimes some integrations require access to more than one application database). Having them separated from application servers helps.

Also, DB systems have their own upgrade, backup, distribution/clustering and administration procedures, and are often maintained by different persons than application servers. Thus the whole database administration topic is easier to deal with if you consider it separately. And if the database becomes a bottleneck, you can work on the database alone without considering if the other tiers are impacting performance.

I do recommend keeping RDBMS alone in a single host for reasonably-sized production environments. But of course, if you don't have performance, administration, or availability requirements, you can consider using shared server for everything.

Glassfish

In general terms, when you want to deploy a Java EE application to several server (for load balancing or high availability) you install the same application server and artifacts to all of the application servers in the cluster. Then you can choose what artifacts to enable on each node of the cluster. Some application servers can enable or disable components depending on the server load. In this case, your application server is the "unit" that you need to distribute.

Now, there may be cases where you or your organization may prefer to have completely separated network layers for the web tier and the business tier (i.e., security concerns). In this case, you would use separate hosts for this. If your web tier is really heavy and you find that you need to scale it separately from the business tier (i.e. you find you need 6 web servers but you can make it with 1 or 2 EJB containers), I'd separate these two tiers too.

As a note: there are slight benefits in running the web and EJB tiers in the same Glassfish instance: as they share the JVM, calls between the web tier and the business tier can use call-by-reference semantics. Depending on your work load and the size and serialization cost of responses, this can result in a noticeable performance increase.

In most cases, for many corporate applications, I'd use just one or two servers (depending if you need high availability) containing both layers, because even if load rises you can still grow vertically (increase server power or VM resources) or horizontally (add another server and load-balance requests).

World-available and high-throughput applications need to consider many other aspects in order for them to be scalable (simply adding nodes to a Java EE cluster node won't cut it), so I don't think any of this solution is better or worse in order to be deployed to "the Cloud", but in general terms, if you plan to deploy to elastic virtualization services and your requirements justify that, I do recommend to separate the web and business tiers.

Security

In my opinion, the topics discussed don't have a direct impact on security.

Lastly, I'm sure that much more can be said about this topic, and my experience is limited, so please get more opinions ;).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow