Scalability and high availability of a Java standalone application

https://stackoverflow.com/questions/561188

05-09-2019
|

Question

We are currently running a Java integration application on a Linux box. First an overview of the application.

The Java application is a standalone application (not deployed on any Java EE application server like OracleAS,WebLogic,JBOSS etc). By Stand Alone I mean its NOT a DESKTOP application. However it is run from the command line from a Main class. The user does not directly interact with this application at all. Messages are dumped into the queue using an API which is then read out by my Application which is constantly running 24/7. I wouldn't qualify this as a desktop app since the user has no direct interaction with it.(Not sure if this is the correct reasoning to qualify as one).

It uses Spring and connects to WebSphere MQ and Oracle Database We use a Spring Listener(Spring Message Driven POJOs) which listens to a queue on WebSphere MQ. Once there is a message in the queue, the application read the message from the MQ and dumps(insert/update) it into the database.

Now the question is:

How can we horizontally scale this application? I mean just putting more boxes and running multiple instances of this same application, is that a viable approach?
Should we consider moving from Spring MDPs to EJB MDBs? Thereby deploying it on the Application Server. Is there any added benefit by doing so?
There is a request to make the application High Available(HA)? What are the suggested methodologies or strategies that can be put in place to make a standalone application HA?

Solution

Does "standalone" == "desktop"?

How do users interact with the controller that owns the message-driven beans?

My opinions on your questions:

You can scale by adding more message listeners to the listener pool, since each one runs in its own thread. You should match the size of the database connection pool to message listeners, so that would have to increase as well. Do that before adding more servers. Make sure you have enough RAM on hand.
I don't see what EJB MDB buys you over Spring MDB. You keep referring to "app servers". Do you specifically mean Java EE app servers like WebLogic, WebSphere, JBOSS, Glassfish? Because if you're deploying Spring on Tomcat I'd consider Tomcat to be the "app server" in this conversation.
HA means load balancing and failover. You'll need to have databases that are either synchronized or hot redeployable. Same with queues. F5 is a great hardware solution for load balancing. I'd talk to your infrastructure folks if you have some.

OTHER TIPS

Another option is Terracotta, a framework that does precisely what you want; running your app on several machines simultaneously and balancing the load among them.

Horizontal scaling for any application will eventually run into limits as demand for the data increases. Those limits are determined by load and server/database performance. At some point, if demand and load increase with scaling, the number of servers/databases will have to increase as well. Depending on the data that is being stored, the servers/databases will either have to be duplicated and synchronized, or some sort of hashing algorithm will need to be employed to split data across multiple servers. As you increase the number of synchronized data sources the cost of replicating/synchronizing those servers increases as well. That is why the hashed approach may be more appealing to minimize cost.

True High Availability solutions are very expensive to implement. I've seen various degrees of HA as well, but by definition it means absolute minimal or no downtime of, or lose of access to the data source. To achieve this requires a lot of redundant hardware, networking, and software that is able to utilize redundant hardware without losing the ability to get to the data when one of the data sources fails. Hardware failure is inevitable, it will happen, as well as power outages and other random acts of nature. Depending on how critical this data is an HA solution will also require multiple data centers on multiple independent power grids. Which is obviously going to be very expensive, so it all depends on how critical this data is to the end-user.

So, HA is an extreme scenario requiring an expensive architecture. I find that most of the time people are interested in just minimizing downtime, and depending on the size of the data source this can be achieved fairly inexpensively with adding hot-spares of the data sources.

Horizontal scaling a message driven app is easy... most of the time. You can certainly add another message listener operating on the same queue. Watch out, though, because you might have subtle dependencies on the ordering of messages. They might not be a problem now, with just one processor, but with more than one you are guaranteed that the messages will be processed "out of order" at some point.
EJB MDPs don't offer anything beyond Spring MDBs. Stick with what's working.
Horizontally scaling the processors is a start, but this one requires a bit more discussion.

For HA, you need to clarify the requirements. "High availability" is an interesting question for a queue-based app. If your app goes down for a few minutes, messages pile up in the queue. As long as you can get your app back up and running, those messages will still get processed, just with a bit more latency. It's probably worth asking, "What is the maximum acceptable latency for a message?"

There's probably some component of concern about hardware failures, loss of a datacenter, etc. These won't be addressed by horizontal scaling in the same location. You'll need to replicate all components at every layer: the queue itself, the processors, the backend database, and all network hardware connecting them.

It's an expensive proposition, so it's also worth asking, "What's the delta in annualized loss expectancy of downtime between an HA scenario and a non-HA scenario?" ALE incorporates both direct losses and regulatory or legal costs, so it's a good way to capture the cost of downtime.

.1. Creating more listeners on the queue can scale the number of consumers. As a consumer dies, the remaining consumers can keep running. Note: Your MQ and database need to have high availability solutions as well.

.2. Not sure what difference an application server would make in your case. Perhaps you could explain which features you intend to use?

.3. See my answer to 1. for HA.

Did you try to make multiple boxes ? I think you may see the doc of your MQ ? running multiple boxes may need some configuartion in your MQ but it will run ISA

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow