Centralized server solution vs a not centralized one

https://softwareengineering.stackexchange.com/questions/355959

18-01-2021
|

Question

At work, we are discussing about two system models:

First one with a centralized server:

Second one without a centralized server:

The goal of the system is to allow clients access information, by browser, served by some remote raspberry-pis that will be connected with different kinds of sensors.

As the pictures show, in the first case there will be a server working as a middleware (Node + Express, for example) to manage clients requests and properly getting/setting information from the sensors (socket communication) they wish to inspect. In the second case, the raspberry-pis will also work as webservers, allowing clients to connect to the sensors they want directly.

So here is what I think: The fisrt case is the default way of how web-systems are generally built. The second case, in my point of view, is a weird way to make this communication process generating a lot of problems.

What I want with this question is to raise some positive and negative aspects from each case.

Here is what I have already raised, please correct me if I said anything wrong.

Deploy. In the first case is easy to give maintenance/deploy to the middle server code, since there is only one place to do it, in the second case, I would have to change each raspberry-pi in the network. To avoid this hard task, I could create an automatic deploy that would do this for me for each raspberry-pi. Beyond the hard work, there would be any additional problem about doing this?
Tracking. In the first case, tracking raspberry-pis is easy since as far as they connect, they would fill a list of connected devices, then clients would be able to see which sensor they want to inspect in this list. In the second case, It would be more complex to find all raspberry-pis since there isn't a place to store a list of the connected ones. All the raspberry-pis will have to have a list of all connected devices (a lot of repeated information) and this list always need to be consistent in all devices. Clients need to be able to connect to other raspberry-pis by just connecting to a single one. To avoid CORS in this case, raspberry-pis need to intercommunicate. I don't know whether this intercommunication is nice, but perhaps this can be faster than the centralized server case, since information can be accessed directly: raspbarry-pi to raspberry-pi.
Speed. Theoretically, direct communication to raspberry-pis in the second case is faster than the first case. But due to hardware limitations, I think the second case can easily overload any raspberry-pi. The application may require data streaming or a faster package sending process.
Authentication and security. In the second case, that would be a hell. Spread tokens, for examplo, along all raspberry-pis client need to access would be very hard to implement.

(Please tell me if I wasn't clear about anything)

La solution

First thoughts

Network configuration and tracking:

In the decentralized approach, you'd need some discovery protocol to ensure that the devices know each other and organize communication (e.g. routing of commands and queries across the mesh, storing of replicated information, etc...).

This includes discovering new devices, but also coping with devices that disappear (failure or removal) and expired DHCP lease (i.e. devices that get a new IP address). And it includes management of network boundaries: will you search for neighbor devices across the worldwide network of the company ? or only in the local subnet ?

Deploy:

In the decentralized approach you'd need to propagate updates from device to device. The main problem here is security: a rogue device could compromit you full sensor network. Typically this would require signatures of updates and management of acceptable certificates. So advanced skills in secure communication is required.

Performance:

THe decentralized approach presupposes a careful engineering of the above mentioned communication layer. If for every user query on any device, every sensor starts to query all the others, it will be a mess and a risk of network congestion if you have many devices.

Security and authentication:

Yes, in the decentralized approach this will be a difficult topic: can every authorized user log on any device ? Or are access rights different for different subgroups of devices ?

One approach could be to replicate changes to the user authorizations across all the relevant nodes. Another approach could use JWT tokens managed by some trusted authority (i.e. access control could be decentralized on each device, but the user logon is managed by a centralized service which is not necessarily related to the devices).

More thoughts

Complexity

The decentralized approach is of high complexity.

And it will limit future evolution. If tomorrow you'd discover that your sensors need to communicate with a new kind of devices, you'd need to organized decentralized exchanges with these devices as well. N kind of devices could mean up to N! extensions to your protocols and applications.

My personal impression is that it's lot of complexity which endangers future sustainability of your architecture.

Complex event management

Decentralized solution has also its limits. In a corporate environment, you might want to be able to automatically monitor the sensor data (big data?) and organize some complex event processing beyond the sensor's scope (e.g. if the sensor temperature increases beyond normal, and several neighboring sensors show the same trend, it might be a start of a fire, and a notification to the fireguards should be send out).

Layered approach

Centralization can also be envisaged in layers. One approach could be to have a centralized gateway to the sensor (several gateways if there are many sensors): this could greatly simplify the communication layer and the administration. I'd recommend this practice.

On the application level, you'd still have the freedom to decide if you want to centralize the data or if you prefer to forward the queries to a specific sensor. Not knowing your domain of application, it's difficult to guess what could be best.

A proven and scalable approach is to organize a publisher/subscriber architecture, for example with Kafka. The approach is to forward the device information to a broker, that dispatches it to all the systems or applications that are interested. And this would avoid to loose historical information if it's relevant.

Licencié sous: CC-BY-SA avec attribution

Non affilié à softwareengineering.stackexchange