Remote PostgreSQL DB with local buffer … how could it look like

https://dba.stackexchange.com/questions/273226

05-03-2021
|

Question

I have a sensor which is generating a data vector let's say every second (adjustable to up to 100 vectors per second) at a remote location. The location got 4G modem to send the data. Due to power limitations I have chosen to use a Raspberry Pi 3+ to receive the data from the sensor, decode it and send it to my VPS server.

On my server running Ubuntu 18.04 I have installed TimescaleDB on top of PostgreSQL which is a good combination to handle time-series data and I would like to store the sensor data in there and publish then on my website. I'm keen to hear your thoughts of how the data flow could look like and I have some questions.

What would be the best way of inserting the sensor data from the RPi into the server DB? Is it save to insert them directly from the RPi using the libpq library in my C software or should I transport the data to the server in a different way and then insert the data into the DB at the server side?
At the RPi side, I would like to have a local buffer to prevent losing data if the connection to the server breaks. I could simply use a circular buffer in my C program or a named pipe, or I could install PostgreSQL/TimescaleDB locally, insert the data there and then use postgres_fdw or dblink to transfer the data from there to my server DB and delete the entries from my local DB. What would you suggest?

Solution

I think installing Postgres on the RPi is overkill.

Buffering the sensor data in a plain text file should be good enough. The process could look something like:

send the sensor data to a text file (append mode)
Have a cron job that runs every x minutes. This will rename the text file as a first step. This makes 1. create a new text file while you process the "old" one
Send the contents of the text file to your Postgres server using copy ... from stdin inside a transaction. If this fails (connection problem), retry until successful.
Remove the file that was just sent to the server. Alternatively keep it as an archive/backup - which would require creating a filename in 2. that is unique, e.g. by appending a timestamp.

Using copy .. from stdin is way more efficient than running multiple single-row INSERT statement. And still more efficient than running a single multi-row INSERT statement. Additionally the file then also acts as the buffer you want.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange