TimescaleDB distributed setup strategy

https://dba.stackexchange.com/questions/283538

14-03-2021
|

Pergunta

What is the best setup strategy for TimescaleDB?

My initial thoughts based on the TimescaleDB FAQ was that it is indeed recommended and possible to basically have one database that contains both time series data and my other regular data.

I am just wondering what is the strategy when using multinodes. We can create a distributed hypertable, but what about the rest of my non-timeseries data? Can I distribute them?

Does it make sense to keep TimescaleDB data into its own "cluster" of nodes and have separate instances (perhaps even use other extension like Citus) for non-timeseries data ?

Solução

Current version of TimescaleDB, which is 2.0.0, doesn't support push down of aggregates to data nodes (see limitations) and all joins are performed on the access node. So for the queries, which join a distributed hypertable with non-timeseries data (which, I guess, are stored in normal tables), the data will be brought to the access node for the join. Thus you might want to store non-timeseries data on the access node. You can also manually distribute non-timeseries data, however there will be no performance benefits.

Note that if a hypertable refers to non-timeseries data through a foreign key, it is necessary to have the non-timeseries table on the access node and all data nodes and the timeseries data be partitioned across data nodes in the same way as the hypertable.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a dba.stackexchange