Database recommendations for hybrid scientific & relational data?

Question

I am not sure if I can give a proper solution but we have a similar setup.

We have meta-information stored in a RBDMS (postgresql) and the actual scientific data in HDF5 files.
We have a couple of analysis that are run on a our HPC. The way it is done is as follows:

User wants to run an analysis (from a web-frontend)
A message is sent to a central message broker (AMQP, RabbitMQ) containing the type of analysis and some additional information
A worker machine (VM) picks up the message from the central message broker. The worker uses REST to retrieve meta-information from the RDBMS database and stages the files on the HPC and then creates a PBS job on the cluster.
Once the PBS job is submitted a message with the job-id is sent back to the message broker to be stored in the RBDS database.
The HPC job will run the scientific analysis and then store the result in a HDF5 file.
Once the job is finished, the worker machine will stage-out the HDF5 files into a NFS share and it will store the link in the RBMS database.

I would recommend against storing binary files in a RDBMS as a BLOB.
I would keep them in HDF5 format. You can have different backup policies for the database and the filesystem.

A couple of additional pointers:

You could hide everything (both RBMS and HDF5 storage) behind a REST interface. This might solve your containment issue
If you want to store everything in a NoSQL DB I would recommend to have a look at Elasticsearch. It works well with time-series data, it is distributed out of the box and it has also a Hadoop plugin