I have a desktop app that has the concept of an entity called Field
.
-----------------------
| Id | FieldName |
-----------------------
| 1 | "Field 1" |
-----------------------
| 2 | "Field 2" |
-----------------------
Field
s are defined by the user, so there can be as many of them as the user wants. They are associated with another entity called Employee
.
Field
s have a value (a 16-bits integer calculated and stored by the app) for each day of the year.
Field
values are stored in a table where each record holds the values for one full year of one Employee
of one Field
.
Said table, therefore, looks a bit like this:
---------------------------------------------
| FieldId | EmployeeId | FieldValues | Year |
---------------------------------------------
| 1 | 4 | byte[] | 2012 |
---------------------------------------------
| 2 | 4 | byte[] | 2012 |
---------------------------------------------
| 1 | 5 | byte[] | 2013 |
---------------------------------------------
| ... | ... | ... | ... |
---------------------------------------------
FieldValues holds the values as a byte array in a BLOB field, which is then converted back to an array of 16-bits integers before being shown to the user on a grid.
Now that we have a bit of context, the real question.
This is a legacy app, I am not the original designer. It's easy to guess, though, that the goal of storing this data in a binary format was to limit the number of records that would otherwise be necessary to store 365 (or 366) values per year per Employee
per Field
.
What I'm doing now is a "sync" app which pulls this data from a local Access db (don't ask) and pushes it via a REST API to a web app on a remote server.
Such app needs to have a copy of this data so I'll have to store it in its database.
Storing data in a binary format has the clear advantage of really limiting the number of records we need to store, but the disadvantage of being human-unreadable.
On the other hand, the web app is multi-tenant, so storing this data in any other way would mean storing a great number of records: just a couple thousand Employee
s and an average of 20 Field
s would mean storing upwards of 14 million records each year (and Fields
are not the only entity that could generate millions of records). Plus, a large number of records per-year wouldn't be a problem per se if somewhere down the road, say every two or three years, we could throw them away; that, however, is not the case.
The real question, then, is how to store said data. Should I stick to the old format?
Can anyone think of a whole different way of going about it?
For the sake of completeness, even though I don't think it matters much, the destination db is Postgres.