The second solution has some problems:
- Since you need to access individual timestamps1,
serialized_timestamps_for_the_day
cannot be considered atomic and would violate the 1NF, causing a bunch of problems. - On top of that, you are introducing a redundancy: the
date
can be inferred from the contents of theserialized_timestamps_for_the_day
, and your application code would need to make sure they never become "desynchronized", which is vulnerable to bugs.2
Therefore go with the first solution. If properly indexed, a modern database on modern hardware can handle much more than mere "well over a million records". In this specific case:
- A composite index on {person_id, timestamp} will allow you to query for person or combination of person and date by a simple index range scan, which can be very efficient.
- If you need just "by date" query, you'll need an index on {timestamp}. You can easily search for all timestamps within a specific date by searching for a range 00:00 to 24:00 of the given day.
1 Even if you don't query for individual timestamps, you still need to write them to the database one-by-one. If you have a serialized field, you first need to read the whole field to append just one value, and then write the whole result back to the database, which may become a performance problem rather quickly. And there are other problems, as mentioned in the link above.
2 As a general rule, what can be inferred should not be stored, unless there is a good performance reason to do so, and I don't see any here.