Multidimensional Tables in PostgreSQL

Question

I would like to convince you, if possible, to not encode your data this way, (independent of how terrible an idea it is)

Lets suppose you have a really hot post, goes viral, et-cetera. That means all of your users are viewing it and many are trying to comment on it. with all of your nested discussion embedded in a single row, all updates must apply to that row. This in turn means that every update on that discussion competes with every other to update that one attribute. As you might imagine, this write contention will make your database slow way down.

A second reason is that it violates the rules of first normal form; in the sense that the comment attribute on the table you're showing contains more than one value. The motivating reasoning for this widely applied rule is that it makes a larger number of queries possible. In your design, it would be very difficult to delete from COMMENTS where USER = 'spammy-user'*, or even select * from COMMENTS where text like '%Trending Topic%'. In general, if you might ever want to look at part of a value in a column, rather than the whole thing, then you're probably looking at an opportunity for normalization.

The rule I try to use is "each 'kind of thing' gets its own table". as comments are a 'kind of thing', we'll split them out:

create table COMMENTS(
    COMMENT_ID serial primary key,
    POST_ID integer not null foreign key references POSTS(ID),
    PARENT_COMMENT_ID integer foreign key references COMMENTS(COMMENT_ID),
    CREATED_BY ...
    CONTENT ...
)

with the convention that comments having a null parent_comment_id are the roots of threaded discussions.