Question

After going through the relational DB/NoSQL research debate, I've come to the conclusion that I will be moving forward with PG as my data store. A big part of that decision was the announcement of JSONB coming to 9.4. My question is what should I do now, building an application from the ground up knowing that I want to migrate to (I mean use right now!) jsonb? The DaaS options for me are going to be running 9.3 for a while.

From what I can tell, and correct me if I'm wrong, hstore would run quite a bit faster since I'll be doing a lot of queries of many keys in the hstore column and if I were to use plain json I wouldn't be able to take advantage of indexing/GIN etc. However I could take advantage of nesting with json, but running any queries would be very slow and users would be frustrated.

So, do I build my app around the current version of hstore or json data type, "good ol" EAV or something else? Should I structure my DB and app code a certain way? Any advice would be greatly appreciated. I'm sure others may face the same question as we await the next official release of PostgreSQL.

A few extra details on the app I want to build:

-Very relational (with one exception below)
-Strong social network aspect (groups, friends, likes, timeline etc)
-Based around a single object with variable user assigned attributes, maybe 10 or 1000+ (this is where the schema-less design need comes into play)

Thanks in advance for any input!

Was it helpful?

Solution

It depends. If you expect to have a lot of users, a very high transaction volume, or an insane number of attribute fetches per query, I would say use HSTORE. If, however, you app will start small and grow over time, or have relatively few transactions that fetch attributes, or just fetch a few per query, then use JSON. Even in the latter case, if you're not fetching many attributes but checking one or two keys often in the WHERE clause of your queries, you can create a functional index to speed things up:

CREATE INDEX idx_foo_somekey ON foo((bar ->> 'somekey'));

Now, when you have WHERE bar ->> somekey, it should use the index.

And of course, it will be easier to use nested data and to upgrade to jsonb when it becomes available to you.

So I would lean toward JSON unless you know for sure you're going kick your server's ass with heavy use of key fetches before you have a chance to upgrade to 9.4. But to be sure of that, I would say, do some benchmarking with anticipated query volumes now and see what works best for you.

OTHER TIPS

You probably don't give quite enough to give a very detailed answer, but I will say this... If your data is "very relational" then I believe your best course is to build it with a good relational design. If it's just one field with "variable assigned attributes", then that sounds like a good use for an hstore. Which is pretty tried and true at this point. I've been doing some reading on 9.4 and jsonb sounds cool, but, that won't be out for a while. I suspect that a good schema design in 9.3 + a very targeted use of hstore will probably yield a good combination of performance and flexibility.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top