Question

At my project, I alrealdy have tables wich have columns with json(nested) data, but they're not JSON data type. For some json attributes, I have a column at the same table, so that I can have some performance while searching, ordering, etc.. Just to use like relational data. For better understanding, here is an example of what I actually have:

lb_reg_marcacao_especial

        Column        |            Type             | Modifiers
----------------------+-----------------------------+-----------
 id                   | integer                     | not null
 json                 | character varying           | not null
 nm_marcacao          | character varying           |
 dt_inclusao          | timestamp without time zone |
Indexes:
    "lb_reg_marcacao_especial_pkey" PRIMARY KEY, btree (id_reg)
    "lb_reg_marcacao_especial_nm_marcacao_key" UNIQUE CONSTRAINT, btree (nm_marcacao)

while json columns will have data like:

'{
  "dt_inclusao": "02/12/2013 11:05:27",
  "nm_user_inclusao": "Some name",
  "nested": {
    "nm_marcacao": "marc1",
    "dt_ultima_alteracao": ""
  },
  "nm_user_ultima_alteracao": "",
  "ds_marcacao": "marc1",
  "st_marcacao": true
}'

Some of these json data have ~100 keys and about 3 level of nesting. As you can see, data is replicated, since I have some keys in json and in table's columns (nm_marcacao and dt_inclusao).

So, I'm thinking about changing the json column data type to JSON and removing the other columns. What about you ?

Was it helpful?

Solution

Switching to JSON instead of text has several benefits:

1) The data will be verified to confirm to actual JSON specs. Your server-side app might already be doing that, but type safety is always nice.

2) You have access to several JSON functions, such as getters of properties at arbitrary paths, conversion to records, and more:

http://www.postgresql.org/docs/9.3/static/functions-json.html http://clarkdave.net/2013/06/what-can-you-do-with-postgresql-and-json/

To do setters, I suggest a simple plv8 function that takes 3 inputs (json object, path-to-property, updated-value) and returns the updated json object back. That can be used directly in update statements.

3) You can actually index certain properties of the JSON already for performance:

How to create index on json field in Postgres 9.3

4) JSON 9.4 will introduce binary storage, along with index engine improvements will drastically increase speed:

http://obartunov.livejournal.com/177247.html

Here is a nice talk on the state of JSON in 9.3:

http://www.slideshare.net/amdunstan/93json-26647827

Here is a performance report of an earlier version (using the original hstore2 work) showing some performance relative to mongodb - this is by the jsonb author:

http://obartunov.livejournal.com/175235.html

and an updated benchmark from someone else:

https://plus.google.com/+ThomBrownUK/posts/1JizRBGPYBq

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top