Caveats of OUTER JOIN on nested JSON value
-
04-03-2021 - |
質問
I'm writing a query which is supposed to find the elements from a list which DO NOT exist in the DB. My first attempt at this was to use a nested query where the first query fetches the ids, then I right join
on that query to get what I need, and this works well:
select v.id from (
select distinct json_data ->> 'elementId' as elementId
from content
and json_data->> 'elementId' in ('id1', 'id2', 'id3')
) as a
right join (values('id1'), ('id2'), ('id3')) as v(id)
on v.id = a.elementId
where a.elementId is null
The above query works perfect except for the fact that I want to I should be able to reduce the nested query to a regular select
if I do the comparison on json_data ->> 'elementId'
directly.
My attempt:
select v.id
from content a
right join (values('id1'), ('id2'), ('id3')) as v(id)
on json_data ->> 'elementId' = v.id
After some debugging I realized that this will never work because the content
table will always contain a row even if json_data ->>'elementId'
is null
.
Edit: I had an extra WHERE
statement which wasn't stated in the question, once I moved this after the ON
my query was fixed
My question is; Is there a way to avoid using a nested query when wanted to do a left join
or right join
on JSON
data?
解決
Use NOT EXISTS
:
SELECT *
FROM (VALUES ('id1'), ('id2'), ('id3')) AS v(id)
WHERE NOT EXISTS (SELECT FROM content WHERE json_data ->> 'elementId' = v.id);
Or if you prefer a join:
SELECT v.id
FROM (VALUES ('id1'), ('id2'), ('id3')) AS v(id)
LEFT JOIN content c ON c.json_data ->> 'elementId' = v.id
WHERE c.json_data IS NULL -- or use the PK column
Either is an "anti-join", technically; and both will probably result in the same query plan.
See:
Consider upgrading to a current version of Postgres. 9.4 has reached EOL in Feb 2020.
Index
But even Postgres 9.4 already supports jsonb
which (unlike json
) allows a GIN index to support your query. See:
- How to get particular object from jsonb array in PostgreSQL?
- What's the proper index for querying structures in arrays in Postgres jsonb?
Or if you are focused on this query exclusively, a plain btree on an expression should be the optimum:
CREATE INDEX ON content ((json_data ->> 'elementId'));
Related: