PostgreSQL - retrieve all IDs in a tree for a given subnode
-
18-01-2021 - |
Question
I have a non-binary tree of customer, and I need to obtain all the IDs in a tree for the given node.
The table is very simple, just an join table with a parent id and child id. This is a representation of the tree I stored in my db.
In this example if I search for node 17 I need in return 14-17. If I search for 11 I need in return 1-6-5-4-8-11-12-7-2-10-3.
The order is not important. I only need the ID to avoid circularity when adding children to a node.
I created this query. The ancestor part works fine, I retrieve all parent nodes, but for the descendants I have some trouble. I'm only able to retrieve some part of the tree. For example, with node 11 I retrieve 4-10-6-11-7-8, so all right part of the tree is missing.
WITH RECURSIVE
-- starting node(s)
starting (parent, child) AS
(
SELECT t.parent, t.child
FROM public.customerincustomer AS t
WHERE t.child = :node or t.parent = :node
)
,
ancestors (parent, child) AS
(
SELECT t.parent, t.child
FROM public.customerincustomer AS t
WHERE t.parent IN (SELECT parent FROM starting)
UNION ALL
SELECT t.parent, t.child
FROM public.customerincustomer AS t JOIN ancestors AS a ON t.child = a.parent
),
descendants (parent, child) AS
(
SELECT t.parent, t.child
FROM public.customerincustomer AS t
WHERE t.parent IN (SELECT parent FROM starting) or t.child in (select child from starting)
UNION ALL
SELECT t.parent, t.child
FROM public.customerincustomer AS t JOIN ancestors AS a ON t.parent = a.child
)
table ancestors
union all
table descendants
UPDATE
I see that many examples included in the tree table also the root in form (root_id, null).
In my case I don't have this record.
For example, taking the smallest tree 14->17, in my table I have only one record parent, child
14 17
Solution
A very primitive implementation:
It basically divides the problem into two subproblems:
- First find all the ancestors of the node in question (including the node itself). If the node has no parents, then this would be just itself.
- Then find the descendants of all those ancestors (including themselves). We may have several nodes in the
ancestors
result set, we may get duplicates here, so we useUNION
(and notUNION ALL
) to remove them. - Note that the query will work even if the input node is a root with has no children.
- It will also work if the data set is not a forest of trees but an arbitrary directed graph (where nodes can have more than one parent).
The query:
WITH RECURSIVE
ancestors (parent) AS
(
SELECT :node -- start with the given node
UNION ALL
SELECT t.parent -- and find all its ancestors
FROM public.customerincustomer AS t JOIN ancestors AS a ON t.child = a.parent
),
descendants (customer) AS
(
SELECT parent AS customer -- now start with all the ancestors
FROM ancestors
UNION
SELECT t.child -- and find all their descendants
FROM public.customerincustomer AS t JOIN descendants AS d ON t.parent = d.customer
)
SELECT customer
FROM descendants ;
OTHER TIPS
This function returns the parent level of node_id
:
There is a 'level' row due there isn't a row (id, null) for parent row.
CREATE FUNCTION get_parent(node_id int)
RETURNS integer AS
$$
WITH RECURSIVE get_parent AS
(
SELECT
t1.id,
t1.parent_id,
t1.name,
0 AS level
FROM
tree t1
WHERE
t1.id = node_id
UNION ALL
SELECT
t2.id,
t2.parent_id,
t2.name,
level+1
FROM
tree t2
INNER JOIN
get_parent ON get_parent.parent_id = t2.id
)
SELECT
id
FROM
get_parent
ORDER BY
level DESC
LIMIT 1 ;
$$
LANGUAGE SQL;
select get_parent(7);
| get_parent | | ---------: | | 6 |
Now, next query returns the whole tree structure based on a parent node.
WITH RECURSIVE childs AS
(
SELECT
t1.id,
t1.parent_id,
t1.name
FROM
tree t1
WHERE
t1.id = get_parent(7)
UNION ALL
SELECT
t2.id,
t2.parent_id,
t2.name
FROM
tree t2
INNER JOIN
childs ON childs.id = t2.parent_id
)
SELECT
id,
parent_id,
name
FROM
childs;
id | parent_id | name -: | --------: | :------ 6 | 1 | Node 6 4 | 6 | Node 4 8 | 6 | Node 8 11 | 6 | Node 11 7 | 11 | Node 7 10 | 7 | Node 10
db<>fiddle here