سؤال

I have a table that is used to store a hierarchy, which references itself. I am in need of a SQL statement that will determine the parent's node type. Below I have given the structure of my table as well as sample data to provide the best explanation of what I am trying to figure out.

NODES Table

CREATE TABLE IF NOT EXISTS NODES(id INTEGER PRIMARY KEY AUTOINCREMENT, type TEXT NOT NULL, parent_id INTEGER REFERENCES NODES(id) ON DELETE CASCADE

NOTE: parent_id can be NULL to reference the ROOT node.

Sample Data

INSERT INTO NODES(type, parent_id) VALUES('GRP', NULL); -- id: 1, title: Hello, World!
INSERT INTO NODES(type, parent_id) VALUES('TXT', 1);    -- id: 2, title: Print
INSERT INTO NODES(type, parent_id) VALUES('RND', 1);    -- id: 3, title: Random Output
INSERT INTO NODES(type, parent_id) VALUES('TXT', 3);    -- id: 4, title: OUTPUT #1
INSERT INTO NODES(type, parent_id) VALUES('TXT', 3);    -- id: 5, title: OUTPUT #2
INSERT INTO NODES(type, parent_id) VALUES('TXT', 3);    -- id: 6, title: OUTPUT #3

There are titles for each node, which I have listed in the comment, but for usability I just put them in the comments. What I am looking to do is have a single SQL statement to return everything, but the OUTPUT #* using the parent's attributes.

My Attempt

SELECT id
FROM NODES
WHERE parent_id NOT IN (SELECT id
                        FROM NODES
                        WHERE type = 'RND');

My attempt works for the most part, but since parent_id can be NULL I was informed by researching this that a JOIN would be a much better solution. I just cannot figure out how to make a SELF JOIN work the way I want it to.

هل كانت مفيدة؟

المحلول

This query joins the table to itself based on the parent_id and displays all fields from the node and its parent node. Because a left join is used, the results will include all nodes with their parents, including the root nodes.

Because the same table is referenced twice in the same query, an alias must be used to distinguish between the two tables. The syntax "nodes as parent" creates the alias "parent".

SELECT nodes.*, parent.*
FROM nodes
LEFT JOIN nodes AS parent
ON nodes.parent_id = parent.id

To find nodes where the parent type is not equal to "RND" as in your query above, you would need to add the below where clause to the query.

WHERE parent.type != 'RND' OR parent.type IS NULL

The important point here which I believe you are missing in your above query is the way that NULL values and comparison operators work together. The result of NULL with most comparison operators and any other value always comes out as false. This is why the second condition is needed in the where clause above. The "IS" keyword is a special keyword which can be used to check for null values.

نصائح أخرى

Your query indeed does not return the "Hello, World!" node, because comparisons with NULL always fail. You could modify this query to include nodes without parent by adding this special case:

SELECT id
FROM NODES
WHERE parent_id NOT IN (SELECT id
                        FROM NODES
                        WHERE type = 'RND')
   OR parent_id IS NULL

It is possible to do this with a join, but

  1. you cannot use a normal (inner) join because this would not match for a NULL value, you have to use an outer join instead; and
  2. you still have to add the special case for parent-less nodes, because the outer join sets all values of the missing parent to NULL; and
  3. you now have two tables, so you have to use an alias, and to specify the table for each column name:
SELECT child.id
FROM nodes AS child
LEFT JOIN nodes AS parent ON child.parent_id = parent.id
WHERE parent.type <> 'RND'
   OR parent.type IS NULL
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top