Question

[previous question]

I'm trying to add reddit-like comments to an app, and I decided to go with the closure table pattern for database organization. My app database looks somewhat like this:

posts

+----+-------+
| id | title |
+----+-------+
|  1 | Hello |
+----+-------+

comments

+----+-----------+---------+------+
| id | parent_id | post_id | text |
+----+-----------+---------+------+
|  1 |      NULL |       1 | ...  |
|  2 |         1 |       1 | ...  |
|  3 |         2 |       1 | ...  |
|  4 |         3 |       1 | ...  |
|  5 |         3 |       1 | ...  |
|  6 |         5 |       1 | ...  |
|  7 |      NULL |       1 | ...  |
|  8 |         7 |       1 | ...  |
|  9 |         4 |       1 | ...  |
+----+-----------+---------+------+

comment_paths

+-----------+----------+-------+
| parent_id | child_id | depth |
+-----------+----------+-------+
|         1 |        1 |     0 |
|         2 |        2 |     0 |
|         1 |        2 |     1 |
|         3 |        3 |     0 |
|         2 |        3 |     1 |
|         1 |        3 |     2 |
|         4 |        4 |     0 |
|         3 |        4 |     1 |
|         2 |        4 |     2 |
|         1 |        4 |     3 |
          [...snip...]

Right now, I'm running a this query:

SELECT c.*, p.*
FROM comments AS c
JOIN comment_paths AS p
    ON c.id = p.child_id
WHERE p.parent_id IN
    (SELECT c2.id FROM comments AS c2 WHERE c2.parent_id IS NULL AND c2.post_id = 1)

to get a list of the comments based on their post_id. The returned data is:

+------+-------------+-----------+--------+-------------+------------+---------+
| c.id | c.parent_id | c.post_id | c.text | p.parent_id | p.child_id | p.depth |
+------+-------------+-----------+--------+-------------+------------+---------+
|    1 |        NULL |         1 | ...    |           1 |          1 |       0 |
|    2 |           1 |         1 | ...    |           1 |          2 |       1 |
|    3 |           2 |         1 | ...    |           1 |          3 |       2 |
|    4 |           3 |         1 | ...    |           1 |          4 |       3 |
|    5 |           3 |         1 | ...    |           1 |          5 |       3 |
|    6 |           5 |         1 | ...    |           1 |          6 |       4 |
|    9 |           4 |         1 | ...    |           1 |          9 |       4 |
|    7 |        NULL |         1 | ...    |           7 |          7 |       0 |
|    8 |           7 |         1 | ...    |           7 |          8 |       1 |
+------+-------------+-----------+--------+-------------+------------+---------+

This represents the tree:

[1]
 |[2]
 | |[3]
 |   |[4]
 |   | |[9]
 |  [5]
 |   |[6]
[7]
 |[8]

However, I'm struggling to convert the returned data into a Python tree structure. Essentially, my goal is this question and this question in terms of final output (HTML) but I really don't want to resort to recursive SQL statements since I already have the information. I figure some sort of recursion is necessary, as I would like to end up with structure similar to this:

[
  {
    'id': 1,
    ...
    'children': [
                  {
                    'id': 2,
                    ...
                    'children': [ ... ]
                  }
                ]
  },
  {
    'id': 7,
    ...
    'children': [
                {
                  'id': 8,
                  ...
                }
              ]
  }
]

Basically a nested list of dictionaries so I can loop over them using Jinja's recursive loop. Does anyone have an idea?

Thanks!


Edit 2013-04-17

Messing around, I have a "working" solution, although it does a lot of iterations so I don't want to mark it as the answer to this question. The solution I used is:

comment_set = ... # previous query to grab data set

def create_tree(parent):
    parent['children'] = []
    for record in comment_set:
        if record['parent_id'] == parent['id']:
            parent['children'].append(create_tree(record))
    return parent

comment_tree = []
for record in comment_set:
    if record['parent_id'] is None: # if this is the start of a tree
        comment_tree.append(create_tree(record))

It's not ideal because it iterates over the comment_set every time create_tree() is called, which is every record in the set. However, it's the best I have right now. Anyone have any thoughts?

Was it helpful?

Solution

You don't need recursion, you just need to be able to process node objects by reference.

Here's a code example to create the nested data structure in linear time. Treat this as pseudocode, because I haven't tested this and I'm not fluent in python yet.

The reason I use two for-loops is that otherwise we'd have to assume nodes from the top of the tree are processed before nodes from deep in the tree. With two loops as shown below, no such assumption is needed.

for record in comment_set:
    nodes[record['id']] = record
for record in comment_set:
    if record['parent_id'] in nodes:
        nodes[record['parent_id']]['children'].append(record)
    else
        top = record;
return top

By the end of that loop:

  • nodes will be a one-dimensional dictionary of all the nodes in the hierarchy.
  • each node will have a list of its own immediate children.
  • top should be the only node that has no parent (i.e. the root node).

This is similar to an example I posted in a past SO post, Turn database result into array:

OTHER TIPS

you can make some small changes that helps you to improve your code::

comment_set = ... order by parent_id asc

def create_tree(parent, list):
    parent['children'] = []
    for record in comment_set:
        if record['parent_id'] == parent['id']:
            parent['children'].append(create_tree(record, list[1:]))
    return parent

comment_tree = []
for record in comment_set:
    if record['parent_id'] is None: # if this is the start of a tree
        comment_tree.append(create_tree(record, comment_set))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top