Question

I'm curious to know what the best way (best practice) to handle hierarchies are in regards to database design. Here is a small example of how I usually handle them.

Node Table

NodeId int PRIMARY KEY
NodeParentId int NULL
DisplaySeq int NOT NULL
Title nvarchar(255)

Ancestor Table

NodeId int
AncestorId int
Hops int

with Indexes on NodeId, AncestorId, Hops

Tables look like this:

Node Table

NodeId    NodeParentId    DisplaySeq    Title
1         NULL            1             'Root'
2         1               1             'Child 1'
3         1               2             'Child 2'
4         2               1             'Grandchild 1'
5         2               2             'Grandchild 2'

Ancestor Table

NodeId    AncestorId    Hops
1         NULL          0
1         1             0
2         1             1
2         2             0
3         1             1
3         3             0
4         1             2
4         2             1
4         4             0
5         1             2
5         2             1
5         5             0

With this design, I've found that with large hierarchies I can get an entire section of the hierarchy very quickly by joining on the Ancestor table for AncestorId = target NodeId, like:

SELECT *
FROM Node n
INNER JOIN Ancestor a on a.NodeId=n.NodeId
WHERE a.AncestorId = @TargetNodeId

It's also easy to get direct children as well

SELECT *
FROM Node n
INNER JOIN Ancestor a on a.NodeId=n.NodeId
WHERE a.AncestorId = @TargetNodeId
AND Hops = 1

I'm interested in knowing what other solutions you may have used for this type of thing. In my experience, hierarchies can get pretty hairy, and any way to optimize their retrieval is very important.

Was it helpful?

Solution

As MarkusQ and n8wrl have already pointed out, Joe Celko has some good stuff on this. I'll just add that there are multiple ways to model a hierarchy (Joe's book contains several I believe, not just one that he considers the "best"). Your final decision will hopefully take into account your own specific needs. Some of the different ways to model it are better for write-intensive operations while others are better for frequent or fast reads up and down the hierarchy. Just keep in mind what your system will be doing with it.

OTHER TIPS

There are some vendor-specific extensions to do this, but my favorite db-neutral way comes from Joe Celko - google 'Joe Celko Trees and Hierarchies' or buy this book: link text

This is a very clever set-based way to go. Easy to query hierarchy. I added the 'parentID' field you have just because I ask the 'direct children' and 'parent' questions a lot and that speeds those up. But this is a wonderful way to get a 'ancestry' or 'descdent' query.

You may also want to check out the "nested sets" pattern:

http://www.intelligententerprise.com/001020/celko.jhtml (Broken Link)

Or you can Google for more.

P.S.: Curses, n8wrl, you type faster than I do!

SQL Server 2008 introduced the hierarchyid data type

In Oracle, you can use CONNECT BY/START WITH to query hierarchial data. In SQL Server, you can use a stored procedure, that calls itself recursively.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top