Handling Hierarchy Data in Database
-
06-09-2019 - |
Question
I'm curious to know what the best way (best practice) to handle hierarchies are in regards to database design. Here is a small example of how I usually handle them.
Node Table
NodeId int PRIMARY KEY
NodeParentId int NULL
DisplaySeq int NOT NULL
Title nvarchar(255)
Ancestor Table
NodeId int
AncestorId int
Hops int
with Indexes on NodeId, AncestorId, Hops
Tables look like this:
Node Table
NodeId NodeParentId DisplaySeq Title
1 NULL 1 'Root'
2 1 1 'Child 1'
3 1 2 'Child 2'
4 2 1 'Grandchild 1'
5 2 2 'Grandchild 2'
Ancestor Table
NodeId AncestorId Hops
1 NULL 0
1 1 0
2 1 1
2 2 0
3 1 1
3 3 0
4 1 2
4 2 1
4 4 0
5 1 2
5 2 1
5 5 0
With this design, I've found that with large hierarchies I can get an entire section of the hierarchy very quickly by joining on the Ancestor table for AncestorId = target NodeId, like:
SELECT *
FROM Node n
INNER JOIN Ancestor a on a.NodeId=n.NodeId
WHERE a.AncestorId = @TargetNodeId
It's also easy to get direct children as well
SELECT *
FROM Node n
INNER JOIN Ancestor a on a.NodeId=n.NodeId
WHERE a.AncestorId = @TargetNodeId
AND Hops = 1
I'm interested in knowing what other solutions you may have used for this type of thing. In my experience, hierarchies can get pretty hairy, and any way to optimize their retrieval is very important.
Solution
As MarkusQ and n8wrl have already pointed out, Joe Celko has some good stuff on this. I'll just add that there are multiple ways to model a hierarchy (Joe's book contains several I believe, not just one that he considers the "best"). Your final decision will hopefully take into account your own specific needs. Some of the different ways to model it are better for write-intensive operations while others are better for frequent or fast reads up and down the hierarchy. Just keep in mind what your system will be doing with it.
OTHER TIPS
There are some vendor-specific extensions to do this, but my favorite db-neutral way comes from Joe Celko - google 'Joe Celko Trees and Hierarchies' or buy this book: link text
This is a very clever set-based way to go. Easy to query hierarchy. I added the 'parentID' field you have just because I ask the 'direct children' and 'parent' questions a lot and that speeds those up. But this is a wonderful way to get a 'ancestry' or 'descdent' query.
You may also want to check out the "nested sets" pattern:
http://www.intelligententerprise.com/001020/celko.jhtml (Broken Link)
Or you can Google for more.
P.S.: Curses, n8wrl, you type faster than I do!
SQL Server 2008 introduced the hierarchyid data type
In Oracle, you can use CONNECT BY/START WITH to query hierarchial data. In SQL Server, you can use a stored procedure, that calls itself recursively.
I'd definitely recommend nested sets. They are great.
http://threebit.net/tutorials/nestedset/tutorial1.html http://www.dbmsmag.com/9603d06.html