SQL - 如何存储和导航层次结构？

https://stackoverflow.com/questions/38801

09-06-2019
|

题

您使用哪些方法来建模和检索数据库中的分层信息？

解决方案

关于这个主题的权威文章由 Joe Celko 撰写，他将其中的一些文章整理成一本名为《Joe Celko's Trees and Hierarchies in SQL for Smarties》的书。

他喜欢一种称为有向图的技术。关于他在这个主题上的工作的介绍可以找到这里

其他提示

我喜欢改进的先序树遍历算法。这种技术使得查询树变得非常容易。

但这里是我从 Zend Framework (PHP) 贡献者网页（由 Laurent Melmoux 于 2007 年 6 月 5 日 15:52 发布）复制的有关该主题的链接列表。

许多链接与语言无关：

有两种主要的表示方法和算法来表示数据库的层次结构：

嵌套集也称为改进的先序树遍历算法
邻接表模型

这里解释得很好：

以下是我收集的更多链接：

邻接表模型

http://www.sqlteam.com/item.asp?ItemID=8866

嵌套集

http://www.sqlsummit.com/AdjacencyList.htm
http://www.edutech.ch/contribution/nstrees/index.php
http://www.phpriot.com/d/articles/php/application-design/nested-trees-1/
http://www.dbmsmag.com/9604d06.html
http://en.wikipedia.org/wiki/Tree_traversal
http://www.cosc.canterbury.ac.nz/mukundan/dsal/BTree.html （java montrant le fonctionnement 小程序）

图形

http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html

班级：

嵌套集 DB 树 Adodb

http://www.phpclasses.org/browse/package/2547.html

访问模型 ADOdb

http://www.phpclasses.org/browse/package/2919.html

PEAR::DB_NestedSet

梨树

nstrees

http://www.edutech.ch/contribution/nstrees/index.php

在 SQL 数据库中表示层次结构的最佳方式是什么？一种通用的、便携式的技术？

我们假设层次结构大部分已被读取，但并不完全静态。假设这是一个家谱。

以下是不这样做的方法：

create table person (
person_id integer autoincrement primary key,
name      varchar(255) not null,
dob       date,
mother    integer,
father    integer
);

并像这样插入数据：

person_id   name      dob       mother father  
1           Pops      1900/1/1   null   null  
2           Grandma   1903/2/4   null   null  
3           Dad       1925/4/2   2      1  
4           Uncle Kev 1927/3/3   2      1
5           Cuz Dave  1953/7/8   null   4
6           Billy     1954/8/1   null   3

相反，将节点和关系拆分为两个表。

create table person (
person_id integer autoincrement primary key,
name      varchar(255) not null,
dob       date
);

create table ancestor (
ancestor_id   integer,
descendant_id integer,
distance      integer
);

数据是这样创建的：

person_id   name      dob       
1           Pops      1900/1/1  
2           Grandma   1903/2/4   
3           Dad       1925/4/2   
4           Uncle Kev 1927/3/3
5           Cuz Dave  1953/7/8   
6           Billy     1954/8/1   

ancestor_id  descendant_id  distance
1            1              0
2            2              0
3            3              0
4            4              0
5            5              0
6            6              0
1            3              1
2            3              1
1            4              1
2            4              1
1            5              2
2            5              2
4            5              1
1            6              2
2            6              2
3            6              1

现在，您可以运行不涉及将表重新连接到自身的任意查询，如果您在与节点相同的行中具有层次关系，则会发生这种情况。

谁有祖父母？

select * from person where person_id in 
    (select descendant_id from ancestor where distance=2);

您所有的后裔：

select * from person where person_id in 
    (select descendant_id from ancestor 
    where ancestor_id=1 and distance>0);

谁是叔叔？

select decendant_id uncle from ancestor 
    where distance=1 and ancestor_id in 
    (select ancestor_id from ancestor 
        where distance=2 and not exists
        (select ancestor_id from ancestor 
        where distance=1 and ancestor_id=uncle)
    )

您可以避免通过子查询将表与其自身连接的所有问题，常见的限制是 16 个子查询。

问题是，维护祖先表有点困难 - 最好使用存储过程来完成。

我不同意乔希的观点。如果您使用像公司组织这样的巨大层次结构，会发生什么。人们可以加入/离开公司、改变汇报路线等等……维护“距离”将是一个大问题，您必须维护两个数据表。

此查询（SQL Server 2005 及更高版本）可以让您查看任何人的完整行并计算他们在层次结构中的位置，并且只需要一个用户信息表。可以对其进行修改以查找任何子关系。

--Create table of dummy data
create table #person (
personID integer IDENTITY(1,1) NOT NULL,
name      varchar(255) not null,
dob       date,
father    integer
);

INSERT INTO #person(name,dob,father)Values('Pops','1900/1/1',NULL);  
INSERT INTO #person(name,dob,father)Values('Grandma','1903/2/4',null);
INSERT INTO #person(name,dob,father)Values('Dad','1925/4/2',1);
INSERT INTO #person(name,dob,father)Values('Uncle Kev','1927/3/3',1);
INSERT INTO #person(name,dob,father)Values('Cuz Dave','1953/7/8',4);
INSERT INTO #person(name,dob,father)Values('Billy','1954/8/1',3);

DECLARE @OldestPerson INT; 
SET @OldestPerson = 1; -- Set this value to the ID of the oldest person in the family

WITH PersonHierarchy (personID,Name,dob,father, HierarchyLevel) AS
(
   SELECT
      personID
      ,Name
      ,dob
      ,father,
      1 as HierarchyLevel
   FROM #person
   WHERE personID = @OldestPerson

   UNION ALL

   SELECT
    e.personID,
      e.Name,
      e.dob,
      e.father,
      eh.HierarchyLevel + 1 AS HierarchyLevel
   FROM #person e
      INNER JOIN PersonHierarchy eh ON
         e.father = eh.personID
)

SELECT *
FROM PersonHierarchy
ORDER BY HierarchyLevel, father;

DROP TABLE #person;

供参考：SQL Server 2008 引入了一个新的层次结构ID 这种情况的数据类型。让您可以控制行在“树”中的水平和垂直位置。

甲骨文：选择 ...从...开始 ...连接方式

Oracle 有一个 SELECT 扩展，可以轻松进行基于树的检索。也许 SQL Server 有一些类似的扩展？

该查询将遍历存储嵌套关系的表家长和孩子列。

select * from my_table
    start with parent = :TOP
    connect by prior child = parent;

http://www.adp-gmbh.ch/ora/sql/connect_by.html

我更喜欢乔什和马克·哈里森使用的技术的组合：

两个表，一个包含 Person 的数据，另一个包含层次结构信息（person_id，parent_id [， mother_id]）如果该表的 PK 是 person_id，则您将拥有一棵简单的树，只有一个父节点（这在以下情况下是有意义的）这种情况，但不是其他情况，如会计账户）

这个层次结构表可以通过递归过程来遍历，或者如果您的数据库通过诸如 SELECT... 之类的语句支持它。由先前（甲骨文）。

另一种可能性是，如果您知道要维护的层次结构数据的最大深度，则使用单个表，每个层次结构级别包含一组列

当我们实现树组件时，我们遇到了同样的问题 [灵活] 并使用 tharkun 中提到的嵌套集树模型方法 MySQL 文档。

除了（显着）加快速度之外，我们还使用了传播方法简单地意味着我们使用顶级右边界的最大 Long 值，这允许我们插入和移动节点而无需重新计算所有左值和右值。left 和 right 的值是通过将节点的范围除以 3 并使用内第三个作为新节点的边界。

可以看一个java代码示例这里.

如果您使用的是 SQL Server 2005 那么这个链接解释如何检索分层数据。

一旦您习惯使用通用表表达式 (CTE)，它们就会成为您的朋友。

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow