转载

MySQL实现嵌套集合模型

MySQL实现嵌套集合模型

译文主要是介绍如何用MySQL来存储嵌套集合数据。在其中会增加一些自己的理解,也会删除掉一些自认为无用的废话。这篇文章主要讲的是嵌套集合模型,所以邻接表不是本文的重点,简单略过就好。

也许这是 原文地址 ,因为我也不知道这是不是原文。

介绍

什么是分层数据?

MySQL实现嵌套集合模型

类似于树形结构,除了根节点和叶子节点外,所有节点都有用一个父节点和多个子节点。

那么,在MySQL中如何处理分层数据呢?

原文中介绍了两种分层结构模型: 邻接表模型嵌套集合模型

邻接表模型(The Adjacency List Model)

首先,建立测试表,导入测试数据,

CREATE TABLE category(  category_id INT AUTO_INCREMENT PRIMARY KEY,  name VARCHAR(20) NOT NULL,  parent INT DEFAULT NULL ); INSERT INTO category VALUES  (1,'ELECTRONICS',NULL),  (2,'TELEVISIONS',1),  (3,'TUBE',2),  (4,'LCD',2),  (5,'PLASMA',2),  (6,'PORTABLE ELECTRONICS',1),  (7,'MP3 PLAYERS',6),  (8,'FLASH',7),  (9,'CD PLAYERS',6),  (10,'2 WAY RADIOS',6); SELECT * FROM category ORDER BY category_id; +-------------+----------------------+--------+ | category_id | name   | parent | +-------------+----------------------+--------+ |    1 | ELECTRONICS   |   NULL | |    2 | TELEVISIONS   |      1 | |    3 | TUBE   |      2 | |    4 | LCD    |      2 | |    5 | PLASMA        |      2 | |    6 | PORTABLE ELECTRONICS |      1 | |    7 | MP3 PLAYERS   |      6 | |    8 | FLASH  |      7 | |    9 | CD PLAYERS    |      6 | |   10 | 2 WAY RADIOS  |      6 | +-------------+----------------------+--------+ 10 rows in set (0.00 sec) 

在邻接表中,所有的数据均拥有一个Parent字段,用来存储它的父节点。当前节点为根节点的话,它的父节点则为NULL。那么在遍历的时候,可以使用递归来实现查询整棵树,从根节点开始,不断寻找子节点(父节点->子节点->父节点->子节点)。

检索分层路径

一般需要获取一个分层结构的路径问题,那么

SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4 FROM category AS t1 LEFT JOIN category AS t2 ON t2.parent = t1.category_id LEFT JOIN category AS t3 ON t3.parent = t2.category_id LEFT JOIN category AS t4 ON t4.parent = t3.category_id WHERE t1.name = 'ELECTRONICS';  +-------------+----------------------+--------------+-------+ | lev1        | lev2                 | lev3         | lev4  | +-------------+----------------------+--------------+-------+ | ELECTRONICS | TELEVISIONS          | TUBE         | NULL  | | ELECTRONICS | TELEVISIONS          | LCD          | NULL  | | ELECTRONICS | TELEVISIONS          | PLASMA       | NULL  | | ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS  | FLASH | | ELECTRONICS | PORTABLE ELECTRONICS | CD PLAYERS   | NULL  | | ELECTRONICS | PORTABLE ELECTRONICS | 2 WAY RADIOS | NULL  | +-------------+----------------------+--------------+-------+ 6 rows in set (0.00 sec)

检索叶子节点

SELECT t1.name FROM category AS t1 LEFT JOIN category as t2 ON t1.category_id = t2.parent WHERE t2.category_id IS NULL;  +--------------+ | name         | +--------------+ | TUBE         | | LCD          | | PLASMA       | | FLASH        | | CD PLAYERS   | | 2 WAY RADIOS | +--------------+

检索指定路径

SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4 FROM category AS t1 LEFT JOIN category AS t2 ON t2.parent = t1.category_id LEFT JOIN category AS t3 ON t3.parent = t2.category_id LEFT JOIN category AS t4 ON t4.parent = t3.category_id WHERE t1.name = 'ELECTRONICS' AND t4.name = 'FLASH';  +-------------+----------------------+-------------+-------+ | lev1        | lev2                 | lev3        | lev4  | +-------------+----------------------+-------------+-------+ | ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH | +-------------+----------------------+-------------+-------+ 1 row in set (0.01 sec)

邻接表的缺点

在检索路径的过程中,除了本层外,每一层都会对应一个 LEFT JOIN ,那么如果层数不定怎么办?或者层数过多?

在删除中间层的节点时,需要同时删除该节点下的所有节点,否则会出现孤立节点。

嵌套集合模型Nested Set Model

原文中主要的目的是介绍嵌套集合模型,如下

MySQL实现嵌套集合模型

通过集合的包含关系,嵌套结合模型可以表示分层结构,每一个分层可以用一个Set来表示(一个圈),父节点所在的圈包含所有子节点所在的圈。

为了用MySQL来表示集合关系,需要定义连个字段 leftright (表示一个集合的范围)。

CREATE TABLE nested_category (         category_id INT AUTO_INCREMENT PRIMARY KEY,         name VARCHAR(20) NOT NULL,         lft INT NOT NULL,         rgt INT NOT NULL );  INSERT INTO nested_category VALUES   (1,'ELECTRONICS',1,20),   (2,'TELEVISIONS',2,9),   (3,'TUBE',3,4),   (4,'LCD',5,6),   (5,'PLASMA',7,8),   (6,'PORTABLE ELECTRONICS',10,19),   (7,'MP3 PLAYERS',11,14),   (8,'FLASH',12,13),   (9,'CD PLAYERS',15,16),   (10,'2 WAY RADIOS',17,18);  SELECT * FROM nested_category ORDER BY category_id;  +-------------+----------------------+-----+-----+ | category_id | name                 | lft | rgt | +-------------+----------------------+-----+-----+ |           1 | ELECTRONICS          |   1 |  20 | |           2 | TELEVISIONS          |   2 |   9 | |           3 | TUBE                 |   3 |   4 | |           4 | LCD                  |   5 |   6 | |           5 | PLASMA               |   7 |   8 | |           6 | PORTABLE ELECTRONICS |  10 |  19 | |           7 | MP3 PLAYERS          |  11 |  14 | |           8 | FLASH                |  12 |  13 | |           9 | CD PLAYERS           |  15 |  16 | |          10 | 2 WAY RADIOS         |  17 |  18 | +-------------+----------------------+-----+-----+

由于 leftright 是MySQL的保留字,因此,字段名称用lft和rgt代替。每一个集合都是从lft开始到rgt结束,也就是集合的两个边界。

MySQL实现嵌套集合模型

在树中也同样适用,

MySQL实现嵌套集合模型

当为树状结构编号时,我们从左到右,一次一层,赋值按照从左到右的顺序遍历其子节点,这种方法称为 先序遍历算法

检索分层路径

由于子节点的lft值总在父节点的lft和rgt值之间,所以可以通过父节点连接到子节点上来检索整棵树。

SELECT node.name FROM nested_category AS node,  nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt  AND parent.name = 'ELECTRONICS' ORDER BY node.lft; +----------------------+ | name   | +----------------------+ | ELECTRONICS   | | TELEVISIONS   | | TUBE   | | LCD    | | PLASMA        | | PORTABLE ELECTRONICS | | MP3 PLAYERS   | | FLASH  | | CD PLAYERS    | | 2 WAY RADIOS  | +----------------------+</pre> 

这个方法并不需要考虑层数,而且不需要考虑节点的rgt。

检索所有叶子节点

由于每一个叶子节点的 rgt=lft+1 ,那么只需要这一个条件即可。

SELECT name FROM nested_category WHERE rgt = lft + 1;  +--------------+ | name         | +--------------+ | TUBE         | | LCD          | | PLASMA       | | FLASH        | | CD PLAYERS   | | 2 WAY RADIOS | +--------------+

检索节点路径

不再需要多个join连接操作。

SELECT parent.name FROM nested_category AS node,  nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt  AND node.name = 'FLASH' ORDER BY node.lft; +----------------------+ | name   | +----------------------+ | ELECTRONICS   | | PORTABLE ELECTRONICS | | MP3 PLAYERS   | | FLASH  | +----------------------+ 

检索节点深度

通过 COUNTGROUP BY 函数来获取父节点的个数。

SELECT node.name, (COUNT(parent.name) - 1) AS depth FROM nested_category AS node,  nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt GROUP BY node.name ORDER BY node.lft; +----------------------+-------+ | name   | depth | +----------------------+-------+ | ELECTRONICS   |     0 | | TELEVISIONS   |     1 | | TUBE   |     2 | | LCD    |     2 | | PLASMA        |     2 | | PORTABLE ELECTRONICS |     1 | | MP3 PLAYERS   |     2 | | FLASH  |     3 | | CD PLAYERS    |     2 | | 2 WAY RADIOS  |     2 | +----------------------+-------+ 

甚至可以得到分层的缩进结果,

SELECT CONCAT( REPEAT(' ', COUNT(parent.name) - 1), node.name) AS name FROM nested_category AS node,  nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt GROUP BY node.name ORDER BY node.lft; +-----------------------+ | name    | +-----------------------+ | ELECTRONICS    | |  TELEVISIONS   | |   TUBE  | |   LCD   | |   PLASMA       | |  PORTABLE ELECTRONICS | |   MP3 PLAYERS  | |    FLASH       | |   CD PLAYERS   | |   2 WAY RADIOS | +-----------------------+ 

检索子树的深度

考虑到检索中需要自连接的node或parent,因此需要增加一个额外的连接来作为子查询来限制子树。

SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth FROM nested_category AS node,  nested_category AS parent,  nested_category AS sub_parent,  (   SELECT node.name, (COUNT(parent.name) - 1) AS depth   FROM nested_category AS node,   nested_category AS parent   WHERE node.lft BETWEEN parent.lft AND parent.rgt   AND node.name = 'PORTABLE ELECTRONICS'   GROUP BY node.name   ORDER BY node.lft  )AS sub_tree WHERE node.lft BETWEEN parent.lft AND parent.rgt  AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt  AND sub_parent.name = sub_tree.name GROUP BY node.name ORDER BY node.lft; +----------------------+-------+ | name   | depth | +----------------------+-------+ | PORTABLE ELECTRONICS |     0 | | MP3 PLAYERS   |     1 | | FLASH  |     2 | | CD PLAYERS    |     1 | | 2 WAY RADIOS  |     1 | +----------------------+-------+ 

检索节点的直接子节点

假设一个场景,当用户点击网站上电子产品的一个分类时,将呈现该分类下的产品,同时需要列出所有子分类,并不是全部分类。

为了限制显示分类的层数,需要使用 HAVING 字句,

SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth FROM nested_category AS node,  nested_category AS parent,  nested_category AS sub_parent,  (   SELECT node.name, (COUNT(parent.name) - 1) AS depth   FROM nested_category AS node,    nested_category AS parent   WHERE node.lft BETWEEN parent.lft AND parent.rgt    AND node.name = 'PORTABLE ELECTRONICS'   GROUP BY node.name   ORDER BY node.lft  )AS sub_tree WHERE node.lft BETWEEN parent.lft AND parent.rgt  AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt  AND sub_parent.name = sub_tree.name GROUP BY node.name HAVING depth <= 1 ORDER BY node.lft; +----------------------+-------+ | name   | depth | +----------------------+-------+ | PORTABLE ELECTRONICS |     0 | | MP3 PLAYERS   |     1 | | CD PLAYERS    |     1 | | 2 WAY RADIOS  |     1 | +----------------------+-------+ 

增加新节点

上面已经介绍了如何检索结果,那么如何才能增加新的节点呢?

MySQL实现嵌套集合模型

如果希望在TELEVISIONS和PROTABLE ELECTRONICS节点之间增加一个新的节点,那么新节点的lft和rgt的值应该是10和11,那么所有大于10的节点(新节点右侧的节点)的lft和rgt都应该加2,如上图所示。

LOCK TABLE nested_category WRITE;  SELECT @myRight := rgt FROM nested_category WHERE name = 'TELEVISIONS';  UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myRight; UPDATE nested_category SET lft = lft + 2 WHERE lft > @myRight;  INSERT INTO nested_category(name, lft, rgt) VALUES('GAME CONSOLES', @myRight + 1, @myRight + 2);  UNLOCK TABLES

如果希望在叶子节点下增加节点,需要修改下查询语句,

LOCK TABLE nested_category WRITE;  SELECT @myLeft := lft FROM nested_category  WHERE name = '2 WAY RADIOS';  UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myLeft; UPDATE nested_category SET lft = lft + 2 WHERE lft > @myLeft;  INSERT INTO nested_category(name, lft, rgt) VALUES('FRS', @myLeft + 1, @myLeft + 2);  UNLOCK TABLES;```    ###删除节点  删除叶子节点比较容易,只需要删除自己,而删除一个中间层节点就需要删除其所有子节点。在这个模型中,所有子节点的节点正好在lft和rgt之间。 

LOCK TABLE nested_category WRITE;

SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1

FROM nested_category

WHERE name = 'GAME CONSOLES';

DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight;

UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight;UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight;

UNLOCK TABLES;

 在某些情况下,只需要删除某个节点,但是并不希望删除该节点下的子节点数据。 通过把右侧所有节点的左右值-2,当前节点的子节点左右值-1

LOCK TABLE nested_category WRITE;

SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1

FROM nested_category

WHERE name = 'PORTABLE ELECTRONICS';

DELETE FROM nested_category WHERE lft = @myLeft;

UPDATE nested_category SET rgt = rgt - 1, lft = lft - 1 WHERE lft BETWEEN @myLeft AND @myRight;

UPDATE nested_category SET rgt = rgt - 2 WHERE rgt > @myRight;

UPDATE nested_category SET lft = lft - 2 WHERE lft > @myRight;

UNLOCK TABLES;```

最后的思考

原作者推荐了一本名为《Joe Celko's Trees and Hierarchies in SQL for Smarties》的书籍,该书的作者是SQL领域的大神Joe Celko(嵌套几何模型的创造者)。这本书涵盖了本文中未涉及到的一些高级话题。

正文到此结束
Loading...