转载

通过自关联替代开窗函数实现SQL优化

数据库环境:SQL SERVER 2005

我们平时都是通过开窗析函数来取代自关联,减少表扫描,从而优化SQL。今天,反其道而行,

用自关联改写开窗函数。我们先来看一下SQL。

原SQL语句

通过自关联替代开窗函数实现SQL优化
SELECT  *  FROM    ( SELECT    MIN(a.line_no) OVER ( PARTITION BY a.id_clerk ) AS line_no ,   a.dslyipt_no ,   a.int_year ,   a.int_period ,   a.id_scheme ,   d.name_scheme ,   a.id_schhelp ,   e.name_schhelp ,   a.id_dept ,   b.name_dept ,   a.id_clerk ,   c.name_clerk ,   a.id_slyitem ,   a.name_slyitem ,   a.dec_value  FROM      dslyipt_03 a ,   ctlm1003 b ,   ctlm1007 c ,   ctlm7201 d ,   ctlm7219 e  WHERE     a.id_dept = b.id_dept   AND a.id_clerk = c.id_clerk   AND a.id_scheme = d.id_scheme   AND a.id_schhelp = e.id_schhelp   AND a.id_scheme = e.id_scheme         ) dslyipt_03 WHERE   dslyipt_no IN ( '201509000169' ) ------------------------相关表的数据量---------------------------------------- SELECT COUNT(*) FROM dslyipt_03 --2321920 SELECT COUNT(*) FROM dslyipt_03 WHERE   dslyipt_no IN ( '201509000169' )--16 SELECT COUNT(*) FROM ctlm1003 --125 SELECT COUNT(*) FROM ctlm1007 --11986 SELECT COUNT(*) FROM ctlm7201 --16 SELECT COUNT(*) FROM ctlm7219 --32 
View Code

dslyipt_03表是大表,有2321920条数据,过滤之后只有16条数据,原SQL总共执行了30多秒才出结果,

因此可以判定,SQL还有优化的余地。

对2KW的表进行开窗,慢是有原因的。那么,我们是否可以先过滤再开窗呢?我们来改一下SQL,改写之后

的SQL如下:

通过自关联替代开窗函数实现SQL优化
SELECT    MIN(a.line_no) OVER ( PARTITION BY a.id_clerk ) AS line_no ,  a.dslyipt_no ,  a.int_year ,  a.int_period ,  a.id_scheme ,  d.name_scheme ,  a.id_schhelp ,  e.name_schhelp ,  a.id_dept ,  b.name_dept ,  a.id_clerk ,  c.name_clerk ,  a.id_slyitem ,  a.name_slyitem ,  a.dec_value FROM      dslyipt_03 a ,  ctlm1003 b ,  ctlm1007 c ,  ctlm7201 d ,  ctlm7219 e WHERE     a.id_dept = b.id_dept  AND a.id_clerk = c.id_clerk  AND a.id_scheme = d.id_scheme  AND a.id_schhelp = e.id_schhelp  AND a.id_scheme = e.id_scheme  AND a.dslyipt_no IN ( '201509000169' ) 
View Code

嗖的一下,数据就出来了!但是,再核对一下数据,发现字段line_no的数据对不上,说明改错了。改写

后对line_no开窗的范围变小了,所以数据有问题。

现在通过自关联来实现开窗函数的功能,自关联改写的SQL如下:

通过自关联替代开窗函数实现SQL优化
SELECT  f.line_no ,  a.dslyipt_no ,  a.int_year ,  a.int_period ,  a.id_scheme ,  d.name_scheme ,  a.id_schhelp ,  e.name_schhelp ,  a.id_dept ,  b.name_dept ,  a.id_clerk ,  c.name_clerk ,  a.id_slyitem ,  a.name_slyitem ,  a.dec_value FROM    dslyipt_03 a ,  ctlm1003 b ,  ctlm1007 c ,  ctlm7201 d ,  ctlm7219 e ,  ( SELECT    a.id_clerk ,       MIN(a.line_no) AS line_no    FROM      dslyipt_03 a    GROUP BY  a.id_clerk  ) f WHERE   a.id_dept = b.id_dept  AND a.id_clerk = c.id_clerk  AND a.id_scheme = d.id_scheme  AND a.id_schhelp = e.id_schhelp  AND a.id_scheme = e.id_scheme  AND a.id_clerk = f.id_clerk  AND a.dslyipt_no IN ( '201509000169' ) OPTION(HASH JOIN) 
View Code

在语句后面,我们通过查询提示,建议SQL走哈希连接,1S之内就出结果了,经核对,数据无误!

我们来对比一下改写前后TIME、IO

改写前

改写后

正文到此结束
Loading...