`

《Pro Oracle SQL》10.3 Recursive Subqueries

阅读更多

Recursive Subqueries   递归子查询
    New to Oracle 11.2 is recursive subquery factoring (RSF for the remainder of this chapter). As you can
probably guess, the ANSI name for this feature is recursive common table expression. Regardless of what
you call it, Oracle has had a similar feature for a very long time in the form of the CONNECT BY clause of
the SELECTstatement. This feature has been enhanced in Oracle 11gR2.
    对Oracle 11.2陌生的是“递归子查询分解 ”(本章其余部分简称之为RSF )。你可能能猜出来,这个特性的ANSI名为“递归公用表表达式”。 无论你称它是什 么,Oracle已经有了一个很相似的,时间很长的特征,以SELECT语句的CONNECT BY子句的形式(为体现的)。该特征在Oracle 11gR2中得到了增强。
A CONNECT BY Example
    Let’s begin by looking at a traditional CONNECT BYquery such as in Listing 10-12. The emp inline view is
used to join the EMPLOYEE and DEPARTMENT tables, and then the single data set is presented to the SELECT… CONNECT BYstatement. The PRIORoperator is used to match the current EMPLOYEE_ID to rows where this value is in the MANAGER_ID column. Doing so iteratively creates a recursive query.
     让我们从传统的CONNECT BY查询,如列表10-12,开始查看。emp内联视图用于连接EMPLOYEE 和DEPARTMENT 表,接下来是由SELECT...CONNECT BY语句查出的单个数据集。PRIOR运算符用于匹配当前EMPLOYEE_ID到哪一行MANAGER_ID列。如此迭代创建一个递归查询
--------------------------------------------------------------
CONNECT BY was first available in Oracle Version 2, or in others words, from the very beginning.
CONNCET BY 在Oracle版本2开始有效,换句话说,非常早。
--------------------------------------------------------------
    Listing 10-12 contains a number of extra columns in the output to help explain how the PRIOR
operator works. Let’s take a look at the output beginning with the row for Lex De Haan. You can see that
the EMPLOYEE_ID for Lex is 102. The PRIOR operator will find all rows for which the MANAGER_ID is 102 and include them under the hierarchy for Lex De Haan. The only row that meets these criteria is the one for Alexander Hunold, with an EMPLOYEE_IDof 103. The process is then repeated for Alexander Hunold: are there any rows for which the MANAGER_IDis 103? There are four rows found with a MANAGER_IDof 103: those are for the employees Valli Pattaballa, Diana Lorentz, Bruce Ernst, and David Austin, so these are included in the output below Alexander Hunold. As there were no rows for which any of the EMPLOYEE_ID values for these four employees appears as a MANAGER_ID, Oracle moves back up to a level for which the rows have not yet been processed (in this case, for Alberto Errazuriz) and continues on to the end until all rows have been processed.
    列表10-12的输出包含很多额外的列帮助解释PRIOR运算符是如何工作的。让我们查看输出从行Lex De Haan开始。你可看出Lex的EMPLOYEE_ID是102。PRIOR 运算符将找出MANAGER_ID是102的所有行,再包含他们在Lex De Haan等级之下。 只有一行满足这个标准,是Alexander Hunold, EMPLOYEE_ID是103。这个过程接着在Alexander Hunold上重复:这里是否有任何行的MANAGER_ID是103?找到4行的 MANAGER_ID是103: 这些员工是 Valli Pattaballa, Diana Lorentz, Bruce Ernst, 和David Austin,因此输出在Alexander Hunold之下。因为这四个员工的EMPLOYEE_ID值不是任一行的MANAGER_ID,Oracle回到一没有处理过行的层上(在本例中是 Alberto Errazuriz)且持续到最后,直到所有的行处理完毕。
    The START WITH clause is instructed to begin with a value for which MANAGER_ID is null. As this is an
organizational hierarchy with a single person at the top of the hierarchy, this causes the query to start
with Stephen King.
As the CEO, Mr. King does not have a manager, so the MANAGER_ID column is set to
NULL for his row.
    START WITH子句指示MANAGER_ID的初始值是null。因为这是一个等级组织,某个人在这个等级的顶端,这使得查询开始于Stephen King。 CEO,King先生没有经理,因此这行的MANAGER_ID设成NULL。
    The LEVEL pseudocolumn holds the value for the depth of the recursion, allowing for a simple
method to indent the output so that the organizational hierarchy is visible.
    LEVEL伪列存有递归深度值,因此可用简单的方法缩进输出结果,使得组织的等级可见。

--Listing 10-12. Basic CONNECT BY
 select lpad(' ', level*2-1,' ') || emp.emp_last_name emp_last_name
 , emp.emp_first_name
 , emp.employee_id
 , emp.mgr_last_name, emp.mgr_first_name
 , emp.manager_id
 , department_name
 from (
 select /*+ inline gather_plan_statistics */
 e.last_name emp_last_name, e.first_name emp_first_name
 , e.employee_id, d.department_id
 , e.manager_id, d.department_name
 , es.last_name mgr_last_name, es.first_name mgr_first_name
 from hr.employees e
 left outer join hr.departments d on d.department_id = e.department_id
 left outer join hr.employees es on es.employee_id = e.manager_id
 ) emp
 connect by prior emp.employee_id = emp.manager_id
 start with emp.manager_id is null
 order siblings by emp.emp_last_name;

 

EMP_LAST_NAME                  EMP_FIRST_NAME       EMPLOYEE_ID MGR_LAST_NAME        MGR_FIRST_NAME    MANAGER_ID DEPARTMENT_NAME
------------------------------ -------------------- ----------- -------------------- -------------------- ---------- ----------------------------
 King                          Steven                       100                                              Executive
   Cambrault                   Gerald                       148 King                 Steven              100 Sales
     Bates                     Elizabeth                    172 Cambrault            Gerald              148 Sales
     Bloom                     Harrison                     169 Cambrault            Gerald              148 Sales
     Fox                       Tayler                       170 Cambrault            Gerald              148 Sales
     Kumar                     Sundita                      173 Cambrault            Gerald              148 Sales
     Ozer                      Lisa                         168 Cambrault            Gerald              148 Sales
     Smith                     William                      171 Cambrault            Gerald              148 Sales
   De Haan                     Lex                          102 King                 Steven              100 Executive
     Hunold                    Alexander                    103 De Haan              Lex                 102 IT
       Austin                  David                        105 Hunold               Alexander           103 IT
       Ernst                   Bruce                        104 Hunold               Alexander           103 IT
       Lorentz                 Diana                        107 Hunold               Alexander           103 IT
       Pataballa               Valli                        106 Hunold               Alexander           103 IT

...

 

The Example Using an RSF
    The example query on the EMPLOYEES table has been rewritten in Listing 10-13 to use RSF, where the
main subquery is emp_recurse. The anchor member in this case simply selects the top most row in the
hierarchy by selecting the only row where MANAGER_ID IS NULL. This is equivalent to START WITH
EMP.MANAGER_ID IS NULL in Listing 10-12. The recursive member references the defining query
emp_recurse by joining it to emp query.
This join is used to locate the row corresponding to each
employee’s manager, which is equivalent to CONNECT BY PRIOR EMP.EMPLOYEE_ID = EMP.MANAGER_ID in
Listing 10-12. The results in Listing 10-13 are identical to those in Listing 10-12.

    查询EMPLOYEES表的例子在列表10-13中使用RSF重写了,其中主要的子查询是emp_recurse。在该例中“锚成员”选择等级结构最上的行,通过选择仅有的一行MANAGER_ID IS NULL。这等价于列表10-12中的START WITH EMP.MANAGER_ID IS NULL。“递归成员”通过连接定义的查询emp_recurse到emp实现引用。 这个连接用于定位每个员工的经理的相应行,等价于列表10-12中的CONNECT BY PRIOR EMP.EMPLOYEE_ID = EMP.MANAGER_ID。列表10-13的结果与列表10-12的一致。

Listing 10-13. Basic Recursive Subquery Factoring
1 with emp as (
2 select /*+ inline gather_plan_statistics */
3 e.last_name, e.first_name, e.employee_id, e.manager_id, d.department_name
4 from hr.employees e
5 left outer join hr.departments d on d.department_id = e.department_id
6 ),
emp_recurse(last_name,first_name,employee_id,manager_id,department_name,lvl) as (
8 select e.last_name, e.first_name
9 , e.employee_id, e.manager_id
10 , e.department_name, 1 as lvl
11 from emp e where e.manager_id is null
12  union all
13 select emp.last_name, emp.first_name
14 , emp.employee_id, emp.manager_id
15 ,emp.department_name, empr.lvl + 1 as lvl
16 from emp
17 join emp_recurse  empr on empr.employee_id = emp.manager_id
18 )
19 search depth first by last_name set order1
20 select lpad(' ', lvl*2-1,' ') || er.last_name last_name

21 , er.first_name
22 , er.department_name
23 from emp_recurse er;

LAST_NAME                      FIRST_NAME           DEPARTMENT_NAME
------------------------------ -------------------- ------------------------------
 King                          Steven               Executive
   Cambrault                   Gerald               Sales
     Bates                     Elizabeth            Sales
     Bloom                     Harrison             Sales
     Fox                       Tayler               Sales
     Kumar                     Sundita              Sales
     Ozer                      Lisa                 Sales
     Smith                     William              Sales
   De Haan                     Lex                  Executive
     Hunold                    Alexander            IT
       Austin                  David                IT
       Ernst                   Bruce                IT
       Lorentz                 Diana                IT
       Pataballa               Valli                IT
   Errazuriz                   Alberto              Sales
     Ande                      Sundar               Sales
...
107 rows selected.

 

    While the new RSF method may at first appear verbose, the basis of how it works is simpler to
understand than CONNECT BY and allows for more complex queries. The recursive WITH clause requires
two query blocks, the anchor member and the recursive member. These two query blocks must be
combined with the UNION ALL set operator. The anchor member is the query prior to the UNION ALL,
while the recursive member is the query following. The recursive member must reference the defining
subquery— by doing so, it is recursive.

    虽然新的RSF方法可能初看上去显得冗长,它如何工作的基本原理较CONNECT BY更易于理解,且用于更复杂的查询。递归的WITH子句需要两个查询块,“锚成员”和“递归成员”。这两个查询块必须通过UNION ALL集合运算符连接在一块。锚成员是在UNION ALL之前的查询,而递归成员是接下来的查询块。递归成员必须引用定义的子查询--通过这样做,实现递归。


Restrictions on RSF
    As you might imagine, the use of RSF is quite a bit more flexible than CONNECT BY. There are some
restrictions on its use, however. As per the 11gR2 documentation for the SELECT statement, the
following elements cannot be used in the recursive member of an RSF:
•  The DISTINCT keyword or a GROUP BY clause
•  The model_clause
•  An aggregate function. However, analytic functions are permitted in the select list.
•  Subqueries that refer to query_name.
•  Outer joins that refer to query_nameas the right table.

   正如你可能想到,使用RSF比起CONNECT BY更为灵活。然而,对它也有些限制。依照 11gR2对SELECT语句的文档,下面的因素不能用于某一RSF的递归成员:

•  DISTINCT 关键字或 GROUP BY 子句
•  model子句
•  聚合函数,然而,分析函数允许放在SELECT列表中。
•  引用查询名称的子查询
•  引用查询名称作为右表的外连接。

 

Differences from CONNECT BY
    There are several differences when using RSF as compared to CONNECT BY, and some of them are
apparent in Listing 10-13. You may have wondered what happened to the LEVEL pseudocolumn, as it is
missing in this query, replaced by the LVL column. I’ll get to that one a little later on. Also notice that the
columns returned by an RSF query must be specified in the query definition as seen in line 7 of Listing
10-13. One more new feature is the SEARCH DEPTH FIRST seen on line 19. The default search is BREADTH
FIRST, which is not usually the output you want from a hierarchical query. Listing 10-14 shows the
output when the SEARCH clause is not used or it is set to BREADTH FIRST. This search returns rows of all
siblings at each level before returning any child rows. Specifying SEARCH DEPTH FIRST will return the
rows in hierarchical order.
The SET ORDER1 portion of the SEARCH clause sets the value of the ORDER1
pseudocolumn to the value of the order the rows are returned in, similar to what you might see with
ROWNUM, but you get to name the column. This will also be used in later examples.

    使用RSF相比CONNECT BY也有一些不同之处,其中的一些在列表10-13中有所体现。你可能奇怪LEVEL伪列怎么了,在这个查询中它不见了,而是用LVL列代替了。这点我之后解释。也请注意由某一RSF查询返回的列必须在查询定义中指定,如在列表10-13的第7行所见。还有一个新的特征是在行19中所见的SEARCH DEPTH FIRST。默认搜索是BREADTH FIRST,通常不是你想要从一等级查询中(返回的)输出。列表10-14展示了当SEARCH子句没有使用或者它被设置成 BREADTH FIRST的输出结果。搜索在返回所有子行之前返回各层所有同辈行。指定SEARCH DEPTH FIRST将按等级顺序返回行。 SEARCH子句的SET ORDER1 部分设置ORDER1伪列的值为返回的行顺序的值,类似于你可能见过的ROWNUM,但是你有列名。这也将在后续的例子中用到。


Listing 10-14. Default BREADTH FIRST Search

search breadth first by last_name set order1
select lpad(' ', lvl*2-1,' ') || er.last_name last_name


LAST_NAME                      FIRST_NAME           DEPARTMENT_NAME
------------------------------ -------------------- ------------------------------
 King                          Steven               Executive
   Cambrault                   Gerald               Sales
   De Haan                     Lex                  Executive
   Errazuriz                   Alberto              Sales
   Fripp                       Adam                 Shipping
   Hartstein                   Michael              Marketing
   Kaufling                    Payam                Shipping
   Kochhar                     Neena                Executive
   Mourgos                     Kevin                Shipping
   Partners                    Karen                Sales
   Raphaely                    Den                  Purchasing
   Russell                     John                 Sales
   Vollman                     Shanta               Shipping
   Weiss                       Matthew              Shipping
   Zlotkey                     Eleni                Sales
     Abel                      Ellen                Sales
     Ande                      Sundar               Sales
     Atkinson                  Mozhe                Shipping
     Baer                      Hermann              Public Relations
     Baida                     Shelli               Purchasing
     Banda                     Amit                 Sales
     Bates                     Elizabeth            Sales

...


    Notice that the SEARCH clause as it is used in Figures 10-13 and 10-14 specifies that the search be by
LAST_NAME. This could also be by FIRST_NAME, or by a column list, such as LAST_NAME,FIRST_NAME. Doing so controls the order of the rows within each level. The SEARCH clause ends with SET ORDER1. This
effectively adds the ORDER1 pseudocolumn to the column list returned by the recursive subquery.
You
will see it used more in some of the following examples.

    注意在图10-13和10-14中使用的SEARCH子句指定搜索是 by LAST_NAME。也可以是by FIRST_NAME,或者是一列的列表,如LAST_NAME,FIRST_NAME。 这样做控制每层行的顺序。 SEARCH子句以SET ORDER1结尾。ORDER1伪列加到列的列表之后可提高递归子查询的效率。 你将在后续的一些例子中看到它更多的应用。

 

1
7
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics