`

《Pro Oracle SQL》Chapter 8 -- 8.1-8.2 Anatomy of Analytic Functions -8.3

阅读更多

Riyaj Shamsudeen  (page 243)
    Online Analytic Processing (OLAP) queries perform multi-dimensional aggregation and are useful in
business decision-making processes in areas such as sales, marketing, and business intelligence. Fast
execution is important for OLAP queries as key decisions hinge on the outcome of OLAP queries and
reports. 
    在线分析处理(OLAP) 查询执行多维度的聚合,在诸如销售,市场,业务智能等领域的业务决策过程中是有用的。
对OLAP查询而言快速的执行是重要的,因为关键的决策取决于OLAP查询和报告的结果。
    Although conventional SQL statements can be used to implement OLAP queries, these statements
usually result in multiple self-joins, leading to poorly performing queries. In addition, multi-dimensional
analysis usually requires aggregation at various levels, and both aggregated and non-aggregated rows
must be returned. In the case of conventional SQL statements, fetching the aggregated and non-
aggregated values in the same row usually results in a multitude of self-joins. 
    虽然传统SQL语句能用于OLAP查询,这些语句通常会产生多次自连接,导致查询性能很糟糕。另一方面,多维分析通常需要在多个层次聚合,且聚合和非聚合的行都必须返回。若是传统的SQL语句,在同一行取出聚合的还有非聚合的值通常会导致多次自连接。
    Analytic function is a new class of functions introduced in Oracle Database version 9i and enhanced
in the later releases of Oracle Database. These functions provide the ability to reference values across
rows, multi-level aggregation, and granular control of sort order in a subset of data. In contrast to
grouping functions, analytic functions do not aggregate a result set into fewer rows. 
    分析函数是一新型函数,在Oracle 9i引入且在后续的版本中增强。这些函数具备跨行引用值,多层聚合和在子数据集中排序粒度控制的能力。相比分组函数,分析函数不是聚合结果集于几行。
    With the use of analytic functions, you can fetch both aggregated and non-aggregated values
without any self-joins. Consider that fetching the salary of an employee, with average salary by
department and average salary by location in the same row, would require multiple self-joins to
employees table. Analytic functions can be used to write this query without any self-joins.
    使用分析函数,你就能(同时)取聚合和非聚合的值而不需要自连接。考虑同一行中取一员工的薪水,部门的平均薪水,地点的平均薪水,将需要多次自连接employees表。分析函数能写这种查询而不需要任何自连接。
    Analytic functions are sometimes termed as window functions. Analytic functions perform
calculations within a subset of result set related to the current row by some means.  That subset can be
termed the window. 
    分析函数有时又被称之为窗口函数。分析函数通过某种方式在当前行相关的结果集的子集上执行计算。那些子集被称为“窗口”。


Example Data   例子数据
    To begin your investigation of the analytic SQL functions, you will create a denormalized fact table using the script in Listing 8-1. All the tables in this chapter refer to the objects in SH Schema supplied by
Oracle Corporation Example scripts.  
    为了开始你对分析SQL函数的研究,你要创建一非规范化 事实数据表,使用列表8-1中脚本。所有在本章中的表引用自Oracle公司例子脚本的SH模型下的对象。
■ NOTE  To install the Example schema, software can be downloaded from  http://download.oracle.com/otn/solaris/oracle11g/R2/solaris.sparc64_11gR2_examples.zip  for the 11gR2 Solaris platform. Refer to the
Readme document in the unzipped software directories for installation instructions. Zip files for other platforms and versions are also available at this Oracle web site. 
■ 注意,要安装例子模型,对11gR2 Solaris 平台而言可以从http://download.oracle.com/otn/solaris/oracle11g/R2 /solaris.sparc64_11gR2_examples.zip下载。参考解压缩目录中的Readme安装指导文档。在这个Oracle web站点还提供其他平台和版本的Zip文件。

Listing 8-1. Denormalized sales_fact Table    非规范化 的sales_fact表
drop table sales_fact;
CREATE table sales_fact AS 
SELECT country_name country,country_subRegion region, prod_name product, 
calendar_year year, calendar_week_number week, 
SUM(amount_sold) sale, 
sum(amount_sold*
  ( case   
         when mod(rownum, 10)=0 then 1.4
         when mod(rownum, 5)=0 then 0.6
         when mod(rownum, 2)=0 then 0.9
         when mod(rownum,2)=1 then 1.2
         else 1
    end )) receipts
FROM sales, times, customers, countries, products
WHERE sales.time_id = times.time_id AND
sales.prod_id = products.prod_id AND
sales.cust_id = customers.cust_id AND
customers.country_id = countries.country_id
GROUP BY 
country_name,country_subRegion, prod_name, calendar_year, calendar_week_number;

Anatomy of Analytic Functions    解剖分析函数
    Analytic  functions have three basic components: partitioning-clause, order-by-clause, and the
windowing clause.
Basic syntax of an analytic function is:
     分析函数有三个基本的组件:分区子句,排序子句,和开窗口子句。 一分析函数的基本语法如下:
function1 (argument1, argument2,..argumentN) 
   over ([partition-by-clause] [order-by-clause] [windowing-clause])

 
    Function1 is the analytic function to call which accepts zero or more arguments. The partitioning
clause groups the rows by partitioning column values. All rows with the same value for the partitioning
column are grouped as a data partition. 
    Function1是分析函数接收0个或多个参数。分区子句把按列值划分的行分组。所有具有相同分组值的行被归为一数据组。
    Operationally, rows are sorted by the partitioning columns and partitioned into data partitions. For
example, the SQL clause partition by product, country  partitions the data using the Product and
Country columns. Rows are sorted by both Product and Country columns and grouped into one
partition for each combination of product and country.
    运算时,行依照分区列分类且划分成数据组。例如,SQL子句partition by product,country使用列Product和Country列分区数据。对product和country的每一组合依照Product和 Country两列一并排序且归类成一组。
    The order-by clause sorts the rows in a data partition by a column or expression. In an analytic SQL
statement, the position of a row in the data partition is important and it is controlled by the order-by
clause. Rows are sorted by the sort columns within a data partition. Since the partitioning clause sorts
the rows by the partitioning columns, you actually end up with one sort that includes columns specified
in the partitioning clause and order-by clause.
    order-by 子句按列或表达式排序一数据区内的行。在分析SQL语句中,在数据分区中行的位置是关键的,且它是由order-by子句控制的。在数据分区组内行依照分 区列来分类。由于分区子句依照分区列来分类行,你最终得到一类(行)包含(所有)在分区子句和排序子句中指定的列。
    Sort order can be specified as ascending or descending order. Nulls can be specified to sort to the
top or bottom in a data partition using the clause  NULLS FIRST  or  NULLS LAST.
     分类顺序可悲指定为升序或降序。Null值可以指定排在数据分组的顶端还是末端,通过子句NULLS FIRST或NULLS LAST。
     The windowing clause specifies the subset of rows on which the analytic function operates. This
window can be dynamic and is aptly termed a sliding window . You can specify the top and bottom
boundary condition of the sliding window using the window specification clause. Syntax for the window
specification clause is:
     开窗子句指定行子集,在其上运算分析函数。这个窗口是动态的,可恰当的 称之为“滑动窗口”。通过使用窗口规范子句你能指定滑动窗口的上下边界条件。窗口规范子句的句法如下:
[ROWS | RANGE] BETWEEN <Start expr> AND <End expr>
Whereas 
<Start expr> is [UNBOUNDED PRECEDING | CURRENT ROW | n PRECEDING | n FOLLOWING]
<End expr> is [UNBOUNDED FOLLOWING | CURRENT ROW | n PRECEDING | n FOLLOWING]

 
    The keyword PRECEDING  specifies the top boundary condition, and the clause  FOLLOWING  or  CURRENT ROW  specifies the bottom boundary condition for the window. A sliding window provides the ability to compute complex metrics with ease. For example, you can compute the running sum of the Sale column by the clause rows between unbounded preceding and current row. In this example, the top row in the window is the first row in the current partition and the bottom row in the window is the current row. 
    关键字PRECEDING指定上边界条件,而子句FOLLOWING或CURRENT ROW指定窗口的下边界条件。滑动的窗口提供了轻松计算复杂矩阵的能力。例如,你能通过子句rows between unbounded preceding and current row 计算Sale列累计的和。在本例中,窗口的顶行是当前分区的第一行,而窗口的底行是当前行。


    ■ NOTE  The windowing clause is not supported by all analytic functions. 
    注意window子句不被所有的分析函数支持。


    Analytic functions may not be nested. But a nesting effect can be achieved by placing the
encompassing SQL statement in an inline view, and then applying analytic functions outside the view.
Analytic functions can be used in deeply nested inline views, too.
     分析函数可能不能嵌套。但是可以通过在内联视图中封装SQL取得嵌套的效果,然后在视图的外层应用分析函数,分析函数也能用于深层嵌套的内联视图。

   
List of Functions   函数的列表
The following section tabulates the analytic functions for easy reference.  下面一段将分析函数制成表以备方便索引。
Table 8-1.  Analytic Functions
Sl.      No                     Function  Description
1        Lag                    To access prior row in a partition or result set.
2        Lead                  To access later row in a partition or result set
3        First_value          To access first row in a partition or result set.
4        Last_value          To access last row in a partition or result set.
5        Nth_value           To access any arbitrary row in a partition or result set.
6        Rank                  To rank the rows in a sort order. Ranks are skipped in the case of ties.
                                   行按顺序分等级,如果相等等级被忽略。
7        Dense_rank        To rank the rows in a sort order. Ranks are not skipped in the case of ties.
                                    行按顺序分等级,如果相等等级不被忽略。
8        Row_number      To sort the rows and add unique number to each row. This is a non-deterministic function.
                                    排序行和对每一行加唯一数字。这是一个不确定的函数。
9        Ratio_to_report   To compute the ratio of value to the report.
10      Percent_rank      To compute the rank of value normalized to a value between 0 and 1.
                                   将秩值规范成0到1之间的一个值
11      Percentile_cont   To retrieve the value matching with the specified percent_rank. Reverse of percent_rank function.
                                   检索匹配特定的percent_rank值。percent_rank的反运算。
12      Percentile_dist    To retrieve the value matching with the specified percent_rank. Assumes discreet distribution model.
                                   检索匹配特定的percent_rank值。假设离散分布模型。
13      Ntile                   To group rows in to units.
14      Listagg               To convert column values from different rows in to a list format.
                                    从不同的行转换列值成一列表格式。

0
0
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics