Слайд 1
Lecture 8-10
Optimization.
New DML operators
CTE- expressions
CTE- expressions are for
easier queries readability
Single CTE call
CTE use for recursive tree-walking
PIVOT & UNPIVOT operators
PIVOT operator use
UNPIVOT operator use
APPLY operator
Function creation
CROSS- union query
OUTER- union query
TOP
Subject: “Database Management Systems 2”
Instructor’s full name: Dr. Lyazat Naizabayeva
Слайд 2
Common Table Expression /CTE/
The CTE was introduced into standard
SQL in order to simplify various classes of SQL Queries
for which a derived table just wasn't suitable.
Introduced in SQL Server 2005, the common table expression (CTE) is a temporary named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement.
You can also use a CTE in a CREATE VIEW statement, as part of the view’s SELECT query.
In addition, as of SQL Server 2008, you can add a CTE to the new MERGE statement.
SQL Server supports two types of CTEs
recursive and nonrecursive.
Слайд 3You define CTEs by adding a WITH clause directly before
your SELECT, INSERT, UPDATE, DELETE, or MERGE statement.
The WITH
clause can include one or more CTEs, as shown in the following syntax:
Working with Common Table Expressions
[WITH [,...]]
::=
cte_name [(column_name [,...])]
AS (cte_query)
...which can be represented like this...
Слайд 4WITH
cteTotalSales (SalesPersonID, NetSales)
AS
(
SELECT SalesPersonID, ROUND(SUM(SubTotal), 2)
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY
SalesPersonID
)
SELECT
sp.FirstName + ' ' + sp.LastName AS FullName,
sp.City + ', ' + StateProvinceName AS Location,
ts.NetSales
FROM Sales.vSalesPerson AS sp
INNER JOIN cteTotalSales AS ts
ON sp.BusinessEntityID = ts.SalesPersonID
ORDER BY ts.NetSales DESC
Creating a Nonrecursive Common Table Expression
Слайд 5After I specify the CTE name, I provide two column
names, SalesPersonID and NetSales, which are enclosed in parentheses and
separated by a comma.
Next, I provide the AS keyword, then a set of parentheses that enclose the CTE query. In this case, the SELECT statement returns the total sales for each sales person (total sales grouped by salesperson ID).
I can now reference cteTotalSales in the statement that immediately follows. For this example, I create a SELECT statement that joins the Sales.vSalesPerson view to cteTotalSales, based on the salesperson ID.
I then pull the names and locations from the view and the net sales from the CTE.
Слайд 6The following table shows the results returned by this statement.
Слайд 7Creating a Recursive Common Table Expression
A recursive CTE is one
that references itself within that CTE. The recursive CTE is
useful when working with hierarchical data because the CTE continues to execute until the query returns the entire hierarchy.
Note that a CTE created incorrectly could enter an infinite loop. To prevent this, you can include the MAXRECURSION hint in the OPTION clause of the primary SELECT, INSERT, UPDATE, DELETE, or MERGE statement. For information about using query hints, see the topic “Query Hints (Transact-SQL)” in SQL Server Books Online.
Слайд 8IF OBJECT_ID('Employees', 'U') IS NOT NULL
DROP TABLE dbo.Employees
GO
CREATE TABLE dbo.Employees
(
EmployeeID int NOT NULL PRIMARY KEY,
FirstName varchar(50) NOT NULL,
LastName varchar(50) NOT NULL,
ManagerID int NULL
)
GO
INSERT INTO Employees VALUES (101, 'Ken', 'Sánchez', NULL)
INSERT INTO Employees VALUES (102, 'Terri', 'Duffy', 101)
INSERT INTO Employees VALUES (103, 'Roberto', 'Tamburello', 101)
INSERT INTO Employees VALUES (104, 'Rob', 'Walters', 102)
INSERT INTO Employees VALUES (105, 'Gail', 'Erickson', 102)
INSERT INTO Employees VALUES (106, 'Jossef', 'Goldberg', 103)
INSERT INTO Employees VALUES (107, 'Dylan', 'Miller', 103)
INSERT INTO Employees VALUES (108, 'Diane', 'Margheim', 105)
INSERT INTO Employees VALUES (109, 'Gigi', 'Matthew', 105)
INSERT INTO Employees VALUES (110, 'Michael', 'Raheem', 106)
Слайд 9As you might realize, the AdventureWorks2008 database already includes the
HumanResources.Employee table.
However, that table now uses the hierarchyid data
type to store hierarchical data, which would introduce unnecessary complexity when trying to demonstrate a recursive CTE. For that reason, I created my own table.
However, if you want to try out a recursive CTE without creating and populating a new table, you can use the AdventureWorks sample database that shipped with SQL Server 2005.
The HumanResources.Employee table in that database stores the data in a way similar to the table I create above.
After I created the Employees table, I created the following SELECT statement, which is preceded by a WITH clause that includes a CTE named cteReports:
Слайд 10WITH
cteReports (EmpID, FirstName, LastName, MgrID, EmpLevel)
AS
(
SELECT
EmployeeID, FirstName, LastName, ManagerID, 1
FROM Employees WHERE ManagerID IS
NULL
UNION ALL
SELECT e.EmployeeID, e.FirstName, e.LastName, e.ManagerID, r.EmpLevel + 1
FROM Employees e
INNER JOIN cteReports r ON e.ManagerID = r.EmpID
)
SELECT
FirstName + ' ' + LastName AS FullName, EmpLevel,
(SELECT FirstName + ' ' + LastName FROM Employees
WHERE EmployeeID = cteReports.MgrID) AS Manager
FROM cteReports
ORDER BY EmpLevel, MgrID
Слайд 11The CTE query is itself made up of two SELECT
statements, connected with the UNION ALL operator.
A recursive CTE
query must contain at least two members (statements), connected by the UNION ALL, UNION, INTERSECT, or EXCEPT operator.
In this example, the first SELECT statement is the anchor member, and the second statement is the recursive member.
All anchor members must precede the recursive members, and only the recursive members can reference the CTE itself. In addition, all members must return the same number of columns with corresponding data types.
Слайд 12Now lets look closer at the statements themselves.
The first
statement, the anchor member, retrieves the employee ID, first name,
last name, and manager ID from the Employees table, where the manager ID is null.
This would be the employee at the top of the hierarchy, which means this person reports to no one. Consequently, the manager ID value is null.
To reflect that this person is at the top of the hierarchy, I assign a value of 1 to the EmpLevel column.
Слайд 13The second statement in the CTE query—the recursive member—also retrieves
the employee ID, first name, last name, and manager ID
for employees in the Employees table.
However, notice that I join the Employees table to the CTE itself. In addition, the join is based on the manager ID in the Employees table and the employee ID in the CTE.
By doing this, the CTE will loop through the Employees table until it returns the entire hierarchy.
Слайд 14After I define my WITH clause, I create a SELECT
statement that retrieves the data from the CTE.
Note, however,
that for the Manager column, I retrieve the first and last name of the employee associated with the manager ID in the CTE.
This allows me to display the full name of the manager for each employee.
The following table shows the result set returned by the SELECT statement and its CTE.
Слайд 16As you can see, the CTE, whether recursive or nonrecursive,
can be a useful tool when you need to generate
temporary result sets that can be accessed in a SELECT, INSERT, UPDATE, DELETE, or MERGE statement.
In a sense, a CTE is like a derived table: it’s not stored as an object and is valid only during the execution of the primary statement.
However, unlike the derived table, a CTE can be referenced multiple times within a query and it can be self-referencing. And best of all, CTEs are relatively easy to implement.
Слайд 17Example2
USE AdventureWorks2008R2;
GO
-- Define the CTE expression name and column list.
WITH
Sales_CTE (SalesPersonID, SalesOrderID, SalesYear)
AS
-- Define the CTE query.
(
SELECT SalesPersonID, SalesOrderID, YEAR(OrderDate) AS SalesYear FROM Sales.SalesOrderHeader WHERE SalesPersonID IS NOT NULL
)
-- Define the outer query referencing the CTE name.
SELECT SalesPersonID, COUNT(SalesOrderID) AS TotalSales, SalesYear
FROM Sales_CTE
GROUP BY SalesYear, SalesPersonID
ORDER BY SalesPersonID, SalesYear;
GO
Слайд 18
The following example shows the components of the CTE structure:
expression name, column list, and query.
The CTE expression Sales_CTE
has three columns (SalesPersonID, SalesOrderID, and OrderDate) and is defined as the total number of sales orders per year for each salesperson.
Here is a partial result set:
SalesPersonID TotalSales SalesYear
------------- ----------- -----------
274 4 2001
274 20 2002
274 14 2003
274 10 2004
275 56 2001
275 139 2002
275 169 2003
Слайд 19Example3
Say, for instance, you have a query like this:
SELECT
* FROM (
SELECT A.Address, E.Name, E.Age From Address A
Inner join Employee E on E.EID = A.EID) T
WHERE T.Age > 50
ORDER BY T.NAME
The query looks really a mess. Even if I need to write something that wraps around the entire query, it would gradually become unreadable. CTE allows you to generate Tables beforehand and use it later when we actually bind the data into the output.
Слайд 20Rewriting the query using CTE expressions would look like:
With
T(Address, Name, Age) --Column names for Temporary table
AS
(
SELECT A.Address, E.Name,
E.Age from Address A
INNER JOIN EMP E ON E.EID = A.EID
)
SELECT * FROM T --SELECT or USE CTE temporary Table
WHERE T.Age > 50
ORDER BY T.NAME
Yes as you can see, the second query is much more readable using CTE. You can specify as many query expressions as you want and the final query which will output the data to the external environment will eventually get reference to all of them.
Слайд 21With T1(Address, Name, Age) --Column names for Temporary table
AS
(
SELECT A.Address,
E.Name, E.Age from Address A
INNER JOIN EMP E ON E.EID
= A.EID
),
T2(Name, Desig)
AS
(
SELECT NAME, DESIG FROM Designation)
SELECT T1.*, T2.Desig FROM T1 --SELECT or USE CTE temporary Table
WHERE T1.Age > 50 AND T1.Name = T2.Name
ORDER BY T1.NAME
So the queries are separated using commas. So basically you can pass as many queries as you want and these queries will act as a subqueries, getting you the data and name it as a temporary table in the query.
Слайд 22CTEs used to recursively enumerate hierarchical data.
WITH ShowMessage(STATEMENT, LENGTH)
AS
(
SELECT
STATEMENT = CAST('I Like ' AS VARCHAR(300)), LEN('I Like ')UNION
ALLSELECT
CAST(STATEMENT + 'CodeProject! ' AS VARCHAR(300))
, LEN(STATEMENT) FROM ShowMessage
WHERE LENGTH < 300
)
SELECT STATEMENT, LENGTH FROM ShowMessage
Слайд 24Pivot Table
Pivot tables are used to summarize and display
the data, specially in case of report data by means
of aggregating the values.
Pivot table can be used if we want to display the unique values of the column of a table as the columns of another table.
It turns the unique values of a specified column into another table columns.
Слайд 25The syntax for the Pivot is given below:
SELECT non-pivoted column,
firstpivotedcolumn AS column name,
secondpivotedcolumn AS column name,
lastpivotedcolumn AS column name
FROM
(SELECT query that produces the data>)
AS aliasforsourcequery
PIVOT
(
aggregation function(column being aggregated)
FOR
column that contains the values that will become column headers
IN ( firstpivotedcolumn,secondpivotedcolumn,
last pivoted column)
)
AS aliasforthepivottable (optional ORDER BY clause)
Слайд 26For example, suppose we have a table called tbl_student which
contains the columns studentname, grade and marks.
Слайд 27SELECT studentname, [I], [II], [III], [IV] , [V]
FROM
tbl_student)
AS sourcetable
PIVOT
( AVG(marks)
FOR grade in ([I],[II],[III],[IV],[V]))
AS
PIVOTABLE
ORDER BY
V desc,IV desc,III desc,II desc,I desc
Слайд 28Unpivot table
Unpivot table is reverse of Pivot table as
it rotate the columns of a table into the value
of a column.
For example, suppose we have a table say tbl_stdmarksdata whose structure us given below:
Слайд 29The Query for the Unpivot table will be as follow:
SELECT
studentname,Marks,Grade
FROM tbl_stdmarksdata
UNPIVOT
(
Marks for Grade in (I,II,III,IV,V)
)
AS tblunpvt
Слайд 31Example2
There are two ways to pivot data:
We can convert multiple
rows into a single row with repeating groups of columns.
We
can convert columns on a single row into multiple rows with a generic column and a row type discriminator.
CREATE TABLE Sales (State CHAR(2), SalesAmt DECIMAL(18,2))
INSERT INTO Sales VALUES ('ND',10000)
INSERT INTO Sales VALUES ('SD',30000)
INSERT INTO Sales VALUES ('TN',2500.50)
INSERT INTO Sales VALUES ('OR',5500.50)
INSERT INTO Sales VALUES ('VA',6500.50)
INSERT INTO Sales VALUES ('SD',7000)
INSERT INTO Sales VALUES ('ND',8000)
SELECT * FROM Sales
Слайд 33GO
SELECT [ND],[SD],[TN],[OR],[VA]
FROM (SELECT State,SalesAmt FROM Sales) p
PIVOT
(
SUM (SalesAmt)
FOR State IN
([ND],[SD],[TN],[OR],[VA])
) AS pvt
Run the following query,
Here is the result set,
Слайд 34UNPIVOT performs almost the reverse operation of PIVOT, by rotating
columns into rows.
Use the following scripts,
CREATE TABLE StudentMarks(
[Name] VARCHAR(50),
Subject1 VARCHAR(10),
Mark1 INT,
Subject2 VARCHAR(10),
Mark2 INT,
Subject3 VARCHAR(10),
Mark3 INT)
Слайд 35
INSERT INTO StudentMarks([Name],Subject1,Mark1,Subject2,Mark2,Subject3,Mark3)
VALUES('AAA','Science',98,'Maths',89,'English',76)
INSERT INTO StudentMarks([Name],Subject1,Mark1,Subject2,Mark2,Subject3,Mark3)
VALUES('XXX','Biology',78,'Chemistry',85,'Physics',67)
INSERT INTO StudentMarks([Name],Subject1,Mark1,Subject2,Mark2)
VALUES('YYY','Batany',60,'Zoology',54)
INSERT INTO StudentMarks([Name],Subject1,Mark1,Subject2,Mark2)
VALUES('ZZZ','Maths',67,'Physics',78)
SELECT * FROM StudentMarks
Слайд 37
SELECT [Name], SubjectName,
case when Subject='Subject1' then Mark1
when Subject='Subject2' then Mark2
when Subject='Subject3' then Mark3
else NULL end as Marks
FROM
(
SELECT [Name], Subject1,Mark1, Subject2,Mark2, Subject3,Mark3
FROM StudentMarks
) p
UNPIVOT
(
SubjectName FOR Subject IN
(Subject1, Subject2, Subject3)
)
AS unpvt;
Слайд 39Note that UNPIVOT is not the exact reverse of PIVOT.
PIVOT performs an aggregation and hence merges possible multiple rows
into a single row in the output.
UNPIVOT does not reproduce the original table-valued expression result because rows have been merged.
Besides, NULL values in the input of UNPIVOT disappear in the output, whereas there may have been original NULL values in the input before the PIVOT operation.
Слайд 40The APPLY operator
Problem
SQL Server 2005 introduced the APPLY operator,
which is very much like a join clause and which
allows joining between two table expressions i.e. joining a left/outer table expression with a right/inner table expression.
The difference between join and APPLY operator becomes evident when you have a table-valued expression on the right side and you want this table-valued expression to be evaluated for each row from the left table expression.
Слайд 41Solution
The APPLY operator allows you to join two table expressions;
the right table expression is processed every time for each
row from the left table expression.
As you might have guessed, the left table expression is evaluated first and then right table expression is evaluated against each row of the left table expression for final result-set.
The final result-set contains all the selected columns from the left table expression followed by all the columns of right table expression.
Слайд 42The APPLY operator comes in two variants, the CROSS APPLY
and the OUTER APPLY.
The CROSS APPLY operator returns only
those rows from left table expression (in its final output) if it matches with right table expression. In other words, the right table expression returns rows for left table expression match only.
Whereas the OUTER APPLY operator returns all the rows from left table expression irrespective of its match with the right table expression.
For those rows for which there are no corresponding matches in right table expression, it contains NULL values in columns of right table expression.
So you might now conclude, the
CROSS APPLY is semantically equivalent to INNER JOIN (or to be more precise its like a CROSS JOIN with a correlated sub-query) with a implicit join condition of 1=1 whereas
OUTER APPLY is semantically equivalent to LEFT OUTER JOIN.
Слайд 43SELECT * FROM Department D
CROSS APPLY
(
SELECT * FROM Employee E
WHERE E.DepartmentID = D.DepartmentID
) A
GO
SELECT * FROM Department D
INNER JOIN Employee E ON D.DepartmentID = E.DepartmentID
GO
CROSS APPLY and INNER JOIN
Слайд 44If you look at the results they produced, it is
the exact same result-set; not only that even the execution
plan for these queries are similar to each other and has equal query cost, as you can see in the image below. So what is the use of APPLY operator? How does it differ from a JOIN and how does it help in writing more efficient queries. I will discuss this later, but first let me show you an example of OUTER APPLY also.
Слайд 45SELECT * FROM Department D
OUTER APPLY
(
SELECT * FROM Employee E
WHERE E.DepartmentID = D.DepartmentID
) A
GO
SELECT * FROM Department D
LEFT OUTER JOIN Employee E ON D.DepartmentID = E.DepartmentID
GO
OUTER APPLY and LEFT OUTER JOIN
Слайд 46Even though the above two queries return the same information,
the execution plan is a bit different. Although cost wise
there is not much difference, the query with the OUTER APPLY uses a Compute Scalar operator (which has an estimated operator cost of 0.0000103 or almost 0% of total query cost) before Nested Loops operator to evaluate and produce the columns of Employee table.
Слайд 47Now comes the time to see where the APPLY operator
is really required. In Script #4, I am creating a
table-valued function which accepts DepartmentID as its parameter and returns all the employees who belong to this department.
The next query selects data from Department table and uses CROSS APPLY to join with the function we created.
It passes the DepartmentID for each row from the outer table expression (in our case Department table) and evaluates the function for each row similar to a correlated subquery.
The next query uses the OUTER APPLY in place of CROSS APPLY and hence unlike CROSS APPLY which returned only correlated data, the OUTER APPLY returns non-correlated data as well, placing NULLs into the missing columns.
Слайд 48APPLY with table-valued function
IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[fn_GetAllEmployeeOfADepartment]') AND type IN (N'IF'))
BEGIN
DROP FUNCTION dbo.fn_GetAllEmployeeOfADepartment
END
GO
CREATE FUNCTION dbo.fn_GetAllEmployeeOfADepartment(@DeptID AS INT)
RETURNS TABLE
AS
RETURN
(
SELECT * FROM Employee E
WHERE E.DepartmentID = @DeptID
)
GO
SELECT * FROM Department D
CROSS APPLY dbo.fn_GetAllEmployeeOfADepartment(D.DepartmentID)
GO
SELECT * FROM Department D
OUTER APPLY dbo.fn_GetAllEmployeeOfADepartment(D.DepartmentID)
GO
Слайд 50So now if you are wondering, can we use a
simple join in place of the above queries?
Then the
answer is NO, if you replace CROSS/OUTER APPLY in the above queries with INNER JOIN/LEFT OUTER JOIN, specify ON clause (something as 1=1) and run the query, you will get "The multi-part identifier "D.DepartmentID" could not be bound." error.
This is because with JOINs the execution context of outer query is different from the execution context of the function (or a derived table), and you can not bind a value/variable from the outer query to the function as a parameter.
Hence the APPLY operator is required for such queries.
Слайд 51APPLY with Dynamic Management Function (DMF)
USE master
GO
SELECT DB_NAME(database_id) AS [Database], [text] AS [Query]
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.plan_handle) st
WHERE session_Id > 50
-- Consider spids for users only, no system spids.
AND session_Id NOT IN (@@SPID)
-- Don't include request from current spid.
Please note the [text] column in the
above query returns the all queries submitted in a batch; if you want to see only active (currently executing) query you can use statement_start_offset and statement_end_offset columns to trim the active part of the query.
Слайд 52TOP
Along the way to making this solution work we'll discuss
a few new features of SQL Server 2005.
The first
is that the TOP clause can now take a variable:
DECLARE @Rows INTSET @Rows = 10
SELECT TOP ( @Rows ) *FROM Sales.SalesOrderHeader
This will return the top 10 rows from SalesOrderHeader.
You can also replace @Rows with anything that evaluates to a number. The following query looks odd but runs just fine:
Слайд 53SELECT TOP (
SELECT COUNT(*)
FROM Sales.Customer
) *
FROM Sales.SalesOrderHeader
There are 19,185 rows in the Customer table and this query returns the top 19,185 rows from SalesOrderHeader.
You can also use the TOP clause for INSERT, UPDATE and DELETE statements.
If you wanted to DELETE in batches of 500 you can now do that using the TOP clause.
Слайд 54Next we need a function to return the TOP X
rows from SalesOrderHeader based on the total sales amount. That
function looks like this:
CREATE FUNCTION dbo.fn_GetTopOrders(@custid AS int, @n AS INT)
RETURNS TABLE
AS
RETURN
SELECT TOP(@n) *
FROM Sales.SalesOrderHeader
WHERE CustomerID = @custid
ORDER BY TotalDue DESC
GO
My Function
Слайд 55Notice that it accepts the number of orders to return
as a parameter.
Also notice that I'm using SELECT *
all over the place. I really encourage you to explicitly list out the columns when you write this for real.
If you call this with a CustomerID and a number of rows it will return that many rows ordered by the total amount of the order in descending order.
Also notice that there is an ORDER BY clause in this function.