Dynamic T-SQL Pivot

In a prior blog post (Standard T-SQL PIVOT) I showed how we can use a PIVOT query to take distinct values in a column and make those values new columns. However,  data changes and new columns may be needed. Looking at the data from the prior post, we only had three types of candy bars. What happens when a new candy bar shows up in our transactions? Well, we could update our query to use the new column value. If another new candy bar shows up we could update the query again. This process could go on and on and on. I am going to propose an alternative: using a dynamic PIVOT query. In this blog post I am going to show how we can use dynamic SQL to implement a dynamic PIVOT query.

Let’s start out by looking at the original data:

CustomerID ProductName Quantity
1 Snickers 1
1 Butterfinger 1
1 Rolo 2
1 Snickers 1
1 Butterfinger 1
2 Rolo 2

After using a PIVOT query we will see the data as described in the table below:

CustomerIDButterfingerRoloSnickers

CustomerID Butterfinger Rolo Snickers
1 2 2 2
2 NULL 2 NULL

Now update the original data with a few new products:

CustomerID ProductName Quantity
1 Snickers 1
1 Butterfinger 1
1 Rolo 2
1 Snickers 1
1 Butterfinger 1
2 Rolo 2
1 MilkyWay 3
2 Twix 2

If we apply that same query from the prior blog post, listed below, we see the same results as before.

SELECT   [CustomerID], 
         [Butterfinger], [Milkyway], [Rolo], [Snicker]
FROM    #CustomerProducts P 
PIVOT ( SUM(Quantity) FOR P.ProductName IN (
                          [Butterfinger], [Milkyway], [Rolo], [Snicker])
) AS PVT
CustomerID Butterfinger Rolo Snickers
1 2 2 2
2 NULL 2 NULL

But what we really wanted was

CustomerID Butterfinger Milkyway Rolo Snickers Twix
1 2 3 2 2 NULL
2 NULL NULL 2 NULL 2

As I mentioned before, we could simply update our query and this blog post is over. However, we are going to push on and create our dynamic version. Let’s take a look at creating the dynamic portion of the PIVOT query. The key part of the dynamic query is the distinct column list. Getting the distinct columns is just a simple DISTINCT query. To get the comma separated column list we will employ a trick that involves a COALESCE or ISNULL statement.

SELECT @PivotList = COALESCE(@PivotList + ',', '')

Once we have the distinct comma separated list, we can start to build out the dynamic query. We will create an nvarchar variable that we will use to store our dynamic SQL statement, and then we will execute that statement with the sp_executesql function.

Looking at our PIVOT query again, what we will want to do is replace the distinct columns with our comma separated list of values. Then we will take and store the new query into our variable

SELECT   [CustomerID], 
         [Butterfinger], [Milkyway], [Rolo], [Snicker]
FROM    #CustomerProducts P 
PIVOT ( SUM(Quantity) FOR P.ProductName IN (
                          [Butterfinger], [Milkyway], [Rolo], [Snicker])
) AS PVT

Converts to this

SET @Output = 'SELECT  [CustomerID]
	, ' + @PivotList + '
FROM    #CustomerProducts P 
PIVOT ( SUM(Quantity) FOR P.ProductName IN (' + @PivotList + ') ) AS PVT'

Now all that is left is to execute the stored query using the sp_executesql function. Here is the entire Dynamic SQL Pivot Script:

CREATE TABLE #CustomerProducts (
   CustomerID int
 , ProductName nvarchar(50)
 , Quantity int)
 
INSERT   INTO #CustomerProducts
         (CustomerID, ProductName, Quantity)
VALUES   (1, N'Snicker', 1),
         (1, N'Butterfinger', 1),
         (1, N'Rolo', 2),
         (1, N'Milkyway', 1),
         (1, N'Snicker', 1),
         (1, N'Butterfinger', 1),
         (2, N'Rolo', 2),
         (1, N'Milkyway', 3),
         (2, N'Twix', 2);
 
DECLARE @Output nvarchar(max) = N''
 , @PivotList varchar(max)
 
SELECT   @PivotList = COALESCE(@PivotList + ', ', N'') + N'[' + ProductName + N']'
 
FROM     (
          SELECT DISTINCT
                  ProductName
          FROM    #CustomerProducts
         ) AS CustProds;
 
SET @Output = 'SELECT  [CustomerID]
	, ' + @PivotList + '
FROM    #CustomerProducts P 
PIVOT ( SUM(Quantity) FOR P.ProductName IN (' + @PivotList + ') ) AS PVT'
 
EXEC sp_executesql @Output;
 
IF OBJECT_ID('tempdb..#CustomerProducts') IS NOT NULL	
	DROP TABLE #CustomerProducts

Standard T-SQL PIVOT

Pivoting data is a common practice while analyzing data. The basic idea is to take a single column’s data and pivot the unique values into separate columns. In this post I am going to show a simple example of how we can implement a PIVOT statement in a query. In a future post I will discuss how we can create a dynamic PIVOT based on the functionality below. SQL Server 2005 introduced functionality to help with pivoting data. You can PIVOT data using the following syntax

SELECT  [AnyColumn], [PivotCol1],[PivotCol2],...,[PivotColN]
FROM    (
	   SELECT  [AnyColumn]
                 , [DataColumn]
		 , [AggregateColumn]
	   FROM    SomeTable	   
	  ) P PIVOT ( SUM(AggregateColumn) FOR DataColumn IN ([PivotCol1],[PivotCol2],...,[PivotColN]) ) AS PVT

So, let’s take a look at a bit of data. The temp table #CustomerProducts below stores a snippet of transaction data. This data is describing the number of candy bars ordered per transaction by a given customer, over time.

CREATE TABLE #CustomerProducts (
   CustomerID int
 , ProductName nvarchar(50)
 , Quantity int)
 
INSERT   INTO #CustomerProducts
         (CustomerID, ProductName, Quantity)
VALUES   (1, N'Snickers', 1),
         (1, N'Butterfinger', 1),
         (1, N'Rolo', 2),
         (1, N'Snickers', 1),
         (1, N'Butterfinger', 1),
         (2, N'Rolo', 2);
CustomerID ProductName Quantity
1 Snickers 1
1 Butterfinger 1
1 Rolo 2
1 Snickers 1
1 Butterfinger 1
2 Rolo 2

However, we would like to see how many candy bars each customer has ordered total, as described in the table below:

CustomerID Butterfinger Rolo Snickers
1 2 2 2
2 NULL 2 NULL

So how do we accomplish this task? Simple, we will implement a PIVOT query…

Before we can develop the PIVOT query we need to determine the PIVOT columns. These columns will come from the data located in the product name column. According the the table above we want to use the following columns: Butterfinger, Rolo, and Snickers. Next we will want to determine the column that we are going to aggregate. In our example, we will use the Quantity column and we will summarize the results. With this information we can plug it into the a PIVOT query and get the query listed below.

CREATE TABLE #CustomerProducts (
   CustomerID int
 , ProductName nvarchar(50)
 , Quantity int)
 
INSERT   INTO #CustomerProducts
         (CustomerID, ProductName, Quantity)
VALUES   (1, N'Snickers', 1),
         (1, N'Butterfinger', 1),
         (1, N'Rolo', 2),         
         (1, N'Snickers', 1),
         (1, N'Butterfinger', 1),
         (2, N'Rolo', 2);
 
 SELECT  [CustomerID]
	, [Butterfinger],[Rolo],[Snickers]
FROM    (
	   SELECT  [CustomerID]
			 , [ProductName]
			 , [Quantity]
	   FROM    #CustomerProducts	   
	  ) P PIVOT ( SUM(Quantity) FOR P.ProductName IN ([Butterfinger],[Rolo],[Snickers]) ) AS PVT
 
IF OBJECT_ID('tempdb..#CustomerProducts') IS NOT NULL	
	DROP TABLE #CustomerProducts

So, whats next? Well, what if another candy bar type is sold? What if 10 different types are sold? We can continue to update the PIVOT query above, or we can use a Dynamic PIVOT Query. In another post I will describe how to create a Dynamic PIVOT Query.

Deleting Records from Large tables

Database maintenance is a necessary evil for a database administrator and deleting records is one of these necessary evils that we need to deal with. Deleting records can be tedious task especially in Very Large Databases (VLDB). This post will demonstrate different ways to handle deleting records from a database table.

Basic Delete

The easiest way to delete records from a database table is to use a DELETE FROM statement

DELETE FROM VLDB_Table WHERE ColumnName = 'Value';

As the number of records grow, the amount of time it takes to delete records from our table increases. While the records are being deleted, the table is locked and other processes will be blocked until the completion of the delete statement.

Truncate Table

The truncate statement is a much faster delete process, which removes all records from a table.

TRUNCATE TABLE VLDB_Table;

A truncate statement is not always an option on a table, if the table has a foreign key relationship with another table, the table is used in an indexed view or used in replication, the truncate option is not an option.

Batch Delete

With the limitation of the truncate statement and potential for blocking using the basic delete, that leaves us looking for an optimal solution for cleaning up records. Using a basic delete in conjunction with the TOP option and a loop, we can delete a records in a batch.

DECLARE @BatchSize int  = 1000;
 
WHILE EXISTS (SELECT 1 FROM VLDB_Table)
BEGIN
   DELETE TOP (@BatchSize) FROM VLDB_Table WHERE ColumnName = 'Value';
END

Another batch delete scrip that I have come across uses the ROWCOUNT as the factor for the loop.

DECLARE @BatchSize int  = 1000;
-- Delete the initial batch outside the loop, this is needed to set @@ROWCONT
DELETE TOP (@BatchSize) FROM VLDB_Table WHERE ColumnName = 'Value';
 
WHILE (@@ROWCOUNT > 0)
BEGIN
   DELETE TOP (@BatchSize) FROM VLDB_Table WHERE ColumnName = 'Value';
END

View table data with Query Shortcuts Part – 2

In Part 1 of the Query Shortcuts, I demonstrated a quick SELECT * FROM keyboard shortcut. The problem with this approach is if your table is not in the default schema, you have to select the schema and table name. In this post I am going to demonstrate how you can use a custom stored procedure in the Query Shortcuts to return table data without selecting the schema.

First let’s look at the stored procedure.

Stored Procedure:

The idea behind the stored procedure is to find the schema for the table name that is passed in. If the same name is used in multiple schemas, the stored procedure will output both tables and all their data.

CREATE PROCEDURE [sp_GetTableData] (
   @TableName varchar(200))
AS 
   -- Remove any square brackets, we only want the table name
   SET @TableName = REPLACE(REPLACE(@TableName, '[', ''), ']', '')
 
   DECLARE @SelectFrom nvarchar(max);
   -- If temp table exists remove it
   IF OBJECT_ID('tempdb..#TableNames') IS NOT NULL 
      DROP TABLE #TableNames;
 
   -- Get a list of all tables that match in all schemas
   SELECT   TableName = s.NAME + '.' + o.name
   INTO     #TableNames
   FROM     sys.objects o
   JOIN     sys.schemas s ON o.schema_id = s.schema_id
   WHERE    o.name = @TableName
            AND o.type = 'U'
 
   -- Loop through all tables and output table results
   WHILE ( SELECT COUNT (1) FROM #TableNames ) >= 1 
      BEGIN
         SELECT TOP 1 @SelectFrom = N'SELECT * FROM ' + TableName
         FROM     #TableNames T
 
         EXECUTE sp_executesql @SelectFrom;
 
         DELETE TOP (1) FROM #TableNames
      END
 
   -- clean up
   DROP TABLE #TableNames

To use this query shortcut in all databases in the current server instance you can store this stored procedure in the Master database. Typically I would not recommend using the prefix “sp_”, however in this instance we need the procedure to behave as a system stored procedure. Also by using the prefix “sp_” SQL Server will look in the master database for the procedure before looking in the current database.

Query Shortcut Setup:

In order to setup a query short cut you will need to navigate to the SSMS Options Dialog:
(SQL 2012)

  1. Open SSMS navigate to Tools –> Options…
  2. After the dialog opens navigate to Environment –> Keyboard –> Query Shortcuts. This will open the Options dialog window shown below.
  3. Enter “sp_GetTableData ” in any of the open text fields under Stored Procedure.
  4. Click [OK]

NOTE: The new query shortcut will be available in all SSMS windows opened after the query shortcut is saved.

Query Shortcut for Stored Procedure sp_GetTableData

Now open any query, highlight any table name and press Ctrl+F1. Voila you have the contents of the highlighted table. Below we are looking at a snippet of the [dbo].[uspGetEmployeeManagers] procedure in the adventure works database.

Output After Pressing Ctrl+F1

Notice that this time the schema is not selected as a part of the table name. This gives us the ability to double click to highlight the table, and run our shortcut. This feature is handy when learning/debugging a stored procedure, function, view or any other db object.

A downfall with using the stored procedure version is the stored procedure has to be setup on each instance of SQL Server that you connect to in SSMS. This is because the stored procedure lives in the instance where as the Query Shortcut lives in SSMS. Part 1 does not have this same limitation.

View table data with Query Shortcuts

Query shortcuts are one of the great features in SSMS (SQL Server Management Studio) that is commonly overlooked. One of my favorite query shortcuts for TSQL development is the ability to select all records from a table, by simply by highlighting the table name and pressing Ctrl+F1. In this two part blog post series I will demonstrate how to setup a couple variations of my favorite query shortcut.

Simple Query:

In order to setup a query short cut you will need to navigate to the SSMS Options Dialog:
(SQL 2012)

  1. Open SSMS navigate to Tools –> Options…
  2. After the dialog opens navigate to Environment –> Keyboard –> Query Shortcuts. This will open the Options dialog window shown below.
  3. Enter “SELECT * FROM ” in any of the open text fields under Stored Procedure.
  4. Click [OK]

NOTE: The new query shortcut will be available in all SSMS windows opened after the query shortcut is saved.

Query Shortcut for Simple SELECT * FROM

Now open any query highlight any table name and press Ctrl+F1. Voila you have the contents of the highlighted table. Below we are looking at a snippet of the [dbo].[uspGetEmployeeManagers] procedure in the adventure works database.

Output After Pressing Ctrl+F1

Notice that the schema is selected as a part of the table name. In part 2 of the Query Shortcuts I will demonstrate another approach where the schema does not need to be highlighted. Also we will use a stored procedure rather than a query.