Monday, September 1, 2008

Cross Tab queries , Converting Rows to Columns

Hi all,

While browsing on the net I got a really good article by Jeff Moden on Cross Tab queries , Converting Rows to Columns . Hats off to jeff for this beautiful article. Some part of it that I feel was really very helpful for me I am reproducing it here for my future reference and I recommend you guys to read the complete series here

Cross Tab queries , Converting Rows to Columns
Sometimes it is necessary to rotate results so that [the data in] columns are presented horizontally and [the data in] rows are presented vertically. This is known as creating a cross-tab report, or rotating data.

In other words, you can use a Cross Tab or Pivot to convert or transpose information from rows to columns.
A simple introduction to Cross Tabs:

The Cross Tab Report example from Books Online is very simple and easy to understand. I've shamelessly borrowed from it to explain this first shot at a Cross Tab

The Test Data

Basically, the table and data looks like this...

--===== Sample data #1 (#SomeTable1)
--===== Create a test table and some data
CREATE TABLE #SomeTable1
(
Year SMALLINT,
Quarter TINYINT,
Amount DECIMAL(2,1)
)
GO
INSERT INTO #SomeTable1
(Year, Quarter, Amount)
SELECT 2006, 1, 1.1 UNION ALL
SELECT 2006, 2, 1.2 UNION ALL
SELECT 2006, 3, 1.3 UNION ALL
SELECT 2006, 4, 1.4 UNION ALL
SELECT 2007, 1, 2.1 UNION ALL
SELECT 2007, 2, 2.2 UNION ALL
SELECT 2007, 3, 2.3 UNION ALL
SELECT 2007, 4, 2.4 UNION ALL
SELECT 2008, 1, 1.5 UNION ALL
SELECT 2008, 3, 2.3 UNION ALL
SELECT 2008, 4, 1.9
GO

Every row in the code above is unique in that each row contains ALL the information for a given quarter of a given year. Unique data is NOT a requirement for doing Cross Tabs... it just happens to be the condition that the data is in. Also, notice that the 2nd quarter for 2008 is missing.
The goal is to make the data look more like what you would find in a spreadsheet... 1 row for each year with the amounts laid out in columns for each quarter with a grand total for the year. Kind of like this...

... and, notice, we've plugged in a "0" for the missing 2nd quarter of 2008.

Year 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Total
------ ------- ------- ------- ------- -----
2006 1.1 1.2 1.3 1.4 5.0
2007 2.1 2.2 2.3 2.4 9.0
2008 1.5 0.0 2.3 1.9 5.7

The KEY to Cross Tabs!
Let's start out with the most obvious... we want a Total for each year. This isn't required for Cross Tabs, but it will help demonstrate what the key to making a Cross Tab is.
To make the Total, we need to use the SUM aggregate and a GROUP BY
... like this...

--===== Simple sum/total for each year
SELECT Year,
SUM(Amount) AS Total
FROM #SomeTable1
GROUP BY Year
ORDER BY Year

And, that returns the following...

Year Total
------ ----------------------------------------
2006 5.0
2007 9.0
2008 5.7

Not so difficult and really nothing new there. So, how do we "pivot" the data for the Quarter?
Let's do this by the numbers...
1. How many quarters are there per year? Correct, 4.
2. How many columns do we need to show the 4 quarters per year? Correct, 4.
3. How many times do we need the Quarter column to appear in the SELECT list to make it show up 4 times per year? Correct, 4.
4. Now, look at the total column... it gives the GRAND total for each year. What would we have to do to get it to give us, say, the total just for the first quarter for each year? Correct... we need a CASE statement inside the SUM.
Number 4 above is the KEY to doing this Cross Tab... It should be a SUM and it MUST have a CASE to identify the quarter even though each quarter only has 1 value. Yes, if each quarter had more than 1 value, this would still work! If any given quarter is missing, a zero will be substituted.

To emphasize, each column for each quarter is just like the Total column, but it has a CASE statement to trap info only for the correct data for each quarter's column. Here's the code...

--===== Each quarter is just like the total except it has a CASE
-- statement to isolate the amount for each quarter.
SELECT Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS [1st Qtr],
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS [2nd Qtr],
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS [3rd Qtr],
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS [4th Qtr],
SUM(Amount) AS Total
FROM #SomeTable1
GROUP BY Year

... and that gives us the following result in the text mode (modified so it will fit here)...
Year 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Total
------ ------- ------- ------- ------- -----
2006 1.1 1.2 1.3 1.4 5.0
2007 2.1 2.2 2.3 2.4 9.0
2008 1.5 .0 2.3 1.9 5.7

Also notice... because there is only one value for each quarter, we could have gotten away with using MAX instead of SUM.
For most applications, that's good enough. If it's supposed to represent the final output, we might want to make it a little prettier. The STR function inherently right justifies, so we can use that to make the output a little prettier. Please, no hate mail here! I'll be one of the first that formatting of this nature is supposed to be done in the GUI!

--===== We can use the STR function to right justify data and make it prettier.
-- Note that this should really be done by the GUI or Reporting Tool and
-- not in T-SQL
SELECT Year,
STR(SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END),5,1) AS [1st Qtr],
STR(SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END),5,1) AS [2nd Qtr],
STR(SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END),5,1) AS [3rd Qtr],
STR(SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END),5,1) AS [4th Qtr],
STR(SUM(Amount),5,1) AS Total
FROM #SomeTable1
GROUP BY Year

The code above gives us the final result we were looking for...
Year 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Total
------ ------- ------- ------- ------- -----
2006 1.1 1.2 1.3 1.4 5.0
2007 2.1 2.2 2.3 2.4 9.0
2008 1.5 0.0 2.3 1.9 5.7