Washington State University



718739640465600Featherman’s Advanced T-SQL Analytics – Window FunctionsThis document reviews T-SQL Window functions which solve a data analytics limitation of GROUP BY() queries. GROUP BY() calculations are performed for the entire dataset based on the dimensions specified. What if you want to produce the analytics but have many reporting needs such as more than one country, color, year, etc.? You would need many GROUP BY queries because there is no way to sub-group the sliced by another grouped data. What we want to do is write one big query that produces the dataset that solves all the analytics needs. For this we need the concept of partitions of data within the resultset. These partitions create windows of data. “Often you want the query result set sectioned into partitions. The window function is applied to each partition separately and computation restarts for each partition.”For example while you may produce a dataset of sales by model by month for one year including a % of total column, what if you want to show more than one year and even compare the years? You will need many queries for this, and maybe several arrays right? Well WINDOW function can solve this problem. BTW WINDOW functions have nothing to do with MSFT Windows. The innovation is that the metric is calculated by defining groupings and sub-groupings of data records called windows. The % of total analytics (and others) are calculated within the defined windows. Window Functions are used when creating metrics that examine groups of data within a larger dataset (such as a running total by week or month). Window functions are an option when creating columns of metrics to place records within groups and determine how the calculations are performed within the groups. For example the invoices of a sorted customer list can be displayed with a running total, and when the next customer is shown, a new window is created, and the running total starts again for that customer. Unlike the examples shown in a prior module, you do not have to use variables, to calculate grand totals for a % of total calculation within each window, SQL does that for you.To date the calculations that were typically performed were for one group or several groups defined by a GROUP BY (e.g., ‘Mountain Bikes’, ‘Road Bikes’, ‘Touring Bikes’). This is a useful dataset, however there were no running totals, or comparison of time periods, which are needed to measure business performance. This document demonstrates some uses of the SQL syntax OVER() and PARTITION BY() that build can segment and calculate data (into groups called windows). We add new types of analytics (e.g. month over month analysis) that we have not used to date. Window functions are to perform calculations within a group of rows. The unique feature is that the metric will restart its calculation for every new group (i.e. product subcategory within product category). For example a moving average should start over when the product subcategory changes from road bikes to mountain bikes.Window functions are used to build datasets for in-depth analysis, and as part of an extract, transform and load data process, extracting data from database tables, segmenting and organizing it and loading the compiled resultset into the data warehouse. From Postgre: “A?window function?performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row — the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result.”So this document introduces the participant to ‘window functions’ of SQL. There actually isn’t specific T-SQL syntax ‘Window’ rather we use the OVER() clause is used to define a window (a sub-grouping) of records within which to perform analytics on a group. In the examples provided we start by consolidating overly granular data with a GROUP BY and then add window functions which create new columns of metrics. This is the same type of adding columns as we did with CASE() statements. Ok saddle up!! Base Query – Copy this base query into a new query in SSMS and run it. We will add several calculated columns in subsequent pages. USE [Featherman_Analytics];SELECT [Sub Category], (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate) AS [YearMonth], YEAR(i.OrderDate) as [Year]--above fields are the dimensions of interest, below are the measures to compile and calculate28455455298000, SUM(i.[OrderQuantity]) AS [Total Units], COUNT(DISTINCT([SalesOrderNumber])) as [# TA], FORMAT(SUM(i.[SalesAmount]), 'N0') AS [Monthly Revenue]FROM [dbo].[FactInternetSales] as iINNER JOIN [dbo].[AW_Products_Flattened] as p ON p.[Productkey] = i.[Productkey]WHERE [Category] = 'Bikes'GROUP BY [Sub Category], YEAR(i.OrderDate), (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)ORDER BY [Sub Category], [YEAR], [YearMonth]-548474161693100-1105066140390200Again we can see the ability of the [YearMonth] calculated datetime field to organize data for analysis. The sales are lined up by year and month within the sub-categories. Also notice the WHERE filter uses a field that is not in the GROUP BY() or SELECT section, proving that you can filter on any field that is in the linked tables. The field s used to filter the WHERE statement do not have to be in the GROUP BY field. FYI – this fact table of transactions is 60k. We use the GROUP BY() to change the granularity of the data for easier analysis. WHEN looking at a query your eyes should first go to the GROUP BY() to identify the dimension(s) of analysis.Window Function Example #1 - Running Totals - Add this code before the FROM statement of the base query to add two running totals. Running totals are great if you want to create a combo chart with monthly totals in the bars and running totals in the line. You may say hey, I don’t need this difficult content, I can create charts from downloaded data any time I want. Sure says the DBA, but can you automate the process? Analysts really don’t want to write SQL they want to solve problems, so automation is the litmus test. If an analytics process can be packaged up and automated, it is useful. If we create the SQL queries and then call them from reporting or dashboard software, then the process can be automated, removing error and non-value labor. DBA’s can also set the data to refresh for a report or spreadsheet at pre-determined time intervals.-- #1 this running total resets for each SubCategory ,FORMAT(SUM(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category]ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)227774518986600) ,'N0' ) AS [Sub-Cat Total] -- #2 this running total resets for each SubCat & Year combination,FORMAT(SUM(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category], YEAR(i.OrderDate)ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) ,'N0' ) 20999453746500AS [Sub-Cat Total by Year]99695122555--- Window Functions Running Totals Challenge Can you change the above code to calculate profit, running total columns by a) sub-category, b) sub-category and yearc) add a column to calculate profit by category by year?00--- Window Functions Running Totals Challenge Can you change the above code to calculate profit, running total columns by a) sub-category, b) sub-category and yearc) add a column to calculate profit by category by year?This first window function creates a calculated field that has a running total that restarts every year and every product subcategory. So the Partition By has 2 columnsNotice the difference in the second window function and the derived calculations from that. It is a similar running total but it only resets per product subcategory not per year. So the partition by is performed using only 1 column. Since there is more data in the running total and the same month numbers can repeat we need to add year to the ORDER BY phrase Should we try to move the data to Excel and create a combo chart? Should we convert the data to weekly totals?Line by line explanation brought to you by Featherman:,SUM(SUM([SalesAmount])) The first sum (outside) indicates that we are summing the values in the column. The second (inside) SUM([Field name]) combines with the rest of the GROUP BY query. This calculation is being performed on the results of a GROUP BY query. If the data was already summarized correctly then this second SUM() would not be necessary. The overall effect is that the SUM([Field name]) calculation is performed in sections called partitions called windows.OVERThe OVER () clause includes the PARTITION BY and the ORDER BY statements. They are broken out seperately here for clarityThe OVER term defines the set of rows that is to be calculated, summed in this example. The Partition by set of columns specifies the set of rows that is to be calculated. This is similar to a GROUP BY() specifying the dimensions that are used for the the calculations. If you specify an empty OVER () term then the calculation is for the entire column, you might use this in a denominator or a larger calculation. The term window refers to a set of rows for which you want to calculate an analytic, typically a running total, % of total, or comparison of the current row (e.g., month) from the prior row (e.g. prior month). While rare, to include all the rows of the dataset into the window calculation then leave the term blank OVER() as we did before when using NTILE().OVER The OVER “Determines the partitioning and ordering of a rowset before the associated window function [the sum or count or other calculation] is applied. That is, the OVER clause defines a window or user-specified set of rows within a query result set. A window function then computes a value for each row in the window. You can use the OVER clause with functions to compute aggregated values such as moving averages, cumulative aggregates, running totals, or a top N per group results.”#1 OVER(PARTITION BY [Sub Category])#2 OVER(PARTITION BY [Sub Category], YEAR(i.OrderDate))Two examples are shown #1, when the sub-category changes, or the year within a sub-category changes. Often you want the query result set sectioned into partitions. The window function is applied to each partition separately and computation restarts for each partition.PARTITION BY subdivides the column and calculations into sections called partitions or windows. The partition can be based on one or more attributes of a dimension. These should match the underlying GROUP BY dimensions.PARTITION BY “Divides the query result set into partitions [groups of rows]. The window function (the moving average etc.) is applied to each partition separately and computation restarts for each partition.”ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)Specifies the sorting within the window, these should match the fields in the GROUP BY()AS [Running Total]This entire code block was used to create a calculated column. Was it worth it? Well, new functionality was provided and many different reports and charts can be derived from this query, and you did not have to do the work manually in Excel.Window Function #2 - % of Total – often business performance indicators are used to compare entities within groups, for example to analyze a group of data on an attribute (what % of bike sales are blue?) or to see the scope (size) of a problem (what % of regional sales are derived from the Lewiston store?). While other methodologies were used to do these calculations, there was no ceoncept of performing the calculations on sections (partitions) of data within a larger dataset. What if you wanted to build a dataset that analyzed activity on many groups – for example combinations of two dimensions (as here the combinations of year and sub-category are analyzed) Here is where the PARTITION BY really shines, it handles a lot of work for us! Here are four columns of metrics that all calculate a % of total value for the SUM(i.[SalesAmount])/SUM(SUM(i.[SalesAmount]))calculation. Each column is a different % of total calculation. You can tell how the % calculation is performed by looking at the different OVER() statements. Demonstrated are calculations that use 1, 2, 1, and 0 fields in the partition section. Each makes an insightful line chart.,FORMAT(SUM(i.[SalesAmount])/SUM(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category]), 'P2') AS [% of Sub-Cat] ,FORMAT(SUM(i.[SalesAmount])/SUM(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category], YEAR(i.OrderDate)), 'P2') AS [% of Sub-Cat by Year],FORMAT(SUM(i.[SalesAmount])/SUM(SUM(i.[SalesAmount])) OVER(PARTITION BY YEAR(i.OrderDate)), 'P2') AS [% of All Bikes by Year],FORMAT(SUM(i.[SalesAmount])/SUM(SUM(i.[SalesAmount])) OVER(), 'P2') AS [% of All Bikes]713102384889Remove the prior calculated fields replacing them with these calcualtions 0Remove the prior calculated fields replacing them with these calcualtions OVER(PARTITION BY [Sub Category]), calculates the % of total for each YearMonth row across the years within each sub-category.OVER(PARTITION BY [Sub Category], YEAR(i.OrderDate)), calculates the % of total within each category, and within each year. This is the most detailed analysis, which examines for each year, and within each sub-category, which months had the most sales.OVER(PARTITION BY YEAR(i.OrderDate)), calculates the % of total for each row in the dataset within each year. The different sub-categories are compared to each other within each year. For a specific year, which were the revenue producing sub-categories and months?OVER(),calculates the % of total across all time periods and all sub-categories (the entire dataset which is based on the GROUP BY dimensions. This is analysis useful to identify any outlier months for any sub-category.Prove it to yourself! Add this code to the base query and total up the percentages to verify that the calculations do add up to 100% based on the window specified in the PARTITION statement. Next draw some line charts in PowerBI or Excel.Window Function #3 - Moving averages We now turn our attention to creating calculated columns that displays moving average. Moving averages are used to smooth out temporary fluctuations in values. Managers that can’t focus on the strategic level and ‘growing the pie’ often micromanage and can ‘knee-jerk’ react to temporary dips in performance. To maintian some perspective it may be prudent to step back and keep partially focused on different summary trends. Moving averages smooth out the peak and valley fluctuations in overly granular data. Moving averages are insightful and are often used to improve excel line charts. Remove and replace the prior calcs., FORMAT(AVG(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category]ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW), 'N0') 207094410842300AS [Sub-Cat Moving Avg.] -- The metric above is like a baseball player’s batting average, calculated over the entire season. A good or bad week barely changes the season’s avg. The metric below is the classic 3-month moving average. That’s defined by this ROWS code.ROWS BETWEEN 2 PRECEDING and CURRENT ROWCompare the performance metric on line 18, these two columns tell different stories. More perspective for the decision maker is usually preferred. , FORMAT(AVG(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category]ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)ROWS BETWEEN 2 PRECEDING and CURRENT ROW) , 'N0') AS [Sub-Cat 3 Mo. Moving Avg.]-722289226350000The ROWS or RANGE option further limits the rows within the partition to use in the calculation by specifying start and end points within the partition. This is done by specifying a range of rows with respect to the current row. The ROWS clause limits the rows within a partition by specifying a fixed number of rows preceding or following the current row. Alternatively, the RANGE clause logically limits the rows within a partition by specifying a range of values with respect to the value in the current row. Preceding and following rows are defined based on the ordering in the ORDER BY clause. The window frame "RANGE ... CURRENT ROW ..." includes all rows that have the same values in the ORDER BY expression as the current row. For example, ROWS BETWEEN 2 PRECEDING AND CURRENT ROW means that the window of rows that the function operates on is three rows in size, starting with 2 rows preceding until and including the current row.Window Function #4 - Side by Side Comparison of Current and Prior Record, and Numeric Change from Prior PeriodAdd this code after your syntax before the FROM statement to add two calculated fields that provide comparison analysis such as month over month, qtr over qtr, year over year, or even retailrack# over rack#. The LAG keyword provides access to a prior row at a specified offset that comes before the current row. Notice and be sure to understand all the () signs. While the first [Prior Month] formula is instructive, it is not presented in the image below. Rather a second LAG funtion to produce a Year over Year analysis is presented., FORMAT(LAG(SUM(i.[SalesAmount]), 1, NULL)OVER (PARTITION BY [Sub category]223329616382900ORDER BY 300482014287400 (YEAR(i.OrderDate)*100)+ MONTH(i.OrderDate)) , 'C0') AS , FORMAT(SUM(i.[SalesAmount]) - LAG(SUM(i.[SalesAmount]), 1, NULL)OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) ,'C0') as [Month over Month Revenue] 1714593346The first image shows the results of a LAG 1 command to show the value for the prior month (MOM month over month analysis)The first image shows the results of a LAG 12 command to show the value for the same month of the prior year (YOY year over year analysis)You can hand calculate the YOY to compare line 13 to line 1 (111,165 – 105,075 = 6,090)00The first image shows the results of a LAG 1 command to show the value for the prior month (MOM month over month analysis)The first image shows the results of a LAG 12 command to show the value for the same month of the prior year (YOY year over year analysis)You can hand calculate the YOY to compare line 13 to line 1 (111,165 – 105,075 = 6,090)127825538036500130365552641500495304889500140208059690003561080495300013322306223000,CASE WHEN SUM([SalesAmount]) > LAG(SUM([SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate)*100 ) + MONTH(i.OrderDate)) THEN 'Up' WHEN SUM([SalesAmount]) < LAG(SUM([SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(OrderDate) *100 ) + MONTH(OrderDate)) THEN 'down' END as [trend]Case statements are useful to insert a new column of characters or symbols which is cognitively easier than reading numbers. LAG(SUM(i.[SalesAmount]), 1, NULL) means For each row take the sum of the sales amount (grouped up by the fields specified in the GROUP BY statement) and subtract out the sum of sales for the prior period (Lag 1 month). If no value is found (outside the partition) then return NULL895350626110, Case WHEN current month > prior month THEN 'Up' WHEN current month < prior month THEN 'Down' End as [new column name] 00, Case WHEN current month > prior month THEN 'Up' WHEN current month < prior month THEN 'Down' End as [new column name] You can see that the LAG specifier is really useful. There are more options using LEAD/LAG/FIRST/LAST specifiers. For example, finance reports may build a window of data and create an analytic using the first or last record in the window, such as day # in the month., IIF(SUM([SalesAmount]) > LAG(SUM([SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(OrderDate) *100 ) + MONTH(OrderDate)) , 'Up', 'Down') as [UpDown]This is the closest syntax to an IF statement that will work inside a SELECT statement and Window function.The yellow code is the test condition that gets evaluated as true or false. In fact everything in front of the first comma, is the test condition. If the test condition equates to true then the word 'Up'is writtent into the cell, else when the test condition equates to false the word 'Down'is written into the cell.This module also includes another document that demonstrates some PowerBI/Excel DAX formulas that can calculate the same MOM and YOY values. DAX includes useful functions such as PREVIOUSMONTH(), PREVIOUSQUARTER(), and PREVIOUSYEAR for this calculation., CASE WHEN SUM(i.[SalesAmount]) > 1.25 * (LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate))) THEN 'Up big' WHEN SUM(i.[SalesAmount]) > LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate))1369696-317600 THEN 'Up' WHEN (1.25 * SUM(i.[SalesAmount])) < LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) THEN 'Down big' WHEN SUM(i.[SalesAmount]) < LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) THEN 'down' END as [UpDown3]This is an attempt to make a more useful textual analytic column. Rather than a simple up/down term, here we analyze each row’s month over month (MOM) revenue change and place it into the first category ‘Up big’ when the increase over the prior month is over 25%. We label the row ‘Up’ if the MOM revenue change is higher than the prior month but does not reach the 25% threshold. The way case statements work is that the first WHEN statement that equates to true, the THEN statement is executed, then the END statement, so placement of the categories is important, reversing the order of these two WHEN statements would not work.When the current month is down over 25% from the prior month then the colunm will read ‘Down big’. Finally if the current month is lower than the prior, but does not reach the 25% threshold then the column will read ‘down’Interestingly if we change the THEN statements to numbers then Up Big = condition 1, Up = condition 2, Down = condition 3, Down big = condition 4. If the data is copied into Excel you can use conditional formatting to draw icons onto the screen. Here is how you could set up the conditional formatting rule. Select the column of data, then select Conditional formatting form the Home menu as show. Choose your icon set and set up your formatting rule.250272156651006239841255754001307796159766000213288216680620020297071517428002967962166055300655375188495600484568555689500Many people add icons to their Excel sheets to speed up analysis and draw the eye to the rows of data that need it most. The same functionality can be created in PowerBI or Tableau dashboards.Window Function #5 - Month Over Month Variance % Change - Add this code after the prior syntax to show the % variance change in sales as compared to the prior month. This column is perfect data sources for an SSRS or Excel-based KPI with conditional formatting. You could similarly write a year over year analysis or week over week analysis.--comparison to prior month revenue = Month over Month (MOM). The general formula is (new time period – old time period)/ old time periodor (Current month value- Previous month value)/ Previous month value , FORMAT( (SUM(SalesAmount) - LAG(SUM(SalesAmount), 1, NULL) OVER (PARTITION BY ([Sub category]) ORDER BY [Sub category]) ) / LAG(SUM(SalesAmount), 1, NULL) OVER (PARTITION BY ([Sub category]) ORDER BY [Sub category]) , 'P2') AS [MOM % Variance] Specifying the LAG as 1 in this case since the granularity is YearMonth, gives us the total sales for the prior month. This is the classic month over month analysis. Specifying the LAG as 12 gives us a year over year analysis. The code is exactly the same, only the LAG was changed from 1 to 12. -20497801036319002757170706120356997023063201455420541020149352023126706045202331720598170528320 Window Function #6 - Breaking records into equal sized groups such as quartiles. You can use NTILE() to quickly break your records into groups of roughly equal size. For example an NTILE(4) would create four ‘bins’ each of which would hold roughly 25% of the records. Statisticians call these quartiles. Categorizing data into groups, allows quick look at different segments. Excel’s conditional formatting can read the numbers 1-4 and use them to render a indicant on a dashboard, (eg., stoplight graphic, or up-down arrows). , NTILE(4) OVER ( PARTITION BY [Sub Category], YEAR(i.OrderDate)15601954381400 ORDER BY SUM(i.[SalesAmount]) DESC) AS [Sales Quartile]-- next we combine NTILE with our good friend the CASE statement. The CASE statement reads the numeric output (1-4) from the NTILE(4) and coverts the numbers to text into a new column., CASE NTILE(4) OVER ( PARTITION BY [Sub Category], YEAR(i.OrderDate) ORDER BY SUM(i.[SalesAmount]) DESC) WHEN 1 THEN 'Gangbusters' WHEN 2 THEN 'Great month' WHEN 3 THEN 'Break-even month' WHEN 4 THEN 'Sub-par month'END AS [Sales Quartile 2]-171740096664100Line by line explanation:, CASE NTILE(4) OVER (PARTITION BY [Sub Category], YEAR(i.OrderDate)This line specifies that the data is being broken into 4 quartiles for each year in the dataset. The months are assigned 3 per quartile based on the ORDER BY SUM(i.[SalesAmount]) and a number 1-4 is assigned to each row for the new column. 1 is the highest ranked month based on sales being in the top quartile (top 25%).ORDER BY SUM(SalesAmount) DESC)This line specifies that the records will be summed according to the GROUP BY expression criteria. The SUM(fieldname) totals the records based on the SalesAmount column values and sorts them in descending order highest to lowest. WHEN 1 THEN 'Gangbusters' WHEN 2 THEN 'Great month' WHEN 3 THEN 'Break-even month' WHEN 4 THEN 'Sub-par month'END AS [Sales Quartile 2] AS [Sales Rating]The WHEN statements are used to assign text to a column, which makes for a better slicer in reports, and for sorting. We are writing these labels into each row of the column. Conclusion: While more can be done with window functions, we need to move on to demonstrate the content thus far in Tableau, and also DAX. Its useful to also spend some time in Excel and PowerBI creating data visualizations using the epic metrics created here. See you in the next module. Never forget, anyone can write SQL, but only an intuitive analyst can create a useful KPI. Portions quoted from entire query is proviuded here for use in future training:USE [Featherman_Analytics];SELECT [Sub Category], (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate) AS [YearMonth], YEAR(i.OrderDate) as [Year], SUM(i.[OrderQuantity]) AS [Total Units], COUNT(DISTINCT([SalesOrderNumber])) as [# TA], FORMAT(SUM(i.[SalesAmount]), 'N0') AS [Monthly Revenue]--- Introducing the LSG function so that you can access a different row in the window frame , FORMAT(LAG(SUM(i.[SalesAmount]), 1, NULL)OVER (PARTITION BY [Sub category]ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)), 'C0') AS [Prior Month] , FORMAT(SUM(i.[SalesAmount]) - LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)), 'C0') as [Month over Month Revenue] , FORMAT((SUM(SalesAmount) - LAG(SUM(SalesAmount), 12, NULL) OVER (PARTITION BY ([Sub category]) ORDER BY [Sub category]) ) / LAG(SUM(SalesAmount), 1, NULL) OVER (PARTITION BY ([Sub category]) ORDER BY [Sub category]) , 'P2')AS [YOY % Variance] ------ add two textual column metrics, IIF(SUM(i.[SalesAmount]) >LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) , 'Up', 'Down') as [UpDown], CASE WHEN SUM(i.[SalesAmount]) > 1.25 * (LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate))) THEN 'Up big' WHEN SUM(i.[SalesAmount]) > LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) THEN 'Up' WHEN (1.25 * SUM(i.[SalesAmount])) < LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) THEN 'Down big' WHEN SUM(i.[SalesAmount]) < LAG(SUM(i.[SalesAmount]), 1, NULL) OVER (PARTITION BY [Sub category] ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) THEN 'down' END as [UpDown2] ---comparison to prior month revenue, FORMAT(SUM(i.[SalesAmount]) - LAG(SUM(i.[SalesAmount]), 1, NULL)OVER (PARTITION BY [Sub category]ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)), 'N0') AS [Month Over Month] --comparison to prior month revenue = (new-old)/ old , FORMAT((SUM(SalesAmount) - LAG(SUM(SalesAmount), 1, NULL) OVER (PARTITION BY ([Sub category]) ORDER BY [Sub category]) ) / LAG(SUM(SalesAmount), 1, NULL) OVER (PARTITION BY ([Sub category]) ORDER BY [Sub category]) , 'P2')AS [MOM % Variance] -- this is a running total that resets for each SubCategory , FORMAT(AVG(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category]ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW) , 'N0') AS [Sub-Cat Moving Avg.] , FORMAT(AVG(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category]ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)ROWS BETWEEN 2 PRECEDING and CURRENT ROW) , 'N0') AS [Sub-Cat 3 Mo. Moving Avg.]-- this is a running total that resets for each SubCategory and Year,FORMAT(SUM(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category], YEAR(i.OrderDate)ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) ,'N0' ) AS [Sub-Cat Total by Year] -- this is a running total that resets for each SubCategory ,FORMAT(SUM(SUM(i.[SalesAmount])) OVER(PARTITION BY [Sub Category]ORDER BY (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)) ,'N0' ) AS [Sub-Cat Total]-- here records are categorized into quartiles. , NTILE(4) OVER (PARTITION BY [Sub Category], YEAR(i.OrderDate)ORDER BY SUM(i.[SalesAmount]) DESC )AS [Sales Quartile], CASE NTILE(4) OVER (PARTITION BY [Sub Category], YEAR(i.OrderDate)ORDER BY SUM(i.[SalesAmount])DESC) WHEN 1 THEN 'Gangbusters'WHEN 2 THEN 'Great month'WHEN 3 THEN 'Break-even month'WHEN 4 THEN 'Sub-par month'END AS [Sales Quartile 2]-- All these columns of analytics need to matchup with the GROUP BY fields.FROM [dbo].[FactInternetSales] as iINNER JOIN [dbo].[AW_Products_Flattened] as p ON p.[Productkey] = i.[Productkey]WHERE [Category] = 'Bikes'GROUP BY [Sub Category], YEAR(i.OrderDate), (YEAR(i.OrderDate) *100 ) + MONTH(i.OrderDate)ORDER BY [Sub Category], [YEAR], [YearMonth] ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download