Analysis Services ROLAP for SQL Server Data Warehouses



-35271-452673Analysis Services ROLAP for SQL Server Data WarehousesSQL Server Technical Case StudyWriters: Denny Lee, Kay UnkrothInvestigators: Denny Lee, Thomas KejserContributors: James Podgorski, Richard Tkachuk, Mosha Pasumansky, Akshai Mirchandani, Edward Melomed, Jonathan Ehren, Mark Tameling, Mukesh KumarTechnical Reviewers: Akshai Mirchandani, Kevin Cox, Nicolae MogoreanuPublished: July 2010Applies to: Microsoft? SQL Server? 2008 and SQL Server 2008 R2Summary: This technical case study describes how the SQL Server? Customer Advisory Team (SQLCAT), in collaboration with SQL Server developers, tested and optimized a large Online Analytical Processing (OLAP) solution based on SQL Server 2008 Analysis Services by using the Relational OLAP (ROLAP) storage mode. The study examines ROLAP system requirements and usage scenarios, highlights advantages and disadvantages of ROLAP in comparison with Multidimensional OLAP (MOLAP), and evaluates various ROLAP-related data warehouse (DW) optimization techniques regarding their effectiveness and limitations.This case study is for data warehouse architects, database administrators, and storage engineers, and assumes the audience is already familiar with the concepts of large-scale data warehouse designs for servers, storage subsystems, and databases. A high-level understanding of Analysis Services optimization techniques for cube processing and query performance is also helpful. Detailed information is available in the SQL Server 2008 Analysis Services Performance Guide at security reasons, the actual names of servers, cubes, tables, views, columns, and other resources have been replaced with fictitious names. The sample names mentioned in this paper do not represent real resource names and are for illustration purposes only.CopyrightThe information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.This case study is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, AS TO THE INFORMATION IN THIS plying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in, or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place, or event is intended or should be inferred.? 2010 Microsoft Corporation. All rights reserved.Microsoft and SQL Server are trademarks of the Microsoft group of companies.All other trademarks are property of their respective owners.Contents TOC \o "1-3" \h \z \u Introduction PAGEREF _Toc269731322 \h 4Scoping the Project and Building the Test Lab PAGEREF _Toc269731323 \h 5Performing Initial Data Warehouse Optimization PAGEREF _Toc269731324 \h 7Deploying the Lab Cube and Defining the Test Cases PAGEREF _Toc269731325 \h 9Lab Cube Deployment PAGEREF _Toc269731326 \h 9Cube Test Scenarios PAGEREF _Toc269731327 \h 10Selecting Test Tools and Methods PAGEREF _Toc269731328 \h 11Establishing a Performance Baseline PAGEREF _Toc269731329 \h 12Implementing Data Warehouse Aggregations PAGEREF _Toc269731330 \h 13Cube-Based ROLAP Aggregations PAGEREF _Toc269731331 \h 14Transparent ROLAP Aggregations PAGEREF _Toc269731332 \h 15Choosing an Aggregation Strategy PAGEREF _Toc269731333 \h 17Creating Transparent Aggregations PAGEREF _Toc269731334 \h 20Table Binding versus Query Binding PAGEREF _Toc269731335 \h 22Reorganizing the Lab Data Warehouse PAGEREF _Toc269731336 \h 23Replacing Table Partitions with Individual Base Tables PAGEREF _Toc269731337 \h 24Loading the Facts Data PAGEREF _Toc269731338 \h 25Realigning Cube Partitions PAGEREF _Toc269731339 \h 27Reviewing Measure Groups and Measures PAGEREF _Toc269731340 \h 29Zooming in on Measures PAGEREF _Toc269731341 \h 29Row Counting and Query Binding PAGEREF _Toc269731342 \h 31Benchmarking the ROLAP Cube PAGEREF _Toc269731343 \h 32TopCount Customer Queries PAGEREF _Toc269731344 \h 32Additional SQLCAT MDX Queries PAGEREF _Toc269731345 \h 34Understanding Query Concurrency and Processor Utilization PAGEREF _Toc269731346 \h 34Usage-Based Optimization of Long-Running Queries PAGEREF _Toc269731347 \h 36Dealing with Skewed Distribution of Data PAGEREF _Toc269731348 \h 37Lessons Learned and Best Practices PAGEREF _Toc269731349 \h 38Conclusion PAGEREF _Toc269731350 \h 40IntroductionData warehouse architects and storage engineers face exciting and challenging times as emerging hardware and software make it possible to build large-scale data warehouses with hundreds of terabytes of data at affordable prices. AMD Opteron and Intel Xeon Nehalem-EX servers with up to 256 logical processors and hundreds of gigabytes of memory offer enormous potential to achieve high performance and scalability. Such features let organizations keep increasing amounts of analytical data online, enabling users to readily analyze massive datasets, drill down into intricate details, and uncover actionable insights with a microscopic level of granularity. The challenge is on to stay on top of this data mountain that keeps accumulating at a rapid pace in the enterprise.SQL Server Analysis Services is a crucial technology for most Microsoft BI solutions, yet building Analysis Services cubes that keep up with growing data sizes is difficult. Gigantic MOLAP cubes with hundreds of terabytes of data are out of the question. Even the increased cube processing performance available with the latest versions of Analysis Services can’t help manage this enormous volume. Full cube processing would take weeks, and there are query-related performance issues to consider as well. A MOLAP cube of 5 terabytes is stretching the limit. A more reasonable maximum ranges between 1.5 and 3 terabytes. The adCenter cubes, documented in the white paper “Accelerating Microsoft adCenter with Microsoft SQL Server 2008 Analysis Services” (), can serve as a reference.With MOLAP (at this time) impractical for data volumes larger than 5 terabytes, ROLAP becomes an interesting option. ROLAP lets data and aggregates remain in the relational data warehouse. This is attractive because the SQL Server 2008 relational engine can be amazingly fast on top-end server models with storage subsystems designed for fast serial input/output (I/O), which means that DW architects can take full advantage of the latest hardware and software trends. Microsoft SQL Server Fast Track Data Warehouse () even supports the deployment of such systems with reference architectures for HP, Dell, Bull, EMC, and IBM. In contrast to the relational engine, Analysis Services is currently limited to 64 logical processors. Plus, a ROLAP cube can, in some cases, fan out SQL queries across multiple database servers and combine the results for analysis. ROLAP can also provide low-latency data delivery for real-time solutions. In short, there are many good reasons to evaluate ROLAP.Of course, there are also complicating factors, which are good reasons to evaluate this storage mode just the same. In comparison with MOLAP, ROLAP handles attribute relationships less efficiently, requires more physical scans to get the data, and does not benefit from Analysis Services data compression. Hardware requirements are therefore substantially higher. ROLAP is also more susceptible to data quality issues, which MOLAP can automatically detect during cube processing. And there is a significant disadvantage relating to aggregation design and management. DW architects have to build ROLAP aggregates directly into the relational data warehouse. The relational engine of SQL Server includes all necessary capabilities to support aggregations seamlessly, yet optimizing a ROLAP cube for query performance and scalability remains complex. Repeated testing, tweaking, and aggregation building are essential.For all of these reasons, SQLCAT decided to investigate typical ROLAP design and optimization tasks in a realistic lab setting representative of an enterprise environment. A large Microsoft customer agreed to help by providing SQLCAT with a copy of a 1.35 terabyte production data warehouse. During the project, SQLCAT converted an existing Analysis Services cube from MOLAP to ROLAP, compared the ROLAP performance with the MOLAP baseline, analyzed performance issues, optimized the ROLAP design to increase performance, and engaged SQL Server developers to determine best solutions and workarounds for difficult issues. The result of this work is the collection of ROLAP optimization techniques, lessons learned, and best practices summarized in this paper.Scoping the Project and Building the Test LabOne of the first decisions the SQLCAT team had to make in the ROLAP investigation concerned the overall scope of the project. Important questions revolved around the preferred sizes and complexities of the cubes as well as the desired scalability scenarios.On the one hand, SQLCAT wanted to test with the largest possible cubes, multiple servers, and high query concurrency, but it also wanted to provide actionable information and guidelines without getting lost in implementation complexity or troubleshooting minutiae. For this reason, SQLCAT decided to rule out scalability testing and constrained the project scope by starting with a single cube of low complexity. This approach enabled SQLCAT to tackle typical ROLAP issues efficiently and to quickly develop effective optimization techniques. The ultimate goal was to benchmark the original query performance of a given MOLAP cube and then try to achieve similar or better results with ROLAP.Figure 1 illustrates the lab environment that SQLCAT deployed according to the overall project scope. This lab was based on a single quad-socket server with four dual-core processors and 64 GB of memory. With a total of eight processor cores and 64 GB of memory, the lab system was able to sustain the ROLAP processing demand, which is generally higher in comparison to MOLAP, as discussed in the section “Understanding Query Concurrency and Processor Utilization” later in this case study.Figure 1: Server Configuration for ROLAP Performance Tests and BenchmarkingParticularly noteworthy is the lab server’s storage design. Because I/O performance greatly influences query performance, SQLCAT optimized the storage subsystem by using three separate SANs. In particular, SQLCAT connected the logical unit number (LUN) drives of each SAN to different ports so that two SAN ports handled the workload for every pair of LUNs (that is, O/P, S/T, and U/V). This configuration doubled the maximum I/O throughput and ensured that the storage subsystem did not create a bottleneck during performance testing. With 64 KB block sizes (except for the OLAP LUNs, which were set to 32 KB block sizes) and every port delivering 600-900 MB per second, SQLCAT was able to achieve in this configuration an I/O throughput of between 1.2 and 1.9 GB per second with minimal latencies.SQLCAT implemented the following storage design for the lab server:Fact tables for the ROLAP cube???Stored in a file group that was evenly split across the LUNs S, T, U, and V, formatted with a 64?KB cluster size according to SQL Server best practices.TempDB???Placed on its own two separate LUNs, O and P, so that TempDB usage did not generate write requests to the data LUNs during querying.SQL Server log folder???Placed on LUN H to ensure that the sequential writing of the transaction log would occur on a separate, dedicated drive.Windows page file???Placed on LUN M to isolate most of the paging activity on a separate, dedicated drive. Only a small page file of 800?MB remained on the C drive. Using multiple page files placed on separate physical disks is a best practice to increase system performance, and by leaving a small page file on the system drive, SQLCAT ensured that critical operating system processes, such as the Local Security Authority Service (Lsass.exe), are able to access a page file without requiring a SAN environment.OLAP folder???Placed on LUN N, which was optimized for random I/O requests according to the typical I/O pattern of MOLAP measures and was formatted with a 32?KB cluster size. SQLCAT performance tests show that a 32?KB cluster size can benefit Analysis Services performance more than a 64?KB cluster size.OLAP log folder???Placed on LUN H, which also hosted the SQL log folder. Storing the OLTP and OLAP logs on the same LUN did not impact test results because SQLCAT performed the MOLAP and ROLAP tests in separate cycles.OLAP temp folder???Placed on LUN O, which also hosted the TempDB database. Again, it was acceptable to share a LUN between OLAP temp folder and TempDB because SQLCAT performed MOLAP and ROLAP tests separately and there was very little disk activity.Note:???SQLCAT hosted all resources on a single server to run the MOLAP and ROLAP performance tests with exactly the same system configuration. However, the single-server design depicted in Figure 1 is not a recommended configuration for production environments. For large-scale production designs, refer to the SQL Server Best Practices article “Scale-Out Querying with Analysis Services,” available on Microsoft TechNet at , as well as the white paper “Accelerating Microsoft adCenter with Microsoft SQL Server 2008 Analysis Services” at Initial Data Warehouse OptimizationWith the lab server design and deployment completed, SQLCAT was ready to build and load the relational data warehouse, which marked the next important optimization milestone. Figure 2 illustrates the steps performed at this stage.Figure 2: Cube Design in Relationship to the Source Facts TableMost importantly, SQLCAT streamlined the source facts table in the lab data warehouse to reduce the size of the data rows. Smaller data rows lead to smaller indexes and lower I/O overhead to retrieve the data, which ultimately benefits query performance. After all, ROLAP performance depends on the capabilities of the relational data warehouse to return results quickly.SQLCAT carried out the following steps to optimize the source facts table:Removal of unnecessary columns???The original source facts table contained more columns than the lab cube needed. The columns in red in Figure 2 were not necessary. By removing these columns, SQLCAT was able to reduce the size of the facts table by more than 50 percent.Conversion of data types to save space???By converting the numeric columns in the original facts table to the bigint data type, SQLCAT was able to save 11 bytes per value. This optimization also potentially improved the performance of Multidimensional Expressions (MDX). For example, the DistinctCount function is optimized for integer values.Table 1 compares the original and optimized lab data warehouse versions.Table 1: Original and Optimized Lab Data Warehouse Sizes DatasetOriginalOptimizedMOLAP Cube 30 days worth of data230 GB230 GBSQL Server DW Dimension Data0.1 GB0.1 GB Facts Data1.35 TB521 GB Indexed Views*03.2 GB* After ROLAP Performance OptimizationNote:???As a best practice to achieve high ROLAP performance, SQLCAT recommends avoiding complex data models and simplifying the relational data warehouse as much as possible, such as by removing unnecessary columns and, if possible, by replacing wide data types with narrower alternatives to increase storage efficiency. SQLCAT also recommends using a straightforward star schema, avoiding snowflake designs and many-to-many attribute relationships in the dimensions.Deploying the Lab Cube and Defining the Test CasesHaving optimized the storage subsystem for fast serial I/O and reduced the size of the data warehouse by more than 50 percent, SQLCAT felt confident that the lab environment would reach satisfactory performance levels. It was time to deploy the original MOLAP cube and take it for a test run.Lab Cube DeploymentTo deploy the lab cube, SQLCAT reused the customer’s Business Intelligence Development Studio (BIDS) solution. In this context, it is important to note that SQLCAT did not modify any aspect of the MOLAP cube design and even refrained from altering aggregations to keep the lab cube closely aligned with the customer’s original. It was not necessary to apply design changes or cube optimizations because the cube already featured a clean star schema without any parent/child or many-to-many relationships in the dimension tables, as illustrated in Figure 2. This straightforward cube was an ideal candidate for a later conversion to ROLAP.Note:???Parent-child and many-to-many attribute relationships can have a substantial performance impact in both MOLAP and ROLAP modes. And unlike MOLAP, ROLAP can’t determine the individual cube partitions that can satisfy the query if the query involves complex dimensions, which implies that Analysis Services must query all cube partitions. The section “Understanding Query Concurrency and Processor Utilization” later in this paper discusses the resulting performance impact and design implications.The only extra optimization that SQLCAT applied to the MOLAP cube concerned the OLAP drive N, which SQLCAT formatted with a 32 KB cluster size for best performance, as mentioned earlier.Cube Test ScenariosFor the test scenarios, SQLCAT planned to use a set of original customer queries. These queries promised realistic test runs, but a quick inspection revealed that the queries all used similar MDX expressions. Essentially, all of the queries the customer provided relied on the TopCount() function and analyzed three days or seven days of data.The TopCount() function can help to highlight storage mode differences because it requires a full scan of the data across all relevant cube partitions, which also implies a full scan of the underlying facts tables in ROLAP mode. Still, SQLCAT deemed it necessary to create additional sets of three-day and seven-day queries, as well as advanced queries that had more variety and did not rely on the TopCount() function. This expanded portfolio of test cases enabled SQLCAT to simulate standard queries as well as complex ad-hoc queries that individual analysts might generate in a production environment.Table 2 provides an overview of the test scenarios and a breakdown of the corresponding MDX queries that SQLCAT decided to use for the ROLAP investigation.Table 2: Query Breakdown for SQLCAT Test CasesTest CaseQuery IDsCommentsMDX subqueries restricting the main query’s multidimensional set based on various date/time ranges, then using the TopCount() function to sort and then return the top-ten members for a variety of dimensions and measure groups.Three-day queries: 3\01, 3\02, 3\03, 3\04, 3\05, 3\06, 3\07, and 3\08Seven-day queries: 7\01, 7\02, 7\03, 7\04, 7\05, 7\06, 7\07, 7\08, 7\09, 7\10, and 7\11.All queries use similar constructs with a cross product of multiple sets in the MDX subquery.SELECT statements with members from the date dimension on columns and all members from another dimension on rows.Three-day queries: 3\09, 3\10, 3\11, and 3\12.Queries use different members or cross joins of multiple member sets on rows.SELECT statements with results filtered on various criteria in WHERE clause.One-day queries: 1\01Two-day queries: 2\01, 2\02Other queries: O\01Queries include all measure groups and shaped slices or highly selective criteria in the WHERE clause. Query O/01 slices by attributes related to cube plex SELECT statements with calculated members and WHERE clauses.Additional queries: A\01, A\02, A\03, and A\04.Queries calculate standard deviations, average values, and year-to-date (YTD) values for all or selected date ranges, such as for all, three, or seven selected calendar days.Selecting Test Tools and MethodsThe standard tools to measure and benchmark Analysis Services performance are SQL Server Profiler and Performance Monitor. Another useful tool, and a SQLCAT favorite, is the Analysis Services command-line tool (ascmd.exe). ASCMD can run XML for Analysis (XMLA) scripts, MDX queries, and Data Mining Extensions (DMX) statements while tracking results and trace information in a log file.Note:???The ASCMD source code is available as part of the SQL Server 2008 Samples at . With SQL Server 2008 Samples installed, the path to the source code is %ProgramFiles%\Microsoft SQL Server\100\Samples\Analysis Services\Administrator\ASCMD. The source code can be compiled using Microsoft Visual Studio 2008, or Microsoft Visual C# 2008 Express Edition available as a free download at 3 shows how SQLCAT automated the test runs by using a batch file to start the ASCMD command-line tool for each individual MDX query in a FOR…DO loop.Figure 3: MOLAP and ROLAP Performance Test MethodFor the MOLAP and ROLAP performance tests, SQLCAT used the following tools and methods:ASCMD command-line tool???SQLCAT fully automated the test runs for all MDX queries by using a batch file (see Figure 3). In each loop, SQLCAT first cleared the Analysis Services data cache by running the ClearCache.xmla script as documented in the ASCMD readme file, then executed the currently selected MDX query, recording the start time, end time, and query duration in milliseconds.SQL Server Profiler???During the early stages of the ROLAP tests, SQLCAT double-checked the accuracy of the ASCMD results with Profiler traces. A Profiler trace can provide detailed information about where Analysis Services spends its time processing an MDX query, starting with the command parser and ending with the formula engine finishing query execution. SQLCAT also used Profiler traces to examine the SQL queries that the lab ROLAP cube generated.Performance Monitor???SQLCAT used Performance Monitor to keep an eye on CPU load, memory utilization, and I/O throughput to ensure that no unexpected factor or bottleneck distorted the query performance measurements.Note:???All performance results mentioned in this case study are based on a cold Analysis Services cache. By clearing the data cache prior to running each MDX query, SQLCAT ensured accurate tests. A warm cache would boost query performance but would also distort test results, as Analysis Services could then answer queries directly from memory without going to disk or querying the relational data warehouse.Establishing a Performance BaselineFigure 4 shows the performance baseline for the original MOLAP cube, which SQLCAT established directly on the lab server using the tools and methods described in the previous section. This cold-cache baseline served as the yardstick to evaluate ROLAP performance during the subsequent investigation.Note:???The figures in this paper use a logarithmic scale to help emphasize the difference between MOLAP and ROLAP on a per-query basis. A linear scale would emphasize the difference between short- and long-running queries in each storage mode, but would make it very difficult to compare MOLAP and ROLAP queries of less than 1 second, as well as those that take more than 20 minutes.Figure 4: MOLAP Performance Baseline on a Logarithmic ScaleAs illustrated, the majority of the MDX queries finished in less than one minute, the additional SQLCAT queries O\01, A\02, A\03, and A\04 required more than five minutes, and query A\03 took the longest with 20 minutes and 11 seconds. A\03 required this time to calculate the average of the cross product of two large datasets over a particular measure for 30 days of data (25 billion rows of facts).Implementing Data Warehouse AggregationsAlthough SQLCAT also ran preliminary tests against the ROLAP cube, it did not benchmark the initial performance because the relational data warehouse did not include aggregations yet. Putting a MOLAP cube that includes aggregations side-by-side with a ROLAP cube that does not include aggregations is an apples-to-oranges comparison. For a useful investigation, it is necessary to apply at least the same aggregation design in both MOLAP and ROLAP modes. In fact, ROLAP requires a higher degree of optimization to compensate for the performance advantages MOLAP offers out of the box, such as automatic two-level bitmap indexing on every attribute and lower I/O overhead due to data compression.Cube-Based ROLAP AggregationsIn MOLAP and HOLAP modes, aggregations reside in the multidimensional database maintained by Analysis Services. In ROLAP mode, on the other hand, aggregations are based on indexed views in the relational data warehouse. Analysis Services supports ROLAP aggregations insofar as it can create the indexed views to contain aggregations, and then uses these indexed views instead of the source tables where possible to answer queries (see Figure 5). However, Analysis Services is not involved in materializing these views or storing their result sets. Analysis Services delegates these tasks to the SQL Server relational engine by submitting CREATE VIEW and CREATE UNIQUE CLUSTERED INDEX commands to the data warehouse during cube partition processing. The relational engine then creates and maintains the indexed views in the database in much the same way as tables with a clustered index.To highlight how Analysis Services benefits from indexed views created by means of a cube-based aggregation design, Figure 5 describes the following scenarios:SQL query without aggregations in place?(red arrow, left side of figure)??The SQL query goes with the lowest possible grain directly against the source facts table. Although SQL Server can rapidly scan large data sets on a fast storage subsystem, by aggressively performing synchronous and asynchronous I/O operations and using 512 KB block sizes, for example, performance suffers due to the time spent scanning across unnecessarily detailed data in the database (25 billion rows of facts in the SQLCAT lab environment).SQL query with aggregations in place?(purple arrow, right side of figure)??Analysis Services can generate a SQL query against the indexed view and the relational engine can process the query much faster because the indexed view already summarized the data and persisted the result set.Figure 5: Analysis Services Querying with and without Cube-Based ROLAP AggregationsTransparent ROLAP AggregationsFor a ROLAP cube to query an indexed view instead of a source table, Analysis Services must be able to recognize that the indexed view exists and contains aggregations that can satisfy the MDX query. This is only the case if Analysis Services created the indexed view based on a ROLAP aggregation design, such as a design created using the aggregation wizards in BIDS. In contrast, indexed views created manually using SQL Server Management Studio (SSMS) or other database tools are transparent to Analysis Services. These database objects are not directly associated with a ROLAP cube. Nonetheless, Analysis Services can still benefit from transparent aggregations, which offer DW architects additional flexibility for ROLAP performance tuning.The key to leveraging transparent aggregations is the relational engine’s ability to choose an optimal execution plan for parsed SQL queries. If the relational engine’s query optimizer finds a match between the query elements (such as columns, search conditions, joins, and aggregate functions) and the elements of an indexed view, it considers the view index as a possible option. Query optimizer then estimates the cost of each possible option, chooses the query plan with the lowest cost, and keeps track of the choice for subsequent queries. As a result, if an index of a view has the lowest cost of any considered access option, query optimizer chooses this view index even though the original SQL query references a table in the FROM clause.The execution plan in Figure 6 shows query optimizer at work. The SQL query references the FactLabDataOptimized table, but the relational engine transparently chooses the clustered index on the Agg_Fact Lab Data_SW_ID view to answer the query. Microsoft introduced this query optimization feature based on indexed views 10 years ago with SQL Server 2000 to increase the performance of database applications without requiring T-SQL code modifications, and has continued to improve the view-matching capabilities with every subsequent SQL Server release.Note:???Query optimizer does not differentiate between client applications and is not aware of Analysis Services, and Analysis Services is not aware of query optimizer.Figure 6: Query Execution Plan Selecting a Transparent AggregationChoosing an Aggregation StrategyFor Analysis Services to leverage its sophisticated algorithms for calculating summary data based on aggregations and selecting aggregated values for the calculation of other aggregated values, it is necessary to create the aggregations through Analysis Services. However, these cube-based ROLAP aggregations also have disadvantages (see Table 3). Transparent aggregations can be a better choice to build a foundation for all cubes in the data warehouse, while cube-based aggregations can then help to increase the performance of each cube individually. By using transparent aggregations in combination with cube-based aggregations, DW architects can implement an efficient aggregation design.Table 3 compares general advantages and disadvantages of cube-based and transparent aggregations. These considerations are particularly important for environments with multiple ROLAP cubes accessing the same relational data warehouse.Table 3: Advantages and Disadvantages of ROLAP Aggregation TypesAggregation TypeAdvantagesDisadvantagesCube-Based Efficient query processing???Analysis Services can use cube-based aggregations even if the query and aggregation granularities do not exactly match. For example, a query on [Month] can use an aggregation on [Day], which requires only the summarization of up to 31 numbers.Aggregation design efficiency???Analysis Services includes Aggregation Design Wizard and Usage-Based Optimization Wizard to create aggregation designs based on storage and percentage constraints or queries submitted by client applications.Storage inefficiency???Multiple ROLAP cubes using the same data warehouse do not share cube-based aggregations. Analysis Services creates separate indexed views per cube even if the cubes include the same or similar aggregation designs.Processing overhead???Analysis Services drops and recreates indexed views associated with cube-based aggregations during cube partition processing. Dropping and recreating the indexes can take an excessive amount of time in a large-scale data warehouse.Transparent Reuse of existing indexes across cubes???Multiple cubes using the same data warehouse can benefit from the same set of indexed views. Sharing indexed views among cubes mitigates aggregation-related storage requirements.Less overhead during cube processing???Analysis Services is unaware of the aggregations and does not drop the indexed views during partition processing. There is no need to drop indexed views because the relational engine maintains the indexes continuously, such as during INSERT, UPDATE, and DELETE operations against the base tables.No sophisticated aggregation algorithms???Indexed views must match query granularity. Query optimizer doesn’t consider dimension hierarchies or aggregation granularities in the query execution plan. For example, a SQL query with GROUP BY on [Month] can’t use an index on [Day].Maintenance overhead???Database administrators must maintain aggregations by using SSMS or other tools. It is difficult to keep track of the relationships between indexed views and ROLAP cubes.Design complexity???Database Engine Tuning Advisor can help to facilitate aggregation design tasks by analyzing Profiler traces, but it can’t identify all possible candidates. Moreover, DW architects must manually study Profiler traces to determine effective aggregations.Figure 7 highlights the relationship between cube-based and transparent aggregations in a centralized data warehouse shared among multiple ROLAP cubes. As illustrated, each cube maintains its own aggregations even if the data warehouse already contains suitable indexed views. For this reason, SQLCAT recommends reviewing the indexed views in the data warehouse before and after creating or updating cube-based aggregation designs through Aggregation Design Wizard or Usage-Based Optimization Wizard to avoid inefficient use of storage space.Figure 7: Cube-Based and Transparent ROLAP AggregationsCreating Transparent AggregationsTaking into consideration that transparent aggregations can be an important ROLAP design element, SQLCAT decided to create a basic set of indexed views manually in order to add transparent aggregations to the lab data warehouse. It was not hard to find good aggregation candidates, given that a considerable number of test cases relied on similar MDX queries to analyze three or seven days of data along a given dimension, such as software or segment. By studying Profiler traces, SQLCAT determined that the lab ROLAP cube translated most of these queries into SQL statements according to the following pattern:SELECT SUM ( [TransmittedMB] ) , SUM ( [GatheredMB] ) , SUM ( [Duration] ) , SUM ( [IncidentCount] ) ,[<Dimension Key>]FROM [dbo].[FactLabDataOptimized],[dbo].[DimDate]WHERE ( ( [FactLabDataOptimized].[Date_ID]=[DimDate].[Date_ID] ) AND ([Year]=<…>)AND([Quarter]=<…>)AND([Month]=<…>)AND([Day]= <…>))GROUP BY [<Dimension Key>]The dimension key was the only relevant variable, and with seven different dimension keys in the source facts table SQLCAT was able to cover the most basic test cases with seven indexed views that each joined the facts data to the date dimension on the date key and then aggregated the values by an additional dimension key and by date dimension attributes. The clustered indexes then physically ordered the results by dimension key, year, quarter, month, and day so that the aggregated values for consecutive days were located in close physical proximity in the database.The following T-SQL listing shows the corresponding CREATE VIEW and CREATE UNIQUE CLUSTERED INDEX statements with placeholders for dimension-specific information (for more information about creating indexed views, refer to the topic “Creating Indexed Views” in SQL Server 2008 R2 Books Online at ):CREATE VIEW [dbo].[Agg_Facts By <Dimension> By Date] ( [TransmittedMB_0],[GatheredMB_1],[Duration_2], [IncidentCount_3],[<Dimension Key>_4], [Year_5],[Quarter_6],[Month_7],[Day_8], [COUNT_BIG] ) WITH SCHEMABINDING AS ( SELECT SUM ( [TransmittedMB] ), SUM ( [GatheredMB] ), SUM ( [Duration] ), SUM ([IncidentCount] ), [<Dimension Key>], [Year],[Quarter],[Month],[Day], COUNT_BIG(*) FROM [dbo].[FactLabDataOptimized],[dbo].[DimDate] WHERE ( [FactLabDataOptimized].[Date_ID]=[DimDate].[Date_ID] ) GROUP BY [<Dimension Key>],[Year],[Quarter],[Month],[Day])GOCREATE UNIQUE CLUSTERED INDEX [Agg_Facts By <Dimension> By Date] ON [dbo].[Agg_Facts By <Dimension> By Date] ([<Dimension Key>_4] ASC,[Year_5] ASC,[Quarter_6] ASC,[Month_7] ASC,[Day_8] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = ON, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = OFF, ALLOW_PAGE_LOCKS = OFF, FILLFACTOR = 100) ON [PRIMARY]Table Binding versus Query BindingMuch to SQLCAT’s dismay, creating indexed views on top of a facts table with 25 billion rows took far longer than expected. After more than 24 hours, SQLCAT cancelled this process as soon as the relational engine completed the first view and index. This first indexed view now had to suffice for a small ROLAP test with a subset of MDX queries. However, not a single query showed performance improvements. It turned out that the facts table design and the corresponding cube partition design did not permit the relational engine to utilize indexed views for SQL queries originating from Analysis Services. Figure 8 shows the relevant design aspects.Figure 8: Relational Facts Table Partitions and Corresponding Cube PartitionsAccording to a best practice in large-scale data warehousing, the design of the facts table relied on partitions organized by calendar day to facilitate data pruning along a sliding time window. In this particular case, the partition function used the Date_ID in the format yyyyMMdd. Every day, the data warehouse would drop one expired table partition and add a new partition for the current day. An XMLA script would then drop the expired cube partition as well and create a new cube partition with a source query that referenced the new table partition by specifying the current day’s Date_ID in the WHERE clause. This design works well for MOLAP cubes, but it creates issues for ROLAP.The primary problem is that Analysis Services must use query binding to reference individual facts table partitions, so the SQL queries do not reference the underlying source facts table directly. Instead, Analysis Services uses the source query of the cube partition in the FROM clause to access the relevant facts data, and because the FROM clause now references a subquery instead of the base table, query optimizer can’t choose a view index to optimize the execution plan. Query optimizer only considers view indexes if the SQL query references a base table or if the FROM clause specifies an indexed view directly. Therefore, ROLAP cube partitions must use table binding to benefit from transparent aggregations in the data warehouse.Figure 9 confirms this requirement. As the query execution plans indicate, the first query uses table binding and leverages the indexed view, while the second query uses query binding and performs all joining, sorting, and aggregating dynamically. Both queries return the same results, but the second is much less efficient because it does not use the aggregations that already exist in the data warehouse.Figure 9: Query Plans for Table Binding and Query BindingReorganizing the Lab Data WarehouseSQLCAT learned two lessons from this first attempt to implement ROLAP aggregations. Most importantly, Analysis Services must not use subqueries, which means that Data Source Views (DSVs) must reference source tables (not views or named queries), and cube partitions must use table binding as the binding type to support ROLAP aggregations. Second, the relational data warehouse should not maintain facts data in one large table. While it is possible to create views and indexes that span an entire partitioned table before loading the data, changing any aspect of the table or aggregation design throughout the cube’s lifecycle leads to lengthy processing time. For example, switch operations on partitioned tables require dropping and recreating any associated indexed views.Taking these dependencies into account, SQLCAT concluded that the lab data warehouse required a substantial overhaul. It was necessary to replace the facts table partitions for individual days of data with separate daily tables, load the facts data from the original source table into the daily tables, create separate indexed views for each daily table, and then reconfigure the cube partitions to access the daily tables instead of the original source table.Note:???Partitioning a large database by using separate daily tables used to be a best practice for Microsoft SQL Server 2000 and earlier versions, but it is not a best practice for SQL Server 2005 Enterprise Edition and higher because these editions support table partitions based on a partition function. SQLCAT reverted to the earlier method to achieve optimal ROLAP query performance, but does not recommend this approach for other partitioning scenarios because of the incurred complexity and administrative overhead.Replacing Table Partitions with Individual Base TablesThe following T-SQL listing conveys how SQLCAT created the daily facts tables. Each table mirrored the design of the original facts table, with the exception of the CHECK constraint on the Date_ID column. This CHECK constraint limited the valid data range per table to the correct calendar day. As the placeholders indicate, SQLCAT also appended the calendar-day information to the table name. This common best practice helps to avoid confusion, particularly during the configuration of the cube partitions in a subsequent step.SET ANSI_NULLS ONSET QUOTED_IDENTIFIER ONCREATE TABLE [dbo].[FactLabDataOptimized_<yyyyMMdd>]([Base_ID] [bigint] NOT NULL,[Segment_ID] [int] NOT NULL,[Software_ID] [int] NOT NULL,[Date_ID] [int] NOT NULLCHECK ([Date_ID]=<yyyyMMdd>),[Time_ID] [smallint] NOT NULL,[ClientSocket_ID] [int] NOT NULL,[TargetSocket_ID] [int] NOT NULL,[LayerThree_ID] [smallint] NOT NULL,[TransmittedMB] [bigint] NOT NULL,[GatheredMB] [bigint] NOT NULL,[Duration] [bigint] NOT NULL,[IncidentCount] [bigint] NOT NULL, CONSTRAINT [FK_FactLabDataOptimized_<yyyyMMdd>_DimSegments] FOREIGN KEY([Segment_ID]) REFERENCES [dbo].[DimSegments] ([Segment_ID]), CONSTRAINT [FK_FactLabDataOptimized_<yyyyMMdd>_DimSoftware] FOREIGN KEY([Software_ID]) REFERENCES [dbo].[DimSoftware] ([Software_ID]), CONSTRAINT [FK_FactLabDataOptimized_<yyyyMMdd>_DimDate] FOREIGN KEY([Date_ID]) REFERENCES [dbo].[DimDate] ([Date_ID]), CONSTRAINT [FK_FactLabDataOptimized_<yyyyMMdd>_DimSockets] FOREIGN KEY([ClientSocket_ID]) REFERENCES [dbo].[DimSockets] ([Socket_ID]), CONSTRAINT [FK_FactLabDataOptimized_<yyyyMMdd>_DimSockets1] FOREIGN KEY([TargetSocket_ID]) REFERENCES [dbo].[DimSockets] ([Socket_ID]), CONSTRAINT [FK_FactLabDataOptimized_<yyyyMMdd>_DimLayerThree] FOREIGN KEY([LayerThree_ID]) REFERENCES [dbo].[DimLayerThree] ([LayerThree_ID]), CONSTRAINT [FK_FactLabDataOptimized_<yyyyMMdd>_DimTime] FOREIGN KEY([Time_ID]) REFERENCES [dbo].[DimTime] ([Time_ID]), CONSTRAINT [PK_FactLabDataOptimized_<yyyyMMdd>] PRIMARY KEY CLUSTERED ( [Base_ID] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]Note:???An alternative approach is to switch out partitions to new tables. However, switch operations on partitioned tables require dropping and recreating any associated indexed views, which renders this approach impractical for large-scale data warehouses with ROLAP aggregations based on indexed views.Loading the Facts DataConsidering the large data volume, SQLCAT opted to use SQL Server Integration Services (SSIS) for bulk-loading the facts data into the daily tables. The SSIS packages that SQLCAT created for this purposes included a Data Flow task with an OLE DB Source and a SQL Server Destination. The OLE DB Source used a SQL command with the Date_ID in the WHERE clause to restrict the extracted source data to the relevant day. The SQL Server Destination then mapped the source’s output columns in a straightforward way to the destination’s input columns.Figure 10 displays the resulting distribution of facts data across the new collection of daily tables in the lab data warehouse.Figure 10: Bulk Loading Daily Facts using SSISAlthough the SQL Server Destination object is less flexible than other Extract, Transform, and Load (ETL) objects, it lets you customize bulk-load operations for high performance. Specifically, SQLCAT disabled the option to check constraints and ran five packages in parallel, achieving a throughput of 32 MB per second. By increasing the network packet size to 32 KB, SQLCAT increased the bulk-load performance even further to 42 MB per second. For detailed information about SSIS-based high-performance bulk loading, refer to “We Loaded 1TB in 30 Minutes with SSIS, and So Can You,” available on Microsoft MSDN at . See also “Top 10 SQL Server Integration Services Best Practices” on the SQLCAT Web site at loading the facts data, SQLCAT created the full set of indexed views for each daily table with the options SORT_IN_TEMPDB = ON, ALLOW_PAGE_LOCKS = OFF, ALLOW_ROW_LOCKS = OFF, and FILLFACTOR = 100. For more information about how SQLCAT created indexed views for ROLAP aggregations, see the section “Creating Transparent Aggregations” earlier in this case study.Realigning Cube PartitionsFor maintenance and other reasons, it is a best practice to partition cubes in exactly the same way as the underlying facts data. Applied to the lab environment, this meant that the ROLAP cube required 31 cube partitions, each mapping directly to a corresponding daily table in the lab data warehouse, as illustrated in Figure 11. Given this large number of cube partitions, SQLCAT opted to automate the task using XMLA scripts.Figure 11: Aligning Cube Partitions with Daily Facts TablesTo bring the ROLAP cube back into alignment with the lab data warehouse, SQLCAT performed the following steps:Adding all daily facts tables to the DSV???By using the Add/Remove Tables wizard in BIDS, SQLCAT added all 31 daily facts tables to the Analysis Services database in a single step. The Add/Remove Tables wizard established the relationships between facts and dimension tables automatically because the daily tables included all necessary foreign key definitions. It is worth pointing out that SQLCAT did not remove the original facts table from the DSV because it served as the source table for the measures in the Fact Lab Data Optimized measure group.Removing all existing cube partitions???By deleting the existing cube partitions associated with the original table partitions, SQLCAT excluded the original facts table from further querying and processing in the Fact Lab Data Optimized measure group.Adding daily cube partitions???By using the following XMLA script, SQLCAT added all 31 daily partitions to the ROLAP cube with consistent configuration settings. Particularly important for query performance is the Slice property, which Analysis Services uses during query execution to determine the partitions that contain relevant data. SQLCAT configured each cube partition’s Slice property according to each base table’s calendar day (see also Figure 11).<Create xmlns=""> <ParentObject> <DatabaseID>SQLCAT</DatabaseID> <CubeID>SQLCAT</CubeID> <MeasureGroupID>Fact Lab Data Optimized</MeasureGroupID> </ParentObject> <ObjectDefinition> <Partition xmlns:xsd="" xmlns:xsi=""> <Name>Fact Lab Data Optimized_<yyyyMMdd></Name> <ID>Fact Lab Data Optimized_<yyyyMMdd></ID> <Source xsi:type="DsvTableBinding"> <DataSourceViewID>SQLCAT</DataSourceViewID> <TableID>dbo_FactLabDataOptimized_<yyyyMMdd></TableID> </Source> <StorageMode>Rolap</StorageMode> <ProcessingMode>Regular</ProcessingMode> <Slice> [Dim Date].&amp;[<yyyy>]&amp;[<qq>]&amp;[<MM>]&amp;[<d>] </Slice> <ProactiveCaching> <SilenceInterval>-PT1S</SilenceInterval> <Latency>P0D</Latency> <SilenceOverrideInterval>-PT1S</SilenceOverrideInterval> <ForceRebuildInterval>-PT1S</ForceRebuildInterval> <Enabled>false</Enabled> <AggregationStorage>MolapOnly</AggregationStorage> <OnlineMode>Immediate</OnlineMode> <Source xsi:type="ProactiveCachingInheritedBinding"> <NotificationTechnique>Server</NotificationTechnique> </Source> </ProactiveCaching> </Partition> </ObjectDefinition></Create>Processing the measure group???To process all daily cube partitions and any unprocessed dimensions for the Fact Lab Data Optimized measure group in a single step, SQLCAT reprocessed the measure group programmatically using the following XMLA script, This script enabled SQLCAT to process the data quickly using the ASCMD command-line tool.<Process xmlns=""> <Object> <DatabaseID>SQLCAT</DatabaseID> <CubeID>SQLCAT</CubeID> <MeasureGroupID>Fact Lab Data Optimized</MeasureGroupID> <PartitionID>$(Partition)</PartitionID> </Object> <Type>ProcessFull</Type> <WriteBackTableCreation>UseExisting</WriteBackTableCreation></Process>Note:???The ASCMD command-line tool supports a -v command parameter for different variables. In this case, the different variables corresponded to the different partitions (i.e., different dates) within the cube.Reviewing Measure Groups and MeasuresWith the successful completion of the data warehouse overhaul and subsequent cube alignment, SQLCAT felt confident that the ROLAP cube now met all aggregation requirements—but Analysis Services did not concur. The ROLAP cube continued to generate SQL queries with subqueries in the FROM clause, despite the fact that all cube partitions now used table binding and mapped directly to daily facts tables. SQLCAT needed to investigate and solve this issue because as long as a ROLAP cube uses subqueries to retrieve data from the underlying data warehouse, it can’t benefit from relational aggregations in the data warehouse.Zooming in on MeasuresFigure 12 shows the hierarchy of the most important Analysis Services objects that influence how a ROLAP cube communicates with an underlying data warehouse.Figure 12: Analysis Services Object HierarchyBased on this hierarchy and the following considerations, SQLCAT was able to narrow down the root cause and focus attention on the measure group:Data source ??The Data Source object of an Analysis Services database specifies the connection string, database isolation level, and maximum number of concurrent connections, which have no bearing on the binding type.Data source view???The DSV on top of the Data Source included no views or named queries and only referenced base tables, so the DSV was not the reason.Cube partitions???The cube partitions also showed the correct configuration. The XMLA script to create the partitions set the Type parameter of the Source property correctly to DsvTableBinding (see previous T-SQL listing under “Realigning Cube Partitions”).Cube-based aggregations???The cube partitions were not associated with an aggregation design, so cube-based aggregations did not yet exist at this stage.Measure groups???The next logical place to look was the Fact Lab Data Optimized measure group with its measures Transmitted MB, Gathered MB, Duration, Incident Count, and Fact Lab Data Optimized Count. The first four measures relied on the SUM aggregation function to calculate summary data over their corresponding source columns (that is, Transmitted MB, Gathered MB, Duration, and Incident Count in the daily facts tables), but the fifth measure was different. It used the “Count of Rows” function, which calculates the row count on the entire source table, not just a single column. This Count of Rows measure was the likely cause of the issue.The Fact Lab Data Optimized Count measure was also different in the sense that it did not exist in the original MOLAP cube. When SQLCAT created the ROLAP cube, the Cube Wizard in BIDS automatically added this measure to the measure group. As soon as SQLCAT deleted this measure and processed the cube, Analysis Services switched to table binding in the SQL queries, as SQLCAT verified through Profiler traces.Row Counting and Query BindingTo understand the query-binding puzzle created by Count of Rows measures, it is important to note that Analysis Services attempts to calculate the row count in a way that is independent of the underlying Database Management System (DBMS). Basically, Analysis Services constructs a SQL query that dynamically adds a column with a constant value of 1 in each row to the result set. As the Profiler trace in Figure 13 reveals, this query then serves as the subquery of an outer SQL query that uses the SQL aggregation function COUNT_BIG on this dynamic column to return the number of rows as a bigint. Because all rows have a value of 1 in this column, there is no need to pay special attention to the handling of NULL values by the DBMS, but the drawback is query binding. If a ROLAP cube is supposed to use aggregations based on indexed views, it can’t include Count of Rows measures in its measure groups. Hence, for optimized ROLAP, SQLCAT recommends disabling this Count of Rows measure.Figure 13: Calculating the Row Count with the Help of a SubqueryBenchmarking the ROLAP CubeAt last, SQLCAT’s redesign and optimization efforts paid off. The ROLAP cube was finally ready for performance testing, and thanks to the amazingly fast performance of the relational SQL Server engine on top of a super-fast storage subsystem, the results looked better than expected. To everybody’s surprise, the ROLAP cube outpaced the MOLAP cube in 45 percent of all queries right from the start (see Figure 14). Only 39 percent of the queries showed substantially slower response times in ROLAP mode (more than twice the amount of MOLAP time) and 16 percent showed moderate performance degradation (less than twice the amount of MOLAP time). Admittedly, the lab cube featured a low number of attributes, low cardinality, and the date dimension was relatively simple, but the performance results proved that ROLAP can be a viable option for simple cubes with large data volumes but low dimensional complexity.Figure 14: ROLAP Performance in Comparison to a MOLAP Baseline on a Logarithmic ScaleTopCount Customer QueriesIn particular, the three-day and seven-day queries provided by the customer exhibited remarkable performance behavior. For these TopCount() queries, ROLAP was at par with MOLAP and in many cases faster. For example, query 7\11’s average response times dropped from 9 minutes and 50 seconds to 2 minutes and 39 seconds (see Figure 14). Across all TopCount() customer queries, average response times dropped from 1 minute and 28 seconds with MOLAP to 58 seconds with ROLAP. The average performance was better with ROLAP because the three-day and seven-day TopCount() queries were able to leverage the transparent aggregations in ROLAP mode, while MOLAP is naturally unable to exploit these types of aggregations in the relational data warehouse to increase query performance. The original MOLAP cube did not include corresponding aggregation designs.The ROLAP performance also benefited from the daily cube partition design. As explained earlier, the ROLAP cube included 31 partitions that mapped to 31 daily facts tables, each with its own set of indexed views that satisfied the TopCount() queries. In this configuration, when executing an MDX query for a given range of days, Analysis Services first checks the Slice property of the cube partitions to determine the partitions that contain relevant data, then submits SQL queries to the data warehouse for only those facts tables that correspond to the relevant cube partitions. If the relational engine can process these SQL queries in parallel and fully leverage multiple indexed views, and if the storage subsystem is optimized for fast serial I/O, the data warehouse can return the results to Analysis Services in the shortest possible time.Figure 15 shows a seven-day MDX query example. The seven selected days corresponded to seven cube partitions, each owning a Slice within the specified day range. Accordingly, Analysis Services submitted seven SQL queries to the SQLCAT lab data warehouse.Figure 15: ROLAP Query Processing and TopCount Performance ResultsAdditional SQLCAT MDX QueriesFor the most part, the additional queries that SQLCAT defined for a greater variety of test cases also performed well in ROLAP mode, although query performance suffered noticeably over larger data sets. As Figure 16 reveals, query A\03 was the worst performer. It calculated average values for 30 days of data—and ROLAP required no less than 2 hours, 34 minutes, and 57 seconds to deliver the results. The next section, “Understanding Query Concurrency and Processor Utilization,” deals with query A\03 in more detail.Figure 16: Additional MDX Query Performance ResultsUnderstanding Query Concurrency and Processor UtilizationAn interesting aspect of query A\03 is that its MDX expression was very similar to that of query A\02, yet query A\02 performed much better. The difference was that query A\02 analyzed data over seven days whereas query A\03 covered 30 days of data.By studying Performance Monitor traces, SQLCAT noticed the following query behavior:Seven-day MDX query A\02???Generated seven SQL queries, which the relational engine was able to execute quickly because the number of queries did not exceed the number of processor cores on the server. The server had eight cores, and thanks to the high I/O performance in the storage subsystem, the relational engine was able to reach a very high level of CPU utilization while executing all seven queries in parallel.30-day query A\03???Generated 30 SQL queries, which far exceeded the number of available cores, thus flooding the relational engine and causing a substantial backlog of queries attempting to execute. The relational engine parallelized each of these 30 SQL queries and utilized all available cores to distribute the query threads, which caused excessive thread thrashing and I/O congestion.To increase the performance of query A\03 and to ensure that the cores would not become a bottleneck for a wider range of ROLAP queries, SQLCAT concluded that it would be necessary to limit parallel query plan generation and scale out the lab environment, as illustrated in Figure 17. By restricting the maximum degree of parallelism to 1 (MAXDOP = 1), SQL queries run single-threaded, so 30 cores would be necessary to run all SQL queries of MDX query A\03 in parallel. For more information about the MAXDOP setting, refer to the section “Max Degree of Parallelism Option” in SQL Server 2008 Books Online at . Note that the MAXDOP setting can also be enforced with Resource Governor.Figure 17: ROLAP SQL Query Fan-OutNote:???The SQL query fan-out design depicted in Figure 17 primarily benefits ROLAP cubes with a high percentage of MDX queries spanning all cube partitions, such as MDX queries on attributes that are not associated with partition slices. This design does not apply to MOLAP. MOLAP has substantially lower hardware requirements. For example, MOLAP was able to execute query A\03 much faster than ROLAP on a single server with eight cores. It didn’t require as much CPU power to parse through 30 days of data (see the results in Figure 16).Usage-Based Optimization of Long-Running QueriesUp to this point, SQLCAT had implemented ROLAP aggregations based on the assumption that all MDX queries had equal weight. The next logical step was to increase cube performance specifically for long-running queries, such as queries 3\10 and 3\11, by using the Usage-Based Optimization Wizard.The Usage-Based Optimization Wizard can create smart cube-based aggregations based on statistical information about MDX queries tracked in the Analysis Services query log. To generate suitable query log entries, SQLCAT enabled the query log as discussed in the TechNet article “Configuring the Analysis Services Query Log” at , then ran the selected queries via the ASCMD command-line tool.Figure 18 shows a breakdown of the response times for query 3\10 and 3\11 before and after UBO. As the results reveal, the long-running queries benefited substantially from UBO.Figure 18: Query Times Before and After UBO on a Logarithmic ScaleParticularly in ROLAP mode, response times dropped from almost 3 minutes to roughly 300 milliseconds. Prior to UBO, the ROLAP queries lagged far behind their MOLAP counterparts. After UBO, MOLAP and ROLAP response times were at par, and only fractions of a second apart. This proves that the application of a smart aggregation design through UBO can benefit both MOLAP and ROLAP cubes. However, UBO should not be applied indiscriminately to all queries to avoid inadvertent effects on storage and processing overhead, as discussed in the Technical Note “Reintroducing Usage-Based Optimization in SQL Server 2008 Analysis Services” available on the SQLCAT web site at with Skewed Distribution of DataIn addition to aggregations, SQLCAT was also interested in studying the effect of data skew on ROLAP performance. The lab data warehouse facilitated this investigation because the Software dimension data was very unequally distributed across the facts data. As illustrated in Figure 19, the vast majority of the Software_IDs in the source facts tables were unknown values (-1). Only a small portion of the IDs referred to known entities.Figure 19: Skewed Distribution of Software_IDs in a Lab Facts TableIn particular, SQLCAT wanted to examine the impact of data skew on query optimizer. Query optimizer uses a complex optimization algorithm based on table and index statistics to formulate the query execution plan. This algorithm works well in most cases, but it is not perfect. Skewed data can cause query optimizer to generate an inefficient query execution plan, which would affect ROLAP performance while the effect remains virtually non-existent for MOLAP.To perform the tests, SQLCAT ran arbitrary ad-hoc queries against the MOLAP and ROLAP cubes. These ad-hoc queries did not benefit from aggregations or indexed views, and the performance results validated the assumption that data skew must be better compensated for with ROLAP than with MOLAP. The MOLAP queries performed well with an average response time of 49 seconds while the ROLAP queries, on average, took approximately two minutes longer (see Figure 19).One way to compensate for data skew in a relational data warehouse is to cover the most relevant queries with indexed views. Another is to ensure that appropriate, up-to-date column statistics exist. For information about concepts and guidelines for using query optimization statistics, refer to the topic “Using Statistics to Improve Query Performance” in SQL Server 2008 Books Online at Learned and Best PracticesIn reviewing the investigation project, SQLCAT concluded that the efforts to achieve similar or better performance with ROLAP over MOLAP had been successful. The test results exceeded expectations in many cases and gave SQLCAT a good grasp on hardware requirements, implementation details, and design implications. Table 4 summarizes corresponding lessons learned during the ROLAP investigation and best practices for large-scale ROLAP implementations, which SQLCAT determined in discussions with SQL Server and Analysis Services developers.Table 4: ROLAP—Lessons Learned and Best PracticesROLAPLessons LearnedBest PracticesIs a viable choice for cubes with low complexity.For large cubes with a low level of complexity and low attribute cardinality, ROLAP can be a good choice, but ROLAP is less attractive than MOLAP for cubes with many dimensions (wide facts tables), a large number of attributes (wide dimension tables), and complex attribute relationships (snowflake schema or many-to-many relationships).Avoid complex data models and simplify the relational data warehouse as much as possible, such as by removing unnecessary columns and, if possible, by replacing wide data types with narrower alternatives to increase storage efficiency.Must use table binding.Analysis Services must reference database tables in the SQL queries to support relational aggregations based on indexed views because SQL Server does not support building indexed views against views or subqueries and the query optimizer does not leverage indexed views if the SQL queries do not reference tables directly.Use table binding (not query binding) for ROLAP cube partitions, include only tables (not views or named queries) in the DSVs, and remove any Count of Rows measures from ROLAP measure groups.Requires more hardware resources than MOLAP.In comparison to MOLAP, ROLAP has substantially higher hardware requirements, such as higher demand for CPU power and I/O performance.Provide sufficient CPU power to sustain the high number of concurrent SQL queries and optimize the storage subsystem to deliver sufficient I/O throughput to reach a very high level of CPU utilization. Correctly implemented, ROLAP should be consistently generating sequential scans, so the storage subsystem should be optimized for a high volume of serial I/O requests.Impacts the size of relational databases.Although ROLAP avoids copying source facts data into multidimensional databases, it does place a high demand on storage capacities in the relational data warehouse for indexed views and summary tables. For example, indexing on every attribute can lead to excessive storage consumption in ROLAP mode, while MOLAP is much more efficient with its automatic two-level bitmap indexes.Use transparent aggregations in combination with cube-based aggregations to avoid inefficient use of storage resources, particularly in environments with multiple cubes sharing the same data warehouse. Avoid overlapping cube-based aggregations and cubes with different levels of granularity because they require different sets of tables for each level of granularity.Increases design complexity and management overhead. MOLAP automatically optimizes many design aspects that require manual attention in ROLAP mode. Queries that don’t hit aggregations require a very clean, normalized schema and other data warehouse optimization techniques also come into play, such as restricting the maximum degree of parallelism to 1 and compensating for data skew. Also, because the cube can’t rely on partitioned facts tables, the data warehouse requires individual daily tables and separate sets of indexed views, which can substantially increase management overhead.Create scripts or other custom solutions to automate the provisioning of facts tables, indexed views, and corresponding cube partitions. Directly map the facts tables in the relational data warehouse to cube partitions in Analysis Services. Also, use the ASCMD command-line tool and Profiler traces to test applicable queries with original aggregations, new UBO aggregations, handmade aggregations, and no aggregations to better understand the query characteristics of the cube.Can leverage cube-based aggregations even if the granularities do not exactly match.MOLAP and ROLAP can use aggregations even when the query granularity doesn’t exactly match the aggregation granularity. However, ROLAP can only use cube-based aggregations on the leaf level of a dimension when querying higher levels.Create cube-based aggregations at the leaf level to help known cross-join patterns. Note that Analysis Services can only leverage aggregations for different granularities if the aggregations are part of the ROLAP cube design, such as aggregations created by using UBO. Transparent aggregations do not fall into this category.Might be affected by data problems and referential integrity issues.MOLAP can catch referential integrity errors and other data problems during cube processing, but these issues can affect ROLAP during query execution. ROLAP cubes do not support Snapshot isolation with multiple active result sets on a SQL Server connection, so DW architects must use the ReadCommitted isolation level, which allows nonrepeatable reads (reading the same row twice delivers different results) and phantoms (re-executing the same query delivers a different set of rows).Focus on data cleanup and data quality during the load process. Confine data changes to the most current facts table, such as to the most recent daily table, in order to avoid any impact of referential integrity issues on past data.Relies on query optimizer’s ability to make the best decisions for the query execution plan.ROLAP performance depends on the capabilities of the query optimizer to formulate an efficient query execution plan. An inefficient query execution plan affects ROLAP performance, while the effect remains virtually non-existent for MOLAP.Add foreign key and other constraints to the facts tables. Keep the statistics up to date by running regular UPDATE STATISTICS jobs after updates (or turn on auto update stats). ConclusionROLAP can be a viable option for simple cubes with large data volumes but low dimensional complexity if DW architects and database administrators have the knowledge to fine-tune the data warehouse manually. It is important to realize that MOLAP optimizes many design aspects automatically, while this must be done by hand in ROLAP mode. However, given sufficient CPU power, memory capacity, a storage subsystem optimized for high I/O throughput, and useful relational aggregations, ROLAP can reach very high performance levels thanks to the amazing performance of the SQL Server 2008 relational engine, exceeding MOLAP performance in some cases.In comparison to the gold standard of MOLAP, ROLAP clearly places a higher demand on hardware resources, impacts the size of relational databases, increases design complexity and management overhead, comes with higher processing overhead, and might be affected by data problems and referential integrity issues. For instance, to benefit from aggregations, a ROLAP cube can’t use query binding, which limits cube design flexibility and prevents the data warehouse from using table partitioning. For partitioning purposes, the data warehouse must rely on the old-style method of individual facts tables organized by attributes related to cube partitioning, which noticeably increases management overhead. Furthermore, if ROLAP can’t determine which partitions contain relevant data, perhaps because the query did not specify any of the slice attributes or because of complex attribute relationships that ROLAP can’t analyze, ROLAP includes all available cube partitions in the query execution, potentially flooding the relational engine and causing a substantial backlog of queries as well as excessive thread thrashing and I/O congestion. Limiting parallel query plan generation and implementing a SQL query fan-out design can help to mitigate this issue, but the required resources to implement a fan-out are substantial. In comparison, MOLAP can determine the relevant cube partitions even with parent-child and many-to-many relationships in the dimensions and can execute queries much more efficiently and with much less CPU power. MOLAP offers many performance advantages out of the box, such as automatic two-level bitmap indexing on every attribute and lower I/O overhead due to data compression.On the other hand, ROLAP undeniably does have the advantage that it doesn’t need to copy the source data into a multidimensional database. With data volumes beginning to exceed the maximum MOLAP capabilities, this advantage will trump all of the ROLAP disadvantages, especially considering the powerful features of the SQL Server relational engine to manage the data volumes. Smart aggregation designs that combine the advantages of transparent and cube-based aggregations can mitigate some of the performance disadvantages of ROLAP. It is also possible to fan out SQL queries across multiple data warehouse servers. The data warehouse scale-out onto multiple servers would also benefit cube partition processing. And finally, XMLA scripts and custom solutions can help to lower the administrative burden of maintaining a large number of facts tables and cube partitions.The keys to a successful ROLAP implementation are a fast underlying storage subsystem, a streamlined relational data warehouse, and thorough testing, tweaking, and usage-based optimization. It is imperative to study Profiler traces and analyze the SQL queries generated by Analysis Services. If these SQL queries reference views or subqueries in the FROM clause, the ROLAP cube will not be able to leverage relational aggregations. Query optimizer will not use transparent aggregations created manually in the data warehouse and the creation of cube-based aggregations will fail during partition processing if the cube uses query binding for any reason. It is important to review the design of DSVs, cube partitions, and measure groups to ensure that Analysis Services uses table binding.Although SQLCAT deemed its first lab-based customer ROLAP investigation a great success, a few questions still remain to be answered. For example, SQLCAT did not examine the ROLAP behavior for multiple concurrent users and did not study how ROLAP would benefit from a warm Analysis Services cache. In theory, the ROLAP storage mode exhibits the same caching as MOLAP does and should therefore show similar performance improvements, but this still needs to be verified in a realistic test lab. Furthermore, SQLCAT still needs to clarify the impact of attribute cardinality, distinct count measures, and complex attribute relationships on ROLAP performance. It still needs to be seen how ROLAP performs over very wide dimensions and attributes with one million or more members. For all of these reasons, customers are encouraged to use ROLAP with caution, apply the optimization techniques discussed in this case study, and perform additional acceptance tests with the full number of current users in their environments.For more information:: SQLCAT Web site: SQL Server Web site: SQL Server TechCenter: SQL Server DevCenter Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5 (excellent) how you would rate this paper and why you’ve given it this rating. For example:Are you rating it high because it has good examples, excellent screen shots, clear writing, or another reason?Are you rating it low because the examples are poor, the screen shots fuzzy, or the writing unclear?This feedback will help us improve the quality of the case studies we release.Send feedback. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download