subject: Data Warehouse Performance And Scalability [print this page] Most data warehouses are mission-critical, serving in an increasingly mixed workload capacity, including as a data source for online applications. Deep mining analysts and business analysts are running complex queries and fast-running tactical queries, each with differing service-level expectations driven by business users, customers and executives of their Company. These differing workloads are all competing for CPU, memory and disk access. At the same time, data latency continues to progress from batch to continuous loading demands.
The complex mixed workload consists of:
Continuous data loading, similar to an OLTP workload
Batch data loading
Summary and aggregate management to support dashboards and prebuilt reports
Large numbers of standard reports ranging in the thousands per day requiring SQL tuning, index creation, new types of storage partitioning and other types of optimization structures in the data warehouse
Business analytics in which business process professionals with limited query language experience use prebuilt analytic data objects with aggregated data (pre-joins) and designated dimensional drill downs (summary)
An increasing number of query users with a random, unpredictable use of the data
Automation, Performance and Scalability
The Business Intelligence tool is a customer-facing interface so its the initial focus
Automate Regression testing of the Business Intelligence tool
Automate Performance/Scalability testing of the Business Intelligence tool
Performance may be an issue at the Data Warehouse or any other data source component
Characterize each source and subsystems response under load