Not surprisingly, many e-commerce sites experience periods of high activity at the holidays. For instance, 1-800-Flowers.com...
Inc. (1-800-flowers.com) experiences bursts of ordering activity around Valentine's Day, Mother's Day, Christmas, Easter, Thanksgiving, Secretary's Day and Team Appreciation Week. Most other online retailers experience peak volumes starting Thanksgiving Day and lasting until December 26.
So what can you learn from companies that always need to maintain highly available, fast-performing databases? I'll discuss some ways you can prepare for peak activity with this overview and checklist.
|TABLE OF CONTENTS
Understand the workflow
Checklist: Prepare for peak workload
|Availability methods||Return to Table of Contents|
High availability typically involves clustering. When you need high levels of uptime, you need to cluster SQL Servers, which consists of a single-instance cluster where several nodes are grouped together, presenting themselves as a single node to the client. If one node in the cluster goes offline (for SQL Server faults, hardware failures or maintenance), the other nodes automatically take over the workload. The client does not have to reconnect to the other node because the nodes are connecting to a virtualized server, which floats between the active nodes.
Clustering provides automatic fault tolerance to hardware and software faults, but it typically does not provide tolerance to local faults (i.e., a power failure in the building or room that houses the cluster). Care should be taken to eliminate single points of failure, such as a redundant power supply or standby generator power.
Geospatial clustering and load balancing
Other high availability methods include geospatial clustering, where the cluster nodes are located in different locations, or geospatial load balancing, where IP address clients can switch between a main data center and a disaster-recovery site.
EMC Corp., Hitachi Data Systems and the now Symantec-acquired Veritas Software offer hardware data-mirroring tools that provide continuous replication, so a disaster-recovery site retains a real-time copy of your data. Hardware data-mirroring tools can be used in conjunction with geospatial IP solutions to provide automatic failover to a disaster-recovery site.
You could also take advantage of scale-out methods to distribute your data over several worker machines. So instead of having 1,000 clients connecting to a single SQL Server, 100 clients could each connect to 10 SQL Servers. Your data-access patterns have to be such that it doesn't matter which SQL Server the clients connect to, or you must have sticky session enabled. That way, each client connects to a single SQL Server for the length of its session.
For example, if the 10 SQL Servers in your federation provides catalog information, and the data is the same across all 10 SQL Servers, it will not matter which SQL Server the client connects and reconnects to. A SQL Server could be added or removed from such a federation with the remaining SQL Servers distributing the load. Peer-to-peer replication in SQL Server 2005 is designed for precisely this type of scale out.
Note that SQL Server does not automatically distribute the load among other SQL Servers. You need network load balancing to distribute the load over a multitude of Web servers and have one or more SQL Servers in the federation serve each Web server.
|Understand the workflow||Return to Table of Contents|
E-commerce companies prepare all year for their peak seasons. Systems architects study workflow to understand which processes must be done during the transaction, which can be batched and which can benefit from parallel processing on other machines.
Consider a typical order-taking operation. A credit card is entered and validated on the Web page, ensuring the number starts with a certain sequence and meets a certain length. This step is often performed on the browser so as not to chew up Web server processor cycles. The credit card charge is typically not authorized because the Web service call would slow the transaction to the point where an overall scalability solution would degrade. Without a Web service call to authorize each credit card transaction, the e-commerce site could support several thousand more pages. Credit cards would be processed later as part of a batch process.
As demonstrated in the above example, by carefully looking at workflow, systems architects can identify processes to be done asynchronously so overall scalability increases.
|Load test||Return to Table of Contents|
Extensive representative load tests are done on machines that clone production machines. These load tests are carefully analyzed to identify and eliminate bottlenecks. As the bottlenecks are eliminated the load test is repeated to identify and eliminate new bottlenecks. This iterative process is continued as resources permit.
|Lock down||Return to Table of Contents|
|Typically, all development ceases on e-commerce sites weeks in advance of a peak season and go into lock-down mode, where no changes are made on any of the production machines. Automatic patch management is disabled and patches are only applied to address real vulnerabilities. The impact of these emergency patches is assessed in the QA environments before being promoted to production.|
|Checklist: Prepare for peak workload||Return to Table of Contents|
What would you as a DBA do to prepare your SQL Server for these peak loads? Here are some steps to follow.
|Checklist: How to prepare your SQL Server for peak workload|
|If you have any maintenance window at all:|
| Archive as much unnecessary data as possible.|
| Run dbreindex to update your indexes and re-establish fill factors.|
|On VLDBs (very large databases), you will probably not have this luxury. If that's the case, take the following step.|
|Turn off auto-update statistics|
By default SQL Server will auto-update statistics for tables when a threshold of 20% of the table is modified. To turn off auto-update/create statistics, issue the following commands:
|Turn off Auto Shrink|
|A transaction log or database file being shrunk will cause performance degradation. Please refer to advice by SQL Server MVP Tibor Karaszi on the consequences of shrinking your database or transaction log files.|
|Turn off Auto Grow|
|Turn off Auto Grow and size your database's data files for peak sizes. Setting your database to have to grow the size of its database files or transaction-log files can degrade performance and serialize transactions. Please refer to this Microsoft article for more information on the consequences of Auto Grow.|
|Turn off index defragging and indexing optimizations|
|Index defragging an online operation (i.e., operations which do not lock your tables) will cause considerable locking on your tables and indexes, and it will degrade overall performance on your SQL Server. The impact of fragmented indexes is minimized on VLDBs; they are consequential only when you do index scans as opposed to index lookups. Please refer to this white paper for more information on index-maintenance operations.|
|Maintain your transaction log|
|A neglected transaction log will have a large number of VLFs (virtual log files). You will get better performance with a small number of VLFs. You can reduce the number of them by dumping the transaction log frequently (i.e., every five minutes).|
|Use a fast database backup solution|
|Reduce the performance impact of your backups by using a third-party SQL backup product.|
|Recompile stored procedures|
|Recompile your stored procedures to ensure that the optimal execution plan will be selected.|
|Run warm-up scripts|
|Run warm-up scripts against your database to ensure that your queries will get maximum benefit from your cache.|
|Be vigilant in performance monitoring|
|You can run SQL Server Profiler using the Standard template with minimal performance impact on your system. Other tools exist from various vendors like Quest Software, Imdeca Software Inc. and Idera, which provide you with a window into your SQL Server to assess the health and performance of your system. Many of these tools have a low memory and processor footprint on your SQL Servers. Performance Monitor also has a low memory and processor footprint on your SQL Servers and does provide an alternate window into your database.|
|Time batch jobs|
|Time your batch jobs or DTS packages to occur at times of low load, or defer them until the peak part of the holiday season is over.|
|Summary||Return to Table of Contents|
This completes our look at proactive measures you can take to improve the performance of your SQL Server solution for peak loads. There is no substitute for careful planning and testing. We reviewed a series of tips on settings that can degrade performance on production systems under high load. Determine which of these tips will work for you and test them under a representative environment. Here's hoping that your holidays will be happy and profitable ones!
About the author:
Hilary Cotter has been involved in IT for more than 20 years as a Web and database consultant. Microsoft first awarded Cotter the Microsoft SQL Server MVP award in 2001. Cotter received his bachelor of applied science degree in mechanical engineering from the University of Toronto and studied economics at the University of Calgary and computer science at UC Berkeley. He is the author of a book on SQL Server transactional replication and is currently working on books on merge replication and Microsoft search technologies.
More information from SearchDataManagement.com
- Learning Guide: SQL Server 2005
- Tip: Tips for choosing the best DBMS
- Checklist: How to maintain an effective SQL Server DR strategy