Board logo

subject: Introduction to Planning a Data Center Cold Shutdown [print this page]


Introduction to Planning a Data Center Cold Shutdown

A primary goal of any data center, whether it's a little server room in your basement or a large server farm with thousands of severs, is to provide continuous operations for the network and information technology services they support. However, this is unachievable without an adequate amount of maintenance from time to time, including, occasional cold shut downs.

Although in theory a data centers are supposed to keep its networks running without interruption, they must make room for scheduled maintenance windows. Occasionally, a cold shutdown may be needed. So when does a cold shutdown become necessary?

1.Data center move: If your data center is moved from location A to B, then most likely you'll need to schedule a window of time to switch things off at A and bring them up at B. Such moves are rare since they often result in interruption to service that may not be within the tolerable range. Such moves are often done in small chunks of a few servers at a time.

2.Data center expansion/growth: From time to time, you might find yourself needing to grow the data center further, which might need additional wiring, or construction that requires a cold shutdown for safety reasons.

3.Original faulty construction or wiring: If the original design of the data center was deficient or faulty, you may need to shut it down cold and fix the design before bringing it back up. An example of this is a data center that provides UPS battery backups for only a part of the data center area. For instance, in a recent case at one of our clients, VoIP equipment was in a closet that did not have backup power to it. So in case of a power outage, phone lines and voice mail systems would fail. To rectify this situation, additional electrical work was needed which required a data center shutdown for reasons of safety for the electricians.

Cold shut-downs are no piece of cake. The shorter the window and tolerance for down-time the harder the project becomes. A weekend is usually the least amount of time that you'll need to cold shut down a data center with 50 or more servers and bring them back up fully operational.

Here's the reason why. When you shut down a data center with say 50 servers, chances are that at least one or two of the servers may have been old and possibly on their last leg. When they are brought back up after a shut down, they may not operate as expected. This will require the server or equipment owners to either activate a backup system or restore/build from a backup. Such work can often take a long time depending on the complexity and sensitivity of the systems.

Shutdown usually occurs top-down and restart occurs bottom-up. What this means is that shutdown occurs in the following order:

Presentation layer servers and web servers

Application servers including virtual server hosts

Database servers

Storage and backup servers

Load balancers and other equipment

Communication equipment such as phone servers, fax servers, Etc.

Network equipment such as routers, switches and access servers

Low-level network, cabling and infrastructure equipment

They are brought back up in reverse order.

When planning for data center maintenance, it is important to note that such a move is often risky and involves significant stress in case of an extended outage of any of the levels mentioned above. It is often prudent to maintain a list of on-call technicians that can help in case any of the portions of the shut down do not go as planned. Also note that critical documentation may need to be printed or kept ready locally on a machine since most regular documentation might be in systems that are affected due to the shut down. If you are new to data center maintenance, you may want to hire or consult professionals to ensure that everything goes without a hitch.




welcome to loan (http://www.yloan.com/) Powered by Discuz! 5.5.0