subject: Key Factors to Find Resource BottleNeck in Linux Server Overloading [print this page] Key Factors to Find Resource BottleNeck in Linux Server Overloading
It's very common, despite of the affordable hardware, to have load issues on the server. There can be a number of reasons for high load on the server such as, inadequate RAM/CPU, slower hard disk drives, or just unoptimized software. This article will help you identify what's the bottleneck and where do you need to invest on. Please, however, do not take it as a replacement of professional advice/service. You should always seek professional service if you can afford the costs associated.
I) First of all, are you really in trouble?
Usually people look for load in control panels, using "uptime" or "top" command. You can probably execute the "uptime" command in your root shell to find out what's the load, but I would like you to use "top" for the moment (pretty please). This will help you identify how many CPUs are being reported*. You should be able to see something like cpu00, cpu01, etc.
A load of ~1 for each cpu is reasonable. For example, you're fine if the load's 3.50 and you have 4 CPUs.
Another thing to consider while looking at the load via uptime or top, is to understand what it shows. For instance: (on a 2HT cpus server, reported as 4)
The first part (3.76) shows the load average in the last 5 mins, while the second (2.97) and third (2.62) shows averages of 10 and 15 mins respectively. It's probably a spike here which I wouldn't be too worried about (a bit carefree?), but if you are, then just read on!
Pretty happy about how you were able to identify that your server is really overloaded? Sorry to hear that, but you never know because sometimes servers are able to handle much more load than the load shown. The load averages aren't so accurate afterall and cannot always be the ultimate deciding factor. Confused? It was just some technical information you don't need to be so bothered about. Move ahead if your loads are something to worry over.
* note the usage of term "reported". I have used this term because a P4 CPU having HT technology will be reported as 2 even if you know your server has one CPU.
II) Where's the problem?
To identify the problem, you need to run a series of logical tests (Ok, it isn't as scary as it may sound). All you need is some free time, probably 30-45 mins, and root access to your server (expect no magic ;)). Ready to start? Let's go!
Note: Perform the checks multiple times, to reach a fine conclusion.
1. Check for RAM (most common bottleneck!).
# free -m
The output should look similar to this:
# free -m
total used free shared buffers cached
Mem: 1963 1912 50 0 28 906
-/+ buffers/cache: 978 985
Swap: 1027 157 869
Any reaction like, "Ohh Gosh, almost all the RAM is used up."? Don't panic. Have a look at the buffers/cache that says "985" mb of RAM is still free in buffers. As long as you have enough memory in the buffers, and your server isnt't using much swap, you're pretty fine on RAM. Your server starts to use SWAP (much like Pagefile), which is part of your disk mapped as memory but it is comparatively very slow and can furthur slower down your system if you have a busy hard disk (which I doubt you wouldn't if you're using so much RAM). In short, at least 175mb available in buffers and no more than 200mb swap.
If RAM is the issue, you should probably look into optimizations on your PHP/Perl scripts, MySQL queries + server, and Apache.
2. Check if I/O (input/output) usage is excessive
If there are too many read/write requests on a single hard disk drive, it will become slow and you'll have to upgrade it to a faster drive (with more RPM and cache). The alternate option to a single faster drive is splitting the load onto multiple drives by spreading the most request content onto multiple drives, which can be easily accomplished using "symlinks" (soft links to files/folders). To identify, if your I/O issue is making your server lag:
# top
Read the output under "iowait" section, for each CPU. In ideal situations, it should be near to 0%. If you do however are scrutinizing at time of a load spike, consider rechecking these values multiple times to reach a fine conclusion. Anything above 15% is worrysome. Next, you can check the speed of your hard disk drive to see if it's really lagging:
If you do know your hard disk exists on /dev/sda or /dev/hda, just perform the following. Or execute "df -h" command to check which is the drive that your data resides on.
Timing buffered disk reads: 62 MB in 3.00 seconds = 20.66 MB/sec
It was awesome at the buffer-cache reads, most probably because of the disk's onboard cache, however, buffered disk reads is just at 20.66 MB / sec. Anything below 25MB is something you should worry about.
3. CPU power is all consumed?
# top
Check the top output to find out if you're using too much CPU power. You should be looking the value under idle besides each CPU entry. Anything below 45% is something you should really worry about.
III) Problem identified, What's the solution?
To wrap it up, let me offer a few solutions for each problem:
A global solution to all problems is to optimize MySQL, and Webserver including PHP/Perl scripts and queries. Or the least you can do is to optimize Apache and MySQL server parameters to perform better.
1. Too much CPU usage
In "ps -auxf" or "top" look for processes that use too much CPU. If it's HTTP or MySQL, you better optimize your scripts and queries, if possible. In most cases, it's extremely difficult to optimize all the scripts and queries and a better option is to just go for a CPU change/upgrade. A dual CPU should perform better, but what kind of upgrade you're looking for depends on your current CPU.
2. RAM's all exhausted
It's like you're in the same kind of situation as the CPU one. Optimize HTTP, MySQL, scripts etc. or go for a RAM upgrade. You may install Opcode cache softwares like APC (from Pear) for PHP to make it perform better while decreasing the load.
3. Disk's all used (eh, I don't mean space)
Here you either have to go for a faster disk like SATA over normal IDE or SCSI over SATA. Well, I was just speaking generally. You have consider factors like RPM and cache to end up going for an upgrad that's worth it. The second option is to get multiple drives of the same class and spread the load across drives. One common methodology is to serve MySQL from a second drive.
IV) Conclusion
That wasn't of much help? My article might be flawed, ahh, excuse me. It's my first article and this thing really consumed quite a few braincells of mine. That's a bit personal isn't it? Let's get back to business.
FYI, In the example, the problem was with the I/O usage and hard disk getting slow.
A guide can never be complete in itself or offer you everything you will need to reach upto expert level (you need to keep learning to reach that level). Whenever in doubt, please DO hire experts to look over your server. Somehow, if you don't have the money to spend, you're stil safe! You can head to our Server optimization help section to get help with your server optimization.
Asad N
Now Pay Close Attention --
Using Video Testimonials to increase your websites sales and revenue is simpler than you've been told. Everyone with a website faces the same two problems:
[Problem #1] How To Build Credibility With Potential Customers
[Problem #2] How To Obtain Testimonials Which Have Been Proven To Improve Sales
Fast Video Testimonials has been solving these two problems for hundreds of satisfied customers. Fast Video Testimonials has been tried and tested and known to produce excellent results.
First: Click Here For Fast Video Testimonials
Fast Video Testimonials always supplies REAL VIDEO TESTIMONIALS created specifically for your website and business.
Second: Order Your Video Testimonials Package of 1, 5 or 10 original Video Testimonials for your website.
Your new Video Testimonials will arrive within 7 days and all Video Testimonials are done by REAL PEOPLE and are guaranteed to boost your sales by up to 30%. Buy Video Testimonials today and boost your businesses online presence overnight.