One of the more ambiguous, but oft-seen, errors resulting in support tickets is related to high server loads. Given that the range of causes are not specified within the error itself, a degree of investigation is required.
Generally speaking, this investigation will need to be done by the server owner, their system administrator, or the server provider due to the necessary level of access that may be required.
What Causes High Server Loads?
The vast majority of high server load errors are caused by an excessive or persistent utilization of one, or a combination, of the following system resources:
- RAM / Memory (including swap space)
- Disk I/O
All of these resources work concurrently to ensure the proper functioning of any computer; a web server is no exception. If any process running on the machine causes one of the aforementioned to exceed its normal parameters, or otherwise prevent it from returning to normal limits within an acceptable amount of time, the result is often a high server load error.
Imagine putting the pedal to the metal in a car for too long, causing one of the systems within the engine to exceed its operational limits, but fortunately in this case we get an error instead of a fire. The principle is the same though: a machine has simply been pushed past its limit.
How To Investigate High Server Load
Before proceeding, it is important to determine whether you want to go down the path of discovering the current resource usage or rather to review the historical usage from a specific date or time. The former would be necessary to resolve an issue occurring in real time, while the latter would be a forensic investigation as to what caused a prior issue. For the sake of checking every box, we will cover both scenarios below.
Historical resource usage can be viewed using the “sar” utility, which should exist by default on all cPanel servers from the sysstat package. Statistics are collected when sysstat runs via cron (/etc/cron.d/sysstat). If crond is not running, sysstat will not be able to collect this historical data.
To view resource usage with sar, you must provide the path to the file that corresponds to the date in question. For example, if you wanted to view the load averages for your server from the 23rd of the month, you would run this command:
[[email protected] ~]$ sar -q -f /var/log/sa/sa23
The above command above ‘-q’ to obtain the load average information, and ‘-f’ to specify from which sar file to obtain the information. Keep in mind that sar may not have historical data going back more than a week or so.
You do not need to specify the date when viewing the statistics for the current day. As such, this command would show the load average for today:
[[email protected] ~]$ sar -q
As with any command with which you are unfamiliar, it is always advisable to read the documentation:
[[email protected] ~]$ man sar
Current CPU Usage
The real-time CPU usage can be viewed by running the “top” command. On the line that says “Cpu(s),” check the “%id” section to see the percentage at which your CPUs are idle; the higher the number, the better. A 99% idle CPU is doing almost nothing, whereas a 1% idle CPU is heavily tasked at that moment.
[[email protected] ~]$ top c
Tip: hit “P” to sort by processes that are currently consuming the most CPU.
Historical CPU Usage
As noted above, we will use the “sar” command to view the historical statistics. The command is otherwise virtually the same, being sure to check the “%idle” column:
[[email protected] ~]$ sar -p
Current RAM Usage
To view how much memory the server currently has unutilized, or free, use the “free” command:
[[email protected] ~]$ free -m
Tip: run “top c” and hit “M” to see which processes are consuming the most memory.
Historical RAM Usage
One thing to note when checking the historical memory usage is that the version of sar being used will determine the specific command. Older versions of sar used ‘-r” to show both %memused and %swpused (swap memory used), but more current versions of sar require the additional use of ‘-S’ to show %swpused:
[[email protected] ~]$ sar -r
[[email protected] ~]$ sar -r [[email protected] ~]$ sar -S
Current Disk I/O Usage
The final resource to investigate when determining high server load errors is the overall read and write activity of the hard drive itself. The following command will display the disk usage statistics ten times each second. Note that the following commands will not work on OpenVZ/Virtuozzo containers:
[[email protected] ~]$ iostat -x 1 10
Historical Disk I/O Usage
As with the the above examples, we use the sar command:
[[email protected] ~]$ sar -d
An Ounce of Prevention
At the end of the day, your best defense will always be a good offense. With that in mind, remember that good system administration includes keeping an eye on your server resources and being aware of any unusual increase in utilization before it actually becomes a problem.