Troubleshooting memory & swap usage

headout

Well-Known Member
Aug 20, 2003
78
0
156
Hi all,

we are having issues with high load, high memory and swap usage.
Code:
Apr 23 14:59:22 s01 /kernel: pid 87516 (httpd), uid 65534, was killed: out of swap space
Apr 23 15:01:46 s01 /kernel: swap_pager_getswapspace: failed
Apr 23 15:02:00 s01 last message repeated 287 times
Apr 23 15:02:00 s01 /kernel: pid 86953 (httpd), uid 65534, was killed: out of swap space
Apr 23 15:06:58 s01 /kernel: swap_pager_getswapspace: failed
Apr 23 15:07:04 s01 last message repeated 465 times
Apr 23 15:07:05 s01 /kernel: pid 53556 (httpd), uid 65534, was killed: out of swap space
This is happinging for acouple of days right now. Restarting httpd solves it for just a few hours.
Top:
Code:
last pid:  2588;  load averages:  2.16,  2.16,  2.35                                                                                up 34+02:06:25  15:31:19
94 processes:  3 running, 86 sleeping, 4 stopped, 1 zombie
CPU states: 59.6% user,  0.0% nice, 34.9% system,  0.0% interrupt,  5.5% idle
Mem: 1171M Active, 172M Inact, 576M Wired, 88M Cache, 199M Buf, 3004K Free
Swap: 2048M Total, 934M Used, 1114M Free, 45% Inuse, 340K In

  PID USERNAME PRI NICE  SIZE    RES STATE    TIME   WCPU    CPU COMMAND
49195 mysql     43   0   386M 74244K RUN     69:33 19.87% 19.87% mysqld
89881 nobody     2   0   415M   284M sbwait   1:37 11.04% 11.04% httpd
89877 nobody     2   0   206M   176M sbwait   0:47 10.89% 10.89% httpd
 2285 nobody     2   0   220M   204M sbwait   0:43 10.40% 10.40% httpd
 2511 nobody    36   0 51512K 35748K RUN      0:07 10.12%  9.91% httpd
 2463 root      33   0  2064K  1084K RUN      0:05  3.24%  3.22% top
89889 nobody     2   0 32200K 14172K sbwait   0:06  2.14%  1.66% httpd
90249 nobody     2   0 30860K 10256K sbwait   0:06  1.44%  1.12% httpd
89909 nobody     2   0   296M 24376K sbwait   1:00  1.15%  0.88% httpd
 1951 nobody     2   0 32564K 15964K sbwait   0:04  0.88%  0.88% httpd
 2380 nobody     2   0 25476K  9748K sbwait   0:01  0.73%  0.73% httpd
 2314 root       2   0 65612K 41896K poll     0:02  0.81%  0.68% perl
89910 nobody     2   0 31116K  9304K sbwait   0:07  0.59%  0.49% httpd
89884 nobody     2   0 29228K 13428K sbwait   0:05  0.44%  0.44% httpd
89874 nobody     2   0   377M 22932K sbwait   1:19  0.34%  0.34% httpd
 2482 nobody     2   0 26528K 10848K sbwait   0:01  0.30%  0.29% httpd
89875 nobody     2   0 30808K 10332K sbwait   0:05  0.24%  0.24% httpd
 2584 nobody     2   0 25668K  9824K sbwait   0:00  2.56%  0.24% httpd
89886 nobody     2   0 32060K  8712K sbwait   0:07  0.15%  0.15% httpd
 2457 nobody     2   0 22664K  6852K sbwait   0:00  0.10%  0.10% httpd
 1953 nobody     2   0   338M   323M sbwait   1:05  0.05%  0.05% httpd
  484 root       2   0  3864K   944K select  84:17  0.00%  0.00% <snmpd>
93868 bind       2   0 10448K  3240K select   5:52  0.00%  0.00% named
  124 root       2   0   984K   284K select   4:52  0.00%  0.00% syslogd
  140 root       2   0  3008K     0K select   4:31  0.00%  0.00% <sshd>
  549 root       2   0  2392K     0K accept   0:47  0.00%  0.00% <pure-authd>
 4240 mailman    2   0  8756K  1084K poll     0:28  0.00%  0.00% python2.4
 4244 mailman    2   0 10120K  1060K poll     0:26  0.00%  0.00% python2.4
  138 root      10   0  1024K   252K nanslp   0:26  0.00%  0.00% cron
 4245 mailman    2   0 10016K  1020K poll     0:24  0.00%  0.00% python2.4
 4242 mailman    2   0 10136K  1032K poll     0:24  0.00%  0.00% python2.4
 4241 mailman    2   0  8720K  1000K poll     0:23  0.00%  0.00% python2.4
 4239 mailman    2   0  8820K   976K poll     0:23  0.00%  0.00% python2.4
 4243 mailman    2   0  8112K   948K poll     0:23  0.00%  0.00% python2.4
24730 root       2   0 63272K  7188K poll     0:14  0.00%  0.00% perl
 3357 mailnull   2   0  5712K   756K poll     0:13  0.00%  0.00% exim-4.64-0
  547 root       2   0  2632K     0K select   0:12  0.00%  0.00% <pure-ftpd>
89883 nobody     2   0 29264K  9040K sbwait   0:07  0.00%  0.00% httpd
47997 headout    2   0  5708K     0K select   0:07  0.00%  0.00% <sshd>
89879 nobody     2   0 31348K  8676K sbwait   0:07  0.00%  0.00% httpd
89888 nobody     2   0 32056K  9028K sbwait   0:07  0.00%  0.00% httpd
89887 nobody     2   0 30544K  3344K select   0:05  0.00%  0.00% httpd
89878 nobody     2   0 27684K  8632K sbwait   0:05  0.00%  0.00% httpd
89907 nobody     2   0 30640K  3128K sbwait   0:05  0.00%  0.00% httpd
  136 root       2   0  1096K     0K select   0:04  0.00%  0.00% <inetd>
64681 headout    2   0  5712K   404K select   0:04  0.00%  0.00% sshd
25032 root       2   0 15276K     0K select   0:02  0.00%  0.00% <cpsrvd-ssl>
26980 root       2  19 10856K     0K poll     0:01  0.00%  0.00% <perl5.8.7>
90106 root       2   0  9812K  3016K select   0:01  0.00%  0.00% cppop
89828 root       2   0 22540K  6112K select   0:01  0.00%  0.00% httpd
90424 root       2   0 65028K     0K poll     0:01  0.00%  0.00% <perl>
 4246 mailman    2   0  8096K     0K poll     0:01  0.00%  0.00% <python2.4>
We recently restarted httpd, so swap usage is just :rolleyes: 45%.

Info about our system:
Code:
Server Version: Apache/1.3.37 (Unix) mod_gzip/1.3.26.1a mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.4.4 FrontPage/5.0.2.2635.SR1.2 mod_ssl/2.8.28 OpenSSL/0.9.7d
PHP 4.4.4 (cgi) (built: Oct 11 2006 00:09:37)
Copyright (c) 1997-2006 The PHP Group
Zend Engine v1.3.0, Copyright (c) 1998-2004 Zend Technologies
    with Zend Extension Manager v1.0.8, Copyright (c) 2003-2005, by Zend Technologies
    with Zend Optimizer v2.5.10, Copyright (c) 1998-2005, by Zend Technologies
Mysql 4.1.22
FreeBSD 4.9-RELEASE
Processor #1 Name: Intel(R) Pentium(R) 4 CPU 2.80GHz
real memory  = 2147418112 (2097088K bytes)
ad0: 78167MB  [158816/16/63] at ata0-master UDMA100
ad1: 78167MB  [158816/16/63] at ata0-slave UDMA100
Can anyone give us a advice where to look (for)?
 

nxds

Well-Known Member
Jan 6, 2006
53
0
156
Limit Apache memory usage with "Modify Apache Memory Usage" from WHM Security menu,
 

headout

Well-Known Member
Aug 20, 2003
78
0
156
Limit Apache memory usage with "Modify Apache Memory Usage" from WHM Security menu,
We did that, and it didn't make any difference actually. I removed the lines from httpd.conf, but i will try again.
 

headout

Well-Known Member
Aug 20, 2003
78
0
156
No difference at all:
last pid: 93670; load averages: 0.98, 1.46, 1.55 up 34+22:21:24 11:46:23
78 processes: 5 running, 72 sleeping, 1 stopped
CPU states: 67.5% user, 0.0% nice, 32.0% system, 0.5% interrupt, 0.0% idle
Mem: 978M Active, 173M Inact, 549M Wired, 80M Cache, 199M Buf, 229M Free
Swap: 2048M Total, 2015M Used, 33M Free, 98% Inuse

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
69329 nobody 18 0 467M 51248K lockf 5:13 0.00% 0.00% httpd
69327 nobody 2 0 419M 28552K sbwait 2:26 0.05% 0.05% httpd
69339 nobody 2 0 410M 14884K sbwait 4:39 0.00% 0.00% httpd
49195 mysql 37 0 396M 84656K RUN 295:31 10.50% 10.50% mysqld
69336 nobody 2 0 322M 16288K sbwait 4:26 0.05% 0.05% httpd
69340 nobody 18 0 295M 32048K lockf 3:10 0.05% 0.05% httpd
69331 nobody 18 0 267M 20176K lockf 2:21 0.29% 0.29% httpd
86677 nobody 2 0 261M 16428K sbwait 1:34 0.00% 0.00% httpd
87613 nobody 18 0 239M 51596K lockf 1:10 0.00% 0.00% httpd
93224 root 2 0 64992K 43080K poll 0:02 0.00% 0.00% perl
71424 nobody 36 0 63888K 44544K RUN 1:02 9.57% 9.57% httpd
93452 root 2 0 63288K 9140K poll 0:00 0.00% 0.00% perl
51703 root 2 0 63284K 7668K poll 0:11 0.00% 0.00% perl
69332 nobody 2 0 41780K 15292K sbwait 1:29 0.44% 0.44% httpd
69330 nobody 2 0 38468K 15164K sbwait 1:24 0.49% 0.49% httpd
70723 nobody 2 0 34060K 13216K sbwait 0:45 0.20% 0.20% httpd
85719 nobody 2 0 33488K 14880K sbwait 0:57 0.00% 0.00% httpd
84469 nobody 2 0 33480K 12864K sbwait 0:33 0.05% 0.05% httpd
88448 nobody 2 0 33360K 10872K sbwait 0:51 0.05% 0.05% httpd
88691 nobody 18 0 33124K 12032K lockf 0:24 0.54% 0.54% httpd
93356 nobody 2 0 32096K 15376K sbwait 0:01 0.00% 0.00% httpd
91069 nobody 18 0 32040K 10244K lockf 0:09 0.10% 0.10% httpd
92030 nobody 2 0 31184K 10940K sbwait 0:04 0.00% 0.00% httpd
92116 nobody 2 0 30488K 10100K sbwait 0:10 0.00% 0.00% httpd
93166 nobody 2 0 30396K 13488K sbwait 0:01 0.00% 0.00% httpd
93168 nobody 2 0 26484K 9468K select 0:01 0.00% 0.00% httpd
93355 nobody 18 0 26036K 9560K lockf 0:02 0.15% 0.15% httpd
93357 nobody 2 0 25916K 8888K sbwait 0:01 0.20% 0.20% httpd
93173 nobody 2 0 25836K 8456K sbwait 0:01 0.20% 0.20% httpd
93263 nobody 18 0 25624K 8380K lockf 0:01 0.00% 0.00% httpd
93172 nobody 18 0 25612K 8296K lockf 0:01 0.34% 0.34% httpd
93266 nobody 2 0 25604K 8820K sbwait 0:01 0.00% 0.00% httpd
69318 root 2 0 22564K 4776K select 0:04 0.00% 0.00% httpd
68060 root 2 0 15280K 2476K select 0:02 0.00% 0.00% cpsrvd-ssl
53928 root 2 19 10856K 0K poll 0:01 0.00% 0.00% <perl5.8.7>
93868 bind 2 0 10448K 4112K select 6:13 0.00% 0.00% named
4242 mailman 2 0 10136K 3628K poll 0:36 0.00% 0.00% python2.4
4244 mailman 2 0 10120K 3732K poll 0:38 0.00% 0.00% python2.4
4245 mailman 2 0 10080K 3388K poll 0:36 0.00% 0.00% python2.4
90106 root 2 0 9812K 2868K select 0:41 0.00% 0.00% cppop
4239 mailman 2 0 9056K 1012K poll 0:34 0.00% 0.00% python2.4
4240 mailman 2 0 8752K 3452K poll 0:41 0.00% 0.00% python2.4
4241 mailman 2 0 8720K 3428K poll 0:34 0.00% 0.00% python2.4
4238 mailman 2 0 8144K 0K poll 0:00 0.00% 0.00% <python2.4>
4243 mailman 2 0 8112K 984K poll 0:33 0.00% 0.00% python2.4
4246 mailman 2 0 8096K 1032K poll 0:01 0.00% 0.00% python2.4
91984 root 2 20 7472K 6688K STOP 0:01 0.00% 0.00% perl5.8.7
93547 mailnull 2 0 5712K 1420K poll 0:00 0.00% 0.00% exim-4.64-0
3357 mailnull 2 0 5712K 748K poll 0:20 0.00% 0.00% exim-4.64-0
88210 headout 28 0 5708K 412K RUN 0:00 0.00% 0.00% sshd
88157 root 2 0 5708K 0K sbwait 0:00 0.00% 0.00% <sshd>
3360 mailnull 2 0 5676K 0K poll 0:00 0.00% 0.00% <exim-4.64-0>
When we do a "lsof -p PID" on topconsuming memory processes, we just see the domlogs. Why do they keep on rotating the whole day? None of the domlogs are above the 2 GB limitsize.
 

jayh38

Well-Known Member
Mar 3, 2006
1,212
0
166
You seem to have a fair amount of activity with mysql being a significant usage by clients. Are you hosting forums? Also what are the box specs, single core?

Anyway, for your current process usage, 1 gig ram will not cut it. Especially since your swap is building to 2 gig over the available ram. I would suggest upgrading to 2 gig ram, 3 or 4 better yet.

Also, be rid of mailman service. If your box is small, one user will really suck you dry of resources in a hurry.

As far as your domlogs, yes, they may be delaying on your system due to loading and much of the loading of course is coming from your swap file which of course is due to unavailable ram.

So overall, consider more ram. Perhaps you will not need as much by disabling services you do not really need. Also, offer only one stats suite, such as awstats and disable the others if you have no need.
 

headout

Well-Known Member
Aug 20, 2003
78
0
156
You seem to have a fair amount of activity with mysql being a significant usage by clients. Are you hosting forums? Also what are the box specs, single core?
We host a reasonable amount of forums on this box, for now three years. The issue came up since a few days actually.
Specs: P4 2.8 Ghz and 2 GB Ram.
Anyway, for your current process usage, 1 gig ram will not cut it. Especially since your swap is building to 2 gig over the available ram. I would suggest upgrading to 2 gig ram, 3 or 4 better yet.
The server has already 2 GB ram, so maybe we upgrade it to 4.
Also, be rid of mailman service. If your box is small, one user will really suck you dry of resources in a hurry.
One customer uses it, since 2 years (maybe longer), so it seems this can't create a problem at this moment. But i will keep your advice in my memory.
Also, offer only one stats suite, such as awstats and disable the others if you have no need.
Ok, thanks.

About MySQL and the usage by clients: how to track the customer which uses an unreasonable amount of resources? With the processlist there's nothing particulary to see.

MyTop:
Code:
MySQL on localhost (4.1.22)                                                                                                          up 1+04:52:12 [13:32:56]
 Queries: 278.9M  qps: 2814 Slow:   872.0         Se/In/Up/De(%):    100/00/00/00
             qps now:   30 Slow qps: 0.0  Threads:    2 (   1/  91) 65/03/16/04
 Cache Hits: 277.7M Hits/s: 2801.8 Hits now:  12.1  Ratio: 99.8% Ratio now: 61.0%
 Key Efficiency: 99.5%  Bps in/out: 28.9k/230.5   Now in/out:  4.7k/45.6k

      Id      User         Host/IP         DB      Time    Cmd Query or State
      --      ----         -------         --      ----    --- ----------
  113642      root       localhost   gsf_gsf2         0  Query show full processlist
  113628 colorman_       localhost colorman_p        32  Sleep
 
Last edited:

niccell

Well-Known Member
Aug 10, 2005
46
0
156
Track memory/cpu Usage

Hello!

I'm certainly no expert but I've run across this a time or twelve.. :(

If you install mytop it will show you the mysql processes. But I don't think mysql processes are the issue, it's the web page(s) that are driving them. some of those pages look like they are processing for an awful long time, and probably driving the mysql database into a fit.

if you log into root and go to /proc/processname ls -la

for some of the big resource httpd threads, it should show you which user is causing you grief. To be certain it is them, suspend them, restart httpd and see if the resources stay within normal parameters. Sometimes it is more than one account.
 

headout

Well-Known Member
Aug 20, 2003
78
0
156
We fixed it. We saw in "WHM ---> CPU/Memory/MySQL Usage" an account which was consuming huge resources. We suspended this account, about 12 hours ago from now. That fixed it for now.

Thanks to all for the suggestions made!