FreeBSD shows high load, cannot find bottleneckInexplicably high total CPU usage when itemized view shows lessHigh load on serverHigh traffic, slow response: where is the bottleneck?FreeBSD high load loopback interfaceHigh load average due to high system cpu load (%sys)High Server Load cannot figure out whyfiguring out high load cause from top and iotopHigh load cause?high linux server loadIdentifying bottleneck with nginx VPS load testing

In Romance of the Three Kingdoms why do people still use bamboo sticks when papers are already invented?

How could indestructible materials be used in power generation?

Is there a hemisphere-neutral way of specifying a season?

Why is the ratio of two extensive quantities always intensive?

Why does the EU insist on the backstop when it is clear in a no deal scenario they still intend to keep an open border?

Why can't we play rap on piano?

What mechanic is there to disable a threat instead of killing it?

Python: return float 1.0 as int 1 but float 1.5 as float 1.5

Where does SFDX store details about scratch orgs?

How can I prevent hyper evolved versions of regular creatures from wiping out their cousins?

Why doesn't H₄O²⁺ exist?

Does casting Light, or a similar spell, have any effect when the caster is swallowed by a monster?

Why is consensus so controversial in Britain?

Were any external disk drives stacked vertically?

How to show the equivalence between the regularized regression and their constraint formulas using KKT

I Accidentally Deleted a Stock Terminal Theme

Arrow those variables!

How can I make my BBEG immortal short of making them a Lich or Vampire?

Alternative to sending password over mail?

Reserved de-dupe rules

Is the Joker left-handed?

Withdrawals from HSA

What reasons are there for a Capitalist to oppose a 100% inheritance tax?

Is it possible to create light that imparts a greater proportion of its energy as momentum rather than heat?



FreeBSD shows high load, cannot find bottleneck


Inexplicably high total CPU usage when itemized view shows lessHigh load on serverHigh traffic, slow response: where is the bottleneck?FreeBSD high load loopback interfaceHigh load average due to high system cpu load (%sys)High Server Load cannot figure out whyfiguring out high load cause from top and iotopHigh load cause?high linux server loadIdentifying bottleneck with nginx VPS load testing






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








2















So we have set up a server(11.0-RELEASE-p2) that hosts around 150-200 jails. The server has 24 cores and 192gb of ram. When using top it shows no sign of stress - except the high load. All jails reside on NFS mounts and each jail mounts its own directory upon creation.
The server does not feel slow in any way, its rather snappy. The one thing that bothers us is the high load we get.



Output from top:



last pid: 71841; load averages: 320.13, 131.33, 79.28 up 27+17:45:03 10:37:48
5325 processes:1 running, 5324 sleeping
CPU: 4.4% user, 0.0% nice, 1.6% system, 0.4% interrupt, 93.6% idle
Mem: 3116M Active, 23G Inact, 23G Wired, 900M Buf, 138G Free
ARC: 10G Total, 2612M MFU, 4553M MRU, 37M Anon, 89M Header, 2742M Other
Swap: 4096M Total, 4096M Free


As you can see, the load is high, memory has 138G free and cpu is 94% idle.



Output from systat -vmstat



 3 users Load 92.59 105 73.97 Feb 1 10:39
Mem usage: 26%Phy 6%Kmem
Mem: KB REAL VIRTUAL VN PAGER SWAP PAGER
Tot Share Tot Share Free in out in out
Act 21491k 223884 120800k 555864 144668k count
All 22230k 836948 142997k 4351592 pages
Proc: Interrupts
r p d s w Csw Trp Sys Int Sof Flt ioflt 3595 total
104 5k 13k 5848 20k 1362 127 1646 147 cow atkbd0 1
730 zfod 1 ata1 15
1.8%Sys 0.3%Intr 3.0%User 0.0%Nice 94.9%Idle ozfod ohci0 ohci
| | | | | | | | | | %ozfod ehci0 ohci
=>> daefr 107 cpu0:timer
dtbuf 622 prcfr 722 bce0 259
Namei Name-cache Dir-cache 3237762 desvn 2014 totfr 619 bce1 260
Calls hits % hits % 3237760 numvn react pcib7 263
41265 41201 100 2713450 frevn pdwak 21 mps0 264
1290 pdpgs ciss0 265
Disks da0 da1 cd0 pass0 pass1 pass2 intrn 74 cpu13:time
KB/t 13.33 14.76 0.00 0.00 0.00 0.00 24315624 wire 112 cpu4:timer
tps 10 17 0 0 0 0 3192008 act 147 cpu2:timer
MB/s 0.14 0.24 0.00 0.00 0.00 0.00 23921440 inact 54 cpu3:timer
%busy 0 0 0 0 0 0 cache 132 cpu5:timer
144669k free 52 cpu1:timer
921954 68 cpu19:time
99 cpu21:time
54 cpu20:time
59 cpu18:time
59 cpu22:time
82 cpu23:time
67 cpu12:time
68 cpu6:timer
79 cpu14:time
88 cpu15:time
111 cpu16:time
93 cpu17:time
49 cpu8:timer
251 cpu7:timer
102 cpu9:timer
176 cpu10:time
49 cpu11:time


As far as i can tell nothing looks really strange there either. Sure, there are some interrupts but googling shows that interrupts in the amount we get there is nothing compared to what other people get when they have interrupt problems which are more in the line of 350 000 interrupts.



iostat -w 1



 tty da0 da1 cd0 cpu
tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
1 571 14.51 11 0.15 14.56 11 0.15 0.00 0 0.00 1 0 1 0 99
0 231 10.29 90 0.90 11.26 102 1.12 0.00 0 0.00 3 0 1 0 95
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 3 0 1 0 96
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 7 0 1 0 92
0 79 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 3 0 2 0 95
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 6 0 2 0 93
0 77 13.63 128 1.71 11.97 123 1.44 0.00 0 0.00 2 0 2 0 96
0 79 36.00 1 0.04 14.86 7 0.10 0.00 0 0.00 2 0 1 0 97
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 76 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 80 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 2 0 1 0 97
0 75 9.98 117 1.15 18.43 129 2.32 0.00 0 0.00 3 0 1 0 96
0 81 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 2 0 1 0 96


vmstat -w 1



procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr da0 da1 in sy cs us sy id
3 0 0 115G 138G 297 0 2 0 653 373 0 0 224 59 1405 1 1 99
2 0 0 115G 138G 75 0 0 0 2017 1368 118 109 2299 23370 18920 6 2 92
2 0 0 115G 138G 1397 0 2 0 2839 1434 0 0 2665 30985 23294 5 4 91
2 0 0 115G 138G 1113 0 0 0 666 1373 0 0 2222 23078 17157 5 2 93
1 0 0 115G 138G 7 0 0 0 597 1368 0 0 590 18529 10477 2 1 96
1 0 0 115G 138G 0 0 2 0 194 2773 83 81 1269 26734 19190 3 3 94
1 0 0 115G 138G 9 0 0 0 90 1404 0 0 833 18907 11455 2 2 96
2 0 0 115G 138G 13 0 0 0 1309 1374 0 0 3185 25773 20054 3 3 94
1 0 0 115G 138G 1419 0 0 0 2750 1369 0 0 3899 25403 23252 7 4 90
0 0 0 115G 138G 776 0 1 0 164 1368 75 58 837 26261 16368 3 3 94
1 0 0 115G 138G 2336 0 5 0 2562 1367 0 0 1337 23287 13288 3 3 94
0 0 0 115G 138G 560 0 0 0 1193 2785 0 0 608 27176 14512 5 5 90
1 0 0 115G 138G 0 0 2 0 249 1369 0 0 702 18533 10700 1 2 97
1 0 0 115G 138G 3290 0 0 0 2313 1369 91 96 1461 22049 14726 6 3 91


About NFS i really dont know how to look for problems there. But here is a output from



nfsstat -c



Client Info:
Rpc Counts:
Getattr Setattr Lookup Readlink Read Write Create Remove
44956931 1020943 93567574 167 23609403 879028 514647 665228
Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access
36867 1387 1 24655 21955 6118822 0 26166205
Mknod Fsstat Fsinfo PathConf Commit
0 5489407 1 2270 830867
Rpc Info:
TimedOut Invalid X Replies Retries Requests
0 0 0 0 203906224
Cache Info:
Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW Hits Misses
-719986429 44956925 -1243965171 93531884 66678251 22460288 981123 879028
BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs Hits Misses
144 167 14572148 5721030 5124486 1455 -1123294109 26165764


and from



nfsstat -w 1 -c



GtAttr Lookup Rdlink Read Write Rename Access Rddir
5 0 0 5 0 0 0 2
9 342 0 9 0 0 42 9
12 91 0 21 0 0 21 4
0 2 0 0 0 0 2 0
0 1 0 0 0 0 0 0
0 5 0 0 0 0 2 0
5 124 0 5 0 0 0 2
6 12 0 5 0 0 12 2
4 0 0 5 0 0 0 2
9 0 0 10 0 0 0 4
4 0 0 5 0 0 0 2
50 1 0 14 0 0 0 7


and finally output from



systat -ifstat



 /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 29.6

Interface Traffic Peak Total
lo0 in 34.285 KB/s 291.936 KB/s 69.263 GB
out 34.285 KB/s 291.936 KB/s 69.263 GB

bce1 in 792.808 KB/s 5.382 MB/s 707.266 GB
out 56.828 KB/s 238.912 KB/s 91.154 GB

bce0 in 21.711 KB/s 21.711 KB/s 17.338 GB
out 13.799 KB/s 287.402 KB/s 64.000 GB


As requested dmesg:



[larsemil@prison01 ~]$ dmesg
Limiting open port RST response from 213 to 200 packets/sec
Limiting open port RST response from 2636 to 200 packets/sec
pid 22548 (php-fpm), uid 10000: exited on signal 11
pid 26938 (wkhtmltopdf), uid 10000: exited on signal 6 (core dumped)
[zone: pf states] PF states limit reached
Limiting icmp ping response from 9592 to 200 packets/sec
Limiting icmp ping response from 611 to 200 packets/sec
Limiting icmp ping response from 1792 to 200 packets/sec
Limiting icmp ping response from 2650 to 200 packets/sec
Limiting icmp ping response from 316 to 200 packets/sec
Limiting icmp ping response from 1758 to 200 packets/sec
Limiting icmp ping response from 2478 to 200 packets/sec
Limiting icmp ping response from 578 to 200 packets/sec
Limiting icmp ping response from 2028 to 200 packets/sec
Limiting icmp ping response from 3175 to 200 packets/sec
Limiting icmp ping response from 245 to 200 packets/sec
Limiting icmp ping response from 536 to 200 packets/sec
Limiting icmp ping response from 229 to 200 packets/sec
Limiting icmp ping response from 546 to 200 packets/sec
Limiting icmp ping response from 2239 to 200 packets/sec
Limiting icmp ping response from 3414 to 200 packets/sec
Limiting icmp ping response from 3033 to 200 packets/sec
Limiting icmp ping response from 1018 to 200 packets/sec
Limiting icmp ping response from 270 to 200 packets/sec
pid 34239 (php-fpm), uid 10000: exited on signal 11
pid 68427 (php-fpm), uid 10000: exited on signal 11


Any ideas are welcome!










share|improve this question
















bumped to the homepage by Community 2 days ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • this info might be totally wrong but i have seen an unusual high load because of many subshells/procs beeing created every seconds (we had a case where one server did start up ~4-5k valid processes per second). After some finetuning we dropped that number to ~1k. No visible direct impact anywhere, similiar to what you are describing. This was on a debian linux server so uncertain if that can be an issue on BSD as well.

    – Dennis Nolte
    Feb 1 '17 at 10:09











  • Be careful, FreeBSD appears to calculate load average different from how Linux does it. That said, I'm wondering if there is any actual problem at all here, aside from that the "load average" numbers feel high but no other indication of any excessive load?

    – a CVn
    Feb 1 '17 at 10:18











  • @MichaelKjörling this is basicly my question as well. As the system in general feels snappy i dont know if there really is a problem. Still a load on occation towards 300-400 seems rather excessive.

    – larsemil
    Feb 1 '17 at 10:19











  • It's still less than 10% of the processes running on the system at the time of your snapshot (320/5325=6.01%). FWIW, I just posted How is load average calculated on FreeBSD? on our sister site Unix & Linux because I was unable to actually locate any concrete information on how FreeBSD calculates the load average numbers, and your question piqued my curiosity.

    – a CVn
    Feb 1 '17 at 10:23












  • I don't have much experience in FreeBSD, but I know that until recently Ubuntu had a hard time calculating the real load on a host as soon as virtualisation came into play. I'm pretty sure your real load is much lower than the system shows you. BSD has the habit of implementing updates very late in non-experimental versions due to stability reasons. Also can it be that BSD restricts the load info given by each jail due to security reasons and so it assumes a higher load?

    – Broco
    Feb 1 '17 at 10:46

















2















So we have set up a server(11.0-RELEASE-p2) that hosts around 150-200 jails. The server has 24 cores and 192gb of ram. When using top it shows no sign of stress - except the high load. All jails reside on NFS mounts and each jail mounts its own directory upon creation.
The server does not feel slow in any way, its rather snappy. The one thing that bothers us is the high load we get.



Output from top:



last pid: 71841; load averages: 320.13, 131.33, 79.28 up 27+17:45:03 10:37:48
5325 processes:1 running, 5324 sleeping
CPU: 4.4% user, 0.0% nice, 1.6% system, 0.4% interrupt, 93.6% idle
Mem: 3116M Active, 23G Inact, 23G Wired, 900M Buf, 138G Free
ARC: 10G Total, 2612M MFU, 4553M MRU, 37M Anon, 89M Header, 2742M Other
Swap: 4096M Total, 4096M Free


As you can see, the load is high, memory has 138G free and cpu is 94% idle.



Output from systat -vmstat



 3 users Load 92.59 105 73.97 Feb 1 10:39
Mem usage: 26%Phy 6%Kmem
Mem: KB REAL VIRTUAL VN PAGER SWAP PAGER
Tot Share Tot Share Free in out in out
Act 21491k 223884 120800k 555864 144668k count
All 22230k 836948 142997k 4351592 pages
Proc: Interrupts
r p d s w Csw Trp Sys Int Sof Flt ioflt 3595 total
104 5k 13k 5848 20k 1362 127 1646 147 cow atkbd0 1
730 zfod 1 ata1 15
1.8%Sys 0.3%Intr 3.0%User 0.0%Nice 94.9%Idle ozfod ohci0 ohci
| | | | | | | | | | %ozfod ehci0 ohci
=>> daefr 107 cpu0:timer
dtbuf 622 prcfr 722 bce0 259
Namei Name-cache Dir-cache 3237762 desvn 2014 totfr 619 bce1 260
Calls hits % hits % 3237760 numvn react pcib7 263
41265 41201 100 2713450 frevn pdwak 21 mps0 264
1290 pdpgs ciss0 265
Disks da0 da1 cd0 pass0 pass1 pass2 intrn 74 cpu13:time
KB/t 13.33 14.76 0.00 0.00 0.00 0.00 24315624 wire 112 cpu4:timer
tps 10 17 0 0 0 0 3192008 act 147 cpu2:timer
MB/s 0.14 0.24 0.00 0.00 0.00 0.00 23921440 inact 54 cpu3:timer
%busy 0 0 0 0 0 0 cache 132 cpu5:timer
144669k free 52 cpu1:timer
921954 68 cpu19:time
99 cpu21:time
54 cpu20:time
59 cpu18:time
59 cpu22:time
82 cpu23:time
67 cpu12:time
68 cpu6:timer
79 cpu14:time
88 cpu15:time
111 cpu16:time
93 cpu17:time
49 cpu8:timer
251 cpu7:timer
102 cpu9:timer
176 cpu10:time
49 cpu11:time


As far as i can tell nothing looks really strange there either. Sure, there are some interrupts but googling shows that interrupts in the amount we get there is nothing compared to what other people get when they have interrupt problems which are more in the line of 350 000 interrupts.



iostat -w 1



 tty da0 da1 cd0 cpu
tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
1 571 14.51 11 0.15 14.56 11 0.15 0.00 0 0.00 1 0 1 0 99
0 231 10.29 90 0.90 11.26 102 1.12 0.00 0 0.00 3 0 1 0 95
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 3 0 1 0 96
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 7 0 1 0 92
0 79 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 3 0 2 0 95
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 6 0 2 0 93
0 77 13.63 128 1.71 11.97 123 1.44 0.00 0 0.00 2 0 2 0 96
0 79 36.00 1 0.04 14.86 7 0.10 0.00 0 0.00 2 0 1 0 97
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 76 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 80 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 2 0 1 0 97
0 75 9.98 117 1.15 18.43 129 2.32 0.00 0 0.00 3 0 1 0 96
0 81 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 2 0 1 0 96


vmstat -w 1



procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr da0 da1 in sy cs us sy id
3 0 0 115G 138G 297 0 2 0 653 373 0 0 224 59 1405 1 1 99
2 0 0 115G 138G 75 0 0 0 2017 1368 118 109 2299 23370 18920 6 2 92
2 0 0 115G 138G 1397 0 2 0 2839 1434 0 0 2665 30985 23294 5 4 91
2 0 0 115G 138G 1113 0 0 0 666 1373 0 0 2222 23078 17157 5 2 93
1 0 0 115G 138G 7 0 0 0 597 1368 0 0 590 18529 10477 2 1 96
1 0 0 115G 138G 0 0 2 0 194 2773 83 81 1269 26734 19190 3 3 94
1 0 0 115G 138G 9 0 0 0 90 1404 0 0 833 18907 11455 2 2 96
2 0 0 115G 138G 13 0 0 0 1309 1374 0 0 3185 25773 20054 3 3 94
1 0 0 115G 138G 1419 0 0 0 2750 1369 0 0 3899 25403 23252 7 4 90
0 0 0 115G 138G 776 0 1 0 164 1368 75 58 837 26261 16368 3 3 94
1 0 0 115G 138G 2336 0 5 0 2562 1367 0 0 1337 23287 13288 3 3 94
0 0 0 115G 138G 560 0 0 0 1193 2785 0 0 608 27176 14512 5 5 90
1 0 0 115G 138G 0 0 2 0 249 1369 0 0 702 18533 10700 1 2 97
1 0 0 115G 138G 3290 0 0 0 2313 1369 91 96 1461 22049 14726 6 3 91


About NFS i really dont know how to look for problems there. But here is a output from



nfsstat -c



Client Info:
Rpc Counts:
Getattr Setattr Lookup Readlink Read Write Create Remove
44956931 1020943 93567574 167 23609403 879028 514647 665228
Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access
36867 1387 1 24655 21955 6118822 0 26166205
Mknod Fsstat Fsinfo PathConf Commit
0 5489407 1 2270 830867
Rpc Info:
TimedOut Invalid X Replies Retries Requests
0 0 0 0 203906224
Cache Info:
Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW Hits Misses
-719986429 44956925 -1243965171 93531884 66678251 22460288 981123 879028
BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs Hits Misses
144 167 14572148 5721030 5124486 1455 -1123294109 26165764


and from



nfsstat -w 1 -c



GtAttr Lookup Rdlink Read Write Rename Access Rddir
5 0 0 5 0 0 0 2
9 342 0 9 0 0 42 9
12 91 0 21 0 0 21 4
0 2 0 0 0 0 2 0
0 1 0 0 0 0 0 0
0 5 0 0 0 0 2 0
5 124 0 5 0 0 0 2
6 12 0 5 0 0 12 2
4 0 0 5 0 0 0 2
9 0 0 10 0 0 0 4
4 0 0 5 0 0 0 2
50 1 0 14 0 0 0 7


and finally output from



systat -ifstat



 /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 29.6

Interface Traffic Peak Total
lo0 in 34.285 KB/s 291.936 KB/s 69.263 GB
out 34.285 KB/s 291.936 KB/s 69.263 GB

bce1 in 792.808 KB/s 5.382 MB/s 707.266 GB
out 56.828 KB/s 238.912 KB/s 91.154 GB

bce0 in 21.711 KB/s 21.711 KB/s 17.338 GB
out 13.799 KB/s 287.402 KB/s 64.000 GB


As requested dmesg:



[larsemil@prison01 ~]$ dmesg
Limiting open port RST response from 213 to 200 packets/sec
Limiting open port RST response from 2636 to 200 packets/sec
pid 22548 (php-fpm), uid 10000: exited on signal 11
pid 26938 (wkhtmltopdf), uid 10000: exited on signal 6 (core dumped)
[zone: pf states] PF states limit reached
Limiting icmp ping response from 9592 to 200 packets/sec
Limiting icmp ping response from 611 to 200 packets/sec
Limiting icmp ping response from 1792 to 200 packets/sec
Limiting icmp ping response from 2650 to 200 packets/sec
Limiting icmp ping response from 316 to 200 packets/sec
Limiting icmp ping response from 1758 to 200 packets/sec
Limiting icmp ping response from 2478 to 200 packets/sec
Limiting icmp ping response from 578 to 200 packets/sec
Limiting icmp ping response from 2028 to 200 packets/sec
Limiting icmp ping response from 3175 to 200 packets/sec
Limiting icmp ping response from 245 to 200 packets/sec
Limiting icmp ping response from 536 to 200 packets/sec
Limiting icmp ping response from 229 to 200 packets/sec
Limiting icmp ping response from 546 to 200 packets/sec
Limiting icmp ping response from 2239 to 200 packets/sec
Limiting icmp ping response from 3414 to 200 packets/sec
Limiting icmp ping response from 3033 to 200 packets/sec
Limiting icmp ping response from 1018 to 200 packets/sec
Limiting icmp ping response from 270 to 200 packets/sec
pid 34239 (php-fpm), uid 10000: exited on signal 11
pid 68427 (php-fpm), uid 10000: exited on signal 11


Any ideas are welcome!










share|improve this question
















bumped to the homepage by Community 2 days ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • this info might be totally wrong but i have seen an unusual high load because of many subshells/procs beeing created every seconds (we had a case where one server did start up ~4-5k valid processes per second). After some finetuning we dropped that number to ~1k. No visible direct impact anywhere, similiar to what you are describing. This was on a debian linux server so uncertain if that can be an issue on BSD as well.

    – Dennis Nolte
    Feb 1 '17 at 10:09











  • Be careful, FreeBSD appears to calculate load average different from how Linux does it. That said, I'm wondering if there is any actual problem at all here, aside from that the "load average" numbers feel high but no other indication of any excessive load?

    – a CVn
    Feb 1 '17 at 10:18











  • @MichaelKjörling this is basicly my question as well. As the system in general feels snappy i dont know if there really is a problem. Still a load on occation towards 300-400 seems rather excessive.

    – larsemil
    Feb 1 '17 at 10:19











  • It's still less than 10% of the processes running on the system at the time of your snapshot (320/5325=6.01%). FWIW, I just posted How is load average calculated on FreeBSD? on our sister site Unix & Linux because I was unable to actually locate any concrete information on how FreeBSD calculates the load average numbers, and your question piqued my curiosity.

    – a CVn
    Feb 1 '17 at 10:23












  • I don't have much experience in FreeBSD, but I know that until recently Ubuntu had a hard time calculating the real load on a host as soon as virtualisation came into play. I'm pretty sure your real load is much lower than the system shows you. BSD has the habit of implementing updates very late in non-experimental versions due to stability reasons. Also can it be that BSD restricts the load info given by each jail due to security reasons and so it assumes a higher load?

    – Broco
    Feb 1 '17 at 10:46













2












2








2








So we have set up a server(11.0-RELEASE-p2) that hosts around 150-200 jails. The server has 24 cores and 192gb of ram. When using top it shows no sign of stress - except the high load. All jails reside on NFS mounts and each jail mounts its own directory upon creation.
The server does not feel slow in any way, its rather snappy. The one thing that bothers us is the high load we get.



Output from top:



last pid: 71841; load averages: 320.13, 131.33, 79.28 up 27+17:45:03 10:37:48
5325 processes:1 running, 5324 sleeping
CPU: 4.4% user, 0.0% nice, 1.6% system, 0.4% interrupt, 93.6% idle
Mem: 3116M Active, 23G Inact, 23G Wired, 900M Buf, 138G Free
ARC: 10G Total, 2612M MFU, 4553M MRU, 37M Anon, 89M Header, 2742M Other
Swap: 4096M Total, 4096M Free


As you can see, the load is high, memory has 138G free and cpu is 94% idle.



Output from systat -vmstat



 3 users Load 92.59 105 73.97 Feb 1 10:39
Mem usage: 26%Phy 6%Kmem
Mem: KB REAL VIRTUAL VN PAGER SWAP PAGER
Tot Share Tot Share Free in out in out
Act 21491k 223884 120800k 555864 144668k count
All 22230k 836948 142997k 4351592 pages
Proc: Interrupts
r p d s w Csw Trp Sys Int Sof Flt ioflt 3595 total
104 5k 13k 5848 20k 1362 127 1646 147 cow atkbd0 1
730 zfod 1 ata1 15
1.8%Sys 0.3%Intr 3.0%User 0.0%Nice 94.9%Idle ozfod ohci0 ohci
| | | | | | | | | | %ozfod ehci0 ohci
=>> daefr 107 cpu0:timer
dtbuf 622 prcfr 722 bce0 259
Namei Name-cache Dir-cache 3237762 desvn 2014 totfr 619 bce1 260
Calls hits % hits % 3237760 numvn react pcib7 263
41265 41201 100 2713450 frevn pdwak 21 mps0 264
1290 pdpgs ciss0 265
Disks da0 da1 cd0 pass0 pass1 pass2 intrn 74 cpu13:time
KB/t 13.33 14.76 0.00 0.00 0.00 0.00 24315624 wire 112 cpu4:timer
tps 10 17 0 0 0 0 3192008 act 147 cpu2:timer
MB/s 0.14 0.24 0.00 0.00 0.00 0.00 23921440 inact 54 cpu3:timer
%busy 0 0 0 0 0 0 cache 132 cpu5:timer
144669k free 52 cpu1:timer
921954 68 cpu19:time
99 cpu21:time
54 cpu20:time
59 cpu18:time
59 cpu22:time
82 cpu23:time
67 cpu12:time
68 cpu6:timer
79 cpu14:time
88 cpu15:time
111 cpu16:time
93 cpu17:time
49 cpu8:timer
251 cpu7:timer
102 cpu9:timer
176 cpu10:time
49 cpu11:time


As far as i can tell nothing looks really strange there either. Sure, there are some interrupts but googling shows that interrupts in the amount we get there is nothing compared to what other people get when they have interrupt problems which are more in the line of 350 000 interrupts.



iostat -w 1



 tty da0 da1 cd0 cpu
tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
1 571 14.51 11 0.15 14.56 11 0.15 0.00 0 0.00 1 0 1 0 99
0 231 10.29 90 0.90 11.26 102 1.12 0.00 0 0.00 3 0 1 0 95
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 3 0 1 0 96
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 7 0 1 0 92
0 79 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 3 0 2 0 95
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 6 0 2 0 93
0 77 13.63 128 1.71 11.97 123 1.44 0.00 0 0.00 2 0 2 0 96
0 79 36.00 1 0.04 14.86 7 0.10 0.00 0 0.00 2 0 1 0 97
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 76 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 80 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 2 0 1 0 97
0 75 9.98 117 1.15 18.43 129 2.32 0.00 0 0.00 3 0 1 0 96
0 81 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 2 0 1 0 96


vmstat -w 1



procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr da0 da1 in sy cs us sy id
3 0 0 115G 138G 297 0 2 0 653 373 0 0 224 59 1405 1 1 99
2 0 0 115G 138G 75 0 0 0 2017 1368 118 109 2299 23370 18920 6 2 92
2 0 0 115G 138G 1397 0 2 0 2839 1434 0 0 2665 30985 23294 5 4 91
2 0 0 115G 138G 1113 0 0 0 666 1373 0 0 2222 23078 17157 5 2 93
1 0 0 115G 138G 7 0 0 0 597 1368 0 0 590 18529 10477 2 1 96
1 0 0 115G 138G 0 0 2 0 194 2773 83 81 1269 26734 19190 3 3 94
1 0 0 115G 138G 9 0 0 0 90 1404 0 0 833 18907 11455 2 2 96
2 0 0 115G 138G 13 0 0 0 1309 1374 0 0 3185 25773 20054 3 3 94
1 0 0 115G 138G 1419 0 0 0 2750 1369 0 0 3899 25403 23252 7 4 90
0 0 0 115G 138G 776 0 1 0 164 1368 75 58 837 26261 16368 3 3 94
1 0 0 115G 138G 2336 0 5 0 2562 1367 0 0 1337 23287 13288 3 3 94
0 0 0 115G 138G 560 0 0 0 1193 2785 0 0 608 27176 14512 5 5 90
1 0 0 115G 138G 0 0 2 0 249 1369 0 0 702 18533 10700 1 2 97
1 0 0 115G 138G 3290 0 0 0 2313 1369 91 96 1461 22049 14726 6 3 91


About NFS i really dont know how to look for problems there. But here is a output from



nfsstat -c



Client Info:
Rpc Counts:
Getattr Setattr Lookup Readlink Read Write Create Remove
44956931 1020943 93567574 167 23609403 879028 514647 665228
Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access
36867 1387 1 24655 21955 6118822 0 26166205
Mknod Fsstat Fsinfo PathConf Commit
0 5489407 1 2270 830867
Rpc Info:
TimedOut Invalid X Replies Retries Requests
0 0 0 0 203906224
Cache Info:
Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW Hits Misses
-719986429 44956925 -1243965171 93531884 66678251 22460288 981123 879028
BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs Hits Misses
144 167 14572148 5721030 5124486 1455 -1123294109 26165764


and from



nfsstat -w 1 -c



GtAttr Lookup Rdlink Read Write Rename Access Rddir
5 0 0 5 0 0 0 2
9 342 0 9 0 0 42 9
12 91 0 21 0 0 21 4
0 2 0 0 0 0 2 0
0 1 0 0 0 0 0 0
0 5 0 0 0 0 2 0
5 124 0 5 0 0 0 2
6 12 0 5 0 0 12 2
4 0 0 5 0 0 0 2
9 0 0 10 0 0 0 4
4 0 0 5 0 0 0 2
50 1 0 14 0 0 0 7


and finally output from



systat -ifstat



 /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 29.6

Interface Traffic Peak Total
lo0 in 34.285 KB/s 291.936 KB/s 69.263 GB
out 34.285 KB/s 291.936 KB/s 69.263 GB

bce1 in 792.808 KB/s 5.382 MB/s 707.266 GB
out 56.828 KB/s 238.912 KB/s 91.154 GB

bce0 in 21.711 KB/s 21.711 KB/s 17.338 GB
out 13.799 KB/s 287.402 KB/s 64.000 GB


As requested dmesg:



[larsemil@prison01 ~]$ dmesg
Limiting open port RST response from 213 to 200 packets/sec
Limiting open port RST response from 2636 to 200 packets/sec
pid 22548 (php-fpm), uid 10000: exited on signal 11
pid 26938 (wkhtmltopdf), uid 10000: exited on signal 6 (core dumped)
[zone: pf states] PF states limit reached
Limiting icmp ping response from 9592 to 200 packets/sec
Limiting icmp ping response from 611 to 200 packets/sec
Limiting icmp ping response from 1792 to 200 packets/sec
Limiting icmp ping response from 2650 to 200 packets/sec
Limiting icmp ping response from 316 to 200 packets/sec
Limiting icmp ping response from 1758 to 200 packets/sec
Limiting icmp ping response from 2478 to 200 packets/sec
Limiting icmp ping response from 578 to 200 packets/sec
Limiting icmp ping response from 2028 to 200 packets/sec
Limiting icmp ping response from 3175 to 200 packets/sec
Limiting icmp ping response from 245 to 200 packets/sec
Limiting icmp ping response from 536 to 200 packets/sec
Limiting icmp ping response from 229 to 200 packets/sec
Limiting icmp ping response from 546 to 200 packets/sec
Limiting icmp ping response from 2239 to 200 packets/sec
Limiting icmp ping response from 3414 to 200 packets/sec
Limiting icmp ping response from 3033 to 200 packets/sec
Limiting icmp ping response from 1018 to 200 packets/sec
Limiting icmp ping response from 270 to 200 packets/sec
pid 34239 (php-fpm), uid 10000: exited on signal 11
pid 68427 (php-fpm), uid 10000: exited on signal 11


Any ideas are welcome!










share|improve this question
















So we have set up a server(11.0-RELEASE-p2) that hosts around 150-200 jails. The server has 24 cores and 192gb of ram. When using top it shows no sign of stress - except the high load. All jails reside on NFS mounts and each jail mounts its own directory upon creation.
The server does not feel slow in any way, its rather snappy. The one thing that bothers us is the high load we get.



Output from top:



last pid: 71841; load averages: 320.13, 131.33, 79.28 up 27+17:45:03 10:37:48
5325 processes:1 running, 5324 sleeping
CPU: 4.4% user, 0.0% nice, 1.6% system, 0.4% interrupt, 93.6% idle
Mem: 3116M Active, 23G Inact, 23G Wired, 900M Buf, 138G Free
ARC: 10G Total, 2612M MFU, 4553M MRU, 37M Anon, 89M Header, 2742M Other
Swap: 4096M Total, 4096M Free


As you can see, the load is high, memory has 138G free and cpu is 94% idle.



Output from systat -vmstat



 3 users Load 92.59 105 73.97 Feb 1 10:39
Mem usage: 26%Phy 6%Kmem
Mem: KB REAL VIRTUAL VN PAGER SWAP PAGER
Tot Share Tot Share Free in out in out
Act 21491k 223884 120800k 555864 144668k count
All 22230k 836948 142997k 4351592 pages
Proc: Interrupts
r p d s w Csw Trp Sys Int Sof Flt ioflt 3595 total
104 5k 13k 5848 20k 1362 127 1646 147 cow atkbd0 1
730 zfod 1 ata1 15
1.8%Sys 0.3%Intr 3.0%User 0.0%Nice 94.9%Idle ozfod ohci0 ohci
| | | | | | | | | | %ozfod ehci0 ohci
=>> daefr 107 cpu0:timer
dtbuf 622 prcfr 722 bce0 259
Namei Name-cache Dir-cache 3237762 desvn 2014 totfr 619 bce1 260
Calls hits % hits % 3237760 numvn react pcib7 263
41265 41201 100 2713450 frevn pdwak 21 mps0 264
1290 pdpgs ciss0 265
Disks da0 da1 cd0 pass0 pass1 pass2 intrn 74 cpu13:time
KB/t 13.33 14.76 0.00 0.00 0.00 0.00 24315624 wire 112 cpu4:timer
tps 10 17 0 0 0 0 3192008 act 147 cpu2:timer
MB/s 0.14 0.24 0.00 0.00 0.00 0.00 23921440 inact 54 cpu3:timer
%busy 0 0 0 0 0 0 cache 132 cpu5:timer
144669k free 52 cpu1:timer
921954 68 cpu19:time
99 cpu21:time
54 cpu20:time
59 cpu18:time
59 cpu22:time
82 cpu23:time
67 cpu12:time
68 cpu6:timer
79 cpu14:time
88 cpu15:time
111 cpu16:time
93 cpu17:time
49 cpu8:timer
251 cpu7:timer
102 cpu9:timer
176 cpu10:time
49 cpu11:time


As far as i can tell nothing looks really strange there either. Sure, there are some interrupts but googling shows that interrupts in the amount we get there is nothing compared to what other people get when they have interrupt problems which are more in the line of 350 000 interrupts.



iostat -w 1



 tty da0 da1 cd0 cpu
tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
1 571 14.51 11 0.15 14.56 11 0.15 0.00 0 0.00 1 0 1 0 99
0 231 10.29 90 0.90 11.26 102 1.12 0.00 0 0.00 3 0 1 0 95
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 3 0 1 0 96
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 7 0 1 0 92
0 79 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 3 0 2 0 95
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 6 0 2 0 93
0 77 13.63 128 1.71 11.97 123 1.44 0.00 0 0.00 2 0 2 0 96
0 79 36.00 1 0.04 14.86 7 0.10 0.00 0 0.00 2 0 1 0 97
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 76 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 80 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 2 0 1 0 97
0 75 9.98 117 1.15 18.43 129 2.32 0.00 0 0.00 3 0 1 0 96
0 81 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 4 0 2 0 94
0 78 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 2 0 1 0 96


vmstat -w 1



procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr da0 da1 in sy cs us sy id
3 0 0 115G 138G 297 0 2 0 653 373 0 0 224 59 1405 1 1 99
2 0 0 115G 138G 75 0 0 0 2017 1368 118 109 2299 23370 18920 6 2 92
2 0 0 115G 138G 1397 0 2 0 2839 1434 0 0 2665 30985 23294 5 4 91
2 0 0 115G 138G 1113 0 0 0 666 1373 0 0 2222 23078 17157 5 2 93
1 0 0 115G 138G 7 0 0 0 597 1368 0 0 590 18529 10477 2 1 96
1 0 0 115G 138G 0 0 2 0 194 2773 83 81 1269 26734 19190 3 3 94
1 0 0 115G 138G 9 0 0 0 90 1404 0 0 833 18907 11455 2 2 96
2 0 0 115G 138G 13 0 0 0 1309 1374 0 0 3185 25773 20054 3 3 94
1 0 0 115G 138G 1419 0 0 0 2750 1369 0 0 3899 25403 23252 7 4 90
0 0 0 115G 138G 776 0 1 0 164 1368 75 58 837 26261 16368 3 3 94
1 0 0 115G 138G 2336 0 5 0 2562 1367 0 0 1337 23287 13288 3 3 94
0 0 0 115G 138G 560 0 0 0 1193 2785 0 0 608 27176 14512 5 5 90
1 0 0 115G 138G 0 0 2 0 249 1369 0 0 702 18533 10700 1 2 97
1 0 0 115G 138G 3290 0 0 0 2313 1369 91 96 1461 22049 14726 6 3 91


About NFS i really dont know how to look for problems there. But here is a output from



nfsstat -c



Client Info:
Rpc Counts:
Getattr Setattr Lookup Readlink Read Write Create Remove
44956931 1020943 93567574 167 23609403 879028 514647 665228
Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access
36867 1387 1 24655 21955 6118822 0 26166205
Mknod Fsstat Fsinfo PathConf Commit
0 5489407 1 2270 830867
Rpc Info:
TimedOut Invalid X Replies Retries Requests
0 0 0 0 203906224
Cache Info:
Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW Hits Misses
-719986429 44956925 -1243965171 93531884 66678251 22460288 981123 879028
BioRLHits Misses BioD Hits Misses DirE Hits Misses Accs Hits Misses
144 167 14572148 5721030 5124486 1455 -1123294109 26165764


and from



nfsstat -w 1 -c



GtAttr Lookup Rdlink Read Write Rename Access Rddir
5 0 0 5 0 0 0 2
9 342 0 9 0 0 42 9
12 91 0 21 0 0 21 4
0 2 0 0 0 0 2 0
0 1 0 0 0 0 0 0
0 5 0 0 0 0 2 0
5 124 0 5 0 0 0 2
6 12 0 5 0 0 12 2
4 0 0 5 0 0 0 2
9 0 0 10 0 0 0 4
4 0 0 5 0 0 0 2
50 1 0 14 0 0 0 7


and finally output from



systat -ifstat



 /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 29.6

Interface Traffic Peak Total
lo0 in 34.285 KB/s 291.936 KB/s 69.263 GB
out 34.285 KB/s 291.936 KB/s 69.263 GB

bce1 in 792.808 KB/s 5.382 MB/s 707.266 GB
out 56.828 KB/s 238.912 KB/s 91.154 GB

bce0 in 21.711 KB/s 21.711 KB/s 17.338 GB
out 13.799 KB/s 287.402 KB/s 64.000 GB


As requested dmesg:



[larsemil@prison01 ~]$ dmesg
Limiting open port RST response from 213 to 200 packets/sec
Limiting open port RST response from 2636 to 200 packets/sec
pid 22548 (php-fpm), uid 10000: exited on signal 11
pid 26938 (wkhtmltopdf), uid 10000: exited on signal 6 (core dumped)
[zone: pf states] PF states limit reached
Limiting icmp ping response from 9592 to 200 packets/sec
Limiting icmp ping response from 611 to 200 packets/sec
Limiting icmp ping response from 1792 to 200 packets/sec
Limiting icmp ping response from 2650 to 200 packets/sec
Limiting icmp ping response from 316 to 200 packets/sec
Limiting icmp ping response from 1758 to 200 packets/sec
Limiting icmp ping response from 2478 to 200 packets/sec
Limiting icmp ping response from 578 to 200 packets/sec
Limiting icmp ping response from 2028 to 200 packets/sec
Limiting icmp ping response from 3175 to 200 packets/sec
Limiting icmp ping response from 245 to 200 packets/sec
Limiting icmp ping response from 536 to 200 packets/sec
Limiting icmp ping response from 229 to 200 packets/sec
Limiting icmp ping response from 546 to 200 packets/sec
Limiting icmp ping response from 2239 to 200 packets/sec
Limiting icmp ping response from 3414 to 200 packets/sec
Limiting icmp ping response from 3033 to 200 packets/sec
Limiting icmp ping response from 1018 to 200 packets/sec
Limiting icmp ping response from 270 to 200 packets/sec
pid 34239 (php-fpm), uid 10000: exited on signal 11
pid 68427 (php-fpm), uid 10000: exited on signal 11


Any ideas are welcome!







freebsd high-load






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 15 '17 at 19:47







larsemil

















asked Feb 1 '17 at 10:01









larsemillarsemil

14417




14417





bumped to the homepage by Community 2 days ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 2 days ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.














  • this info might be totally wrong but i have seen an unusual high load because of many subshells/procs beeing created every seconds (we had a case where one server did start up ~4-5k valid processes per second). After some finetuning we dropped that number to ~1k. No visible direct impact anywhere, similiar to what you are describing. This was on a debian linux server so uncertain if that can be an issue on BSD as well.

    – Dennis Nolte
    Feb 1 '17 at 10:09











  • Be careful, FreeBSD appears to calculate load average different from how Linux does it. That said, I'm wondering if there is any actual problem at all here, aside from that the "load average" numbers feel high but no other indication of any excessive load?

    – a CVn
    Feb 1 '17 at 10:18











  • @MichaelKjörling this is basicly my question as well. As the system in general feels snappy i dont know if there really is a problem. Still a load on occation towards 300-400 seems rather excessive.

    – larsemil
    Feb 1 '17 at 10:19











  • It's still less than 10% of the processes running on the system at the time of your snapshot (320/5325=6.01%). FWIW, I just posted How is load average calculated on FreeBSD? on our sister site Unix & Linux because I was unable to actually locate any concrete information on how FreeBSD calculates the load average numbers, and your question piqued my curiosity.

    – a CVn
    Feb 1 '17 at 10:23












  • I don't have much experience in FreeBSD, but I know that until recently Ubuntu had a hard time calculating the real load on a host as soon as virtualisation came into play. I'm pretty sure your real load is much lower than the system shows you. BSD has the habit of implementing updates very late in non-experimental versions due to stability reasons. Also can it be that BSD restricts the load info given by each jail due to security reasons and so it assumes a higher load?

    – Broco
    Feb 1 '17 at 10:46

















  • this info might be totally wrong but i have seen an unusual high load because of many subshells/procs beeing created every seconds (we had a case where one server did start up ~4-5k valid processes per second). After some finetuning we dropped that number to ~1k. No visible direct impact anywhere, similiar to what you are describing. This was on a debian linux server so uncertain if that can be an issue on BSD as well.

    – Dennis Nolte
    Feb 1 '17 at 10:09











  • Be careful, FreeBSD appears to calculate load average different from how Linux does it. That said, I'm wondering if there is any actual problem at all here, aside from that the "load average" numbers feel high but no other indication of any excessive load?

    – a CVn
    Feb 1 '17 at 10:18











  • @MichaelKjörling this is basicly my question as well. As the system in general feels snappy i dont know if there really is a problem. Still a load on occation towards 300-400 seems rather excessive.

    – larsemil
    Feb 1 '17 at 10:19











  • It's still less than 10% of the processes running on the system at the time of your snapshot (320/5325=6.01%). FWIW, I just posted How is load average calculated on FreeBSD? on our sister site Unix & Linux because I was unable to actually locate any concrete information on how FreeBSD calculates the load average numbers, and your question piqued my curiosity.

    – a CVn
    Feb 1 '17 at 10:23












  • I don't have much experience in FreeBSD, but I know that until recently Ubuntu had a hard time calculating the real load on a host as soon as virtualisation came into play. I'm pretty sure your real load is much lower than the system shows you. BSD has the habit of implementing updates very late in non-experimental versions due to stability reasons. Also can it be that BSD restricts the load info given by each jail due to security reasons and so it assumes a higher load?

    – Broco
    Feb 1 '17 at 10:46
















this info might be totally wrong but i have seen an unusual high load because of many subshells/procs beeing created every seconds (we had a case where one server did start up ~4-5k valid processes per second). After some finetuning we dropped that number to ~1k. No visible direct impact anywhere, similiar to what you are describing. This was on a debian linux server so uncertain if that can be an issue on BSD as well.

– Dennis Nolte
Feb 1 '17 at 10:09





this info might be totally wrong but i have seen an unusual high load because of many subshells/procs beeing created every seconds (we had a case where one server did start up ~4-5k valid processes per second). After some finetuning we dropped that number to ~1k. No visible direct impact anywhere, similiar to what you are describing. This was on a debian linux server so uncertain if that can be an issue on BSD as well.

– Dennis Nolte
Feb 1 '17 at 10:09













Be careful, FreeBSD appears to calculate load average different from how Linux does it. That said, I'm wondering if there is any actual problem at all here, aside from that the "load average" numbers feel high but no other indication of any excessive load?

– a CVn
Feb 1 '17 at 10:18





Be careful, FreeBSD appears to calculate load average different from how Linux does it. That said, I'm wondering if there is any actual problem at all here, aside from that the "load average" numbers feel high but no other indication of any excessive load?

– a CVn
Feb 1 '17 at 10:18













@MichaelKjörling this is basicly my question as well. As the system in general feels snappy i dont know if there really is a problem. Still a load on occation towards 300-400 seems rather excessive.

– larsemil
Feb 1 '17 at 10:19





@MichaelKjörling this is basicly my question as well. As the system in general feels snappy i dont know if there really is a problem. Still a load on occation towards 300-400 seems rather excessive.

– larsemil
Feb 1 '17 at 10:19













It's still less than 10% of the processes running on the system at the time of your snapshot (320/5325=6.01%). FWIW, I just posted How is load average calculated on FreeBSD? on our sister site Unix & Linux because I was unable to actually locate any concrete information on how FreeBSD calculates the load average numbers, and your question piqued my curiosity.

– a CVn
Feb 1 '17 at 10:23






It's still less than 10% of the processes running on the system at the time of your snapshot (320/5325=6.01%). FWIW, I just posted How is load average calculated on FreeBSD? on our sister site Unix & Linux because I was unable to actually locate any concrete information on how FreeBSD calculates the load average numbers, and your question piqued my curiosity.

– a CVn
Feb 1 '17 at 10:23














I don't have much experience in FreeBSD, but I know that until recently Ubuntu had a hard time calculating the real load on a host as soon as virtualisation came into play. I'm pretty sure your real load is much lower than the system shows you. BSD has the habit of implementing updates very late in non-experimental versions due to stability reasons. Also can it be that BSD restricts the load info given by each jail due to security reasons and so it assumes a higher load?

– Broco
Feb 1 '17 at 10:46





I don't have much experience in FreeBSD, but I know that until recently Ubuntu had a hard time calculating the real load on a host as soon as virtualisation came into play. I'm pretty sure your real load is much lower than the system shows you. BSD has the habit of implementing updates very late in non-experimental versions due to stability reasons. Also can it be that BSD restricts the load info given by each jail due to security reasons and so it assumes a higher load?

– Broco
Feb 1 '17 at 10:46










2 Answers
2






active

oldest

votes


















0














Can you post dmesg output and any log messages from /var/log/messages?



What I see is that you have a 196GB ram machine that is trying to do everything in 3GB of ram... it is probably swapping furiously.



Mem: 3116M Active, 23G Inact, 23G Wired, 900M Buf, 138G Free
ARC: 10G Total, 2612M MFU, 4553M MRU, 37M Anon, 89M Header, 2742M Other



Free ram is bad. You need to use the ram in the machine.
Please post the output of sysctl vfs.zfs.arc_max
Check here for zfs tuning for the ARC



Jails themselves do basically nothing. Processes in the jails will show up in top if they are running - looks like not much is going on.



FreeBSD top is different yes, the LA should be read relative to the number of cores (24). Your LA is high, but this is only because something cannot get the memory it needs.






share|improve this answer























  • I think its the way of FreeBSD how it reports load. A blog here gives more info (undeadly.org/cgi?action=article&sid=20090715034920) and it says 'On BSD, load is the number of processes which have (wanted to) run at least once in the most recent 5-second window, with a degradation over time. So, if you have a process that wakes up every 5 seconds and prints the time on your console, you have a load average of 1. Load is not the number of cpu cycles used.' which would explain the load as there are so many jails running. Machine is not swapping and barely using it disks at all.

    – larsemil
    Feb 1 '17 at 17:36












  • ´sysctl vfs.zfs.arc_max vfs.zfs.arc_max: 199730868224 ´

    – larsemil
    Feb 1 '17 at 17:38











  • How about dmesg? Anything in the logs? 320 LA still means something is pinning the CPU...

    – Stefan Caunter
    Feb 1 '17 at 18:47











  • Updated with dmesg...

    – larsemil
    Feb 15 '17 at 19:47


















0














try:



sysctl kern.eventtimer.timer=HPET





share|improve this answer




















  • 2





    Can you explain what this is doing or why?

    – chicks
    Feb 13 '17 at 20:00











  • Will this simply give better data or does it change anything else?

    – larsemil
    Feb 14 '17 at 8:32











Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f829844%2ffreebsd-shows-high-load-cannot-find-bottleneck%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Can you post dmesg output and any log messages from /var/log/messages?



What I see is that you have a 196GB ram machine that is trying to do everything in 3GB of ram... it is probably swapping furiously.



Mem: 3116M Active, 23G Inact, 23G Wired, 900M Buf, 138G Free
ARC: 10G Total, 2612M MFU, 4553M MRU, 37M Anon, 89M Header, 2742M Other



Free ram is bad. You need to use the ram in the machine.
Please post the output of sysctl vfs.zfs.arc_max
Check here for zfs tuning for the ARC



Jails themselves do basically nothing. Processes in the jails will show up in top if they are running - looks like not much is going on.



FreeBSD top is different yes, the LA should be read relative to the number of cores (24). Your LA is high, but this is only because something cannot get the memory it needs.






share|improve this answer























  • I think its the way of FreeBSD how it reports load. A blog here gives more info (undeadly.org/cgi?action=article&sid=20090715034920) and it says 'On BSD, load is the number of processes which have (wanted to) run at least once in the most recent 5-second window, with a degradation over time. So, if you have a process that wakes up every 5 seconds and prints the time on your console, you have a load average of 1. Load is not the number of cpu cycles used.' which would explain the load as there are so many jails running. Machine is not swapping and barely using it disks at all.

    – larsemil
    Feb 1 '17 at 17:36












  • ´sysctl vfs.zfs.arc_max vfs.zfs.arc_max: 199730868224 ´

    – larsemil
    Feb 1 '17 at 17:38











  • How about dmesg? Anything in the logs? 320 LA still means something is pinning the CPU...

    – Stefan Caunter
    Feb 1 '17 at 18:47











  • Updated with dmesg...

    – larsemil
    Feb 15 '17 at 19:47















0














Can you post dmesg output and any log messages from /var/log/messages?



What I see is that you have a 196GB ram machine that is trying to do everything in 3GB of ram... it is probably swapping furiously.



Mem: 3116M Active, 23G Inact, 23G Wired, 900M Buf, 138G Free
ARC: 10G Total, 2612M MFU, 4553M MRU, 37M Anon, 89M Header, 2742M Other



Free ram is bad. You need to use the ram in the machine.
Please post the output of sysctl vfs.zfs.arc_max
Check here for zfs tuning for the ARC



Jails themselves do basically nothing. Processes in the jails will show up in top if they are running - looks like not much is going on.



FreeBSD top is different yes, the LA should be read relative to the number of cores (24). Your LA is high, but this is only because something cannot get the memory it needs.






share|improve this answer























  • I think its the way of FreeBSD how it reports load. A blog here gives more info (undeadly.org/cgi?action=article&sid=20090715034920) and it says 'On BSD, load is the number of processes which have (wanted to) run at least once in the most recent 5-second window, with a degradation over time. So, if you have a process that wakes up every 5 seconds and prints the time on your console, you have a load average of 1. Load is not the number of cpu cycles used.' which would explain the load as there are so many jails running. Machine is not swapping and barely using it disks at all.

    – larsemil
    Feb 1 '17 at 17:36












  • ´sysctl vfs.zfs.arc_max vfs.zfs.arc_max: 199730868224 ´

    – larsemil
    Feb 1 '17 at 17:38











  • How about dmesg? Anything in the logs? 320 LA still means something is pinning the CPU...

    – Stefan Caunter
    Feb 1 '17 at 18:47











  • Updated with dmesg...

    – larsemil
    Feb 15 '17 at 19:47













0












0








0







Can you post dmesg output and any log messages from /var/log/messages?



What I see is that you have a 196GB ram machine that is trying to do everything in 3GB of ram... it is probably swapping furiously.



Mem: 3116M Active, 23G Inact, 23G Wired, 900M Buf, 138G Free
ARC: 10G Total, 2612M MFU, 4553M MRU, 37M Anon, 89M Header, 2742M Other



Free ram is bad. You need to use the ram in the machine.
Please post the output of sysctl vfs.zfs.arc_max
Check here for zfs tuning for the ARC



Jails themselves do basically nothing. Processes in the jails will show up in top if they are running - looks like not much is going on.



FreeBSD top is different yes, the LA should be read relative to the number of cores (24). Your LA is high, but this is only because something cannot get the memory it needs.






share|improve this answer













Can you post dmesg output and any log messages from /var/log/messages?



What I see is that you have a 196GB ram machine that is trying to do everything in 3GB of ram... it is probably swapping furiously.



Mem: 3116M Active, 23G Inact, 23G Wired, 900M Buf, 138G Free
ARC: 10G Total, 2612M MFU, 4553M MRU, 37M Anon, 89M Header, 2742M Other



Free ram is bad. You need to use the ram in the machine.
Please post the output of sysctl vfs.zfs.arc_max
Check here for zfs tuning for the ARC



Jails themselves do basically nothing. Processes in the jails will show up in top if they are running - looks like not much is going on.



FreeBSD top is different yes, the LA should be read relative to the number of cores (24). Your LA is high, but this is only because something cannot get the memory it needs.







share|improve this answer












share|improve this answer



share|improve this answer










answered Feb 1 '17 at 15:52









Stefan CaunterStefan Caunter

11




11












  • I think its the way of FreeBSD how it reports load. A blog here gives more info (undeadly.org/cgi?action=article&sid=20090715034920) and it says 'On BSD, load is the number of processes which have (wanted to) run at least once in the most recent 5-second window, with a degradation over time. So, if you have a process that wakes up every 5 seconds and prints the time on your console, you have a load average of 1. Load is not the number of cpu cycles used.' which would explain the load as there are so many jails running. Machine is not swapping and barely using it disks at all.

    – larsemil
    Feb 1 '17 at 17:36












  • ´sysctl vfs.zfs.arc_max vfs.zfs.arc_max: 199730868224 ´

    – larsemil
    Feb 1 '17 at 17:38











  • How about dmesg? Anything in the logs? 320 LA still means something is pinning the CPU...

    – Stefan Caunter
    Feb 1 '17 at 18:47











  • Updated with dmesg...

    – larsemil
    Feb 15 '17 at 19:47

















  • I think its the way of FreeBSD how it reports load. A blog here gives more info (undeadly.org/cgi?action=article&sid=20090715034920) and it says 'On BSD, load is the number of processes which have (wanted to) run at least once in the most recent 5-second window, with a degradation over time. So, if you have a process that wakes up every 5 seconds and prints the time on your console, you have a load average of 1. Load is not the number of cpu cycles used.' which would explain the load as there are so many jails running. Machine is not swapping and barely using it disks at all.

    – larsemil
    Feb 1 '17 at 17:36












  • ´sysctl vfs.zfs.arc_max vfs.zfs.arc_max: 199730868224 ´

    – larsemil
    Feb 1 '17 at 17:38











  • How about dmesg? Anything in the logs? 320 LA still means something is pinning the CPU...

    – Stefan Caunter
    Feb 1 '17 at 18:47











  • Updated with dmesg...

    – larsemil
    Feb 15 '17 at 19:47
















I think its the way of FreeBSD how it reports load. A blog here gives more info (undeadly.org/cgi?action=article&sid=20090715034920) and it says 'On BSD, load is the number of processes which have (wanted to) run at least once in the most recent 5-second window, with a degradation over time. So, if you have a process that wakes up every 5 seconds and prints the time on your console, you have a load average of 1. Load is not the number of cpu cycles used.' which would explain the load as there are so many jails running. Machine is not swapping and barely using it disks at all.

– larsemil
Feb 1 '17 at 17:36






I think its the way of FreeBSD how it reports load. A blog here gives more info (undeadly.org/cgi?action=article&sid=20090715034920) and it says 'On BSD, load is the number of processes which have (wanted to) run at least once in the most recent 5-second window, with a degradation over time. So, if you have a process that wakes up every 5 seconds and prints the time on your console, you have a load average of 1. Load is not the number of cpu cycles used.' which would explain the load as there are so many jails running. Machine is not swapping and barely using it disks at all.

– larsemil
Feb 1 '17 at 17:36














´sysctl vfs.zfs.arc_max vfs.zfs.arc_max: 199730868224 ´

– larsemil
Feb 1 '17 at 17:38





´sysctl vfs.zfs.arc_max vfs.zfs.arc_max: 199730868224 ´

– larsemil
Feb 1 '17 at 17:38













How about dmesg? Anything in the logs? 320 LA still means something is pinning the CPU...

– Stefan Caunter
Feb 1 '17 at 18:47





How about dmesg? Anything in the logs? 320 LA still means something is pinning the CPU...

– Stefan Caunter
Feb 1 '17 at 18:47













Updated with dmesg...

– larsemil
Feb 15 '17 at 19:47





Updated with dmesg...

– larsemil
Feb 15 '17 at 19:47













0














try:



sysctl kern.eventtimer.timer=HPET





share|improve this answer




















  • 2





    Can you explain what this is doing or why?

    – chicks
    Feb 13 '17 at 20:00











  • Will this simply give better data or does it change anything else?

    – larsemil
    Feb 14 '17 at 8:32















0














try:



sysctl kern.eventtimer.timer=HPET





share|improve this answer




















  • 2





    Can you explain what this is doing or why?

    – chicks
    Feb 13 '17 at 20:00











  • Will this simply give better data or does it change anything else?

    – larsemil
    Feb 14 '17 at 8:32













0












0








0







try:



sysctl kern.eventtimer.timer=HPET





share|improve this answer















try:



sysctl kern.eventtimer.timer=HPET






share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 13 '17 at 20:00









chicks

3,05072033




3,05072033










answered Feb 13 '17 at 18:18









Allan JudeAllan Jude

961611




961611







  • 2





    Can you explain what this is doing or why?

    – chicks
    Feb 13 '17 at 20:00











  • Will this simply give better data or does it change anything else?

    – larsemil
    Feb 14 '17 at 8:32












  • 2





    Can you explain what this is doing or why?

    – chicks
    Feb 13 '17 at 20:00











  • Will this simply give better data or does it change anything else?

    – larsemil
    Feb 14 '17 at 8:32







2




2





Can you explain what this is doing or why?

– chicks
Feb 13 '17 at 20:00





Can you explain what this is doing or why?

– chicks
Feb 13 '17 at 20:00













Will this simply give better data or does it change anything else?

– larsemil
Feb 14 '17 at 8:32





Will this simply give better data or does it change anything else?

– larsemil
Feb 14 '17 at 8:32

















draft saved

draft discarded
















































Thanks for contributing an answer to Server Fault!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f829844%2ffreebsd-shows-high-load-cannot-find-bottleneck%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to write a 12-bar blues melodyI-IV-V blues progressionHow to play the bridges in a standard blues progressionHow does Gdim7 fit in C# minor?question on a certain chord progressionMusicology of Melody12 bar blues, spread rhythm: alternative to 6th chord to avoid finger stretchChord progressions/ Root key/ MelodiesHow to put chords (POP-EDM) under a given lead vocal melody (starting from a good knowledge in music theory)Are there “rules” for improvising with the minor pentatonic scale over 12-bar shuffle?Confusion about blues scale and chords

What if the end-user didn't have the required library?What is setup.py?What is a clean, pythonic way to have multiple constructors in Python?What does Ruby have that Python doesn't, and vice versa?What is the reason for having '//' in Python?How do I create a namespace package in Python?How to package shared objects that python modules depend on?setuptools vs. distutils: why is distutils still a thing?Navigation in Windows 10 vs code not going to virtualenv library when the same library is installed at user levelPython create package for local usePackaging a project that uses multiple python versionsWhy is permission denied on pip install except for when “--user” is included at end of command?

Why did Thanos need his ship to help him in the battle scene?Which actor plays Thanos in the Avengers mid-credits scene?Are there economic implications portrayed in comics where the buildings and cities are ruined almost daily?Old X-Men comic where team travels to alien world with a ring-like sun that needs recharging?Why does Ego need help sleeping?Is there an objective answer to who “the strongest Avenger” is?How did Banner get unstuck?Why did Thanos get hit?How did Thanos (or anyone) know the Infinity Stones would give him this power?Did Thanos leave Eitri alive for his after-sales service?In Avengers 1, why does Thanos need Loki?