iowait creates high performance degradation. There was no changes in server software & configuration. Sometimes server work very slow, count of apache processes is high - there is no any processes and scripts who can use a lot of hdd resources.
Sites located on customer server does not create high load, and not subjected to any attacks.
I received response from DC:
I did some research for this and it seems that other people has experienced similar issues and is reporting that it is a kernel bug have you ran yum update to receive the latest kernel if not you can try and update and see if that may resolve your issue.
CENTOS 5.2
Cpanel / WHM
Linux *** 2.6.18-92.1.22.el5PAE #1 SMP Tue Dec 16 12:36:25 EST 2008 i686 i686 i386 GNU/Linux
I try to run "yum update kernel" but it seems installed kernel is up to date.
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 7775 root 15 0 1344 1344 916 S 0.2 0.0 0:02 0 top 14220 root 15 0 1216 1216 920 R 0.2 0.0 0:00 1 top -c
I have a server with two cPanel accounts, each running gallery2. Load has been high:
[url] [url] [url]
Sar output (done at a less busy time):
[url]
Hardware: 1 Intel 2.4 GHz 1066FSB - Conroe Xeon 3060 (Dual Core) 2 Generic 1024 MB DDR2 667 ECC 1 Maxtor 250GB SATA 7200RPM MaxLine Plus II 1 Unknown Onboard SATA 1 Dell Single socket 1067FSB - Quad Core Capable PowerEdge 840
Howmuch %iowait is too much?
Since it's using less than 1GB Swap, there's enough RAM?
How do I get more specific than httpd as the cause of the load. The gallery script is the only thing running so is the execution time of PHP the bottleneck? If PHP and apache are single threaded would a quad core reduce load? Does lighttpd run PHP faster?
Programs compiled with the Intel C++ compiler run 20-30% (up to 450% in some rare cases) faster than their GCC counterparts.
I have successfully compiled MySQL 5, PHP 5, eAccelerator and nginx with ICC 10.1.015. I was wondering, if anyone has compiled the linux kernel with icc? There are some linux vendors who supply icc compiled kernels, and there are some docs on intels site about compiling the kernel with icc... so it's doable.
But I want to know if anyone has done it on a VPS? What was your experience? Did the system run faster than before? Or did you face some weird situations?
starting a GSP (Game Server Provider) within the next 4-6 months.
One big factor I will need to tackle is optimizing the linux kernel on CentOS for game servers.
This entails accelerating the kernel to allow for 1000fps in Counter strike and Counter Strike Source, as well as tailoring the kernel to provide very reliable server side fps.
My question is, what would be some good sources about possibly doing this, also how would I go about rebuilding the kernel, and is it worth learning how to do myself even though my knowledge is fairly minimal in Linux?
Also does anyone know of specifics about what I need to modify in order to do this.
Fedora Core 6 In a first step, the yum repository must be configured. The .repo file is available in Ingo Molnar's project directory [url]. # cd /etc/yum.repos.d # wget http://people.redhat.com/mingo/realtime-preempt/rt.repo Only the first time, yum must be called in installation mode: # yum install kernel-rt Later on, the command # yum update kernel-rt will update the kernel patch, should a more recent version be available. By default, the newly installed kernel is made the default boot kernel. The realtime-enabled kernel is, therefore, immediately active after the system has been restarted. To check if enabled type: # cat /proc/sys/kernel/kernel_preemption 1 Then this: # cat /proc/sys/kernel/preempt_max_latency 39 As an debugging aid, the condition that led to this latency is also available: # cat /proc/latency_trace preemption latency trace v1.1.5 on 2.6.19-1.rt10.0001 -------------------------------------------------------------------- latency: 39 us, #2/2, CPU#0 | (M:rt VP:0, KP:0, SP:1 HP:1 #P:1) ----------------- | task: posix_cpu_timer-3 (uid:0 nice:0 policy:1 rt_prio:99) ----------------- _------=> CPU# / _-----=> irqs-off | / _----=> need-resched || / _---=> hardirq/softirq ||| / _--=> preempt-depth |||| / ||||| delay cmd pid ||||| time | caller / ||||| | / <...>-3 0...1 39us : __schedule (__schedule) As a role of thumb, the maximum (worst-case) latency amounts to approximately 105 / clock frequency on an idle system. For example, a maximum latency of about 100 microseconds can be expected in a 1 GHz-CPU. If the value displayed is off by more than one order of magnitude, something is not working correctly.
Totally lost here! can someone explain this latency figure of 39, how do I know if this is good or bad figure?
I was wondering if anyone has had luck with 3ware 9650's in a RAID 10 to help bring down iowait?
I currently have a VPS node that see's spikes in iowait and i'd like to bring it down. I've been considering getting a 9650 in a JBOD setup (so then I don't need to reinstall the whole node) with a battery backup to enable the cache.
My two questions are:
- In JBOD, does the write cache enable at all? - Would the write cache help bring down the iowait much?
Drive wise we have 4 x 500GB drives in a raid10 (soft for now).
I am currently researching the options open to me for Virtualisation, the two main ones I have seen are Xen or KVM.
I mainly use CentOS (RHEL), but have read that the version of Xen with it is very old, broken and unstable. KVM isn't included in the kernel that ships with CentOS, as it is too old, apparently it was first featured in Kernel v2.6.20. There isn't likely to be an update till RHEL6, which is due for release first quarter of 2010. I can't wait over a year, so need to find another Distro for use as the Host OS/Hypervisor.
I have built a pretty powerful server, it has an Intel Xeon 3230 which has VT - so I might be better off using KVM over Xen. I am going to collocate this server, so realistically I can make this decision only once - as it would be a PITA to re-install a host Linux distro remotely.
I did a search on distrowatch for distros with the latest version of the kernel, and Slackware came up as being just one minor version behind the most current (v2.6.27.7).
Now this distro is very mature, so should be a fairly safe bet, but it is a 32bit version and can't host 64bit VMs. I have 8GB of ram so want to be able to use it all, and offer the choice of 32/64bit VMs. So that's that out of the window.
I have used Arch Linux on and off for a couple of years as a workstation OS, but because it is so bleeding edge, when pacman updates it can break itself. But I suppose if I just use it as the Host OS, and never let it update/reboot, then it won't break. It should be fairly lightweight and stable, as I will be installing the bare minimum packages. I have a management card, so if the server fails to boot, then I can still remote in to fix it.
If I do want to update the kernel, is it possible to update without rebooting? I think it is somehow... unless I can just reboot during an unused time at 3am or something.
As you can tell I am leaning towards KVM on Arch Linux (x86_64). Is this a good plan?
Is it safe to upgrade to the latest Linux kernel version 2.6.19.2 (released on January 10th, 2007) ? Is there any reported problems or have anyone faced issues after upgrading?
[mysqlhotcopy] interactive-timeout Whenever there are a lot of members online on my forum, the iowait shoot up, >30% and often times hovering at 60%. It'll eventually drop down to normal levels. However, during the high iowait, there are over 2.5GB of free memory according to the command, free -m
is it due to mysql not optimized or other processes? I don't think the bottleneck is on the cpu or ram.
we have one box in hivelocity.net that has been down so many times this month that we were forced to remove links to siteuptime where we were once so proud of having a 99.7% uptime for 3 years in theplanet.
syslog shows that just before crashing, these entries were made:
kernel: kernel BUG at mm/rmap.c:479 kernel: invalid operand:0000 [#1]
dmesg also shows this:
... Brought up 2 CPUs zapping low mappings. checking if image is initramfs... it is Freeing initrd memory: 482k freed NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xf9f20, last bus=1 PCI: Using configuration type 1 mtrr: v2.0 (20020519) mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs. mtrr: corrected configuration. ...
i've googled these messages and they point to ram problems.
hivelocity.net claims to have done diagnostics on the box and that there were no problems reported.
they said this is a result of a sys configuration problem made by us.
Last year I ordered a new server with Centos 4.3 and it had the kernel kernel 2.6.9-34.0.2ELsmp installed. It runned fine and I didn't update any packages since then.
Today I started getting a problem where both mysqld and kswapd0 uses very high amounts of CPU, spiking up to 100% and my memory usage is at 99% all the time. The problem seems exactly the same as the one mentioned in this thread.
In that thread the exact same kernel is said to be insecure and to cause this problem. I also came across a centOS bug that reports this problem with high cpu, mem usage and mysql & kswapd0 consuming all resources.
In the linked thread the person solved the problem by upgrading to kernel 2.6.9-42 using rpms but others recommended a newer kernel or a custom compiled kernel for CentOS.
Apparently when they used yum it said 34.0.2 was the latest kernel.
What should I do to upgrade the kernel, which version should i upgrade to, and where do I get it from? I won't be able to compile a custom kernel and I've only installed basic rpm packages before.
I am running in a High load problem lately. I have one of those cheap 1and1 servers which was running fine until 2 weeks ago. Once I rebooted accidentaly, it did not come back with some unrepairable kernel errors and I had to re-image it.
I chose to reimage the server with CentOS 5, for better support. The new re-image worked fine for some days, at least so I thought and now I am having high loads. The server crashes if not monitored every moment as the load is unpredictable.
Just a restart of the Apache will bring the server back to normality, but I am not sure if it is apache or some other script to be blamed. I have beeing monitoring through apache server-status, but I cannot organize something unusual in the high load moments.
12:00:29 AM CPU %user %nice %system %iowait %steal %idle 12:10:01 AM all 9.14 0.00 5.52 44.66 0.00 40.68 12:20:14 AM all 6.83 0.00 3.98 27.88 0.00 61.32 12:30:10 AM all 6.44 0.00 4.20 81.25 0.00 8.11 12:40:09 AM all 5.25 0.00 4.09 81.93 0.00 8.73 12:50:15 AM all 5.11 0.00 3.79 90.74 0.00 0.36 01:00:07 AM all 7.22 0.00 4.52 57.11 0.00 31.15 01:10:13 AM all 6.89 0.00 4.01 55.38 0.00 33.71 01:20:14 AM all 4.37 0.00 3.27 41.88 0.00 50.48 01:30:25 AM all 4.26 0.00 3.29 63.42 0.00 29.03 01:40:06 AM all 27.18 0.00 4.75 58.27 0.00 9.80 01:50:03 AM all 29.64 0.00 6.61 51.50 0.00 12.25 02:00:07 AM all 27.00 0.00 8.48 55.49 0.00 9.03 02:10:10 AM all 19.29 0.00 4.97 73.80 0.00 1.94 02:20:04 AM all 37.85 0.00 6.78 40.70 0.00 14.67 02:30:05 AM all 15.65 0.00 4.80 68.47 0.00 11.08 02:40:08 AM all 9.06 0.00 5.60 37.49 0.00 47.86 02:50:07 AM all 5.36 0.00 3.62 42.29 0.00 48.73 03:00:02 AM all 6.05 0.00 4.08 47.27 0.00 42.60 03:10:02 AM all 4.22 0.00 3.68 38.17 0.00 53.93 03:20:02 AM all 4.06 0.00 3.75 41.37 0.00 50.82 03:30:22 AM all 4.42 0.00 3.93 45.25 0.00 46.41 03:40:11 AM all 4.34 0.00 3.95 39.58 0.00 52.13 03:50:02 AM all 4.67 0.00 4.01 32.53 0.00 58.80 04:00:08 AM all 3.72 0.00 3.87 28.40 0.00 64.02 04:10:02 AM all 13.49 0.00 6.58 20.82 0.00 59.10 04:20:01 AM all 6.70 0.00 4.63 6.06 0.00 82.61 04:30:02 AM all 1.44 0.00 1.21 4.75 0.00 92.59 04:40:01 AM all 12.42 0.00 8.12 7.65 0.00 71.81 04:50:02 AM all 1.43 0.00 1.07 4.02 0.00 93.47 05:00:02 AM all 1.60 0.00 1.40 8.62 0.00 88.38 05:10:10 AM all 3.80 0.00 3.02 17.86 0.00 75.32 05:20:06 AM all 5.10 0.00 4.22 23.34 0.00 67.34 05:30:02 AM all 1.54 0.00 1.40 11.22 0.00 85.85 05:40:05 AM all 1.75 0.00 1.89 13.12 0.00 83.23 05:50:12 AM all 2.15 0.00 2.22 18.92 0.00 76.72 06:00:02 AM all 1.92 0.00 2.01 12.87 0.00 83.20 06:10:02 AM all 2.27 0.00 2.16 11.53 0.00 84.04 06:20:03 AM all 3.56 0.00 3.02 25.26 0.00 68.16 06:30:10 AM all 2.66 0.00 2.05 18.13 0.00 77.16 06:40:02 AM all 2.58 0.00 2.25 22.87 0.00 72.30 06:50:02 AM all 2.68 0.00 1.92 15.77 0.00 79.63 07:00:03 AM all 3.06 0.00 2.48 26.01 0.00 68.46 07:10:03 AM all 3.65 0.00 3.20 36.54 0.00 56.61
07:10:03 AM CPU %user %nice %system %iowait %steal %idle 07:20:03 AM all 4.40 0.00 3.28 43.86 0.00 48.46 07:30:02 AM all 4.10 0.00 3.17 31.30 0.00 61.43 07:40:06 AM all 7.67 0.00 3.95 50.79 0.00 37.59 07:50:02 AM all 4.72 0.00 3.11 44.30 0.00 47.86 08:00:03 AM all 5.57 0.00 3.72 47.15 0.00 43.56 08:10:07 AM all 10.66 0.00 3.59 71.62 0.00 14.13 08:20:17 AM all 5.67 0.00 3.42 58.81 0.00 32.10 08:30:10 AM all 11.12 0.00 3.49 76.71 0.00 8.67 08:40:03 AM all 7.00 0.00 3.36 47.94 0.00 41.71 Average: all 7.53 0.00 3.76 38.90 0.00 49.81 Some configurations: The reimage partittioning looks like this:
processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 3 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping : 4 cpu MHz : 2793.324 cache size : 1024 KB
here is what I seen when I installed kernel-2.6.20-1.2948.fc6.src.rpm
rpm -ivh kernel-2.6.20-1.2948.fc6.src.rpm 1:kernel warning: user brewbuilder does not exist - using root warning: group brewbuilder does not exist - using root warning: user brewbuilder does not exist - using root ########################################### [100%] warning: user brewbuilder does not exist - using root warning: group brewbuilder does not exist - using root
then when I ran: rpmbuild -bp --target=$(uname -m) /usr/src/redhat/SPECS/kernel-2.6.spec
I seen this error: + Arch=x86_64 + make ARCH=x86_64 nonint_oldconfig In file included from /usr/include/sys/socket.h:35, from /usr/include/netinet/in.h:24, from /usr/include/arpa/inet.h:23, from scripts/basic/fixdep.c:117: /usr/include/bits/socket.h:310:24: error: asm/socket.h: No such file or directory make[1]: *** [scripts/basic/fixdep] Error 1 make: *** [scripts_basic] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.93770 (%prep)
I need to have this installed to get a app installed etc... suggestions or ideas? thanks
I have a Xen VPS. I started with a Debian 4 image and have since upgraded to Debian 5. Firstly was this advisable? Secondly what Kernel version should I be running, or rather is it set by my installation or by the Xen server?
as part of a project I have lately been looking into various aspects of kernel tuning. Most notably lately tuning the TCP stack for more efficient memory usage/throughput.
Thought I would start this thread to mention some of the tools I'd found for doing testing and see what anyone else had to recommend.
So far my favorite of the bunch is nuttcp. Its easy to use and gives a very good idea of how much of your bandwidth you are able to utilize.
A few interesting web pages are as follows for anyone interested in the topic:
[url]- Tuning TCP for High Bandwidth Delay networks
[url]- TCP Tuning Cook book, some interesting information in there as well
[url]...formanceTuning - Performance Tuning TWiki. Has a list of useful tools, flags for existing tools and ways to monitor network performance from a system level, along with some suggestions of things to correct
What is the best way to find out which filesystems and harddrive drivers you can remove? Obviously, i need ext2,3 but how do you find which HD you only need?
i can not find any important info from /var/log/messages
but i find some records many time on it,those like ---------------------------------- Jun 15 05:30:40 server kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jun 15 05:30:40 server kernel: ata1.00: (irq_stat 0x40000001) Jun 15 05:30:40 server kernel: ata1.00: cmd 25/00:08:42:23:d2/00:00:2c:00:00/e0 tag 0 cdb 0x0 data 4096 in Jun 15 05:30:40 server kernel: res 51/40:00:42:23:d2/00:00:2c:00:00/e0 Emask 0x9 (media error) Jun 15 05:30:40 server kernel: ata1.00: configured for UDMA/133 Jun 15 05:30:40 server kernel: ata1: EH complete Jun 15 05:30:42 server kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jun 15 05:30:42 server kernel: ata1.00: (irq_stat 0x40000001) Jun 15 05:30:42 server kernel: ata1.00: cmd 25/00:08:42:23:d2/00:00:2c:00:00/e0 tag 0 cdb 0x0 data 4096 in Jun 15 05:30:42 server kernel: res 51/40:00:42:23:d2/00:00:2c:00:00/e0 Emask 0x9 (media error) Jun 15 05:30:42 server kernel: ata1.00: configured for UDMA/133 Jun 15 05:30:42 server kernel: ata1: EH complete Jun 15 05:30:44 server kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jun 15 05:30:51 server kernel: ata1.00: (irq_stat 0x40000001) Jun 15 05:30:51 server kernel: ata1.00: cmd 25/00:08:42:23:d2/00:00:2c:00:00/e0 tag 0 cdb 0x0 data 4096 in Jun 15 05:30:51 server kernel: res 51/40:00:42:23:d2/00:00:2c:00:00/e0 Emask 0x9 (media error) Jun 15 05:30:51 server kernel: ata1.00: configured for UDMA/133 Jun 15 05:30:51 server kernel: ata1: EH complete Jun 15 05:30:51 server kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jun 15 05:30:51 server kernel: ata1.00: (irq_stat 0x40000001) Jun 15 05:30:51 server kernel: ata1.00: cmd 25/00:08:42:23:d2/00:00:2c:00:00/e0 tag 0 cdb 0x0 data 4096 in Jun 15 05:30:51 server kernel: res 51/40:00:42:23:d2/00:00:2c:00:00/e0 Emask 0x9 (media error) Jun 15 05:30:51 server kernel: ata1.00: configured for UDMA/133 Jun 15 05:30:51 server kernel: ata1: EH complete Jun 15 05:30:51 server kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jun 15 05:30:51 server kernel: ata1.00: (irq_stat 0x40000001) Jun 15 05:30:51 server kernel: ata1.00: cmd 25/00:08:42:23:d2/00:00:2c:00:00/e0 tag 0 cdb 0x0 data 4096 in Jun 15 05:30:51 server kernel: res 51/40:00:42:23:d2/00:00:2c:00:00/e0 Emask 0x9 (media error) Jun 15 05:30:51 server kernel: ata1.00: configured for UDMA/133 Jun 15 05:30:52 server kernel: ata1: EH complete
Jun 15 05:31:26 server kernel: ata1.00: configured for UDMA/133 Jun 15 05:31:30 server kernel: sd 0:0:0:0: SCSI error: return code = 0x08000002 Jun 15 05:31:33 server kernel: sda: Current [descriptor]: sense key: Medium Error Jun 15 05:31:36 server kernel: Add. Sense: Unrecovered read error - auto reallocate failed Jun 15 05:31:36 server kernel: Jun 15 05:31:39 server kernel: Descriptor sense data with sense descriptors (in hex): Jun 15 05:31:46 server kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Jun 15 05:31:51 server kernel: 2c d2 23 42 Jun 15 05:31:56 server kernel: end_request: I/O error, dev sda, sector 751969090 Jun 15 05:31:57 server kernel: ata1: EH complete Jun 15 05:31:57 server kernel: SCSI device sda: 976773168 512-byte hdwr sectors (500108 MB) Jun 15 05:31:58 server kernel: sda: Write Protect is off Jun 15 05:31:58 server kernel: SCSI device sda: drive cache: write back Jun 15 05:31:59 server kernel: SCSI device sda: 976773168 512-byte hdwr sectors (500108 MB) Jun 15 05:32:03 server kernel: sda: Write Protect is off Jun 15 05:32:04 server kernel: SCSI device sda: drive cache: write back -------------------
I copied the default config file and renamed it as .config but I get this:
Code: WARNING: No module dm-mem-cache found for kernel 2.6.27.10-grsec, continuing anyway WARNING: No module dm-region_hash found for kernel 2.6.27.10-grsec, continuing anyway WARNING: No module dm-message found for kernel 2.6.27.10-grsec, continuing anyway WARNING: No module dm-raid45 found for kernel 2.6.27.10-grsec, continuing anyway
My current kernel version is "2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16 17:17:21 EST 2007 i686 athlon i386 GNU/Linux". I want it to be upgraded since it is old. I have been told by our server management company that the latest kernel distributed from yum is kernel.i686 0:2.6.9-78.0.22.E. Can anyone tell me if this version is safe and secure enough? It is a CentOS release 4.7 (Final) server with cPanel installed.
when doing 2.6.26+ or w/e it is, how do you enable conntrack, what options do i need to enable under make menuconfig?
net.netfilter.nf_conntrack_acct = 1 net.netfilter.nf_conntrack_generic_timeout = 120 error: "net.netfilter.nf_conntrack_icmp_timeout" is an unknown key error: "net.netfilter.nf_conntrack_tcp_timeout_close" is an unknown key error: "net.netfilter.nf_conntrack_tcp_timeout_time_wait" is an unknown key error: "net.netfilter.nf_conntrack_tcp_timeout_last_ack" is an unknown key error: "net.netfilter.nf_conntrack_tcp_timeout_close_wait" is an unknown key error: "net.netfilter.nf_conntrack_tcp_timeout_fin_wait" is an unknown key error: "net.netfilter.nf_conntrack_tcp_timeout_established" is an unknown key error: "net.netfilter.nf_conntrack_tcp_timeout_syn_recv" is an unknown key error: "net.netfilter.nf_conntrack_tcp_timeout_syn_sent" is an unknown key error: "net.netfilter.nf_conntrack_udp_timeout" is an unknown key error: "net.netfilter.nf_conntrack_udp_timeout_stream" is an unknown key net.netfilter.nf_conntrack_max = 262144
and how do i know which hardware/devices that i can remove?