[kernel-xen] kernel page allocation failure on numa
Glenn Enright
glenn at rimuhosting.com
Thu Apr 2 12:41:14 AEDT 2015
Hi all
We are seeing the attached kind of error below on some of our larger xen
hosts. The pattern seems to relate to an old issue around numa settings
in the kernel. eg
http://rhaas.blogspot.co.nz/2014/06/linux-disables-vmzonereclaimmode-by.html
We are using 3.14.33-1. The thing that seems to fix that is the
following sysctl setting...
sysctl -w 'vm.zone_reclaim_mode=1'
All our VMS are pinned to specific cpus and we are using mainly intel
CPUs. To my understanding the pinning is best practice. So it makes
sense to have that setting on by default.
Given this kernel is tuned for xen servers, I wonder if it is worth
patching to set that to 1 by default? Especially with so many multiproc
machines coming out now this is going to be increasingly common setting.
Does anyone know if this could affect arm based machines also?
Regards, Glenn
http://ri.mu - Startups start here.
Hosting. DNS. Web Programming. Email. Backups. Monitoring.
kswapd0: page allocation failure: order:1, mode:0x204020
CPU: 0 PID: 37 Comm: kswapd0 Tainted: GF 3.14.33-1.el6xen.x86_64 #1
Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 4.6.5 02/08/2012
0000000000000001 ffff88007a203618 ffffffff8161e672 0000000000000010
0000000000204020 ffff88007a2036a8 ffffffff811415db fffffffe00203638
ffff880101b6cb38 0100000000000000 0000000000000041 fffffffe00203601
Call Trace:
<IRQ> [<ffffffff8161e672>] dump_stack+0x49/0x5f
[<ffffffff811415db>] warn_alloc_failed+0xeb/0x150
[<ffffffff811447bb>] __alloc_pages_nodemask+0x74b/0xaa0
[<ffffffff811447bb>] ? __alloc_pages_nodemask+0x74b/0xaa0
[<ffffffff8161e5f4>] ? printk+0x4d/0x4f
[<ffffffff8118fba8>] kmem_getpages+0x78/0x170
[<ffffffff81190703>] fallback_alloc+0x193/0x240
[<ffffffff81190494>] ____cache_alloc_node+0x94/0x170
[<ffffffff8119197b>] kmem_cache_alloc+0x11b/0x1c0
[<ffffffff81541a48>] sk_prot_alloc+0x48/0x150
[<ffffffff81541c60>] sk_clone_lock+0x20/0x320
[<ffffffff8159de66>] inet_csk_clone_lock+0x16/0xd0
[<ffffffff815b8893>] tcp_create_openreq_child+0x23/0x4e0
[<ffffffff815b5111>] tcp_v4_syn_recv_sock+0x41/0x2b0
[<ffffffff815b85c7>] tcp_check_req+0x237/0x4e0
[<ffffffff815b6739>] tcp_v4_do_rcv+0x339/0x4a0
[<ffffffff815b8196>] tcp_v4_rcv+0x696/0x700
[<ffffffff81592a40>] ? ip_rcv+0x3a0/0x3a0
[<ffffffff81592ae8>] ip_local_deliver_finish+0xa8/0x230
[<ffffffff81592cf8>] ip_local_deliver+0x88/0x90
[<ffffffff815922f9>] ip_rcv_finish+0x119/0x380
[<ffffffff81592965>] ip_rcv+0x2c5/0x3a0
[<ffffffff815bbe32>] ? tcp4_gro_receive+0xf2/0x140
[<ffffffff81558a6e>] __netif_receive_skb_core+0x5fe/0x7a0
[<ffffffff81558c37>] __netif_receive_skb+0x27/0x70
[<ffffffff81558ebd>] netif_receive_skb_internal+0x2d/0x90
[<ffffffff81559b98>] napi_gro_receive+0x98/0x100
[<ffffffffa00ee226>] igb_poll+0x6b6/0x1040 [igb]
[<ffffffff810c0429>] ? handle_irq_event_percpu+0xc9/0x200
[<ffffffff81559271>] net_rx_action+0x111/0x2a0
[<ffffffff8107253c>] __do_softirq+0xfc/0x2b0
[<ffffffff810727fd>] irq_exit+0xbd/0xd0
[<ffffffff813b88e5>] xen_evtchn_do_upcall+0x35/0x50
[<ffffffff8162d0be>] xen_do_hypervisor_callback+0x1e/0x30
More information about the kernel-xen
mailing list