[kernel-xen] Greetings, Bug, and Broken Link, and Small Kernel Config Change Request

Sat Apr 27 20:57:34 EST 2013

On 27/04/2013 11:21, Steven Haigh wrote:

>> Finally - I believe I have found a bug. The last version of
>> xen-hypervisor where I had PCI/VGA passthrough working was 4.2.1-6.
>>
>> The later versions result in error 22: invalid argument error when
>> starting the VM. So:
>>
>> Works:
>> xen-hypervisor-4.2.1-6.el6.x86_64.rpm
>>
>> Don't work:
>> xen-hypervisor-4.2.1-7.el6.x86_64.rpm
>> xen-hypervisor-4.2.2-1.el6.x86_64.rpm
>>
>> It is this specific package that seems to be responsible (/boot/xen.gz).
>> I am running the rest of the xen packages of the latest 4.2.2-1 version.
>>
>> Any idea what is going wrong here?
>
> I know many people have problems with pci and vga passthru. We see
> people from all distros / versions in ##xen on freenode IRC. It seems to
> be very dependent on hardware - and even small things can cause it to
> break.
>
> What I would suggest is to post the config information to the xen-users
> list - then if no success to the xen-devel list with as much information
> as possible. Maybe even try in the ##xen IRC channel first.

I am not convinced this particular issue is hardware related. An issue 
with identical symptoms has been reported on various Xen versions going 
back at least 5 years, e.g.:
http://old-list-archives.xen.org/xen-users/2007-04/msg00323.html

I have never actually seen an answer regarding what causes it or how to 
debug it further. But the fact that

xen-hypervisor-4.2.1-6.el6.x86_64.rpm works

and

xen-hypervisor-4.2.1-7.el6.x86_64.rpm doesn't

might provide some insight into at least this particular specific 
instance by bisecting the differences.

This is completely reproducible on my system. One always works, the 
other always fails.

> I'd also like it if you could file it as a bug at:
> http://xen.crc.id.au/bugs
>
> This way, if we can nail down a fix / working solution, we can easily
> document and then transfer it to a support article later on.

I can do, but I would rather like to know how to debug this further 
first, or the bug report is going to have very little "meat" on it. 
Hmm... I might do a diff between the two package versions and see what 
that yields.

>> Finally - I would like to request the following change to the kernel
>> configuration, if it wouldn't break things for too many people:
>>
>> < CONFIG_NR_CPUS=8
>> > CONFIG_NR_CPUS=32
>>
>> This would make dom0 able to use all CPUs on dual 8-core/16-thread
>> systems in dom0 if required. If this is deemed undesirable, the nr_cpus
>> kernel boot parameter could be used to limit it (but this doesn't appear
>> to work to increase the number of available cores past what is set in
>> CONFIG_NR_CPUS. This had me scratching my head for a bit figuring out
>> why I could only see 8 CPUs on my 24-thread machine.
>
> There is a misconception of what this actually does. The current setting
> is set to 8 CPUs - which is actually 8 vCPUs for the Dom0. Xen can still
> use the number of CPUs detected by the hypervisor (refer xm info) -
> however you can only bind 8 to Dom0. In reality, this is probably WAAAY
> more than required for most configurations - the best practice is to pin
> between 1-4 for exclusive Dom0 use - most people use 1 or 2.
>
> Once again, what you can see in Dom0 is not related to how many vCPUs
> Xen can utilise.

I am aware of that, but my use-case is somewhat unusual. I use dom0 to 
try to avoid some of the virtualization performance tax, while using 
domU for things like Windows with VGA passthrough for gaming and 
similar. My main motivation is to never have to reboot the machine just 
because I need to use a different OS, while still getting as close to 
full bare metal performance as possible for normal work (doing full 
kernel rebuilds is slow enough as it is).

My reasoning for bumping the CONFIG_NR_CPUS to a higher number is that 
it can be limited at boot-time if required, but cannot be increased past 
the CONFIG_NR_CPUS value without rebuilding the kernel.

>> < CONFIG_HOTPLUG_PCI_PCIE=y
>> > CONFIG_HOTPLUG_PCI_PCIE=m
>>
>> Unfortunately, there is some buggy hardware out there (including the
>> EVGA SR-2 I'm running Xen on) that suffers from this bug:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=908023
>
> This gives me a permission denied error.

It is the same bug as the fedora one below.

>> https://bugzilla.redhat.com/show_bug.cgi?id=529153
>>
>> With pciehp built into the kernel, the only workaround I have found that
>> works is as listed in the bug report, but it does involve manually
>> editing init in the initramfs to drop the offending device out of pciehp
>> binding at the earliest possible time.
>>
>> With pciehp built as a module, it could either be blacklisted or handled
>> in a way that ensures that it doesn't just kill the machine as soon as
>> it starts, while at the same time still being available to people who
>> actually need to use it.
>
> Just trying to check here, do you mean this is on the Dom0 or DomU? What
> is the actual effect? The machine just stops? Fails to boot? something
> else?

This affects Dom0.

The effect is that a poorly initialized PCIe device flaps between being 
detected and being unplugged many times per second when pciehp driver 
tries to connect to it. This causes a console flood which triggers an 
interrupt every time and floods the message log. The interrupt load 
makes machine slow down to the point where it takes hours to finish 
booting (sometimes crashes, too), and because the messages are being 
generated faster than the console can display them, it all just never 
comes back.

The workaround is to doctor the initramfs to unbind the PCIe device from 
the pciehp driver at the earliest possible opportunity (as soon as sysfs 
is mounted). But having it as a module instead would mean there are no 
ugly manual initramfs mods required, and none of the functionality would 
actually be lost because those that want pciehp can still modprobe it.

> Please lodge a bug at http://xen.crc.id.au/bugs - it'll be easier to
> track through there...

OK will do. It's more a feature request than bug - the bug is most 
likely in the BIOS, but motherboard manufacturers are notoriously 
useless at fixing such things.

Gordan