Xen Paravirt Ops
Xen paravirt_ops for upstream Linux kernel
What is paravirt_ops?
paravirt_ops (pv-ops for short) is a piece of Linux kernel infrastructure to allow it to run paravirtualized on a hypervisor. It currently supports VMWare's VMI, Rusty's lguest, and most interestingly, Xen.
The infrastructure allows you to compile a single kernel binary which will either boot native on bare hardware (or in hvm mode under Xen), or boot fully paravirtualized in any of the environments you've enabled in the kernel configuration, and lately also as Xen dom0 (see below for patches).
It uses various techniques, such as binary patching, to make sure that the performance impact when running on bare hardware is effectively unmeasurable when compared to a non-paravirt_ops kernel.
At present paravirt_ops is available for x86_32, x86_64 and ia64 architectures.
Xen pv_ops (domU) support has been in mainline Linux since 2.6.23, and is the basis of all on-going Linux/Xen development (the old Xenlinux patches officially ended with 2.6.18.x-xen, though various distros have their own forward-ports of them). Redhat has decided to base all their future Xen-capable products on the in-kernel Xen support, starting with Fedora 9.
Xen/paravirt_ops has been in mainline Linux since 2.6.23, though it is probably first usable in 2.6.24. Latest Linux kernels (2.6.27 and newer) are good for domU use. Starting from Fedora 9 all the new Fedora distribution versions include pv_ops based Xen domU kernel. Ubuntu 10.04 ("Lucid Lynx") and Debian 6.0 ("Squeeze") also includes Xen PV domU kernel. Redhat Enterprise Linux 6.0 also includes pvops based Xen domU kernel.
- Features in 2.6.26:
- x86-32 support
- Console (hvc0)
- Blockfront (xvdX)
- Balloon (reversible contraction only)
- paravirtual framebuffer + mouse (pvfb)
- 2.6.26 onwards pv domU is PAE-only (on x86-32)
- Features added in 2.6.27:
- x86-64 support
- Further pvfb enhancements
- Features added in 2.6.28:
- ia64 (itanium) pv_ops xen domU support
- Various bug fixes and cleanups
Expand Xen blkfront for > 16 xvd devices
- Implement CPU hotplugging
- Add debugfs support
- Features added in 2.6.29:
- performance improvements
- swiotlb (required for dom0 support)
- Features added in 2.6.30:
- Features added in 2.6.31:
- Features added in 2.6.32:
- Features added in 2.6.33:
- save/restore/migration bugfixes. These bugfixes can also be found from the 126.96.36.199 update.
- Features added in 2.6.34:
- Features added in 2.6.35:
- Features added in 2.6.36:
- Xen-SWIOTLB (required for Xen PCI frontend driver and Xen dom0 support).
- Xen PV-on-HVM optimized paravirtualized drivers for fully virtualized (HVM) guest VMs.
- Xen VBD (Virtual Block Device) online dynamic resize support for resizing guest disks (xvd*) on-the-fly.
- Features added in 2.6.37:
- Core Xen dom0 support (no backend drivers yet).
- Xen PCI frontend driver required for Xen PCI Passthru to PV guest/domU.
- Enhanced PV-on-HVM drivers: pirq remappings. Deliver IRQs as Xen event channels for better performance. Requires Xen 4.1 (or newer) hypervisor.
- Features added in 2.6.38:
- Generic Xen dom0 backend bits, required by all xen backend drivers.
- xen-gntdev driver (grant device).
- Features added in 2.6.39:
- xen-netback backend driver to be used in dom0 to serve virtual networks to VMs. Currently unoptimized, optimizations will be added in later kernel versions.
- Many dom0 related bugfixes and improvements.
- PV-on-HVM driver fixes and improvements (xen balloon driver and PV spinlocks support for HVM guests).
- xen-gntalloc driver for userspace grant allocation between Xen domains.
- xen-gntdev support for HVM guests.
- Xen watchdog driver.
- 1-1 identity mapping in P2M. Allows us to automatically figure out if a page is for an I/O hole or not based on the E820. Fixes some device drivers.
- IRQ code rework to support dynamic IRQs so that we're not limited to running 155 VMs.
- Balloon driver has been prepared for memory hotplug and gntalloc.
- save/restore bugfixes.
- Dom0 startup crash fix when certain CONFIG_ options were set.
- Bug fixes in xen-kbdfront, xen-netfront, xen-pcifront, and xen-blkfront.
- Handling of guest events is now round-robin, fixes starvation issue of later guests not having their services served.
- Many cleanups and bugfixes.
- Features queued for 2.6.40:
- xen-blkback backend driver to be used in dom0 to serve virtual block devices (disks) to VMs.
- xen-pciback backend driver to be used in dom0 to support PCI passthru to VMs.
- Work in progress:
- Balloon expansion (using memory hotplug) to grow bigger than the initial domU memory size. Patches posted to xen-devel mailinglist on July and August 2010.
- upstream Linux compatible Xen dom0 ACPI power management patches.
- To be done:
- Device hotplug
- Other device drivers
- pvscsi backend (dom0), patch has been posted to xen-devel for review.
- pvscsi frontend (domU), patch has been posted to xen-devel for review.
- pvusb backend (dom0), patch has been posted to xen-devel for review.
- pvusb frontend (domU), patch has been posted to xen-devel for review.
Building with domU support
- Get a current kernel. The latest kernel.org kernel is generally a good choice.
- Configure as normal; you can start with your current .config file
- If building 32 bit kernel make sure you have CONFIG_X86_PAE enabled (which is set by selecting CONFIG_HIGHMEM64G)
- non-PAE mode doesn't work in 2.6.25, and has been dropped altogether from 2.6.26 and newer kernel versions.
- Enable these core options:
- And Xen pv device support
- CONFIG_HVC_DRIVER and CONFIG_HVC_XEN
- And build as usual
Building with dom0 support
In addition to the config options above you also need to enable:
The kernel build process will build two kernel images: arch/x86/boot/bzImage and vmlinux. They are two forms of the same kernel, and are functionally identical. However, only relatively recent versions of the Xen tools stack support loading bzImage files (post-Xen 3.2), so you must use the vmlinux form of the kernel (gzipped, if you prefer). If you've built a modular kernel, then all the modules will be the same either way. Some aspects of the kernel configuration have changed:
- The console is now /dev/hvc0, so put "console=hvc0" on the kernel command line
- Disk devices are always /dev/xvdX. If you want to dual-boot a system on both Xen and native, then it's best that use use lvm, LABEL or UUID to refer to your filesystems in your /etc/fstab.
Xen/paravirt_ops has not had wide use or testing, so any testing you do is extremely valuable. If you have an existing Xen configuration, then updating the kernel to a current pv-ops and trying to use it as you usually would, then any feedback on how well that works (success or failure) would be very interesting. In particular, information about:
- performance: better/worse/same?
- bugs: outright crash, or something just not right?
- missing features: what can't you live without?
If you do encounter problems, then getting as much information as possible is very helpful. If the domain crashes very early, before any output appears on the console, then booting with: "earlyprintk=xen" should provide some useful information. Note that "earlyprintk=xen" only works for domU if you have Xen hypervisor built in debug mode! If you are running a debug build of Xen hypervisor (set "debug = y" in Config.mk in the Xen source tree), then you should get crash dumps on the Xen console. You can view those with "xm dmesg". Also, CTRL+O can be used to send SysRq (not really specific to pv_ops, but can be handy for kernel debugging).
Xen/paravirt_ops is very much a work in progress, and there are still feature gaps compared to 2.6.18-xen. Many of these gaps are not a huge amount of work to fill in.
The Xen device model is more or less unchanged in the pv-ops kernel. Converting a driver from the xen-unstable or 2.6.18-xen tree should mostly be a matter of getting it to compile. There have been changes in the Linux device model between 2.6.18 and 2.6.26, so converting a driver will mostly be a matter of forward-porting to the new kernel, rather than any Xen specific issues.
All the mechanism should already be in place to support CPU hotplug; it should just be a matter of making it work.
In principle this is already implemented and should work. I'm not sure, however, that it's all plumbed through properly, so that hot-adding a device generates the appropriate udev events to cause devices to appear.
Device unplug/module unload
The 2.6.18-xen patches don't really support device unplug (and driver module unload), mainly because of the difficulties in dealing with granted pages. This should be fixed in the pvops kernel. The main thing to implement is to make sure that on driver termination, rather than freeing granted pages back into the kernel heap, they should be added to a list; that list is polled by a kernel thread which periodically tries to ungrant the pages and return them to the kernel heap if successful.
Getting the current development version
All x86 Xen/pv-ops changes queued for upstream Linus are in Ingo Molnar's tip.git tree. You can get general information about fetching and using this tree in his README. The x86/xen topic branch contains most of the Xen-specific work, though changes in other branches may be necessary too. Using the auto-latest branch is the merged product of all the other topic branches.
The current day-to-day development is happening in a git repository. This repo has numerous topic branches to track individual lines of development, and a couple of roll-up branches which contain everything merged together for easy compilation and running.
NOTE! All active git branches require at least Xen 4.0.1, using older version (4.0.0 or older) will cause problems, xend not starting, etc.
Current active branches are:
xen/stable-2.6.32.x - this is the long term maintained branch, tracking upstream kernel.org 2.6.32.x stable updates. This branch has Xen dom0 patches added. This is the recommended branch for most users. Changelog: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=xen/stable-2.6.32.x .
xen/next-2.6.32 - This is a branch for next 2.6.32 version and gets migrated to the stable branch once automatic tests have succeeded. Changelog: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=xen/next-2.6.32 .
devel/next-2.6.39 - this is the current development branch based on Linux 2.6.39. Changelog: http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=shortlog;h=devel/next-2.6.39 .
Old legacy and obsolete branches with known bugs, don't use these:
xen/stable-2.6.31.x - this branch is based on Linux 188.8.131.52 and was the default branch for Xen 4.0.0 development. Changelog: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=xen/stable-2.6.31.x
- xen/master - this is a link to xen/stable-2.6.31.x
Jeremy's view of the status of pv_ops dom0 kernel (June 2009): http://lists.xensource.com/archives/html/xen-devel/2009-06/msg01193.html
Jeremy's roadmap update (August 2009): http://lists.xensource.com/archives/html/xen-devel/2009-08/msg00510.html
Jeremy's status update (September 2009): http://lists.xensource.com/archives/html/xen-devel/2009-09/msg00806.html
Presentation by Jeremy about pv_ops dom0 kernel at Xen Summit Asia 2009 (November 2009): http://www.xen.org/files/xensummit_intel09/xensummit-asia-2009-talk.pdf
Short update (03 December 2009): http://lists.xensource.com/archives/html/xen-devel/2009-12/msg00190.html
Status update (22 December 2009): http://lists.xensource.com/archives/html/xen-devel/2009-12/msg01127.html
Status update (03 March 2010): http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00162.html
Status update from Xen Summit 2010 NA (April 2010): http://www.slideshare.net/xen_com_mgr/xen-summit-amdpvopsupdate4
Downloading the git tree
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git linux-2.6-xen $ cd linux-2.6-xen
that will automatically check out the 'xen/master' default branch. Note that you need at least 256 MB of free memory, otherwise the "git clone" will fail.
Changing the branch: You most probably want to use the "xen/stable-2.6.32.x" branch, so do this:
$ cd linux-2.6-xen $ git reset --hard $ git checkout -b xen/stable-2.6.32.x origin/xen/stable-2.6.32.x Branch xen/stable-2.6.32.x set up to track remote branch refs/remotes/origin/xen/stable-2.6.32.x. Switched to a new branch "xen/stable-2.6.32.x" $ git pull $ git log | less
Later when you want to update the tree use:
$ cd linux-2.6-xen $ make clean $ git pull
Configuring the kernel:
NOTE0: Make sure you have correct CPU type (Processor Family) set in the kernel configuration, Xen Dom0 options won't show up at all if you have too old CPU selected (too old means a CPU that doesn't support PAE; Pentium Pro was the first CPU to have PAE).
NOTE1: If you're building 32 bit version of the kernel, you first need to enable PAE support, since Xen only supports 32 bit PAE kernels nowadays. Xen kernel build options won't show up at all before you've enabled PAE for 32 bit builds (Processor type and features -> High Memory Support (64GB) -> PAE (Physical Address Extension) Support). PAE is not needed for 64 bit kernels.
NOTE 2: If building 32 bit PAE dom0 kernel make sure you have CONFIG_HIGHPTE=n. There's a known race/bug that causes dom0 kernel crashes if you have CONFIG_HIGHPTE=y.
NOTE 3: Xen dom0 support depends on ACPI support. Make sure you enable ACPI support or you won't see Dom0 options at all.
and add the Xen Dom0 option.
Symbol: XEN_DOM0 [=y] Prompt: Enable Xen privileged domain support Defined at arch/x86/xen/Kconfig:41 Depends on: PARAVIRT_GUEST && XEN && X86_IO_APIC && ACPI Location: -> Processor type and features -> Paravirtualized guest support (PARAVIRT_GUEST [=y]) -> Xen guest support (XEN [=y])
For reference, the xen config options of a working Dom0 (Feel free to edit explain any options that you use below to help others):
If you're using RHEL5 or CentOS5 as a dom0 (ie. you have old udev version), make sure you enable the following options aswell:
For more current Xen related config options check the example .config files from the troubleshooting section, and check the 2.6.18-to-2.6.31-and-higher wiki page.
The XENFS and XEN_COMPAT_XENFS config options are needed for /proc/xen support. If CONFIG_XEN_DEV_EVTCHN is compiled as a module, make sure to load the xen-evtchn.ko module or xend will not start.
You might also need to add a line to /etc/fstab. Xen 3.4.2 and newer automatically mount /proc/xen when /etc/init.d/xend is started, so no need to add xenfs mount entry to /etc/fstab on those systems:
none /proc/xen xenfs defaults 0 0
Working example grub.conf with VGA text console:
title Xen 4.0, dom0 Linux kernel 184.108.40.206 root (hd0,0) kernel /boot/xen-4.0.gz dom0_mem=512M module /boot/vmlinuz-220.127.116.11 root=/dev/sda1 ro nomodeset module /boot/initrd.img-18.104.22.168
NOTE! You need to give correct root= parameter, ie. replace /dev/sda1 with your actual root device. Check your earlier grub kernel entries for the correct option. Also you need to have the "nomodeset" option for the time being.
Working example grub.conf with serial console output (good for debugging since you can easily log the full kernel boot messages even if it crashes):
title pv_ops dom0 (22.214.171.124) with serial console root (hd0,0) kernel /xen-4.0.gz dom0_mem=1024M loglvl=all guest_loglvl=all sync_console console_to_ring com1=19200,8n1 console=com1 module /vmlinuz-126.96.36.199 ro root=/dev/vg00/lv01 console=hvc0 earlyprintk=xen nomodeset module /initrd-188.8.131.52.img
For more information about using a serial console with Xen please check the XenSerialConsole wiki page.
Xen requirements for using pv_ops dom0 kernel
Xen hypervisor and tools need to have support for pv_ops dom0 kernels. In general it means:
- The ability for the Xen hypervisor to load and boot bzImage pv_ops dom0 kernel.
- The ability for the Xen tools to use the sysfs memory ballooning support provided by pv_ops dom0 kernel.
- Current recommended 2.6.32.x version of pvops dom0 kernel requires new IOAPIC setup hypercall from Xen hypervisor.
This means you need to have at least Xen version 4.0.1.
Using older Xen versions is known to be problematic, for example Xen 4.0.0 libraries have problems with recent 2.6.32.x kernels, making xend fail to start due to evtchn/gntdev device node creation issues. Using Xen 3.4.2 or older won't work at all, since old hypervisor versions lack the new required IOAPIC setup hypercall and boot will fail with IRQ related issues.
It's recommended to run the latest Xen 4.0.x version, at least Xen 4.0.1.
Linux distribution support for pv_ops dom0 kernels
Fedora: Fedora 14 includes Xen 4.0.1 hypervisor and is able to run pvops dom0 kernel out-of-the-box. Fedora 14 does not ship with a dom0 capable kernel in the default distribution, but xendom0 kernel rpms are available from developer repositories. Fedora 13 and earlier versions ship with Xen 3.4.x and are not recommended for pvops dom0 usage, unless you update the Xen hypervisor to 4.0.x version. See this tutorial for more help: Fedora13Xen4Tutorial , and also check the Fedora dom0 wiki page: http://fedoraproject.org/wiki/Features/XenPvopsDom0 .
Debian: Debian 6.0 ("Squeeze") includes Xen 4.0.x hypervisor, and also dom0 capable kernel based on the pvops tree.
Other distributions: When using pvops dom0 kernel 2.6.32 or newer you need to have Xen hypervisor 4.0.1 or newer version.
Which kernel image to boot as dom0 kernel from your custom built kernel source tree?
If you have Xen hypervisor with bzImage dom0 kernel support, ie. xen 3.4 or later version, use "linux-2.6-xen/arch/x86/boot/bzImage" as your dom0 kernel (exactly the same kernel image you use for baremetal Linux).
If you have Xen hypervisor without bzImage dom0 kernel support, ie. any official Xen release up to at least Xen 3.3.1, or most of the Xen versions shipped with Linux distributions (before 2009-03), use "linux-2.6-xen/vmlinux" as your dom0 kernel. (Note that "vmlinux" is huge, so you can also gzip it, if you want to make it a bit smaller).
Also read the previous paragraphs for other requirements.
Are there other Xen dom0 kernels available?
Yes. See this wiki page for more information: http://wiki.xensource.com/xenwiki/XenDom0Kernels
Also check XenKernelFeatures wiki page for more information about available features in different Xen enabled kernels.
Xend (and/or xenstored) does not start when using pv_ops dom0 kernel?
You need to have Xen event channel (evtchn) driver included in your dom0 kernel for xend to be able to start. If Xen event channel support is compiled as a module, use "lsmod" to check that you have xen-evtchn (or xen_evtchn) driver module loaded. It not, use "modprobe" to load it. If Xen event channel support is statically built-in to the dom0 kernel then you don't need to load any modules.
You also need xen-gntdev driver loaded.
In December 2009 pv_ops dom0 kernel modules were renamed to have a "xen-" prefix in them, ie. "evtchn.ko" became "xen-evtchn.ko", and "gntdev.ko" became "xen-gntdev.ko".
This makes Xen 3.4.x xend fail to start, because it tried to load "evtchn.ko", but that doesn't exist. You need to load "xen-evtchn.ko" and then start xend. Fedora 12 xen-3.4.2-2 rpms have this problem fixed. If your xend does not automatically load evtchn or xen-evtchn driver module, please load it manually with modprobe.
Also make sure you have xenfs mounted to "/proc/xen", that's needed aswell. Do you have files under "/proc/xen/" directory? You should have for example:
# ls /proc/xen/ capabilities privcmd xenbus xsd_kva xsd_port # cat /proc/xen/capabilities control_d
"control_d" value in "/proc/xen/capabilities" means you're in Xen dom0 ("control domain").
Do you have files under "/dev/xen/" ? You should have there at least the following device nodes:
# ls -la /dev/xen total 0 drwxr-xr-x 2 root root 80 Jun 19 19:17 . drwxr-xr-x 20 root root 4580 Jun 19 19:33 .. crw-rw---- 1 root root 10, 58 Jun 19 19:17 evtchn crw-rw---- 1 root root 10, 57 Jun 19 19:17 gntdev
If you're missing those device nodes you can use "mknod" to create them manually. Creating those device nodes should happen automatically by udev when you have xen-evtchn and xen-gntdev driver modules loaded.
You can see the correct major/minor for the device nodes from "/proc/misc":
# cat /proc/misc 57 xen/gntdev 58 xen/evtchn
Note how they match the values from the previous "ls -la /dev/xen". If the major/minor numbers don't match then xenstored and xend won't be able to start/function. Note the major/minor numbers for device nodes are dynamic nowadays, so you need to verify they're properly set up for your system.
Troubleshooting, what to do if the custom built pv_ops dom0 kernel doesn't work/boot?
You could try these example .config files:
64bit x86_64 (branch: xen/stable-2.6.32.x):
xen/stable-2.6.31.x (184.108.40.206): http://pasik.reaktio.net/xen/kernel-config/config-220.127.116.11-pvops-dom0-xen-master-x86_32
xen/stable-2.6.32.x (18.104.22.168): http://pasik.reaktio.net/xen/kernel-config/config-22.214.171.124-pvops-dom0-xen-stable-x86_32
xen/stable-2.6.32.x (126.96.36.199): http://pasik.reaktio.net/xen/kernel-config/config-188.8.131.52-pvops-dom0-xen-stable-x86_32
Those kernel configs are based on Fedora 11/12 kernel configuration, with some modifications. They've been tested to work on multiple systems. Note that these .config files have various debugging options enabled which will decrease performance so don't use these .config files for performance testing!
Example how to compile/build the pv_ops dom0 kernel:
cd linux-2.6-xen make clean cp -a .config .config-old wget -O .config
make oldconfig make menuconfig (if you need to change something) make bzImage make modules make modules_install # in the following lines replace "version" with the actual kernel version you're compiling. cp -a .config /boot/config-version cp -a System.map /boot/System.map-version cp -a arch/x86/boot/bzImage /boot/vmlinuz-version # And then generate initrd/initramfs image for your dom0 kernel, example for Fedora/RHEL/CentOS: mkinitrd -f /boot/initrd-version.img version
and then edit /boot/grub/grub.conf and make sure you have a correct grub entry to boot Xen hypervisor with dom0 kernel (examples above).
In grub.conf it's a good idea to enable all the logging options for Xen ("loglvl=all guest_loglvl=all sync_console console_to_ring") and for pv_ops dom0 kernel ("earlyprintk=xen"), and set up a serial console to be able to see and capture the full boot messages from Xen and from dom0 kernel, in the case system doesn't start up properly or crashes.
So for debugging and testing you should be using a computer with a built-in serial port on the motherboard (com1), or add a PCI serial card if your motherboard lacks a built-in serial port. You can also use SOL (Serial Over Lan) for logging the Xen hypervisor and dom0 kernel messages. Most server-class machines have SOL available through their management processor or IPMI. SOL device looks like a normal serial port for the OS/Xen, but enables you to connect to the serial console over a network, through the management processor.
If you want read more about Xen and using a Serial Console, please check XenSerialConsole wiki page for more information.
Are there more debugging options I could enable to troubleshoot problems with Xen and/or dom0 kernel?
Yes, try these options in grub.conf:
title pv_ops dom0 debug (184.108.40.206) with serial console root (hd0,0) kernel /xen-4.0.gz dom0_mem=1024M loglvl=all guest_loglvl=all sync_console console_to_ring com1=115200,8n1 console=com1 lapic=debug apic_verbosity=debug apic=debug iommu=off module /vmlinuz-220.127.116.11 ro root=/dev/vg00/lv01 console=hvc0 earlyprintk=xen nomodeset initcall_debug debug loglevel=10 module /initrd-18.104.22.168.img
Is there more information available how to debug and troubleshoot using a serial console?
Please see [XenSerialConsole] wiki page.
I have graphics card (DRM/TTM/KMS/Xorg) related problems with the pv_ops dom0 kernel..
Please see the XenPVOPSDRM wiki page.
Dom0 console gets all weird and corrupted in the end of the boot process
Is the last line on the console something like "Setting console screen modes and fonts" ? Then you might want to disable "console-screen.sh" service from starting automatically and it should workaround the problem.
Is there more help about troubleshooting Xen and/or pvops dom0 problems?
Yes, please see [XenCommonProblems] wiki page.
Is there help available for migrating from linux-2.6.18-xen dom0 kernel to pvops dom0?
Yes, please see the 2.6.18-to-2.6.31-and-higher wiki page.
Contact and bug reports
Before submitting bugreports to xen-devel mailinglist please read this wiki page: XenParavirtOpsHelp . It lists all the information you NEED to provide so the problem can be diagnosed and debugged!
Please mail questions/answers/patches/etc to the Xen-devel mailing list.
Suggestion: This page should be merged with: Kernel.org Linux on Xen Alternatively, one of the pages could be used for the high level overview, theory, and quick status and the other could be used for the "howto"-style using it.