blissfulness be with me.
Thursday, 22 September 2016
Xen and pfSense
On the virtualisation host, two networks would be set up:
A 'front' network, bridged with the physical adaptor and directly accessible to the internet.
The virtualisation host would have an IPv6 address on this network.
The pfSense instance would be connected to this network and would have the public IPv4 address and an IPv6 address.
A 'back' network, not bridged to any physical adaptor. Access from this network to and from the wider internet would be via the pfSense router only. This network would use a private IPv4 subnet (eg. 192.168.0.0/24).
The virtualisation host would have an IPv4 and an IPv6 address on this network.
All VMs would also have an IPv4 and an IPv6 address on this network.
Choosing Xen for virtualisation
I considered VMWare ESXi, Citrix XenServer and direct use of Xen as options for the virtualisation host. I ruled out ESXi as the Realtek ethernet controller on the server rented from Hetzner is not supported on recent versions of ESXi which would complicate installation and upgrades on this server.
I tried installing XenServer using QEMU but the host failed to boot after the installation had completed. The installation process was adapted from instructions for installing Windows using QEMU on a Hetzner server (available here). I considered trying to debug the boot issue but decided to first explore direct use of Xen.
The installation of Xen was remarkably simple: I installed Ubuntu Server 14.04 using Hetzner's Robot interface and logged into the newly installed system via ssh. From here, it was simply a case of running the following commands as root:
Grub is configured to boot Xen by default when it is installed and so a reboot dropped me into the same Ubuntu Server instance now running as dom0 under the Xen hypervisor. To confirm that Xen worked, the list of running domains was shown by running:
However, the dom0 instance was using all 16GB of available RAM. Xen does seem to support dynamically reducing this as VMs are created (which I didn't try) but research suggested it was much better to allocate a minimal amount of memory to dom0 and leave the rest unallocated for use by VMs. This was accomplished by changing the Grub configuration to pass additional options to Xen. The file /etc/default/grub was edited and the following line was added to allocate 1GB of RAM to dom0:
Another reboot was performed after this command and the system was ready to go!
I'm going to gloss over the network configuration as I don't have clear notes on how it was achieved and it involved a bit of trial and error. In the end I had the following entry for the 'front' bridge in /etc/network/interfaces:
Bits 64-79 of the IPv6 address for vnet0 were different to those for br0 so that the two networks were on different subnets.
I used LVM to manage disk space on the virtualisation host. The disk for each VM would be an LV on the host system allowing the use of LVM features like snapshots. A further post will probably explain my backup strategy which makes use of LVM snapshots and zbackup.
The directory /home/store was created for storing ISO images and the pfSense installation CD was downloaded to this location. A 4GB LV was created in the volume group vg0 and was named 'pf'. The following virt-builder command was used to begin installation of pfSense in a new VM with two vCPUs and 512MB of RAM:
As X11 and a vnc viewer were not installed on the virtualisation host, some other way of connecting to the running VNC server was needed. On my desktop I ran:
vncviewer -via <ipv6 address of br0 on virtualisation host> localhost:0
This gave my a VNC connection which I could use to carry out the pfSense installation from the install CD.
Once pfSense was installed and initial configuration had been performed, I created a second VM and installed Windows Server 2012 R2 for use as my domain controller. This VM connected to the network correctly and could ping remote hosts on the internet with traffic passing through the pfSense router however it was not possible to run Windows Update. When checking for updates, the error condition 80027EFE kept appearing, indicating an inability to connect.
After a few hours of debugging and googling I hadn't really found anything useful. I had been assuming that the error was related to the use of Windows Server 2012 R2 on Xen. When I branched out and started looking into networking issues with pfSense 2.2 on Xen things started to fall into place. A few forum and blog posts reported networking issues and suggested fixes, such ashttps://forum.pfsense.org/index.php?topic=85797.0. It seems that there is a bug somewhere which is causing the TX checksum to be calculated incorrectly on pfsense 2.2/FreeBSD 10.1 when the checksum calculation is offloaded to the networking hardware and the 'xn' paravirtualised network driver is in use. The simplest comprehensive solution to this problem was to disable hardware checksum offloading everywhere that could affect this pfSense instance, both on the pfSense instance itself and on the virtualisation host.
Within pfSense, disabling hardware checksum offloading was simple. Navigating to 'System->Advanced' and looking at the 'Networking' tab, there is a checkbox which can be ticked to disable hardware checksum offloading. Ticking this checkbox was not sufficient to fix the networking issues I was seeing.
On the virtualisation host, ethtool was used to change the configuration for each networking interface, both physical and virtual. For each of 'eth0' (the physical adaptor), 'br0', 'vnet0-nic', 'vnet0', 'vif1.0', 'vif1.0-emu', 'vif1.1' and 'vif1.1-emu', the following command was executed:
ethtool --offload <if name> tx off
The 'vifx.y' interface names will need to be changed to match those assigned to the pfSense instance if this is performed on a different virtualisation host. Also, it may be overkill to do this for all the interfaces and network bridges I listed and you may get away with changing this setting for a more limited list of interfaces. However, after running these commands my networking issues were solved: I could successfully run Windows Update on my Domain Controller.
Using a pfSense instance on top of the Xen hypervisor seems to work pretty well once the networking issues were resolved. There was definitely a lot I had to learn as I hadn't used Xen much before but this setup was achievable within a week with a couple of hours each day spent researching, configuring and debugging this setup. I now have an offsite virtual network running within which I can deploy both public-facing and private services.