Total Pageviews

Friday, 16 November 2012

Android and Cisco IPsec VPN


Background

Android has received bad rep for not being enterprise friendly. One example of missing enterprise features is support for Cisco IPsec based VPN solutions. Android does natively support some VPNs but not the Cisco ones, which are after all fairly common. There is a feature request for this, with over 1000 comments and nearly 6000 people having starred the issue. Even though the ticket has been created already in 2009 and is obviously of great importance to a lot of people, it remains unresolved. We decided to do some research to see how difficult it really would be to implement this feature.
Before getting started, it is worth noting that there already is a kind of solution for this. The get-a-robot-vpnc is available from Android market (the name there is VPN Connections) and it is basically supposed to work simply by installing it to a rooted phone (with tun support), specifying your connection settings, and pressing “Connect”. At least for me it didn’t work quite that easily, though, and I only managed to get it working after couple of hours of work, which included replacing ifconfig and route commands, and tweaking the vpnc-script file. For anyone who’s lucky enough to get the get-a-robot-vpnc working out of the box, it is relatively good workaround for the issue of not having platform support for Cisco IPsec VPN; once running, get-a-robot-vpnc does work quite nicely.
Behind the scenes get-a-robot-vpnc uses vpnc, probably the most common application used to connect to Cisco IPsec VPN among Linux users. vpnc has been specifically built to support Cisco IPsec VPN (thought it does currently support some others as well) and it only contains the bare minimums to get that job done. vpnc runs completely in userspace, making use of the tun module.
The fact that using get-a-robot-vpnc requires a rooted phone and the tun module — which is not available on many, if not most, Android phones and needs to be obtained separately — does make it unappealing alternative for regular enterprise users. It also does not integrate too well with the existing VPN functionality of the phone; instead of being able to manage the connections through Settings -> Wireless & networks -> VPN settings, like the natively supported VPNs, you’ll have to use a separate UI. You also don’t get similar connection uptime counter and connection status change notifications you do with the other VPNs. Because the connection is managed separately you could also end up in situations where you try to use multiple VPNs simultaneously, though that’s probably quite uncommon issue. Finally, as mentioned earlier, vpnc runs in userspace, which means there is certain amount of overhead involved in passing packets from kernel to vpnc.
Some of the VPNs Android supports out of the box use a VPN application named Racoon, which is part of IPsec-Tools. Racoon is somewhat complex piece of software, which can work as a client as well as a server, and supports several different VPN solutions. Unlike vpnc, Racoon uses the Linux kernel’s IPsec support.
One interesting thing about Racoon is that it also supports Cisco IPsec VPN. This, apparently, is not too widely acknowledged fact because I could only find one forum post where anybody had even claimed to have successfully used Racoon against Cisco VPN Concentrator and couldn’t find any kind of official documentation except one chart saying that Cisco is supported. There is somewhat decent documentation available on how to use Racoon as a server for official Cisco VPN clients but that is not useful when trying to make it work as the client itself.
Because Android is already using Racoon and Racoon supports Cisco, making Android support Cisco does not sound especially difficult. As it turned out, there were some complications but I did eventually manage to make everything work as desired and I’m now running and Android phone that has Cisco IPsec VPN nicely integrated with the normal VPN framework. Because this was done as a proof of concept, the solution isn’t completely finalized and it only supports the exact scenario I happened to need myself. Here are some pictures showing the feature in action.


Racoon and Cisco VPN Concentrator

The first thing I needed to do was to set up Racoon so that I could connect to a Cisco IPsec VPN Concentrator from normal Linux laptop. Once I got that working it should have been, I thought, straightforward to make the same thing work in Android.
So I needed to come up with a configuration file for Racoon that tells it to connect to the server and perform appropriate steps to set up the VPN. With vpnc this is trivial because the configuration file consists pretty much of just the server name, group name, group secret and Xauth username. With Racoon the configuration file syntax is a lot more elaborate, and Racoon in fact uses yacc and lex to implement parser for the configuration file.
A forum post by Matiass Nissler back from 2006 did provide some kind of starting point but unfortunately his configuration file did not work for me and he also didn’t provide the shell scripts referenced by his configuration. This meant that I had to go through the same route Matiass apparently had and read through lot of source code and analyze packet dumps to get everything finally working.
After making both Racoon and vpnc dump their packets to a file so that I could compare those (the first few packets could have been read directly using Wireshark but the packets got soon encrypted) I was able to start analyzing what Racoon did differently compared to vpnc. First discovery was that our server required different encryption and hash algorithms than the server Matiass was using; changing “encryption_algorithm 3des;” to “encryption_algorithm 3des, aes;” and “hash_algorithm md5;” to “hash_algorithm md5, sha1;” took care of that issue.
Next the Xauth phase was failing. This turned out to be missing implementation on Racoon’s part: Racoon only supported authentication using password. When using SecurID token the server requests the client to provide XAUTH_PASSCODE instead of XAUTH_USER_PASSWORD. The actual logic itself is exactly the same so this was fixed simply by making isakmp_xauth.c switch statements process XAUTH_PASSCODE the same way they already did XAUTH_USER_PASSWORD.
Once the Xauth phase was done Racoon expected to get a notification from kernel before moving on. This notification (SADB_ACQUIRE) was not being received causing the handshake not to finish. After quite a bit of investigation I realized that just commenting out the configuration options that specified the up and down shell scripts (which I didn’t have) probably wasn’t such a great idea. After bit of searching I noticed that Racoon came with sample scripts for roadwarrior, which turned out to work for Cisco as well, fixing this issue.
Now the connection failed because Racoon for some reason lost its current state when the shell script added a new network interface matching the IP the server told the client to use. I was using Racoon version 0.7.3 by the time because that was what Android is using as well. The bug wasn’t so obvious that I could have pinpointed it in 15 minutes so I instead just tried the latest version, 0.8.0, which seemed to fix this particular problem and I could move on.
Next up, packets were not being delivered between Racoon and the Cisco VPN. The problem here turned out to be the fact that NAT traversal had to be forced. Changing “nat_traversal on;” to “nat_traversal force;” fixed the packet routing.
Finally, the ISAKMP Quick handshake that was now getting performed did not succeed. This was caused by incorrect Diffie-Hellman key exchange protocol. Adding explicit “pfs_group modp1024;” to the end of the “sainfo anonymous” group took care of this and I was finally able to successfully establish a VPN connection to Cisco IPsec VPN Concentrator using Racoon.
Here is a zip containing the configuration file and scripts I used. The keyid.txt and psk_nixu_internal_wlan.txt files are not present but you can easily create your own files. keyid.txt must contain your group id (“IPSec ID” in vpnc terms) and the pre-shared key file must contain the VPN server IP address and the group secret (“IPSec secret”) separated by space (e.g. 192.168.3.1 TheGroupSecret). Make sure that keyid.txt does not contain any other data than the group id; no linefeed, no terminating null character. Also make sure that pre-shared key file is owned by root and has mode set to 0400. The actual commands I used to start racoon and to tell it command (the second one done in another terminal because -F tells Racoon to run on foreground):
sudo src/racoon/racoon -f ./nixu_internal_wlan.conf -C -F
sudo src/racoon/racoonctl -s /var/run/racoon.sock vpn-connect -u rikonen 192.168.3.1

Compiling Racoon for Android

I started the Android side by checking out what kind of modifications Google has done to their version of Racoon. The changes were, unfortunately, quite massive. They had stripped away almost all the code that wasn’t required to implement the exact functionality provided by Android, including lot of stuff required to make things work against Cisco. Apparently this was done to make the binary size smaller. They had also written their own simple communication protocol used by the Dalvik side to command Racoon and dropped support for the existing more elaborate protocol used by racoonctl. I clearly needed to make my own build of Racoon for Android and it seemed easier, for proof of concept purpose, to just modify the mainline 0.8.0 version so that it worked on Android, rather than trying to put back all the removed functionality to the Android 0.7.3 version.
The simplest way to build native applications for Android is by using the Android NDK. It allows setting up a cross-compilation environment with just the following commands after decompressing the package somewhere:
export ANDROID_NDK_ROOT=/path/to/android/ndk
export ANDROID_9_TOOLCHAIN=/tmp/android-9-toolchain
$ANDROID_NDK_ROOT/build/tools/make-standalone-toolchain.sh \
--platform=android-9 --install-dir=$ANDROID_9_TOOLCHAIN
export PATH=$ANDROID_9_TOOLCHAIN/bin:$PATH
You might want to use some other directory than /tmp if you plan to play around with Android native applications more often but as long as you don’t do anything that would clean your /tmp directory this will work just fine. We also need the source code for Racoon 0.8.0, which can be obtained from here.
Normally when compiling native Linux applications things work perfectly by just running ./configure, installing some dependencies that might be required, and then running make after ./configure succeeds. With Android things are often not that simple and any moderately complex existing application will require a lot of tweaking to make it compile on Android. This is because although the Android kernel is very close to normal Linux kernel, the userland is nowhere near; some header files are not present, some function prototypes are not present, some function prototypes are present but not implemented, some headers and libraries need to be retrieved separately, some definitions are missing, and so on.
A table linked a bit further down lists all the different issues one encounters when trying to build Racoon for Android and the solution for each of them. Many of the solutions listed in the table mention racoon_android.diff. This file can be found from this zip and it contains all the changes I made to Racoon to make it compile for Android and also includes the changes required to make it work with the Android VPN framework. The Android VPN framework related changes are all within ifdef blocks and will only be compiled in if you define ANDROID_CHANGES in your config.h. You can apply racoon_android.diff by decompressing the Racoon 0.8.0 code somewhere and then running the following command while in the directory you decompressed the packet into:
patch -p1 < racoon_android.diff
When running the Racoon configuration script you’ll need to pass it some extra parameters to get it compiled for Android and include all the features required to talk to Cisco VPN server. The Cisco specific options are —enable-hybrid, —enable-natt, —enable-dpd and —enable-natt-versions=00,01,02,rfc and the full configure command you should run is this:
export CPP=arm-linux-androideabi-cpp
export CXX=arm-linux-androideabi-g++
export CC=arm-linux-androideabi-gcc
./configure --prefix=$ANDROID_9_TOOLCHAIN/sysroot/usr/ \
--host=arm-linux-eabi --enable-adminport --enable-hybrid \
--disable-ipv6 --enable-natt --enable-dpd \
--enable-natt-versions=00,01,02,rfc
After doing ./configure, and making all the changes required to get it run successfully you can simply type make to compile the code. Click here to view a table with detailed list of all the errors you are likely to encounter when trying to run ./configure or make and how to fix them. The items in bold require something more to fix them than just applying racoon_android.diff.
Here are pre-built binaries if you don’t want to go through the trouble of compiling them yourself. The binaries are compiled without ANDROID_CHANGES definition (the one to make Racoon play with Android’s existing VPN framework) so this is basically normal Racoon 0.8.0 for Android.
The list of solutions for the build errors mentions adb on couple of occasions. adb, Android Debug Bridge, is a utility that comes with the Android SDK and can be used to do a lot of things such as pulling files from a phone (adb pull <src> <dst>), pushing files to the phone (adb push <src> <dst>), and running shell on the phone (adb shell). You’ll need to enable USB debugging from the phone (Settings -> Applications -> Development) in order to use adb.

Android and Racoon

Now that I had my custom build of Racoon it was clear that I needed a rooted phone. The existing Racoon application on the device couldn’t be replaced without root access and a copy elsewhere could not have been started with sufficient privileges on a non-rooted phone. I had an HTC Desire Z so I followed the instructions here. I did also flash the engineering HBoot as I suspected there might be need to replace the kernel and the engineering HBoot makes that easy (the non-engineering HBoot won’t let you flash the boot partition using fastboot). I would recommend following the “Install a Custom Recovery Image” instructions here before flashing the engineering HBoot, though, because even after following the instructions to the letter the device refused to start after I had flashed the engineering HBoot and I had to reflash the ROM from ClockworkMod Recovery in order to get the device working again.
To test things out we can start by copying racoon, racoonctl, setkey, the configuration file, up and down shell scripts, id, and the psk file to /data/local/racoon. Fire up racoon and use racoonctl to tell it to connect. And… it doesn’t work. Well, it does work up to some point but then the handshake fails because we don’t get SADB_ACQUIRE notification from kernel. This time the problem wasn’t that the shell scripts wouldn’t have been in place, although you had to remember to change the shell used to execute the scripts (the first line of the files) to get them running in Android.
To understand why we weren’t getting the SADB_ACQUIRE one needed to understand the entire process a bit better, so I spent a while reading the kernel code in the relevant area, reading ISAKMP specs, and also checked out what the setkey application, which is invoked by the up and down shell scripts, does exactly. The flow of events turned out to be something like this:
  1. Racoon connects to the VPN gateway, makes basic handshake during which user is authenticated, key material is established, and Racoon learns the virtual network details, such as nameservers and the IP it should use for itself.
  2. Racoon runs the “up” script, which creates new virtual network interface with appropriate IP address, marks this interface as the default route, configures the nameservers, and runs setkey with specific parameters.
  3. setkey tells the kernel that all traffic going through the virtual network interface should be routed through the real network interface and encapsulated using ESP. The same should be done for incoming traffic, vice versa. Setkey does not, however, tell the kernel the encryption key to use.
  4. When the kernel needs to pass data through the route specified with setkey and it doesn’t have the encryption key required to encrypt the traffic, it sends a SADB_ACQUIRE message to everyone listening
  5. When Racoon gets the SADB_ACQUIRE message, it finalizes the handshake with the VPN gateway and passes the encryption key to use to the kernel
Now the problem here turned out to be the fact that the kernel only sends SADB_ACQUIRE when it really needs the keys, not immediately when a policy is added using setkey. When running racoon in normal Linux it looked like the SADB_ACQUIRE message would have been sent immediately as a response to the setkey call but that was not in fact the case. The reason why the message seemed to come immediately was simply caused by normal Linux installation having so much network traffic that it only took up to few hundred milliseconds before the kernel really required the key. In Android the network traffic has been reduced dramatically and there wasn’t anything to send or receive before the Cisco VPN server got tired of waiting and told Racoon to go away. Simply adding a ping to a random server at the end of the “up” script ensures there kernel has something to route and immediately sends the SADB_ACQUIRE message. Just adding the ping didn’t fix the problem, however, and some additional debugging was required to see what else went wrong.
To further investigate the problem I needed to add some debug statements to the kernel. HTC does provide the source code for their kernel but building that failed because it tried to use some header files that were not included and not available anywhere in the Internet. To work around this I decided to flash CyanogenMod ROM to the device. I would have needed that anyway at some point because I also wanted to modify the Android settings application to support Cisco VPN and HTC doesn’t provide source code that would have allowed me to recompile appropriate version of the settings application or the services framework.
Installing CyanogenMod (7.0.0) was very easy using the ROM Manager. Next I needed the CyanogenMod version of the kernel source code for HTC Desire Z. For compiling the kernel I used the ARM compiler environment that you get by following the CyanogenMod ROM build instructions. The environment used previously for building Racoon might have worked as well but since I needed to build the CyanogenMod ROM as well I decided to go with that environment instead. To get the kernel built with correct configuration I used my current configuration when building the kernel:
adb pull /proc/config.gz ./config.gz && gunzip config.gz && mv config .config
Once the .config file is in the kernel source code root directory, building the kernel should be as easy as running the following command:
make ARCH=arm CROSS_COMPILE=~/android/prebuilt/linux-86/toolchain/arm-eabi-4.4.3/bin/arm-eabi-
Replace ~/android by whatever directory you put the ARM compilation environment into. Note that the build might fail if you’re using a 64-bit operating system. I first tried building in Fedora 14 x64 and after that failed I tried building in Ubuntu 10.4 x86, which was successful.
Because the kernel version string is slightly different when you build the kernel yourself, you might notice that WLAN stops working after switching to your own kernel. I fixed this simply by changing kernel/module.c (around line 2202) slightly so that it loads the WiFi module even though there is a version mismatch.
So how to actually get the kernel to the phone? You cannot do it by simply putting the generated zImage somewhere on the device. Instead you’ll first need to obtain fastboot tool required to flash boot partition of the device, mkbootimg tool required to combine a ramdisk and kernel into a boot image, and split_bootimg Perl script required to obtain the ramdisk from your current boot image.
The current boot image could have been extracted from the device but I chose to get it from the CyanogenMod zip file for HTC Desire Z, which you can download from here. Syntax for split_bootimg.pl is simply
split_bootimg.pl boot.img
Now that you have the new kernel you built a moment ago (zImage) and the ramdisk (boot.img-ramdisk.gz) from the CyanogenMod boot image, you can use mkbootimg to create a new boot image:
mkbootimg --cmdline 'no_console_suspend=1 console=null' \
--kernel /path/to/kernel/zImage --ramdisk boot.img-ramdisk.gz -o new_boot.img
After this you should be able to use fastboot to flash the boot partition of the device using new_boot.img. However, for some reason the newly created boot image didn’t work for me. Doing some diffing revealed that using hex editor to change the following bytes from new_boot.img might fix the problem:
00000010 04 10
00000018 05 11
00000020 04 10
00000024 04 10
Here the first column is the offset of the byte to change, the second column is the value you should expect to see in the original new_boot.img, and the last column is the value you should replace the original value with. After doing this hex editing magic the boot image should work fine. You can apply it by turning off the device, holding down power and volume down buttons (or whatever combination takes you to the boot menu with your device), selecting the FASTBOOT option from the boot menu, connecting the device to your computer through USB, and running the following command:
fastboot boot new_boot.img
This won’t yet flash the boot partition, it just boots the device using the given boot image. Once you have ensured that everything works as expected, you can do the actual permanent flashing part with the following command while in FASTBOOT menu:
fastboot flash boot new_boot.img
So now we’re able to run our own kernel in the device and do some debugging. To cut the story short, I found out that the Android kernel was working fine but it just hadn’t been compiled with all the appropriate options to make Racoon work properly. The options that needed to be turned on are these (modify the .config file):
CONFIG_XFRM_USER=y
CONFIG_INET_XFRM_MODE_TUNNEL=y
You should also make sure that CONFIG_INET_ESP is enabled and I also enabled CONFIG_INET_AH, although it shouldn’t have been necessary in this case.
Now, finally, after building a new kernel with these configuration options enabled, I was able to connect to a Cisco VPN using Racoon from the Android device. Of course I was doing this from command line, which isn’t such a great user interface so it was time to take a look at Dalvik side of things.
Here are the Android versions of the configuration file and up and down scripts. Setting up the nameservers needed to be handled differently and there’s also already the bit that will be required later to notify the Dalvik side that the VPN is up/down. The properties for setting the nameservers in the scripts don’t in fact work unless the VPN is being set up from Dalvik side — for a standalone native application with no integration to the existing VPN framework one needs to set net.dns1 and net.dns2 instead of vpn.dns1 and vpn.dns2. The commands I used to start racoon (after writing su to switch to root) and to tell it command (the second one done in another terminal because -F tells Racoon to run on foreground):
./racoon -f ./nixu_wlan.conf -C -F
./racoonctl -s /data/local/racoon/racoon.sock vpn-connect -u rikonen 192.168.3.1

Android Settings Application and Service Framework

In order to change the actual Android UI and the service logic so that it can handle new type of VPN we must first be able to build the entire ROM. Or, to be exact, we need to be able to build the settings application, core framework library, and VPN services, but it is easiest to start by just compiling the whole thing. CyanogenMod Wiki has good instructions how to do the build so I’m not going to duplicate that here. See this page for more information if you’re using HTC Desire Z or some similar page for other devices.
Here is a zip containing diffs with all the necessary changes that need to be done to the source tree. You can apply these by running
cd ~/android/packages/apps/Settings
patch -p1 < /some/path/packages_apps_Settings.diff
cd ~/android/frameworks/base
patch -p1 < /some/path/frameworks_base.diff
I was using CyanogenMod 7.0.0 when making the changes so they might not work directly with other versions. I did try applying the changes to CyanogenMod 7.0.3, though, and that worked without any visible errors at least.
After the build is complete you can either reflash the entire ROM or then replace the individual jar and apk files that really need to be replaced: /system/app/VpnServices.apk, /system/app/Settings.apk, /system/framework/framework-res.apk, and all jar files under /system/framework. You will need to reboot the phone for the framework changes to take effect. You should exercise some caution when replacing files under /system/framework as leaving that directory to inconsistent state could make the phone unusable (until you reflash it, at least) — see this page for some more information.
Now that the Java code has been modified to support invoking Racoon with all the required parameters, we also need to modify Racoon so that it can receive and process the parameters. Google had modified their version of Racoon so that it didn’t use configuration files at all but instead just set its internal state according to the received parameters. Because I still had the support for reading configuration files, I decided to make Racoon work so that it created a configuration file based on the received parameters and then continued as if it had been told to start normally using that configuration. This was relatively straightforward thing to do. The code is included in the diff that was mentioned already earlier.
After making the changes, compiling Racoon again, and throwing it to the correct location on the device (which does require you to remount /system so that it is writable), things worked beautifully and I was able to connect the VPN by just clicking the VPN entry and entering my SecurID token.
If you don’t want to go through all the trouble of being able to build Racoon for Android yourself, you can just grab this pre-built binary. These binaries have been built with “#define ANDROID_CHANGES” and they only work correctly when invoked by the Android VPN framework. Use the binaries provided in the “Compiling Racoon for Android” section if you want to run Racoon from command line. Also note that this build of Racoon expects the up and down scripts to be present on the phone and named in the following manner:
/data/local/racoon/phase1-up.sh
/data/local/racoon/phase1-down.sh.

Summing it up

So after all this we finally have an Android phone where Cisco IPsec VPN works just like the other VPNs work in normal Android phones. Some of the code isn’t pretty or finalized and some other VPNs stop working if these changes are applied but this does at least prove it can be done.
There were quite a few bumps on the way and even with these long instructions I wouldn’t expect many people to dare or bother to go through all the hoops required to make this work. Using get-a-robot-vpnc is certainly a lot more easier (for those it happens to work) but the point of this exercise wasn’t to make something that is easier than get-a-robot-vpnc but to see how difficult it really would be for Google or for the phone manufacturers to add this feature. A bit tricky perhaps but certainly doable. The biggest challenge is how to handle adding the functionality required by Cisco to Racoon after Google’s disembodiment of it. The other parts, while challenging to do for end users, would be fairly simple to do for Google or any of the phone manufacturers.


from https://www.nixuopen.org/blog/2011/5/android-and-cisco-ipsec-vpn/
----------------------------------------------------------------------