Total Pageviews

Sunday, 7 March 2021

NetBSD IPsec FAQ


This page is developing, and we welcome any comments or suggestions.

IPsec FAQ

Other links


IPsec FAQ

Getting Started

IPsec (IP security protocol) is part of the NetBSD distributions, it provides per-packet authenticity/confidentiality guarantees between peers communicate using IPsec. IPsec is available for both IPv6 and IPv4.

Note that, however, kernel re-configuration is necessary to use IPsec. It is not turned on for default GENERIC kernel.

Userland code includes IPsec support where possible, by default, so no rebuild of userland is necessary even if you switch between kernel with IPsec, and without IPsec.

Note

We sometimes use the word IP security in more broader sense, like IP firewalls, packet filtering, and so forth.

IPsec = AH + ESP + IPcomp + IKE

IPsec consists of a couple of separate protocols, listed below:

  • Authentication Header (AH): provides authenticity guarantee for packets, by attaching strong crypto checksum to packets. If you receive a packet with AH and the checksum operation was successful, you can be sure about two things if you and the peer share a secret key, and no other party knows the key:
    • The packet was originated by the expected peer. The packet was not generated by impersonator.
    • The packet was not modified in transit.
    Unlike other protocols, AH covers the whole packet, from the IP header to the end of the packet.
  • Encapsulating Security Payload (ESP): provides confidentiality guarantee for packets, by encrypting packets with encryption algorithms. If you receive a packet with ESP and successfully decrypted it, you can be sure that the packet was not wiretapped in the middle, if you and the peer share a secret key, and no other party knows the key.
  • IP payload compression (IPcomp): ESP provides encryption service to the packets. However, encryption tend to give negative impact to compression on the wire (such as ppp compression). IPcomp provides a way to compress packet before encryption by ESP (Of course, you can use IPcomp alone if you wish to).
  • Internet Key Exchange (IKE): As noted above, AH and ESP needs shared secret key between peers. For communication between distant location, we need to provide ways to negotiate keys in secrecy. IKE will make it possible.

AH, ESP and IPcomp are implemented in the kernel code. IKE is implemented as daemon process in the userland. Kernel part and userland part will cooperate by using key management table in between.

IKE is actually optional, you can configure secret keys manually for AH/ESP. However, please understand it: you cannot use the same secret key forever. If you use the same secret key for a long period of time, your traffic become more and more likely to get compromised.

Note

security of IPsec protocols depend on the secrecy of secret keys. If secret keys are compromised, IPsec protocols can no longer be secure. Take caution about permission mode of configuration files, key database files, or whatever they may lead to information leakage.

There two set of RFCs published; old IPsec suite starts from RFC1825, and new IPsec suite starts from RFC2401. Though NetBSD implements both, it is recommended to use new IPsec suite.

        userland programs               IKE daemon
          ^ | AF_INET{,6} socket          ^ | PF_KEY socket
========= | | =========================== | | ======== Kernel/user boundary
          | v                             | v
        transport layer, TCP/UDP        key management table
          ^ |                             ^ | key information
          | |                             | |
          | v                             | v
        IP input/output logic <-------> AH/ESP/IPcomp logic
          ^ |
          | v
        Network drivers (ethernet)

Transport mode and tunnel mode

AH, ESP and IPcomp have two modes of operation: transport mode and tunnel mode. Transport mode encrypts normal communication between peers. Tunnel mode will encapsulate packet into new IPv4/v6 header. Tunnel mode is designed to be used by VPN gateways.

[[transport mode]]
my host ======== peer's host
        transport
        mode

packets: [IP: me->peer] ESP payload
                        <---------> encrypted


[[tunnel mode]]
        (a)                  (b)                        (c)
my host ---- my VPN gateway ======== peer's VPN gateway ---- peer's host
                            tunnel mode

packets on (a): [IP: me->peer] payload
packets on (b): [IP: mygw->peergw] ESP [IP: me->peer] payload
                                   <------------------------> encrypted
packets on (c): [IP: me->peer] payload

IPsec policy management

Though the kernel knows how to secure packets, it does not know which packet requires security. We need to tell kernel about which packet needs to be secured. IPsec policy configuration allows us to specify it.

IPsec policy can be configured in per-packet, or per-socket manner:

  • Per-packet: configured into the kernel just like packet filters. You can specify like encrypt outgoing packets if I'm sending to 10.1.1.0/24. This works well when you are running an IPsec router.
  • Per-socket: configured via setsockopt(2) for a certain socket. You can specify like encrypt outgoing packets from this socket. This works well when you would like to run IPsec-aware server program.

IPsec policy decides which IPsec protocols (AH, ESP or IPcomp) to be used against a packet. You can configure kernel to use any combination of AH, ESP and IPcomp against a packet. You can even apply same protocol multiple times, like multiple ESP operation against single packet. It is questionable if multiple ESP operation has any benefit, but certainly interesting for test/debug use.

Configuring IPsec kernel

Refer to tracking NetBSD-current for more details of the build process.

  1. In your kernel configuration file, enable the following portion and build a new kernel.
    options IPSEC
    pseudo-device swcrypto
    	    
    Optionally you can also enable debugging:
    options IPSEC_DEBUG
    	    
  2. Build a new kernel as usual.
  3. Replace the kernel and reboot.

Userland tools include IPsec support by default, and no userland rebuild is necessary.

Additionally, you may want to use racoon(8), which comes with NetBSD or install security/isakmpd.

Configuration examples: host-to-host encryption

If you would like to run host-to-host (transport mode) encryption with manually configured secret keys, the following configuration should be enough. We use setkey(8) to configure the manual keys.

#! /bin/sh
#
# packet will look like this: IPv4 ESP payload
# the node is on 10.1.1.1, peer is on 20.1.1.1
setkey -c <<EOF
add 10.1.1.1 20.1.1.1 esp 9876 -E 3des-cbc "hogehogehogehogehogehoge";
add 20.1.1.1 10.1.1.1 esp 10000 -E 3des-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef;
spdadd 10.1.1.1 20.1.1.1 any -P out ipsec esp/transport//use;
EOF

The first two lines configure secret keys to be used by ESP. The decimal numbers that appear as the fourth word are called SPI (security parameter index). The value will be attached to ESP packet, and it lets the receiving side lookup the secret key from the packet. The number needs to be unique for a node.

  • From 10.1.1.1 to 20.1.1.1, we'd use the 3DES-CBC algorithm, with secret key "hogehogehogehogehogehoge". The traffic will be identified by SPI 9876.
  • From 20.1.1.1 to 10.1.1.1, we'd use 3DES-CBC algorithm, with secret key 0xdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef.

The last line configures per-packet IPsec policy for the node. With the configuration, the node (10.1.1.1) to transmit packets to the peer (20.1.1.1) encrypted, whenever secret key is configured into the kernel. The configuration does not prohibit unencrypted packets from 20.1.1.1 to reach 10.1.1.1. If you would like to reject unencrypted packet, add the following line:

spdadd 20.1.1.1 10.1.1.1 any -P in ipsec esp/transport//require;

On the other end (20.1.1.1), the configuration will be like this. Note that the addresses need to be swapped on the spdadd line, but not the add lines.

#! /bin/sh
#
# packet will look like this: IPv4 ESP payload
# the node is on 20.1.1.1, peer is on 10.1.1.1
setkey -c <<EOF
add 10.1.1.1 20.1.1.1 esp 9876 -E 3des-cbc "hogehogehogehogehogehoge";
add 20.1.1.1 10.1.1.1 esp 10000 -E 3des-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef;
spdadd 20.1.1.1 10.1.1.1 any -P out ipsec esp/transport//use;
EOF

The syntax for policy configuration is documented in ipsec_set_policy(3).

Try running tcpdump(8) to see the encrypted packets on the wire - they are encrypted, it is no longer possible to wiretap those packets.

The above example uses human-readable secret keys. However, use of human-readable secret key is discouraged by the specification (since it will have more chance to be compromised, than binary keys). You'd better use binary keys for real operation.

Key length is determined by algorithms. For 3des-cbc, the secret key MUST be 192 bits (= 24 bytes). If you specify shorter/longer key, you will get error from setkey(8).

If you wish to use other algorithms, the configuration is very similar. Here's an example with rijndael-cbc (also known as AES). rijndael-cbc takes 128, 192 or 256 bits of secret keys. Here we use 128bit keys.

#! /bin/sh
#
# packet will look like this: IPv4 ESP payload
# the node is on 10.1.1.1, peer is on 20.1.1.1
# rijndael-cbc with 128bit key
setkey -c <<EOF
add 10.1.1.1 20.1.1.1 esp 9876 -E rijndael-cbc "hogehogehogehoge";
add 20.1.1.1 10.1.1.1 esp 10000 -E rijndael-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeef;
spdadd 10.1.1.1 20.1.1.1 any -P out ipsec esp/transport//use;
EOF

Configuration examples: host-to-host authentication

Just like ESP, you can configure AH.

#! /bin/sh
#
# packet will look like this: IPv4 AH payload
# the node is on 10.1.1.1, peer is on 20.1.1.1
setkey -c <<EOF
add 10.1.1.1 20.1.1.1 ah 9877 -A hmac-md5 "hogehogehogehoge";
add 20.1.1.1 10.1.1.1 ah 10001 -A hmac-md5 "mogamogamogamoga";
spdadd 10.1.1.1 20.1.1.1 any -P out ipsec ah/transport//use;
EOF

Configuration examples: host-to-host encryption+authentication

If you configure secret keys for both AH and ESP, you can use both of them. IPsec document suggests to apply AH after ESP.

#! /bin/sh
#
# packet will look like this: IPv4 AH ESP payload
# the node is on 10.1.1.1, peer is on 20.1.1.1
setkey -c <<EOF
add 10.1.1.1 20.1.1.1 esp 9876 -E 3des-cbc "hogehogehogehogehogehoge";
add 20.1.1.1 10.1.1.1 esp 10000 -E 3des-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef;
add 10.1.1.1 20.1.1.1 ah 9877 -A hmac-md5 "hogehogehogehoge";
add 20.1.1.1 10.1.1.1 ah 10001 -A hmac-md5 "mogamogamogamoga";
spdadd 10.1.1.1 20.1.1.1 any -P out ipsec esp/transport//use ah/transport//use;
EOF

Configuration examples: IPsec VPN

First of all, here are couple of issues with IPsec VPN configuration.

  • Routing setup must be done properly.
  • Do not try to use IPsec tunnel device to behave as the NAT box, or filtering firewall, at the same time. IPsec and NAT are inherently not compatible protocol. Also, due to implementation and specification limitations in 1.5, they do not play nice. We are trying to improve this situation. See "Interaction with NPF" for more details.
  • VPN configuration differs from installations to installations. Actually, there's no clear definition of what VPN means. If you make questions on mailing lists, you need to clarify your need, your current situation and your network configuration as a whole.

The following example assumes the following network configuration. The goals of the example are:

  • To somehow connect machines inside two private-address cloud (10.0.1.0/24 and 10.0.2.0/24, think of it as Tokyo branch of your company and NY headquarters).
  • The traffic between two cloud needs to be securely exchanged between the gateways.
  • We do not want to pay transpacific leased line charge, so we locally contract with ISP (in Tokyo and in NY) and tunnel traffic between gateways.
((( 10.0.1.0/24 )))     VPN'ed network, Tokyo branch office
  |10.0.1.1
gateway 1
  |20.0.0.1
  |IPsec tunnel
  |20.0.0.2
gateway 2
  |10.0.2.1
((( 10.0.2.0/24 )))     VPN'ed network, NY headquarters

The following text presents configuration for gateway 1.

#! /bin/sh
#
# Note that routing should be set up in advance, i.e. for this example:
#       route -n add -net 10.0.2.0 10.0.2.1
#       route -n add 10.0.2.1 10.0.1.1
# packet will look like this: IPv4 ESP IPv4 payload
# the node is on 10.0.1.1/20.0.0.1, peer is on 10.0.2.1/20.0.0.2
setkey -c <<EOF
add 20.0.0.1 20.0.0.2 esp 13245 -E blowfish-cbc "blowfishtest.001" ;
add 20.0.0.2 20.0.0.1 esp 13246 -E blowfish-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeef;
spdadd 10.0.1.0/24 10.0.2.0/24 any -P out ipsec esp/tunnel/20.0.0.1-20.0.0.2/require ;
spdadd 10.0.2.0/24 10.0.1.0/24 any -P in ipsec esp/tunnel/20.0.0.2-20.0.0.1/require ;
EOF

(contributed by Per Harald Myrvang)

Configuration examples: Leaf-node tunnel

Tunnel mode can be used in situations where all traffic from a given leaf node is to be encrypted to the next-hop router, and unencrypted from there (for example, a wireless node to a router, because 802.11 WEP is inadequate).

For the leaf node, use:

#! /bin/sh
#
# the node is on 10.0.1.5, router is on 10.0.1.1
setkey -c <<EOF
add 10.0.1.1 10.0.1.5 esp 1011 -E rijndael-cbc "rijndaeltest.001" ;
add 10.0.1.5 10.0.1.1 esp 1012 -E rijndael-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeef;
spdadd 10.0.1.5/32 0.0.0.0/0 any -P out ipsec esp/tunnel/10.0.1.5-10.0.1.1/require;
spdadd 0.0.0.0/0 10.0.1.5/32 any -P in ipsec esp/tunnel/10.0.1.1-10.0.1.5/require;
EOF

For the router, swap the out and in keywords on the spdadd commands.

Configuring AH/ESP keys by using IKE

Here we describe the following configuration:

  • Node A and B will use transport-mode ESP.
  • It is required for both ends to use ESP to exchange packets, for all protocols.
  • During IKE, node A and B authenticate each other by using shared secret exchange.

Please follow the steps carefully. Run tcpdump(8) to check how the packets are exchanged between two nodes. Statistics by netstat -sn is also useful to know how kernel IPsec portion is working.

  1. Copy /usr/share/examples/racoon/racoon.conf.sample into /etc/racoon/racoon.conf. Modify parameters declared in racoon.conf as necessary. It is VERY critical that both ends use the same configuration - you will want less differences in racoon.conf.
  2. racoon will obey the IPsec policy settings in the kernel when it negotiates IPsec keys. Therefore, we need to configure IPsec policy into the kernel by using setkey(8). On node A, configure IPsec policy like this. In the example, A and B are IPv4/v6 numeric addresses.
    A# setkey -c
    spdadd A B any -P out ipsec esp/transport//require;
    spdadd B A any -P in ipsec esp/transport//require;
    ^D
    
  3. On node B, configure IPsec policy like this by using setkey(8):
    B# setkey -c
    spdadd B A any -P out ipsec esp/transport//require;
    spdadd A B any -P in ipsec esp/transport//require;
    ^D
    
  4. On both nodes, prepare pre-shared key file. It is VERY critical to set file permission properly, otherwise it worth nothing to use IPsec - it will do nothing other than wasting your CPU time (racoon(8) will not read files with weak permissions). Again A and B are numeric IPv4/v6 addresses.
    A# cat >/etc/racoon/psk.txt
    B       spamspamspam
    ^D
    A# chmod 600 /etc/racoon/psk.txt
    
    B# cat >/etc/racoon/psk.txt
    A       spamspamspam
    ^D
    B# chmod 600 /etc/racoon/psk.txt
    
  5. Run racoon. If you wish see the debug trace, arguments would be like below:
    # racoon -f /etc/racoon/racoon.conf -dddddd
    
  6. Try to exchange packet between A and B. You will see some messages from racoon to console, and key will be established.
    A# ping -n B
    (with some delay, you will start seeing replies)
    ^C
    A# setkey -D
    (you will see keys exchanged by racoon)
    

racoon will negotiate keys based on the policy definition. By changing policy definition, we can easily configure for other cases. Next example configure keys for the following situation:

  • A is a mail server. A wishes to enforce the use of transport mode AH, to everyone contacts A with POP protocol (TCP port 110). B is a client which wishes to contact A.
    1. The policy configuration on A avoids of AH for local traffic (note that racoon cannot negotiate keys with itself). The order of the policy is highly important. If you reorder them, the configuration will not work.
      A# setkey -c
      spdadd A[110] A tcp -P out none;
      spdadd A A[110] tcp -P in none;
      spdadd A[110] 0.0.0.0/0 tcp -P out ipsec ah/transport//require;
      spdadd 0.0.0.0/0 A[110] tcp -P in ipsec ah/transport//require;
      ^D
      
      B# setkey -c
      spdadd B A[110] tcp -P out ipsec ah/transport//require;
      spdadd A[110] B tcp -P in ipsec ah/transport//require;
      ^D
      
    2. Other than policy configuration part, configure just like the previous example.

If you have any problem in configuring it, be sure to look at full debug logs (racoon -dddddd) and see where it chokes. Every configuration difference leads to unsuccessful negotiation.

Setting up IPsec manual keys and policies on bootstrap

rc.conf(5) has an entry for IPsec, named ipsecipsec=YES will run the following command at bootstrap time, before any of the network activities:

/sbin/setkey -f /etc/ipsec.conf

For example, you can perform encrypted NFS mount for /usr. /etc/ipsec.conf should contain valid commands to setkey(8); similar to the configuration examples above without the setkey -c <<EOF ... EOF sections.

Interaction with NPF

NetBSD implements npf(7). NPF filters packets, and IPsec policy processing is inherently similar to packet filter. Therefore, they implement conflicting functionality. NPF/IPsec interaction is specified as: NPF looks at packets in native wire format only. NPF looks at packets before IPsec processing on inbound, and after IPsec processing on outbound.

Even with the processing order, please be aware of the following:

  • If you want IPsec packets to go through NPF, you should not drop them by npf.conf(5) rules. You need to let IP packets with relevant protocol number (50 for ESP, 51 for AH) go through.

    Note

    protocol numbers are completely different thing from TCP/UDP port numbers.

  • Packets coming from tunnel devices (eg gif(4)) will still go through npf(7). You may need to identify these packets by using interface name directive in npf.conf(5).

Processing order

The following diagram summarizes new inbound processing order:

inbound processing:
        userland programs               IKE daemon
          ^ AF_INET{,6} socket            ^ | PF_KEY socket
========= | ============================= | | ======== Kernel/user boundary
          |                               | v
        transport layer, TCP/UDP        key management table
          ^                               ^ | key information
          |                               | |
          |                               | v
  +-----IP input/output logic <-------> AH/ESP/IPcomp logic
  v       ^          ^                      |
tunnel    |          +----------------------+ decapsulated IPsec packets
devices   |
  |     NPF rule
  |       ^
  +------>|
          |
        Network drivers (ethernet)

The following diagram summarizes new outbound processing order:

outbound processing:
        userland programs               IKE daemon
            | AF_INET{,6} socket          ^ | PF_KEY socket
=========== | =========================== | | ======== Kernel/user boundary
            v                             | v
        transport layer, TCP/UDP        key management table
            |                             ^ | key information
            |                             | |
            v                             | v
  +---->IP input/output logic <-------> AH/ESP/IPcomp logic
  |         |                           (incl. IPsec tunnel encapsulation)
tunnel      |
devices     |
  |     NPF rules
  |         |
  +---------+
            v
        Network drivers (ethernet)

Common pitfalls, and debugging techniques

  • Some people mix up the following three items. Take caution if you try to interoperate with other implementations. If you mix them up, you will never be able to make a interoperable configuration. Documentations may be using different words for them (sigh).
    • IPsec with manual key
      In NetBSD case, this way uses setkey(8) to configure IPsec secret key. IPsec secret key will not change over time.
    • IPsec with IKE, with pre-shared secret
      In NetBSD case, this uses racoon(8). We authenticate peer with pre-shared secret. racoon(8) will negotiate IPsec keys dynamically and installs it into the kernel. IPsec secret key changes over time.
    • IPsec with IKE, with certificates
      In NetBSD case, this uses racoon(8). We authenticate peer with certificate files. racoon(8) will negotiate IPsec keys dynamically and installs it into the kernel. IPsec secret key changes over time.
  • The configuration of IPsec is NOT EASY. There are way too many knobs to play with, and debugging is very hard due to wiretap-resistant nature of IPsec. Basically, we can't guess what is going on from packet trace. Try reading some books and standard documents/RFCs, hire consultants or whatever, before you try to configure it.
  • Always run tcpdump while you debug the network. Even though the traffic is encrypted, you can get some idea if the packet is really on the wire or not.
  • netstat(1) is your friend. Run netstat -sn and check the IPsec packet counters.
  • If you have trouble running racoon(8), try running it with maximum debugging output and look at the output. (command line argument -dddddd)
  • You really really need to configure your NetBSD device with peer's device exactly the same to make them interoperate. Your packet needs to be generated by using exactly the same protocol, and encryption algorithm, as the other end is expecting. By failing to do so, you will experience very hard-to-track errors. In IPsec, encryption/authentication failures are modelled as packet drops. So configuration failures will make your packets to be dropped onto the floor with no error indications. tcpdump(8) will not help you much, since the content of packet is now not de-cipherable. Make very very sure that you configure carefully with the other end.
  • On slow machines, you may not be able to negotiate keys with racoon IKE daemon, as IKE negotiation has to finish within net.key.larval_lifetime sysctl MIB, which is 30 seconds by default. Try raising the value if you got a really-slow machines.

Known issues

  • Tunnel mode AH does not work as you might expect, due to restrictions in kernel IPsec policy engine. Do not try to use tunnel mode AH.
  • IPsec and npf(7) code do not play nicely together. See "Interaction with NPF" for detail.
  • IPsec policy rule is not tested enough for explicit protocol specification other than tcp/udp. Use protocol any (= address match only) if you would like to take a safer side. The issue here is generic to any packet filters - normal packet filter descriptions do not play nicely with header chains.

Conformance to standard, interoperability

KAME IPsec implementation (which is included in NetBSD tree) conforms to latest set of IPsec standards. KAME's NetBSD Implementation Note has comprehensive list of standard documents to which the implementation conforms.

Interoperability with other implementation has been confirmed in various occasions. KAME's NetBSD Implementation Note includes list of implementations which we have confirmed interoperability in the past. Note that, however, it is possible for both sides to change the code after interoperability tests, and it is possible that they no longer interoperate. It is also possible that NetBSD device and peer's device interoperate in certain configuration only.

If you try to configure NetBSD device with other implementation, please note that IPsec specifications/implementations have too many knobs to play with. You need to configure your NetBSD device with peer's device exactly the same to make them interoperate.

API compatibility with other IPsec stacks

If you write userland code that is aware of IPsec, you may become curious about API compatibility across IPsec platforms.

We have RFC2367 PF_KEY API for manipulating secret key database in the kernel. Basic portion of this API is available on other UNIX-based IPsec stacks as well, and may be compatible to certain degree (for example, OpenBSD implements PF_KEY API as well). KAME IPsec stack extends this in certain way, just like other parties do. Extended portion is not compatible with other (non-KAME) IPsec stacks.

There is no document that specifies IPsec policy management API. Therefore, we can expect no compatibility with (non-KAME) IPsec stacks in IPsec policy management API.

There is no standard for configuration file syntax. You will need to convert them if you would like to copy configuration from/to non-NetBSD IPsec devices.

Since NetBSD and FreeBSD share IPsec codebase from the same origin (KAME), there is a good chance for API compatibility. Note that, however, there are differences in NetBSD IPsec code and FreeBSD IPsec code, since they merged in KAME code of different date. As of writing, normal userland applications do not need to worry about the difference. However, if you plan to implement IPsec key management daemons, you will need to worry about differences in PF_KEY API.

  • NetBSD 1.5 incorporates KAME IPsec stack of early June 2000.
  • FreeBSD 4.0-RELEASE incorporates KAME IPsec stack of early November 1999.
  • There is no difference in manual ipsec key configuration, kernel behavior on AH/ESP operation, or ipsec_set_policy(3) API.
  • There are differences in behavior of PF_KEY socket, libipsec API for PF_KEY wrapper functions and several other locations. The difference may bite you if you want to implement application that manipulates PF_KEY socket directly (i.e. IKE daemon like racoon(8) or key config program like setkey(8)).

During NetBSD-current development between NetBSD 1.4 to NetBSD 1.5, we have imported KAME IPsec portion three times. Those imports contain backward-incompatible changes in the API. Please make sure to use the latest code, if you are on NetBSD-current between 1.4 and 1.5. with NetBSD 1.5 shipped, we will provide complete binary compatibility, or API version number check, to the API present in NetBSD 1.5

frm https://www.netbsd.org/docs/network/ipsec/

--------

Troubleshooting IPsec VPNs

Due to the finicky nature of IPsec, it isn’t unusual for trouble to arise. Thankfully there are some basic (and some not so basic) troubleshooting steps that can be employed to track down potential problems.

IPsec Logging

Examples presented in this chapter have logs edited for brevity but significant messages remain.

Logging for IPsec may be configured to provide more useful information. To configure IPsec logging for diagnosing tunnel issues with pfSense®, the following procedure yields the best balance of information:

  • Navigate to VPN > IPsec on the Advanced Settings tab

  • Set IKE SAIKE Child SA, and Configuration Backend to Diag

  • Set all other log settings to Control

  • Click Save

Note

Changing logging options is not disruptive to IPsec activity and there is no need to enter a specific “debug mode” for IPsec on current versions of pfSense.

Tunnel does not establish

First check the service status at Status > Services. If the IPsec service is stopped, double check that it is enabled at VPN > IPsec. Also, if using mobile clients, ensure that on the Mobile clients tab, the enable box is also checked.

If the service is running, check the firewall logs (Status > System LogsFirewall tab) to see if the connection is being blocked, and if so, add a rule to allow the blocked traffic. Rules are normally added automatically for IPsec, but that feature can be disabled.

The single most common cause of failed IPsec tunnel connections is a configuration mismatch. Often it is something small, such as a DH group set to 1 on side A and 2 on side B, or perhaps a subnet mask of /24 on one side and /32 on the other. Some routers (Linksys, for one) also like to hide certain options behind “Advanced” buttons or make assumptions. A lot of trial and error may be involved, and a lot of log reading, but ensuring that both sides match precisely will help the most.

Depending on the Internet connections on either end of the tunnel, it is also possible that a router involved on one side or the other does not properly handle IPsec traffic. This is a larger concern with mobile clients, and networks where NAT is involved outside of the actual IPsec endpoints. The problems are generally with the ESP protocol and problems with it being blocked or mishandled along the way. NAT Traversal (NAT- T) encapsulates ESP in UDP port 4500 traffic to work around these issues.

Tunnel establishes but no traffic passes

The top suspect if a tunnel comes up but won’t pass traffic is the IPsec firewall rules. If Site A cannot reach Site B, check the Site B firewall log and rules. Conversely, if Site B cannot contact Site A, check the Site A firewall log and rules. Before looking at the rules, inspect the firewall logs at Status > System Logs, on the Firewall tab. If blocked entries are present which involve the subnets used in the IPsec tunnel, then move on to checking the rules. If there are no log entries indicating blocked packets, revisit the section on IPsec routing considerations in Routing and gateway considerations.

Blocked packets on the IPsec or enc0 interface indicate that the tunnel itself has established but traffic is being blocked by firewall rules. Blocked packets on the LAN or other internal interface may indicate that an additional rule may be needed on that interface ruleset to allow traffic from the internal subnet out to the remote end of the IPsec tunnel. Blocked packets on WAN or OPT WAN interfaces would prevent a tunnel from establishing. Typically this only happens when the automatic VPN rules are disabled. Adding a rule to allow the ESP protocol and UDP port 500 from that remote IP address will allow the tunnel to establish. In the case of mobile tunnels, allow traffic from any source to connect to those ports.

Rules for the IPsec interface can be found under Firewall > Rules, on the IPsec tab. Common mistakes include setting a rule to only allow TCP traffic, which means things like ICMP ping and DNS would not work across the tunnel. See Firewall for more information on how to properly create and troubleshoot firewall rules.

In some cases it is possible that a setting mismatch can also cause traffic to fail passing the tunnel. In one instance, a subnet defined on one non-pfSense firewall was 192.0.2.1/24, and on the pfSense firewall it was 192.0.2.0/24. The tunnel established, but traffic would not pass until the subnet was corrected.

Routing issues are another possibility. Running a traceroute (tracert on Windows) to an IP address on the opposite side of the tunnel can help track down these types of problems. Repeat the test from both sides of the tunnel. Check the Routing and gateway considerations section in this chapter for more information. When using traceroute , traffic which enters and leaves the IPsec tunnel will seem to be missing some interim hops. This is normal, and part of how IPsec works. Traffic which does not properly enter an IPsec tunnel will appear to leave the WAN interface and route outward across the Internet, which would point to either a routing issue such as pfSense not being the gateway (as in Routing and gateway considerations), an incorrectly specified remote subnet on the tunnel definition, or to a tunnel which has been disabled.

Some hosts work, but not all

If traffic between some hosts over the VPN functions properly, but some hosts do not, this is commonly one of four things:

Missing, incorrect or ignored default gateway

If the device does not have a default gateway, or has one pointing to something other than the pfSense firewall, it does not know how to properly get back to the remote network on the VPN (see Routing and gateway considerations). Some devices, even with a default gateway specified, do not use that gateway. This has been seen on various embedded devices, including IP cameras and some printers. There isn’t anything that can be done about that other than getting the software on the device fixed. This can be verified by running a packet capture on the inside interface of the firewall connected to the network containing the device. Troubleshooting with tcpdump is covered in Examples of using tcpdump on the command line, and an IPsec-specific example can be found in IPsec tunnel will not connect. If traffic is observed leaving the inside interface of the firewall, but no replies return, the device is not properly routing its reply traffic or could potentially be blocking it via a local client firewall.

Incorrect subnet mask

If the subnet in use on one end is 10.0.0.0/24 and the other is 10.254.0.0/24, and a host has an incorrect subnet mask of 255.0.0.0 or /8, it will never be able to communicate across the VPN because it thinks the remote VPN subnet is part of the local network and hence routing will not function properly. The system with the broken configuration will attempt to contact the remote system via ARP instead of using the gateway.

Host firewall

If there is a firewall on the target host, it may not be allowing the connections. Check for things like Windows Firewall, iptables, or similar utilities that may be preventing the traffic from being processed by the host.

Firewall rules on pfSense

Ensure the rules on both ends allow the desired network traffic.

Connection Hangs

IPsec does not gracefully handle fragmented packets. Many of these issues have been resolved over the years, but there may be some lingering problems. If hangs or packet loss are seen only when using specific protocols (SMB, RDP, etc.), MSS clamping for the VPN may be necessary. MSS clamping can be activated under VPN > IPsec on the Advanced Settings tab. On that screen, check Enable MSS clamping on VPN traffic and then enter a value. A good starting point would be 1400, and if that works slowly increase the MSS value until the breaking point is hit, then back off a little from there.

“Random” Tunnel Disconnects/DPD Failures on Low-End Routers

If IPsec tunnels are dropped on low-end hardware that is pushing the limits of its CPU, DPD on the tunnel may need disabled. Such failures tend to correlate with times of high bandwidth usage. This happens when the CPU on a low-power system is tied up with sending IPsec traffic or is otherwise occupied. Due to the CPU overload it may not take the time to respond to DPD requests or see a response to a request of its own. As a consequence, the tunnel will fail a DPD check and be disconnected. This is a clear sign that the hardware is being driven beyond its capacity. If this happens, consider replacing the firewall with a more powerful model.

Tunnels Establish and Work but Fail to Renegotiate

In some cases a tunnel will function properly but once the phase 1 or phase 2 lifetime expires the tunnel will fail to renegotiate properly. This can manifest itself in a few different ways, each with a different resolution.

DPD Unsupported, One Side Drops but the Other Remains

Consider this scenario, which DPD is designed to prevent, but can happen in places where DPD is unsupported:

  • A tunnel is established from Site A to Site B, from traffic initiated at Site A.

  • Site B expires the phase 1 or phase 2 before Site A

  • Site A will believe the tunnel is up and continue to send traffic as though the tunnel is working properly.

  • Only when Site A’s phase 1 or phase 2 lifetime expires will it renegotiate as expected.

In this scenario, the two likely things resolutions are: Enable DPD, or Site B must send traffic to Site A which will cause the entire tunnel to renegotiate. The easiest way to make this happen is to enable a keep alive mechanism on both sides of the tunnel.

Tunnel Establishes When Initiating, but not When Responding

If a tunnel will establish sometimes, but not always, generally there is a mismatch on one side. The tunnel may still establish because if the settings presented by one side are more secure, the other may accept them, but not the other way around. For example if there is an Aggressive/Main mode mismatch on an IKEv1 tunnel and the side set for Main initiates, the tunnel will still establish. However, if the side set to Aggressive attempts to initiate the tunnel it will fail.

Lifetime mismatches do not cause a failure in Phase 1 or Phase 2.

To track down these failures, configure the logs as shown in IPsec Logging and attempt to initiate the tunnel from each side, then check the logs.

IPsec Log Interpretation

The IPsec logs available at Status > System Logs, on the IPsec tab contain a record of the tunnel connection process and some messages from ongoing tunnel maintenance activity. Some typical log entries are listed in this section, both good and bad. The main things to look for are key phrases that indicate which part of a connection worked. If “IKE_SA … established” is present in the log, that means phase 1 was completed successfully and a Security Association was negotiated. If “CHILD_SA … established” is present, then phase 2 has also been completed and the tunnel is up.

In the following examples, the logs have been configured as listen in IPsec Logging and irrelevant messages may be omitted. Bear in mind that these are samples and the specific ID numbers, IP addresses, and so forth will vary.

Successful Connections

When a tunnel has been successfully established both sides will indicate that an IKE SA and a Child SA have been established. When multiple Phase 2 definitions are present with IKEv1, a child SA is negotiated for each Phase 2 entry.

Log output from the initiator:

charon: 09[IKE] IKE_SA con2000[11] established between 192.0.2.90[192.0.2.90]...192.0.2.74[192.0.2.74]
charon: 09[IKE] CHILD_SA con2000{2} established with SPIs cf4973bf_i c1cbfdf2_o and TS 192.168.48.0/24|/0 === 10.42.42.0/24|/0

Log output from the responder:

charon: 03[IKE] IKE_SA con1000[19] established between 192.0.2.74[192.0.2.74]...192.0.2.90[192.0.2.90]
charon: 16[IKE] CHILD_SA con1000{1} established with SPIs c1cbfdf2_i cf4973bf_o and TS 10.42.42.0/24|/0 === 192.168.48.0/24|/0

Failed Connection Examples

These examples show failed connections for varying reasons. In most cases it’s clear from the examples that the initiator does not receive messages about specific items that do not match, so the responder logs are much more informative. This is done to protect the security of the tunnel, it would be insecure to provide messages to a potential attacker that would give them information about how the tunnel is configured.

Phase 1 Main / Aggressive Mismatch

In this example, the initiator is set for Aggressive mode while the responder is set for Main mode.

Log output from the initiator:

charon: 15[IKE] initiating Aggressive Mode IKE_SA con2000[1] to 203.0.113.5
charon: 15[IKE] received AUTHENTICATION_FAILED error notify
charon: 13[ENC] parsed INFORMATIONAL_V1 request 1215317906 [ N(AUTH_FAILED) ]
charon: 13[IKE] received AUTHENTICATION_FAILED error notify

Log output from the responder:

charon: 13[IKE] Aggressive Mode PSK disabled for security reasons
charon: 13[ENC] generating INFORMATIONAL_V1 request 2940146627 [ N(AUTH_FAILED) ]

Note that the logs on the responder state clearly that Aggressive mode is disabled, which is a good clue that the mode is mismatched.

In the reverse case, if the side set for Main mode initiates, the tunnel to a pfSense firewall will establish since Main mode is more secure.

Phase 1 Identifier Mismatch

When the identifier does not match, the initiator only shows that the authentication failed, but does not give a reason. The responder states that it is unable to locate a peer, which indicates that it could not find a matching Phase 1, which implies that no matching identifier could be located.

Log output from the initiator:

charon: 10[ENC] parsed INFORMATIONAL_V1 request 4216246776 [ HASH N(AUTH_FAILED) ]
charon: 10[IKE] received AUTHENTICATION_FAILED error notify

Log output from the responder:

charon: 12[CFG] looking for pre-shared key peer configs matching 203.0.113.5...198.51.100.3[someid]
charon: 12[IKE] no peer config found
charon: 12[ENC] generating INFORMATIONAL_V1 request 4216246776 [ HASH N(AUTH_FAILED) ]

Phase 1 Pre-Shared Key Mismatch

A mismatched pre-shared key can be a tough to diagnose. An error stating the fact that this value is mismatched is not printed in the log, instead this messages is shown:

Log output from the initiator:

charon: 09[ENC] invalid HASH_V1 payload length, decryption failed?
charon: 09[ENC] could not decrypt payloads
charon: 09[IKE] message parsing failed

Log output from the responder:

charon: 09[ENC] invalid ID_V1 payload length, decryption failed?
charon: 09[ENC] could not decrypt payloads
charon: 09[IKE] message parsing failed

When the above log messages are present, check the Pre-Shared Key value on both sides to ensure they match.

Phase 1 Encryption Algorithm Mismatch

Log output from the initiator:

charon: 14[ENC] parsed INFORMATIONAL_V1 request 3851683074 [ N(NO_PROP) ]
charon: 14[IKE] received NO_PROPOSAL_CHOSEN error notify

Log output from the responder:

charon: 14[CFG] received proposals: IKE:AES_CBC_128/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024
charon: 14[CFG] configured proposals: IKE:AES_CBC_256/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024
charon: 14[IKE] no proposal found
charon: 14[ENC] generating INFORMATIONAL_V1 request 3851683074 [ N(NO_PROP) ]

In this case, the log entry tells shows the problem exactly: The initiator was set for AES 128 encryption, and the responder is set for AES 256. Set both to matching values and then try again.

Phase 1 Hash Algorithm Mismatch

Log output from the initiator:

charon: 10[ENC] parsed INFORMATIONAL_V1 request 2774552374 [ N(NO_PROP) ]
charon: 10[IKE] received NO_PROPOSAL_CHOSEN error notify

Log output from the responder:

charon: 14[CFG] received proposals: IKE:AES_CBC_256/MODP_1024
charon: 14[CFG] configured proposals: IKE:AES_CBC_256/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024
charon: 14[IKE] no proposal found
charon: 14[ENC] generating INFORMATIONAL_V1 request 2774552374 [ N(NO_PROP) ]

The Hash Algorithm is indicated by the HMAC portion of the logged proposals. As can be seen above, the received and configured propsals do not have matching HMAC entries.

Phase 1 DH Group Mismatch

Log output from the initiator:

charon: 11[ENC] parsed INFORMATIONAL_V1 request 316473468 [ N(NO_PROP) ]
charon: 11[IKE] received NO_PROPOSAL_CHOSEN error notify

Log output from the responder:

charon: 14[CFG] received proposals: IKE:AES_CBC_256/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_8192
charon: 14[CFG] configured proposals: IKE:AES_CBC_256/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024
charon: 14[IKE] no proposal found
charon: 14[ENC] generating INFORMATIONAL_V1 request 316473468 [ N(NO_PROP) ]

DH group is indicated by the “MODP” portion of the listed proposal. As indicated by the log messages, the initiator was set for 8192 (Group 18) and the responder was set for 1024 (Group 2). This error can be corrected by setting the DH group setting on both ends of the tunnel to a matching value.

Phase 2 Network Mismatch

In the following example, the Phase 2 entry on the initiator side is set for 10.3.0.0/24 to 10.5.0.0/24. The responder is not set to match as it lists 10.5.1.0/24 instead.

Log output from the initiator:

charon: 08[CFG] proposing traffic selectors for us:
charon: 08[CFG] 10.3.0.0/24|/0
charon: 08[CFG] proposing traffic selectors for other:
charon: 08[CFG] 10.5.0.0/24|/0
charon: 08[ENC] generating QUICK_MODE request 316948142 [ HASH SA No ID ID ]
charon: 08[NET] sending packet: from 198.51.100.3[500] to 203.0.113.5[500] (236 bytes)
charon: 08[NET] received packet: from 203.0.113.5[500] to 198.51.100.3[500] (76 bytes)
charon: 08[ENC] parsed INFORMATIONAL_V1 request 460353720 [ HASH N(INVAL_ID) ]
charon: 08[IKE] received INVALID_ID_INFORMATION error notify

Log output from the responder:

charon: 08[ENC] parsed QUICK_MODE request 2732380262 [ HASH SA No ID ID ]
charon: 08[CFG] looking for a child config for 10.5.0.0/24|/0 === 10.3.0.0/24|/0
charon: 08[CFG] proposing traffic selectors for us:
charon: 08[CFG] 10.5.1.0/24|/0
charon: 08[CFG] proposing traffic selectors for other:
charon: 08[CFG] 10.3.0.0/24|/0
charon: 08[IKE] no matching CHILD_SA config found
charon: 08[IKE] queueing INFORMATIONAL task
charon: 08[IKE] activating new tasks
charon: 08[IKE] activating INFORMATIONAL task
charon: 08[ENC] generating INFORMATIONAL_V1 request 1136605099 [ HASH N(INVAL_ID) ]

In the responder logs it lists both the networks it received (“child config” line in the log) and what it has configured locally (“proposing traffic selectors for…” lines in the log). By comparing the two, a mismatch can be spotted. The “no matching CHILD_SA config found” line in the log will always be present when this mismatch occurs, and that directly indicates that it could not find a Phase 2 definition to match what it received from the initiator.

Phase 2 Encryption Algorithm Mismatch

Log output from the initiator:

charon: 14[CFG] configured proposals: ESP:AES_CBC_128/HMAC_SHA1_96/NO_EXT_SEQ
charon: 14[ENC] generating QUICK_MODE request 759760112 [ HASH SA No ID ID ]
charon: 14[NET] sending packet: from 198.51.100.3[500] to 203.0.113.5[500] (188 bytes)
charon: 14[NET] received packet: from 203.0.113.5[500] to 198.51.100.3[500] (76 bytes)
charon: 14[ENC] parsed INFORMATIONAL_V1 request 1275272345 [ HASH N(NO_PROP) ]
charon: 14[IKE] received NO_PROPOSAL_CHOSEN error notify

Log output from the responder:

charon: 13[CFG] selecting proposal:
charon: 13[CFG] no acceptable ENCRYPTION_ALGORITHM found
charon: 13[CFG] received proposals: ESP:AES_CBC_128/HMAC_SHA1_96/NO_EXT_SEQ
charon: 13[CFG] configured proposals: ESP:AES_CBC_256/HMAC_SHA1_96/NO_EXT_SEQ
charon: 13[IKE] no matching proposal found, sending NO_PROPOSAL_CHOSEN
charon: 13[IKE] queueing INFORMATIONAL task
charon: 13[IKE] activating new tasks
charon: 13[IKE] activating INFORMATIONAL task
charon: 13[ENC] generating INFORMATIONAL_V1 request 1275272345 [ HASH N(NO_PROP) ]

In this case, the initiator receives a message that the responder could not find a suitable proposal (“received NO_PROPOSAL_CHOSEN”), and from the responder logs it is obvious this was due to the sites being set for different encryption types, AES 128 on one side and AES 256 on the other.

Phase 2 Hash Algorithm Mismatch

Log output from the initiator:

charon: 10[CFG] configured proposals: ESP:AES_CBC_256/HMAC_SHA2_512_256/NO_EXT_SEQ
charon: 10[ENC] generating QUICK_MODE request 2648029707 [ HASH SA No ID ID ]
charon: 10[NET] sending packet: from 198.51.100.3[500] to 203.0.113.5[500] (188 bytes)
charon: 10[NET] received packet: from 203.0.113.5[500] to 198.51.100.3[500] (76 bytes)
charon: 10[ENC] parsed INFORMATIONAL_V1 request 757918402 [ HASH N(NO_PROP) ]
charon: 10[IKE] received NO_PROPOSAL_CHOSEN error notify

Log output from the responder:

charon: 11[CFG] selecting proposal:
charon: 11[CFG] no acceptable INTEGRITY_ALGORITHM found
charon: 11[CFG] received proposals: ESP:AES_CBC_256/HMAC_SHA2_512_256/NO_EXT_SEQ
charon: 11[CFG] configured proposals: ESP:AES_CBC_256/HMAC_SHA1_96/NO_EXT_SEQ
charon: 11[IKE] no matching proposal found, sending NO_PROPOSAL_CHOSEN
charon: 11[IKE] queueing INFORMATIONAL task
charon: 11[IKE] activating new tasks
charon: 11[IKE] activating INFORMATIONAL task
charon: 11[ENC] generating INFORMATIONAL_V1 request 757918402 [ HASH N(NO_PROP) ]

Similar to a Phase 1 Hash Algorithm mismatch, the HMAC values in the log entries do not line up. However the responder also logs a clearer message “no acceptable INTEGRITY_ALGORITHM found” when this happens in Phase 2.

Phase 2 PFS Mismatch

Log output from the initiator:

charon: 06[ENC] generating QUICK_MODE request 909980434 [ HASH SA No KE ID ID ]
charon: 06[NET] sending packet: from 198.51.100.3[500] to 203.0.113.5[500] (444 bytes)
charon: 06[NET] received packet: from 203.0.113.5[500] to 198.51.100.3[500] (76 bytes)
charon: 06[ENC] parsed INFORMATIONAL_V1 request 3861985833 [ HASH N(NO_PROP) ]
charon: 06[IKE] received NO_PROPOSAL_CHOSEN error notify

Log output from the responder:

charon: 08[CFG] selecting proposal:
charon: 08[CFG] no acceptable DIFFIE_HELLMAN_GROUP found
charon: 08[CFG] received proposals: ESP:AES_CBC_256/HMAC_SHA1_96/MODP_2048/NO_EXT_SEQ
charon: 08[CFG] configured proposals: ESP:AES_CBC_256/HMAC_SHA1_96/NO_EXT_SEQ
charon: 08[IKE] no matching proposal found, sending NO_PROPOSAL_CHOSEN
charon: 08[ENC] generating INFORMATIONAL_V1 request 3861985833 [ HASH N(NO_PROP) ]

Perfect Forward Secrecy (PFS) works like DH Groups on Phase 1, but is optional. When chosen PFS options do not match, a clear message is logged indicating this fact: “no acceptable DIFFIE_HELLMAN_GROUP found”.

Note

In some cases, if one side has PFS set to off , and the other side has a value set, the tunnel may still establish and work. The mismatch shown above may only be seen if the values mismatch, for example 1 vs. 5.

Mismatched Identifier with NAT

In this case, pfSense is configured for a Peer Identifier of Peer IP address, but the remote device is actually behind NAT. In this case strongSwan expects the actual private before-NAT IP address as the identifier. The racoon daemon used on older versions was much more relaxed and would match either address, but strongSwan is more formal and requires a correct match.

Log output from the responder:

   charon: 10[IKE] remote host is behind NAT
   charon: 10[IKE] IDir '192.0.2.10' does not match to '203.0.113.245'
[...]
   charon: 10[CFG] looking for pre-shared key peer configs matching 198.51.100.50...203.0.113.245[192.0.2.10]

To correct this condition, change the Peer Identifier setting to IP Address and then enter the pre-NAT IP address, which in this example is 192.0.2.10.

Disappearing Traffic

If IPsec traffic arrives but never appears on the IPsec interface (enc0), check for conflicting routes/interface IP addresses. For example, if an IPsec tunnel is configured with a remote network of 192.0.2.0/24 and there is a local OpenVPN server with a tunnel network of 192.0.2.0/24 then the ESP traffic may arrive, strongSwan may process the packets, but they never show up on enc0 as arriving to the OS for delivery.

Resolve the duplicate interface/route and the traffic will begin to flow.

IPsec Status Page Issues

If the IPsec status page prints errors such as:

Warning: Illegal string offset 'type' in /etc/inc/xmlreader.inc on line 116

That is a sign that the incomplete xmlreader XML parser is active, which is triggered by the presence of the file /cf/conf/use_xmlreader. This alternate parser can be faster for reading large config.xml files, but lacks certain features necessary for other areas to function well. Removing /cf/conf/use_xmlreader will return the system to the default parser immediately, which will correct the display of the IPsec status page。

frm https://docs.netgate.com/pfsense/en/latest/troubleshooting/ipsec.html

--------------

相关帖子:

https://briteming.blogspot.com/2015/07/strongswanfreeradiusikev2-vpn.html

No comments:

Post a Comment