在 korg 内核的 openvswitch 支持 GRE 之前,我们都是用内核原生的 GRE tunnel 来配置,而现在, korg 内核中的 openvswitch 也已经支持 GRE tunnel 了。有兴趣的可以看 openvswitch: Add tunneling interface. 和 openvswitch: Add gre tunnel support. 这两个 commit。
其实在 OVS 中添加 GRE 很简单,它无非就是把对 GRE 头和外部 IP 头的一些操作从原来的代码中抽象出来,做成内核“库函数”的形式,然后 OVS 中就可以直接调用它们了。难的是要从旧的 ip_gre 模块代码中抽象出这些“库函数”。详见 GRE: Refactor GRE tunneling code.
值得注意的是,OVS GRE tunnel 没有注册网络设备,也就是说你无法通过
ip link
看到它,它只是一个 vport 而已,所以能通过 ovs-vsctl show 可以看到。这是故意这么设计的,虽然这简化了用户的操作,但刚注意到时难免会感觉有些奇怪。
网上最流行的一篇讲解 OVS GRE tunnel 配置的教程是这篇文章,根据它我做了如下配置:
ovs-vsctl add-br grebr0 ovs-vsctl add-br phybr0 ovs-vsctl add-port phybr0 p1p1 ovs-vsctl add-port phybr0 tep0 — set Interface tep0 type=internal ifconfig tep0 192.168.88.1/24 ifconfig p1p1 0.0.0.0 ovs-vsctl add-port grebr0 vnet0 ovs-vsctl add-port grebr0 gre1 — set Interface gre1 type=gre options:remote_ip=192.168.88.2
但是仔细分析一下,其实完全没有必要使用两个 bridge,通过 gre1 的包其实可以直接进入 p1p1,即最后的物理网卡。所以优化后的配置如下:
ovs-vsctl add-br grebr0 ifconfig p1p1 192.168.88.1/24 ovs-vsctl add-port grebr0 vnet0 ovs-vsctl add-port grebr0 gre1 — set Interface gre1 type=gre options:remote_ip=192.168.88.2
通过 GRE tunnel 的包是重新注入网络栈中的,所以它们会直接流向 p1p1,最终流向物理层。
注意,这并没有结束。虽然通过这个配置你已经可以 ping 通对方 host 上的 VM 了,但是,如果你运行 netperf 测试的话,你会发现吞吐量非常低。这也是网络上的教程没有提到的地方。
这里的原因是从 vnet0 里出来数据包很多是 MTU 的大小,我这里是1500。而经过 GRE tunnel 后外面又添加了 GRE 头和外层的 IP 头,所以包就会大于 1500。而物理网卡的 MTU 也是 1500!并且,这些包本身并不是 GSO 的,所以这些包最终会被 IP 层分片(fragment),所以性能非常差!
这里有两种解决方法:
1) 把 VM 里的网卡 MTU 调小,比如 1400,这样 host 上的 GRE 加上额外的头也不会超过 1500;
2) 让 VM 里发出来的包依旧维持 GSO,这样 host 上收到的包也是 GSO,它们最终会被分段(segment),而不是分片(fragment)。这个可以通过给 qemu 传递 vnet_hdr=on 来完成(我没有试过,仅分析了源代码)。
关于这个问题的进一步讨论可以看问题。
------------
Using GRE Tunnels with Open vSwitch
I’m back with another “how to” article on Open vSwitch (OVS), this time taking a look at using GRE (Generic Routing Encapsulation) tunnels with OVS. OVS can use GRE tunnels between hosts as a way of encapsulating traffic and creating an overlay network. OpenStack Quantum can (and does) leverage this functionality, in fact, to help separate different “tenant networks” from one another. In this write-up, I’ll walk you through the process of configuring OVS to build a GRE tunnel to build an overlay network between two hypervisors running KVM.
Naturally, any sort of “how to” such as this always builds upon the work of others. In particular, I found a couple of Brent Salisbury’s articles (here and here) especially useful.
This process has 3 basic steps:
- Create an isolated bridge for VM connectivity.
- Create a GRE tunnel endpoint on each hypervisor.
- Add a GRE interface and establish the GRE tunnel.
These steps assume that you’ve already installed OVS on your Linux distribution of choice. I haven’t explicitly done a write-up on this, but there are numerous posts from a variety of authors (in this regard, Google is your friend).
We’ll start with an overview of the topology, then we’ll jump into the specific configuration steps.
Reviewing the Topology
The graphic below shows the basic topology of what we have going on here:
We have two hypervisors (CentOS 6.3 and KVM, in my case), both running OVS (an older version, version 1.7.1). Each hypervisor has one OVS bridge that has at least one physical interface associated with the bridge (shown as
br0
connected to eth0
in the diagram). As part of this process, you’ll create the other internal interfaces (the tep
and gre
interfaces, as well as the second, isolated bridge to which VMs will connect. You’ll then create a GRE tunnel between the hypervisors and test VM-to-VM connectivity.Creating an Isolated Bridge
The first step is to create the isolated OVS bridge to which the VMs will connect. I call this an “isolated bridge” because the bridge has no physical interfaces attached. (Side note: this idea of an isolated bridge is fairly common in OpenStack and NVP environments, where it’s usually called the integration bridge. The concept is the same.)
The command is very simple, actually:
ovs-vsctl add-br br2
Yes, that’s it. Feel free to substitute a different name for
br2
in the command above, if you like, but just make note of the name as you’ll need it later.
To make things easier for myself, once I’d created the isolated bridge I then created a libvirt network for it so that it was dead-easy to attach VMs to this new isolated bridge.
Configuring the GRE Tunnel Endpoint
The GRE tunnel endpoint is an interface on each hypervisor that will, as the name implies, serve as the endpoint for the GRE tunnel. My purpose in creating a separate GRE tunnel endpoint is to separate hypervisor management traffic from GRE traffic, thus allowing for an architecture that might leverage a separate management network (which is typically considered a recommended practice).
To create the GRE tunnel endpoint, I’m going to use the same technique I described in my post on running host management traffic through OVS. Specifically, we’ll create an internal interface and assign it an IP address.
To create the internal interface, use this command:
ovs-vsctl add-port br0 tep0 -- set interface tep0 type=internal
In your environment, you’ll substitute
br2
with the name of the isolated bridge you created earlier. You could also use a different name than tep0
. Since this name is essentially for human consumption only, use what makes sense to you. Since this is a tunnel endpoint, tep0
made sense to me.
Once the internal interface is established, assign it with an IP address using
ifconfig
or ip
, whichever you prefer. I’m still getting used to using ip
(more on that in a future post, most likely), so I tend to use ifconfig
, like this:ifconfig tep0 192.168.200.20 netmask 255.255.255.0
Obviously, you’ll want to use an IP addressing scheme that makes sense for your environment. One important note: don’t use the same subnet as you’ve assigned to other interfaces on the hypervisor, or else you can’t control that the GRE tunnel will originate (or terminate) on the interface you specify. This is because the Linux routing table on the hypervisor will control how the traffic is routed. (You could use source routing, a topic I plan to discuss in a future post, but that’s beyond the scope of this article.)
Repeat this process on the other hypervisor, and be sure to make note of the IP addresses assigned to the GRE tunnel endpoint on each hypervisor; you’ll need those addresses shortly. Once you’ve established the GRE tunnel endpoint on each hypervisor, test connectivity between the endpoints using
ping
or a similar tool. If connectivity is good, you’re clear to proceed; if not, you’ll need to resolve that before moving on.Establishing the GRE Tunnel
By this point, you’ve created the isolated bridge, established the GRE tunnel endpoints, and tested connectivity between those endpoints. You’re now ready to establish the GRE tunnel.
Use this command to add a GRE interface to the isolated bridge on each hypervisor:
ovs-vsctl add-port br2 gre0 -- set interface gre0 type=gre \
options:remote_ip=<GRE tunnel endpoint on other hypervisor>
Substitute the name of the isolated bridge you created earlier here for
br2
and feel free to use something other than gre0
for the interface name. I think using gre
as the base name for the GRE interfaces makes sense, but run with what makes sense to you.
Once you repeat this command on both hypervisors, the GRE tunnel should be up and running. (Troubleshooting the GRE tunnel is one area where my knowledge is weak; anyone have any suggestions or commands that we can use here?)
Testing VM Connectivity
As part of this process, I spun up an Ubuntu 12.04 server image on each hypervisor (using
virt-install
as I outlined here), attached each VM to the isolated bridge created earlier on that hypervisor, and assigned each VM an IP address from an entirely different subnet than the physical network was using (in this case, 10.10.10.x).
Here’s the output of the
route -n
command on the Ubuntu guest, to show that it has no knowledge of the “external” IP subnet—it knows only about its own interfaces:ubuntu:~ root$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.10.10.254 0.0.0.0 UG 100 0 0 eth0
10.10.10.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
Similarly, here’s the output of the
route -n
command on the CentOS host, showing that it has no knowledge of the guest’s IP subnet:centos:~ root$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 tep0
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 mgmt0
0.0.0.0 192.168.1.254 0.0.0.0 UG 0 0 0 mgmt0
In my case, VM1 (named
web01
) was given 10.10.10.1; VM2 (named web02
) was given 10.10.10.2. Once I went through the steps outlined above, I was able to successfully ping VM2 from VM1, as you can see in this screenshot:
(Although it’s not shown here, connectivity from VM2 to VM1 was obviously successful as well.)
“OK, that’s cool, but why do I care?” you might ask.
In this particular context, it’s a bit of a science experiment. However, if you take a step back and begin to look at the bigger picture, then (hopefully) something starts to emerge:
- We can use an encapsulation protocol (GRE in this case, but it could have just as easily been STT or VXLAN) to isolate VM traffic from the physical network and from other VM traffic. (Think multi-tenancy.)
- While this process was manual, think about some sort of controller (an OpenFlow controller, perhaps?) that could help automate this process based on its knowledge of the VM topology.
- Using a virtualized router or virtualized firewall, I could easily provide connectivity into or out of this isolated (encapsulated) private network. (This is probably something I’ll experiment with later.)
- What if we wrapped some sort of orchestration framework around this, to help deploy VMs, create networks, add routers/firewalls automatically, all based on the customer’s needs? (OpenStack Networking, anyone?)
Anyway, I hope this is helpful to someone. As always, I welcome feedback and suggestions for improvement, so feel free to speak up in the comments below. Vendor disclosures, where appropriate, are greatly appreciated. Thanks!
from http://blog.scottlowe.org/2013/05/07/using-gre-tunnels-with-open-vswitch/
------------------------------------------------------------------------
eth0---eth3--------------------eth3---eth0
第一种:GRE VPN(linux之间)
VPN服务端:(201.1.2.5)(两边操作几乎相同)
1)启用GRE模块
#ping 201.1.2.10
#lsmod //显示模块列表
#lsmod | grep ip_gre //确定是否加载了gre模块
#modproba ip_gre //加载模块
#modinfo ip_gre //查看模块信息
2)创建VPN隧道
#ip tunnel add tun0 mode gre remote 201.1.2.10 local 201.1.2.5
//ip tunnel add创建隧道(隧道名称为tun0),ip tunnel help可以查看帮助
//mode设置隧道使用gre模式
//local后面跟本机的IP地址,remote后面是与其他主机建立隧道的对方IP地址
3)启用该隧道(和网卡的up一样)
#ip link show
#ip link set tun0 up
#ip link show
4)为VPN隧道配置IP
#ip addr add 10.10.10.5/24 peer 10.10.10.10/24 dev tun0
#ip addr show
#echo "1" > /proc/sys/net/ipv4/ip_forward //开启路由转发
PC客户端:(201.1.2.10)
#lsmod | grep ip_gre
#modprobe ip_gre
#ip tunnel add tun0 mode gre remote 201.1.2.5 local 201.1.2.10
#link show
#ip link set tun0 up
#ip link show
#ip addr add 10.10.10.10 peer 101.10.10.5/24 dev tun0
#ip addr show
测试连通性
#ping 10.10.10.5
from https://blog.csdn.net/weixin_41619143/article/details/88530758