Pages

Tuesday, 27 February 2018

再见,深度包检查(GoodbyeDPI)

GoodbyeDPI—Passive Deep Packet Inspection blocker and Active DPI circumvention utility (for Windows) 

This software designed to bypass Deep Packet Inspection systems found in many Internet Service Providers which block access to certain websites.
It handles DPI connected using optical splitter or port mirroring (Passive DPI) which do not block any data but just replying faster than requested destination, and Active DPI connected in sequence.
Windows 7, 8, 8.1 and 10 with administrator privileges required.

How to use

Download latest version from Releases page and run.
Usage: goodbyedpi.exe [OPTION...]
 -p          block passive DPI
 -r          replace Host with hoSt
 -s          remove space between host header and its value
 -m          mix Host header case (test.com -> tEsT.cOm)
 -f [value]  set HTTP fragmentation to value
 -k [value]  enable HTTP persistent (keep-alive) fragmentation and set it to value
 -n          do not wait for first segment ACK when -k is enabled
 -e [value]  set HTTPS fragmentation to value
 -a          additional space between Method and Request-URI (enables -s, may break sites)
 -w          try to find and parse HTTP traffic on all processed ports (not only on port 80)
 --port        [value]    additional TCP port to perform fragmentation on (and HTTP tricks with -w)
 --ip-id       [value]    handle additional IP ID (decimal, drop redirects and TCP RSTs with this ID).
                          This option can be supplied multiple times.
 --dns-addr    [value]    redirect UDP DNS requests to the supplied IP address (experimental)
 --dns-port    [value]    redirect UDP DNS requests to the supplied port (53 by default)
 --dnsv6-addr  [value]    redirect UDPv6 DNS requests to the supplied IPv6 address (experimental)
 --dnsv6-port  [value]    redirect UDPv6 DNS requests to the supplied port (53 by default)
 --dns-verb               print verbose DNS redirection messages
 --blacklist   [txtfile]  perform HTTP tricks only to host names and subdomains from
                          supplied text file. This option can be supplied multiple times.

 -1          -p -r -s -f 2 -k 2 -n -e 2 (most compatible mode, default)
 -2          -p -r -s -f 2 -k 2 -n -e 40 (better speed for HTTPS yet still compatible)
 -3          -p -r -s -e 40 (better speed for HTTP and HTTPS)
 -4          -p -r -s (best speed)
To check if your ISP's DPI could be circumvented, run 3_all_dnsredir_hardcore.cmd first. This is the most hardcore mode which will show if this program is suitable for your ISP and DPI vendor at all. If you can open blocked websites with this mode, it means your ISP has DPI which can be circumvented. This is the slowest and prone to break websites mode, but suitable for most DPI.
Try goodbyedpi -1 to see if it works too.
Then try goodbyedpi.exe -2. It should be faster for HTTPS sites. Mode -3 speed ups HTTP websites.
Use goodbyedpi.exe -4 if it works for your ISP's DPI. This is the fastest mode but not compatible with every DPI.

How does it work

Passive DPI

Most Passive DPI send HTTP 302 Redirect if you try to access blocked website over HTTP and TCP Reset in case of HTTPS, faster than destination website. Packets sent by DPI usually have IP Identification field equal to 0x0000 or 0x0001, as seen with Russian providers. These packets, if they redirect you to another website (censorship page), are blocked by GoodbyeDPI.

Active DPI

Active DPI is more tricky to fool. Currently the software uses 6 methods to circumvent Active DPI:
  • TCP-level fragmentation for first data packet
  • TCP-level fragmentation for persistent (keep-alive) HTTP sessions
  • Replacing Host header with hoSt
  • Removing space between header name and value in Host header
  • Adding additional space between HTTP Method (GET, POST etc) and URI
  • Mixing case of Host header value
These methods should not break any website as they're fully compatible with TCP and HTTP standards, yet it's sufficient to prevent DPI data classification and to circumvent censorship. Additional space may break some websites, although it's acceptable by HTTP/1.1 specification (see 19.3 Tolerant Applications).
The program loads WinDivert driver which uses Windows Filtering Platform to set filters and redirect packets to the userspace. It's running as long as console window is visible and terminates when you close the window.

How to build from source

This project can be build using GNU Make and mingw. The only dependency is WinDivert.
To build x86 exe run:
make CPREFIX=i686-w64-mingw32- WINDIVERTHEADERS=/path/to/windivert/include WINDIVERTLIBS=/path/to/windivert/x86
And for x86_64:
make CPREFIX=x86_64-w64-mingw32- WINDIVERTHEADERS=/path/to/windivert/include WINDIVERTLIBS=/path/to/windivert/amd64

How to install as Windows Service

Use service_install_russia_blacklist.cmd, service_install_russia_blacklist_dnsredir.cmd and service_remove.cmd scripts.
Modify them according to your own needs.

Similar projects

zapret by @bol-van (for Linux).

Kudos

Thanks @basil00 for WinDivert. That's the main part of this program.
Thanks for every BlockCheck contributor. It would be impossible to understand DPI behaviour without this utility.

from https://github.com/ValdikSS/GoodbyeDPI
-------

https://rutracker.org/forum/viewtopic.php?t=5403670
--------

Trial version of DPI system "Carbon Reductor DPI X" available for download.



In a thread at the NTC forum@ValdikSS posted about written evaluations of various DPI systems done by Roskomnadzor. One of these systems, Carbon Reductor DPI X, is available for trial download as an ISO image. I haven't tested it, but it looks like it's meant to work on standard PC hardware. The download directory is here:
A while ago we discussed acquiring a DPI box to analyze. Well, this may be the chance! This could be a fascinating opportunity to test and understand a real DPI system in a controlled environment. (You would want to install it on an isolated network to prevent it phoning home.) I'm particularly interested in another observation of @ValdikSS's:
It is able to detect unknown protocols and sends information about them to the developer
I'm curious to know what kinds of unknown protocols cause this reporting to happen.
Here is the Roskomnadzor report (Russian):
You would want to install it on an isolated network to prevent it phoning home
This is a trial version of a premium software, you need it to access the internet to at least activate trial period.
And please remember this is not a generic DPI solution. It's built for Russian censorship in mind, to automate Russian censorship.

from https://github.com/net4people/bbs/issues/15
-------------------------------------------

Automatic Discrepancy Discovery for DPI Elusion.

SymTCP - Automatic Discrepancy Discovery for DPI (Deep Packet Inspection) Elusion

SymTCP is a tool used to automatically discover subtle discrepancies between two TCP implementations, e.g., how they accept and drop packets. Specifically, it can find the discrepancies between a server and a DPI, and use them to elude the DPI, e.g., a packet accepted by the server but ignored by the DPI. It first runs symbolic execution on the server's TCP implementation (whitebox) and collect program execution paths labeled as either "accept" path or "drop" path. Symbolic execution will generate the input packet sequence for each execution paths. Then it probes the DPI (blackbox) with the packet sequences generated and finds if they are processed the same by the DPI as the server. You can find more information in our NDSS paper.

Source

├── bin                    Binaries used to send/receive packets
├── data                   Sampled test cased generated using Linux kernel v4.9.3
├── docker                 Docker files used in early stage testing
├── kernel_info            Debug info extracted from Linux kernel v4.9.3
├── patches                Changes made to S2E symbolic execution engine
├── scripts                Python scripts used for data analysis, DPI probing, ...
│   └── eval               Scripts used in evaluation
├── static                 Some static analysis scripts not used in final version
└── tools
    ├── aws_remote_ctl     Scripts used to control AWS instances 
    ├── data_processing    Result analysis scripts used in early stage
    ├── dpi_sys_confs      Configuration files of Bro and Snort
    ├── sym_packet_sender  Symbolic packet sender
    └── web_addr2line      Web application used to translate memory address to source code line number

Usage

Requirement

  • Ubuntu 16.04/18.04 (Other Linux-based system might work as well)
  • Z3 Solver with Python library (v4.8.5)
  • root privilege (for sending raw packets)

Setup S2E

We are using a S2E 2.0 version fetched in Apr, 2019 (can be found in release). Using a newer version of S2E requires porting the code in the patches folder to the newer version.
To set up the S2E environment, you may use the s2e-env tool (by following the instructions here). If you want to reuse the exact S2E version used in this project, you may consider to replace the source code downloaded by s2e-env with the one we provided.
The S2E project we created can be also found in the release. And it should be put in the projects folder of the S2E environment.

Configure network interfaces in the guest and host OS

Because we need to run the guest OS in QEMU with bridge mode, we also need to add a tun/tap interface in the host OS.
mkdir /dev/net (if it doesn't exist already)
mknod /dev/net/tun c 10 200

ip link add name qemubr0 type bridge
ip addr add 172.20.0.1/24 dev qemubr0
ip link set qemubr0 up
And use setuid to give qemu-bridge-helper root privilege:
sudo chown root:root $S2E_DIR/install/libexec/qemu-bridge-helper 

sudo chmod u+s  $S2E_DIR/install/libexec/qemu-bridge-helper
In the guest OS, we need to configure the network interface to use a static IP address 172.20.0.2.

Run symbolic execution

To launch the S2E project, please run the launch-s2e.sh shell script in the tcp project folder. Please also check the path variables before that.
The results are stored in the debug.log file in the s2e output folder, e.g., s2e-out-0. And it will be used later by the Python analysis scripts.

Test case generation

You may use the scripts/get_concrete_examples.py to extract test cases from symbolic execution results.

Probe the DPI

We provide two datasets with 1,000 and 10,000 test cases respectively in our repository, which are both sampled from the same run of symbolic execution with 3 symbolic packets of 40-byte length (20-byte header + 20-byte TCP options and payload). The smaller 1,000 dataset is in the data folder, and the larger 10,000 dataset is in the release. You may also use your own dataset generated by running symbolic execution.
To probe a DPI with the test cases, you may use scripts/probe_dpi.py.
Probe DPI with test cases generated from symbolic execution.

positional arguments:
  test_case_file        test case file

optional arguments:
  -h, --help            show this help message and exit
  -P, --dump-pcaps      dump pcap files for each test case
  -G, --gfw             probing the gfw
  -I INT, --int INT     interface to listen on
  -F, --tcp-flags-fuzzing
                        switch of tcp flags fuzzing
  --tcp-flags TCP_FLAGS
                        Use specific TCP flags for testing
  -D, --debug           turn on debug mode
  -p PAYLOAD_LEN        TCP payload length (because header len is symbolic,
                        payload may be counted as optional header
  -N NUM_INSTS
  -S SPLIT_ID
  -t TEST_CASE_IDX      test case index
  --packet-idx PACKET_IDX
                        packet index in the test case
  --replay REPLAY       replay a list of successful cases
Examples:
./probe_dpi.py -p 20 -P -F             (20-byte options and payload, dump packets, try different flags)
./probe_dpi.py -p 20 -P -F -N 50 -S 0  (Split the dataset into 50 chunks and use the first chunk)
The results can be found in the probe_dpi_result file under current folder. And detailed logs will be output to the probe_dpi.log file.
probe_dpi.py needs to read DPI logs from local paths in order to check whether it succeeds or not. You will need to configure the path variables in the script.
probe_dpi.py also loads a list of server IP addresses from a server_list file (for probing the GFW) or a server_list.local file (for probing DPIs) in order to balance work loads. In such a file, each line is a server's IP address.

Applying SymTCP to another version of Linux kernel

In our research, we used Linux kernel v4.9.3 as the targeted server implementation. Our tool can also be applied to other version of Linux kernel as well (or even other OSes), and it can also be used to probe other DPI systems with little additional efforts.
To apply to another version of Linux kernel, you will need to have the binary of the Linux kernel (the vmlinux file) in order to label drop points and/or accept points on it. The drop points used in S2E are at binary level, configured in s2e-config.lua in the S2E project folder. A typicial approach to label drop points and/or accept points is to label them in the source code first, by backtracing from sinks such as tcp_drop, kfree_skb, etc, and then map the source code line to binary address with tools such as objdump. There are also some hard coded memory address in the S2E plugin we wrote, you will also need to update those addresses according to the new version of Linux kernel.

Publication

Check our NDSS'20 paper for more technical details[PDF]
SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy Discovery. Zhongjie Wang, Shitong Zhu, Yue Cao, Zhiyun Qian, Chengyu Song, Srikanth Krishnamurthy, Tracy D. Braun, Kevin S. Chan. DOI:https://dx.doi.org/10.14722/ndss.2020.24083
-----

SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy Discovery



SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy Discovery
Zhongjie Wang, Shitong Zhu, Yue Cao, Zhiyun Qian, Chengyu Song, Srikanth V. Krishnamurthy, Kevin S. Chan, Tracy D. Braun
https://censorbib.nymity.ch/#Wang2020a
https://github.com/seclab-ucr/SymTCP
This paper presents SymTCP, a system to automatically discovering packet sequences that desynchronize DPI middleboxes. A middlebox is desynchronized when its notion of the state of a TCP connection differs from that of the client and server. The core idea is to use symbolic execution to explore code paths that leads to state changes in an actual implementation of TCP. Implementations of TCP are complicated, and middlebox simulations of endpoint TCP state tend to be simplified approximations. Even though a middlebox may not be directly inspectable, a diverse set of packets that exercise most of an endpoint's code paths are also likely to exercise most of a middlebox's code paths, and some of those code paths will lead the endpoint and middlebox to different internal states. The output of SymTCP is a set of packet sequences that terminate in an evasion packet—a packet that is ignored by the middlebox but interpreted by the server—or an insertion packet—a packet that is interpreted by the middlebox but ignored by the server. Either of these cases (made formal in Section III) is good for desynchronizing the middlebox.
The process begins by manually annotating accept points and drop points in a TCP implementation—places in the code where a packet either modifies the state of a connection, or is removed from consideration. The authors label 6 accept points and 38 drop points in the Linux 5.4.5 TCP server implementation. The next step is the "offline" phase: symbolic execution to find constraints on packets that lead to known accept and drop points. Section V discusses the challenges involved in symbolically executing a complicated piece of code like the Linux kernel TCP implementation. After that comes the "online" phase: solving the constraints to generate packet sequences, and sending them through the middlebox. In the authors' experience, there were too many packet sequences to test effectively, so they sub-sampled the list, while retaining all the distinct accept and drop points. The middlebox is presumed to be a black box whose state is not directly knowable, so there is a final probe step that sends packets containing a keyword designed to provoke a reaction (e.g. blocking) by the middlebox. The output of executing this process for many sets of constraints is a set of packet sequences, each of which terminates in an evasion packet or an insertion packet. These sequences can then be manually examined to understand how they work.
They tested using the Linux TCP server implementation and three middleboxes: ZeekSnort, and the Great Firewall of China. SymTCP found evasion and insertion strategies against all, some new and some previously known (Tables IV, V, and VI).
Thanks to Zhongjie Wang for commenting on a draft of this post.

SymTCP was the subject of the Tor anti-censorship team's reading group on 2020-04-16. There is a transcript of the discussion:
http://meetbot.debian.net/tor-meeting/2020/tor-meeting.2020-04-16-17.59.log.html#l-49

from https://github.com/net4people/bbs/issues/31
----

SymTCP:自动化规避深度包检测


"2017 年,加州大学河滨分校的 Wang 等人在 ACM 会议上发表了关于防火长城的 TCP 状态机的论文,并开源了工具 INTANG原新闻)。2020 年 2 月,Wang 等人和美国陆军研究实验室的研究者,在 NDSS 2020 上发表论文SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy Discovery》,提出了新工具 SymTCP。论文探索了 DPI 装置与端点实现的协议状态机差异,这些差异允许客户端发送不寻常流量迷惑 DPI,相比之前的论文,这次他们使用了软件分析技术探索了常见 TCP 实现的状态机,并测试对多个 DPI 进行规避。本周 Tor project 的 anti-censorship meeting 讨论了这篇论文,这篇论文的局限在于只考虑了 Linux 的实现。其他类似的论文有 2019 年马里兰大学 Kevin Bock 等人在ACM 会议的 Geneva,但相比 SymTCP,Geneva 的算法未公开。"


No comments:

Post a Comment