client-side censorship evasion engine from the Geneva AI. https://censorship.ai
from https://geneva.cs.umd.edu/
---------------------------
https://censorbib.nymity.ch/pdf/Bock2019a.pdf
-----------------------------
Geneva is an artificial intelligence tool that defeats censorship by exploiting bugs in censors, such as those in China, India, and Kazakhstan. Unlike many other anti-censorship solutions which require assistance from outside the censoring regime (Tor, VPNs, etc.), Geneva runs strictly on the client.
Under the hood, Geneva uses a genetic algorithm to evolve censorship evasion strategies and has found several previously unknown bugs in censors. Geneva's strategies manipulate the client's packets to confuse the censor without impacting the client/server communication. This makes Geneva effective against many types of in-network censorship (though it cannot be used against IP-blocking censorship).
This code release specifically contains the strategy engine used by Geneva, its Python API, and a subset of published strategies, so users and researchers can test and deploy Geneva's strategies. To learn more about how Geneva works, visit How it Works. We will be releasing the genetic algorithm at a later date.
Setup
Geneva has been developed and tested for Centos or Debian-based systems. Due to limitations of netfilter and raw sockets, Geneva does not work on OS X or Windows at this time and requires python3.6 (with more versions coming soon).
Install netfilterqueue dependencies:
# sudo apt-get install build-essential python-dev libnetfilter-queue-dev libffi-dev libssl-dev iptables python3-pip
Install Python dependencies:
# python3 -m pip install -r requirements.txt
Running it
# python3 engine.py --server-port 80 --strategy "[TCP:flags:PA]-duplicate(tamper{TCP:dataofs:replace:10}(tamper{TCP:chksum:corrupt},),)-|" --log debug
2019-10-14 16:34:45 DEBUG:[ENGINE] Engine created with strategy \/ (ID bm3kdw3r) to port 80
2019-10-14 16:34:45 DEBUG:[ENGINE] Configuring iptables rules
2019-10-14 16:34:45 DEBUG:[ENGINE] iptables -A OUTPUT -p tcp --sport 80 -j NFQUEUE --queue-num 1
2019-10-14 16:34:45 DEBUG:[ENGINE] iptables -A INPUT -p tcp --dport 80 -j NFQUEUE --queue-num 2
2019-10-14 16:34:45 DEBUG:[ENGINE] iptables -A OUTPUT -p udp --sport 80 -j NFQUEUE --queue-num 1
2019-10-14 16:34:45 DEBUG:[ENGINE] iptables -A INPUT -p udp --dport 80 -j NFQUEUE --queue-num 2
Note that if you have stale
iptables rules or other rules that rely on Geneva's default queues, this will fail. To fix this, remove those rules.Strategy Library
Geneva has found dozens of strategies that work against censors in China, Kazakhstan, and India. We include several of these strategies in strategies.md. Note that this file contains success rates for each individual country; a strategy that works in one country may not work as well as other countries.
Researchers have observed that strategies may have differing success rates based on your exact location. Although we have not observed this from our vantage points, you may find that some strategies may work differently in a country we have tested. If this is the case, don't be alarmed. However, please feel free to reach out to a member of the team directly or open an issue on this page so we can track how the strategies work from other geographic locations.
Disclaimer
Running these strategies may place you at risk if you use it within a censoring regime. Geneva takes overt actions that interfere with the normal operations of a censor and its strategies are detectable on the network. Geneva is not an anonymity tool, nor does it encrypt any traffic. Understand the risks of running Geneva in your country before trying it.
How it Works
See our paper for an in-depth read on how Geneva works. Below is a rundown of the format of Geneva's strategy DNA.
Strategy DNA
Geneva's strategies can be arbitrarily complicated, and it defines a well-formatted syntax for expressing strategies to the engine.
A strategy is simply a description of how network traffic should be modified. A strategy is not code, it is a description that tells the engine how it should operate over traffic.
A strategy divides how it handles outbound and inbound packets: these are separated in the DNA by a "\/". Specifically, the strategy format is
\/ . If \/ is not present in a strategy, all of the action trees are in the outbound forest.
Both forests are composed of action trees, and each forest is allowed an arbitrarily many trees.
An action tree is comprised of a trigger and a tree. The trigger describes when the strategy should run, and the tree describes what should happen when the trigger fires. Recall that Geneva operates at the packet level, therefore all triggers are packet-level triggers. Action trees start with a trigger, and always end with a
-|.
Triggers operate as exact-matches, are formatted as follows:
[::] . For example, the trigger: [TCP:flags:S] will run its corresponding tree whenever it sees a SYN TCP packet. If the corresponding action tree is [TCP:flags:S]-drop-|, this action tree will cause the engine to drop any SYN packets. [TCP:flags:S]-duplicate-| will cause the engine to duplicate the SYN packet.
Depending on the type of action, some actions can have up to two children. These are represented with the following syntax:
[TCP:flags:S]-duplicate(,)-| , where and themselves are trees. If (,) is not specified, any packets that emerge from the action will be sent on the wire.
Any action that has parameters associated with it contain those parameters in
{}. Consider the following strategy with tamper.[TCP:flags:A]-duplicate(tamper{TCP:flags:replace:R},)-| \/
This strategy takes outbound
ACK packets and duplicates them. To the first duplicate, it tampers the packet by replacing the TCP flags field with RST, and does nothing to the second duplicate.
Note that due to NFQueue limitations, actions that introduce branching (fragment, duplicate) are disabled for incoming action forests.
Citation
If you like the work or plan to use it in your projects, please follow the guidelines in citation.bib.
Paper
See our paper from CCS for an in-depth dive into how it works.
------------------------------------------------------------------
Automating Evasion
Researchers and censoring regimes have long engaged in a cat-and-mouse game, leading to increasingly sophisticated Internet-scale censorship techniques and methods to evade them. In this work, we take a drastic departure from the previously manual evade/detect cycle by developing techniques to automate the discovery of censorship evasion strategies.
Our Approach
We developed Geneva (Genetic Evasion), a novel experimental genetic algorithm that evolves packet-manipulation-based censorship evasion strategies against nation-state level censors. Geneva re-derived virtually all previously published evasion strategies, and has discovered new ways of circumventing censorship in China, India, Iran, and Kazakhstan.
How it works
Geneva runs exclusively on one side of the connection: it does not require a proxy, bridge, or assistance from outside the censoring regime. It defeats censorship by modifying network traffic on the fly (by injecting traffic, modifying packets, etc) in such a way that censoring middleboxes are unable to interfere with forbidden connections, but without otherwise affecting the flow. Since Geneva works at the network layer, it can be used with any application; with Geneva running in the background, any web browser can become a censorship evasion tool. Geneva cannot be used to circumvent blocking of IP addresses.
Geneva composes four basic packet-level actions (drop, duplicate, fragment, tamper) together to represent censorship evasion strategies. By running directly against real censors, Geneva’s genetic algorithm evolves strategies that evade the censor.
Real World Deployments
Geneva has been deployed against real-world censors in China, India, Iran, and Kazahkstan. It has discovered dozens of strategies to defeat censorship, and found previously unknown bugs in censors.
All of these strategies and Geneva’s strategy engine and are open source: check them out on our Github page.
---------------------------
https://censorbib.nymity.ch/pdf/Bock2019a.pdf
-----------------------------
Geneva is a genetic algorithm that automatically discovers censorship evasion strategies by combining primitive operations in various ways and evaluating the combinations against a network censor (real or simulated). The strategies it discovers are packet-level manipulations like those of Khattak et al. 2013, lib·erate, and INTANG—things like sending overlapping segments or dropping certain packets. In fact, Geneva automatically rediscovers most of the evasions that prior work had found manually, as well as new and updated ones that manual analysis probably would not have found. They train and evaluate Geneva in the lab aginst simulated censors, and in the wild against real censors in China, India, and Kazakhstan.
An evasion strategy consists of paired triggers and actions. A trigger is a predicate over packets; for example
[TCP:flags:R] matches TCP RST packets. An action is an operation on a single packet: like duplicate, which makes a copy of a packet and allows you to modify the original and the copy separately; fragment, which breaks a packet or segment into two parts and likewise permits further processing on both parts; and tamper, which sets a field in the packet to a static or a random value, while updating dependent fields such as the checksum. Whenever a trigger is true, it causes its associated action to happen. Actions may recursively invoke other actions, forming a tree structure: for example duplicate has two action subtrees that say what to do with the each of the two copies. At the leaves of the tree appear the special terminal actions send and drop. There are separate lists of triggers and actions in the incoming and outgoing directions.
A sample strategy is:
[TCP:flags:A]-duplicate(
send,
tamper{TCP:flags:replace:R}(
tamper{TCP:chksum:corrupt}(
send
)
)
)-|
\/
The trigger
[TCP:flags:A] matches outgoing ACK packets. duplicate makes two copies of the packet. The first one is sent unmodified, but the second is changed into a RST packet with a bad checksum before being sent. The \/ separates the outgoing and incoming trigger–action lists; in this case the incoming section is empty. This strategy, currently effective against the GFW, tricks the middlebox into thinking the connection has been terminated because of the RST packet it sends, but the RST actually has no effect on the remote server because of its incorrect checksum. There’s a catalog of strategies in Table 1 and at https://github.com/Kkevsterrr/geneva/blob/master/strategies.md.
Geneva starts with a population of strategies that may be initialized randomly or seeded with known strategies. Individuals in the population undergo mutation (random changes to their actions) and crossover (swapping subtrees of actions with another individual). The fitness of an individual is primarily determined by its effectiveness against the censor—Geneva tries it out to see if it works—with penalties for large action trees or high network overhead. High-fitness individuals are more likely to be selected and survive into the next generation. The authors report that computing each new generation takes 5–10 minutes, and full training 4–8 hours.
The bulk of the authors’ validation was in China against the Great Firewall. Geneva finds a number of strategies that confuse the firewall’s notion of what the correct TCP sequence number is or whether the connection has been closed. It also finds a few weird and unexpected strategies that seem to expose previously unknown and subtle characteristics of the GFW’s classification algorithms. Take for example Strategy 7 in Section 5.2: splitting a TCP segment at offset 8 doesn’t work, but splitting it at offsets 8 and 12 does—even when the censored keyword is not split across segments. They additionally tested in India (on the Airtel ISP) and Kazakhstan (during the time when the TLS MITM was still happening), where Geneva found effective strategies that were comparatively simpler than the China ones.
There’s a project home page:
As of this writing, the genetic training algorithm is not yet available to download, but there is source code for the client-side software that implements pre-trained strategies.
from https://ntc.party/t/paper-summary-geneva-evolving-censorship-evasion-strategies-ccs-19/298
------------------
https://github.com/net4people/bbs/issues/23
------------------
https://github.com/net4people/bbs/issues/23
No comments:
Post a Comment