I2P Tunnels: Garlic encryption and unidirectional data transfer

Teacher · Oct 11, 2021

Anonymity of I2P network participants is achieved by using tunnels. An important feature of I2P is that only the person who created it knows the length of the tunnel, its beginning and end.

Each node of the network has its own incoming and outgoing tunnels, and by default it also acts as a transit node in the chains of other participants. I2P tunnels are unidirectional - traffic goes only in one direction through each tunnel. In an anonymous network, the user does not have direct access to the subscriber on the other side, only information about the initial node in its incoming tunnel is available. Access to the incoming tunnel of another network participant occurs through the anonymizing chain on its part - through the outgoing one the tunnel. After the contact is established, the addressee is informed about the incoming tunnel of the network member that contacted it. For more information about how nodes find each other's incoming tunnels for the first access, see the article about floodfiles.

Chains of multiple nodes are, if not the most basic, then one of the most important logical parts of I2P. In this article, we will explain the principle of their construction and speculatively prove that the level of anonymity in I2P can be trusted.

Transports
Popular low-level network protocols today are TCP and UDP. They provide the logic of information delivery between subscribers, but they are absolutely not responsible for the privacy of this information - anyone can intercept it along the way. This makes it necessary for higher-level application protocols to encrypt information.

I2P uses the transport protocols NTCP2, which is analogous to TCP, and SSU, which is analogous to UDP. In fact, these protocols are cryptographic wrappers of their older brothers. Information within the "invisible Internet" is transmitted via these protocols, which allows you to hide all transmitted information from your home provider.

NTCP2 and SSU not only encrypt information, but also add a random number of extra bytes. "Garbage" does not leave a chance for traffic analysis systems, since the size of packets is random and does not actually mean anything. After decrypting the packet at the destination, the blank bytes are simply discarded.

NTCP2 and SSU are tools for secure peer-to-peer connection-direct communication with other routers on the network. At the beginning of this article, it is said that anonymity within I2P is achieved through tunnels consisting of several intermediate nodes. In view of this, we will not focus on how nodes communicate directly with each other, but will go closer to the topic of anonymizing tunnels. Read more about the NTCP2 transport protocol in Russian here.

What is a tunnel
I2P routers, i.e. all network nodes, communicate with each other using the I2NP protocol. You can see the actual list of I2NP message types in the source code, or refer to the official documentation for a detailed explanation.

In fact, I2NP contains an exhaustive set of possible message types required for I2P routers to communicate with each other. Note the types whose names contain the word Tunnel. They are exactly what we are interested in in this article. All user information is transmitted inside the tunnels in the form of encrypted blocks from router to router, so that the original source of information is lost, and its real recipient remains unknown, since it is not known on which of the transit routers the tunnel is interrupted.

There are two main requirements for tunnel logic: anonymity of the tunnel creator in front of its participants, and compatibility of transit nodes of the chain by transport. A brief description of tunnel construction is given in a large article about I2P, but now we will delve into the depths of this complex mechanism and once again make sure that the level of anonymity in I2P is unprecedented in its kind.

Analogy of the I2P architecture with a regular network
So that the following story does not seem like a dry specification that cannot be understood even after two readings, we will slightly simplify the initial understanding of the I2P network architecture by comparing it with the usual network.

Your device connects to the Internet via a router. This can be a home router connected via wire or Wi-Fi, or a mobile operator router connected via 3G/4G/5G networks. The essence of this does not change. Before establishing a logical IP connection to the router, the device exchanges service information at a low level - in the illustration it is designated as "Ethernet frames" (simplified). After establishing a logical IP connection with the router, you can use it to access computers around the world. When you open a site with a server located in a neighboring city, your traffic passes through many transit nodes of Internet service providers, while you have a direct connection only with the original router. All intermediate nodes work with you at the level of IP connections - they transmit certain packets of information from point A to point B. Each logical aggregate along the path of your traffic builds an IP connection with its neighbors at a lower level, just like you and your home router. It turns out that many computers and routers around the world establish low-level communication with their physical neighbors via wires (or other media for transmitting information) and, based on this connection, are combined into IP networks, where the usual routing with IPv4 or IPv6 addresses reigns. The highest level of TCP/IP communication is indicated in the illustration as "applied". This is the top of the pyramid - user traffic, for example, a browser request to a site over HTTPS. The lower levels don't know anything about your site or browser, they just work to transmit some binary information to the destination, where it will be read in its original form and processed.

I2P works on top of TCP / IP, but has its own additional structure, which partly repeats the usual network, but with a strong bias towards privacy and anonymity. An I2P router is a software client on your device that provides all the internal network logic. I2NP messages are the basic communication tool for I2P routers. Tunnels are analogous to IP connections that provide direct communication with other routers and through them interact with nodes that are not in direct contact with them. The highest level in the illustration - the "garlic message" - is user information that is delivered through tunnels, but has meaning only for the final recipient, who can decipher it.

Hidden network messages pass through the regular Internet, but their content can only be understood at a higher level, which requires an I2P router to access. This property is called an overlay. If you go into the formalism, the term "deep Internet" in relation to I2P and other hidden networks should sound the opposite, for example,"high-level Internet".

Garlic message forming a tunnel
Each I2P router publishes information about itself on fludfiles - reference nodes of the network. The full address of the router is called "Router Info", or simply"RI". In addition to information about immediate availability (IP addresses and introducers addresses), as well as some service information, the RI contains the router's public encryption key.

The local database of the I2P router network consists entirely of RI files of other network participants. When a router needs to acquire a tunnel, the first step is to select its future participants. Candidates are searched in the local network database (NetDB). The declared bandwidth of the router, transport compatibility with neighbors, as well as data from the profile, if available, are taken into account. Profiling means a local chronicle of the router's interaction with a specific network participant and an assessment of its stability - an unstable router will not be used when building a tunnel.

When candidates are selected, a garlic message is created. If a tunnel is built no longer than four hops (jumps), garlic always contains four cloves, i.e. four messages. Otherwise, garlic consists of eight messages (by the number of maximum possible tunnel lengths). By default, tunnels have a length of three transit nodes.

The garlic message principle is that each tunnel participant receives a complete set of messages (garlic), but can only read one clove intended for them. Then the entire garlic message is transmitted further according to the instructions received in the clove. Denticles are identified by recipients by the first 16 bytes, which are the beginning of the hash from their encryption key. After reading the garlic message, the participant replaces the contents of the clove with their response. To keep the information secret, it is encrypted with a symmetric key.

The most interesting nuance is that a particular node can only see its clove in garlic. This does not allow you to make guesses about other tunnel participants based on their key hashes. Reception is provided by additional symmetric encryption of all garlic. Perhaps there will be a separate note about this someday, but here is the source code for the most daring:

libi2pd/Tunnel.cpp
To date, garlic messages come in three types and have corresponding differences:

Old. Used when all transit routers in the tunnel use El Gamal encryption. Each message is 528 bytes long and contains AES keys: a one-time key for encrypting the response, an initialization vector (IV), an IV encryption key, and the main key used during the tunnel lifetime for bulbous (multi-layer symmetric) encryption.
Transitional. It is used when there are nodes with both the old El Gamal encryption and new ones with ECIES among the transit routers. Each message is 528 bytes long, but for ECIES nodes, a different encryption is used for the response: the symmetric encryption key is not transmitted directly, but is calculated using the Noise (Noise_N) protocol.
New - "garlic with short messages". Used when all transit routers use ECIES encryption and have a version at least 2.39.0 for i2pd, or 1.5.0 for the Java router. Each message is 218 bytes long. The denticles do not contain the aforementioned keys, because in this case they are all computable. AES is used only as the main encryption key for the tunnel, otherwise the AEAD/Chaha20/Poly1305 and ChaCha20 algorithms are used. For more information about short garlic messages, see the specification.

The uniqueness of the new type of garlic messages lies in its size - a standard garlic of four cloves fits in one kilobyte. At the time of publication of the article, all tunnel messages of the I2P network are divided into fragments of one kilobyte each. This is the standard size, which is an element of combating traffic analysis.

A minimum-sized garlic with 528-byte cloves requires sending three tunnel messages. Despite the obfuscation of traffic, hypothetically it is possible to identify a pattern: three kilobyte messages are standard garlic. If we see a characteristic movement of traffic, it means that the observed user is building a tunnel, or is a transit link in the tunnel of another user. In practice, this threat is fantastic, because tunnels only live for ten minutes, and active nodes are constantly involved in more than two thousand transit tunnels. You can imagine how much garlic on a similar router flies only for a minute...

When using short messages, all the garlic fits in one tunnel message, so the new garlic absolutely merges with the rest of the information flow. The spy's last hope of analyzing traffic for three kilobyte messages in a row is lost. But it's not all about paranoia, ladies and gentlemen! One tunnel message instead of three is a factor in increasing the speed of tunnel construction: fewer packets are sent - there is less chance that one of them may be lost, and less information is transmitted faster.

Outbound and inbound tunnels are constructed in a similar way:

When building an incoming tunnel to remain incognito, the creator router passes garlic to the first participant through its outgoing tunnel. The garlic is then passed from node to node and comes back to the tunnel creator. For the last member of the chain, the creator does not differ from the next transit node.
When building an outgoing tunnel, the picture is reversed: the creator router transmits garlic directly to the router closest in the chain, which cannot know whether the router that transmitted the garlic is the owner of the garlic, or the same transit node. The clove of the last transit node contains information for sending garlic back to the creator. After completing the entire chain, garlic is returned to the creator through one of the incoming tunnels.

At startup, the I2P router has no real tunnels, only two zero-length tunnels. When creating the first full-fledged tunnels, the necessary accesses do not occur through anonymizing chains of nodes, but directly through a zero-length tunnel. Due to the fact that this is a rare event and access through a zero-length tunnel is indistinguishable from normal, this feature is not considered a weak point of the I2P architecture.

Tunnels in the practical context of the network
Garlic cloves, in addition to encryption keys, contain information about the tunnel numbers (each one is 4 bytes). One number is the tunnel number to which the router will receive messages, and the second number is the tunnel number at the next transit node, the address of which is attached, where the current transit router will transmit incoming information.

Garlic messages contain only short addresses of routers that need to send garlic further down the chain. The router's address is a SHA256 hash of its full Router Info. If the desired router is not in the local database, the transit node needs to contact the floodfile to get its full address (RI). When creating garlic, the tunnel creator takes into account the transport compatibility of neighbors with each other, which reduces the probability of an unsuccessful attempt to build a tunnel. For example, it is silly to ask a router without an IPv6 address to contact a network member who has only IPv6.The router's address is a SHA256 hash of its full Router Info. If the desired router is not in the local database, the transit node needs to contact the floodfile to get its full address (RI). When creating garlic, the tunnel creator takes into account the transport compatibility of neighbors with each other, which reduces the probability of an unsuccessful attempt to build a tunnel. For example, it is silly to ask a router without an IPv6 address to contact a network member who has only IPv6.

A tunnel is considered failed if the response doesn't arrive for a long time, or garlic has returned, but one of the surveyed routers has included a refusal to participate in its response (this may be due to the limit on the number of transit tunnels, etc.).

When a tunnel is created, its life cycle is limited to ten minutes. After this time, all transit nodes stop receiving packets within the old tunnel, and its creator needs to have a new tunnel by this time - it is created shortly before the old one goes bad. After updating incoming tunnels, if we are talking about a server endpoint that is waiting for requests from other network participants, its libset is also updated - contact information on floodfiles, which includes information about incoming tunnels.

As such, garlic encryption is used in I2P only when creating tunnels. In operational mode, the tunnel uses only onion encryption (plus end-to-end encryption from user to user).

Onion encryption is a term that means multi-layer symmetric key encryption. In case someone forgot or doesn't know: symmetric encryption uses a single key for encryption and decryption, as opposed to asymmetric algorithms, where encryption is performed with a public key and decryption is performed with a private one. Asymmetric encryption is used for end-to-end encryption.

All tunnels are unidirectional, and transit nodes do not know anything: whether the tunnel is incoming or outgoing, how many participants there are in it, and so on. Their task is to encrypt the passing information with the received (or mathematically derived) symmetric key and transfer the resulting packet to the next node.

Only end nodes in tunnels have special roles: for an outgoing tunnel, the last node is the Endpoint, and the first node in the incoming tunnel is the Endpoint. in the tunnel - Gateway. Unlike the usual "middle" nodes, these two know their place in the chain thanks to the special flags they receive in garlic. The Endpoint's task is to collect kilobyte tunnel messages into a larger packet (up to 64 kilobytes) and transmit it further according to instructions (to the incoming tunnel, directly to another router, or local information processing). The Gateway task is the reverse: split received messages into standard fragments of one kilobyte each and send these fragments further along the incoming tunnel.

When sending a packet, the router decrypts it one by one with all the transit node keys. This is done so that after each transit node encrypts the information, it appears in its original form. This is a feature of symmetric AES and ChaCha20 encryption algorithms. It sounds a bit complicated, but the bottom line is that encryption and decryption are mirror operations, and when decrypting the original information, it is encrypted, but in order to decrypt it later, you need to perform the operation in the opposite direction, that is, encrypt it.

On the last router of the outgoing tunnel - Endpoint-the last layer of onion encryption is removed and the information is transmitted to the incoming tunnel of the other side. Despite the fact that all onion encryption is removed at this point, user information is not subject to threat, since end-to-end encryption is primarily applied to it.

In an incoming tunnel, everything is more simple for the uninitiated mind: each transit node adds encryption with its own key (and this is really just a new layer of encryption), then passes the resulting packet to the next node. The final recipient, having removed all onion encryption, decrypts the information with its asymmetric key and checks the signature, after which the decrypted information is raised to a higher level and given to the external (application) application in its original form.

In general terms, the completed scheme for transmitting user data looks as shown in the illustration.

With an outgoing tunnel:

The I2P router accepts information from an external local application and packages it in I2NP Data(gzip + service headers);
End-to-end encryption is performed with the recipient's key (the key from the lisset).I2NP Garlic;
Information is prepared for sending via a specific tunnel, and a message is generatedI2NP Tunnel;
Traffic is wrapped in transport protocol cryptography;
Information is sent over the physical network to the first (nearest) node of the outgoing tunnel.

With an incoming tunnel:

Packets intended for the I2P router are sent over the network;
Transport protocols are processed and I2NP messages are decrypted;
The message is I2NP Tunnela-tunnel message;
By using the tunnel ID, the router understands that this message came from its tunnel. Onion encryption is removed;
The available message type I2NP Garlicis the original message of the sender encrypted with the recipient's asymmetric key. The description for the illustration focuses on user information received from the tunnel, because there are still many service messages and things are different there.
It should be noted that I2NP Data-user information (gzip + service headers), which is unpacked and given to an external application.

The description for the illustration focuses on user information received from the tunnel, because there are still many service messages and things are different there.

It should be noted that I2NP Garlic- is a type that only denotes a message with single-layer end-to-end encryption, and not a garlic message that is used when building a tunnel. Perhaps it was once planned to use garlic encryption here, but the fate of practical implementation decided otherwise.

This is a simplified explanation of how I2P tunnels work, but despite all the efforts to simplify the material, it turned out to be very difficult to understand. In any case, the most important thing, I believe, is to have at least some understanding of the technology that you will decide to rely on to solve important tasks.

habrastorage.org

I2P Tunnels: Garlic encryption and unidirectional data transfer

Teacher

Professional

Similar threads