I2P Tunnels: Garlic Encryption and One-Way Data Transfer

Teacher

Professional
Messages
2,670
Reaction score
806
Points
113
54f2deaff6f56d9ab8281.png


The anonymity of the participants in the I2P network is achieved through the use of tunnels. An important feature of I2P is that only the person who created it knows the length of the tunnel, its beginning and end.
Each node in the network has its own inbound and outbound tunnels, while by default it also acts as a transit node in the chains of other participants. I2P tunnels are unidirectional - each tunnel has traffic in one direction only. In an anonymous network, the user does not have direct access to the subscriber on the other side, only information about the initial node in his incoming tunnel is available. The access to the incoming tunnel of another network participant occurs through the anonymizing chain from its side - through the outgoing tunnel. After establishing a contact, the addressee is informed for a response information about the incoming tunnel of the network participant who has contacted him. For information on how nodes find each other's incoming tunnels for the first call, see the article on floodfiles.
Chains of several nodes are, if not the most basic, then one of the most important logical parts of I2P. Within the framework of this article, we will explain the principle of their construction and speculatively prove that the level of anonymity in I2P can be trusted.

Transports
The most popular low-level network protocols today are TCP and UDP. They provide the logic for the delivery of information between subscribers, but they are absolutely not responsible for the privacy of this information - it can be intercepted along the way by anyone. In view of this, for higher-level application protocols, it becomes necessary to encrypt information.
I2P uses transport protocols NTCP2 - analogue of TCP, and SSU - analogue of UDP. In essence, these protocols are cryptographic wrappers for their older brothers. Information within the "invisible Internet" is transmitted over these protocols, which allows you to hide all transmitted information from your home provider.
NTCP2 and SSU not only encrypt information, but also mix in a random amount of extra bytes. "Trash" leaves no chance for traffic analysis systems, since the size of the packets is random and in fact does not mean anything. After decrypting the packet at the destination, the blank bytes are simply discarded.
NTCP2 and SSU are tools for secure peer-to-peer communication - direct communication with other routers on the network. At the beginning of this article, it was said that anonymity within I2P is achieved through tunnels consisting of several intermediate nodes. In view of this, we will not dwell on how the nodes communicate with each other directly, but let's go closer to the topic of anonymizing tunnels. Read more about the NTCP2 transport protocol in Russian.

What is a tunnel
I2P routers, that is, all nodes on the network, communicate with each other using the I2NP protocol. The actual list of I2NP message types can be seen in the source code, or refer to the official documentation for a detailed explanation.

d24df1060b5140f69aac7c5384a7646f.png


In fact, I2NP contains a comprehensive set of possible message types required for communication between I2P routers. Pay attention to the types whose names contain the word Tunnel. They are of interest to us within the framework of this article. All user information is transmitted inside the tunnels in the form of encrypted blocks from router to router, due to which the original source of information is lost, and its real recipient remains unknown, since it is not known on which of the transit routers the tunnel is interrupted.
Two main requirements are imposed on the logic of tunnels: the anonymity of the tunnel creator in front of its participants, as well as the compatibility of the transit nodes of the chain across transports. It is briefly about building tunnels in a large article on I2P, but now we will delve into the depths of this complex mechanism and once again make sure that the level of anonymity in I2P is unprecedented in its kind.

Analogy of I2P architecture with a conventional network
So that the further story does not seem like a dry specification that cannot be understood even after two readings, let us slightly simplify the initial understanding of the I2P network architecture by comparing it with a familiar network.
In the illustration, the fundamentally similar levels of a conventional TCP / IP network and an I2P network are opposite each other - on the same line.
Your device connects to the Internet through a router (router). This can be a home router with a wire or Wi-Fi connection, or a cellular operator's router with a 3G / 4G / 5G connection. The essence does not change from this. Before establishing a logical IP connection with the router, the device exchanges service information at a low level - in the illustration it is designated as "Ethernet frames" (simplified). After establishing a logical connection via IP with a router, you can access computers around the world through it. When you open a site with a server in a nearby city, your traffic goes through many hotspots of ISPs, leaving you with only a direct connection to the original router. All intermediate nodes work with you at the level of IP connections - they transmit certain packets of information from point A to point B. Each logical aggregate along the path of your traffic builds an IP connection with its neighbors also at a lower level, just like you and your home router ... It turns out that many computers and routers around the world establish a low-level connection with their physical neighbors through wires (or other media) and, based on this connection, unite in an IP network, where the usual routing with IPv4 or IPv6 addresses reigns. The highest level of TCP / IP communication is indicated in the illustration as "application". This is the top of the pyramid - user traffic, for example, a browser request to a site over HTTPS. Lower levels know nothing about your site and browser, It turns out that many computers and routers around the world establish a low-level connection with their physical neighbors through wires (or other media) and, based on this connection, unite in an IP network, where the usual routing with IPv4 or IPv6 addresses reigns. The highest level of TCP / IP communication is indicated in the illustration as "application". This is the top of the pyramid - user traffic, for example, a browser request to a site over HTTPS. Lower levels know nothing about your site and browser, It turns out that many computers and routers around the world establish a low-level connection with their physical neighbors through wires (or other media) and, based on this connection, unite in an IP network , where the usual routing with IPv4 or IPv6 addresses reigns. The highest level of TCP / IP communication is indicated in the illustration as "application". This is the top of the pyramid - user traffic, for example, a browser request to a site over HTTPS. Lower levels know nothing about your site and browser, The highest level of TCP / IP communication is indicated in the illustration as "application". This is the top of the pyramid - user traffic, for example, a browser request to a site over HTTPS. Lower levels know nothing about your site and browser, The highest level of TCP / IP communication is indicated in the illustration as "application". This is the top of the pyramid - user traffic, for example, a browser request to a site over HTTPS. Lower levels know nothing about your site and browser, Lower levels know nothing about your site and browser, The highest level of TCP / IP communication is indicated in the illustration as "application". This is the top of the pyramid - user traffic, for example, a browser request to a site over HTTPS. Lower levels know nothing about your site and browser, Lower levels know nothing about your site and browser, The highest level of TCP / IP communication is indicated in the illustration as "application". This is the top of the pyramid - user traffic, for example, a browser request to a site over HTTPS. Lower levels know nothing about your site and browser,
I2P works on top of TCP / IP, but has its own additional structure, which partly repeats the usual network, but with a strong bias towards privacy and anonymity. An I2P router is a software client on your device that provides all the internal network logic. I2NP messages are the basic tool for communication between I2P routers. Tunnels are analogous to IP connections, which provide direct interaction with other routers and through them interact with nodes that are not in direct contact. The top level in the illustration, the "garlic message", is user information delivered over tunnels, but meaningful only to the ultimate recipient who can decrypt it.
Messages from the hidden network pass through the normal Internet, but their content can only be understood at a higher level, to which an I2P router is needed. This property is called overlay. Formally speaking, the term "deep internet" in relation to I2P and other hidden networks should sound the opposite, for example, "high-level internet".

Garlic message forming a tunnel
Each I2P router publishes information about itself on floodfills - reference nodes of the network. The full address of the router is called "Router Info", or simply "RI". In addition to information about immediate availability (IP addresses and addresses of introducer), as well as some service information, RI contains the public encryption key of the router.
The local base of the I2P router network consists entirely of RI files of other network participants. When a router needs to acquire a tunnel, its future participants are selected first. The search for candidates takes place in the local network database (netDb). The declared bandwidth of the router, transport compatibility with neighbors, as well as data from the profile, if any, are taken into account. Profiling means a local chronicle of a router about interaction with a specific network participant and an assessment of its stability - an unstable router will not be used when building a tunnel.
When candidates are selected, a garlic message is generated. If a tunnel is built no longer than four hops (jumps), garlic always contains four cloves, that is, four messages. Otherwise, garlic has eight posts (according to the number of the maximum possible tunnel length). By default, tunnels are three hops long.
The principle of the garlic message is that each participant in the tunnel receives a full set of messages (garlic), but can only read one clove intended for him. Then the whole garlic message is passed on according to the instructions received in the clove. The teeth are recognized by the recipients by the first 16 bytes, which are the start of the hash from their encryption key. After reading the garlic message, the participant replaces the contents of the clove with their answer. To keep the information secret, it is encrypted with a symmetric key.
The most interesting nuance is that a specific node can only see its clove in garlic. This prevents you from guessing about the other participants in the tunnel from their key hashes. Reception is provided by additional symmetric encryption of the entire garlic. Perhaps someday there will be a separate note about this, but here is the source code for the most daring:
libi2pd / Tunnel.cpp

Today, garlic messages are of three types and differ accordingly:
  • Old. Used when all transit routers in the tunnel are using El Gamal encryption. Each message is 528 bytes and contains AES keys: one-time for encrypting the response, an initialization vector (IV), an encryption key IV, and a master key used throughout the life of the tunnel for onion (multilayer symmetric) encryption.
  • Transition. It is used when among the transit routers there are nodes with both the old El Gamal encryption and new ones with ECIES. Each message is 528 bytes, but for ECIES nodes, a different encryption is used for the response: the symmetric encryption key is not transmitted directly, but is calculated using the Noise (Noise_N) protocol.
  • The new one is "short message garlic". It is used when all transit routers use ECIES encryption and have a version of at least 2.39.0 for i2pd, or 1.5.0 for a Java router. Each message is 218 bytes. The teeth do not contain the above keys, because in this case they are all computable. AES is only used as the primary encryption key for the tunnel, otherwise the AEAD / Chaha20 / Poly1305 and ChaCha20 algorithms are used. Read more about short garlic messages in the spec.
The uniqueness of the new type of garlic messages lies in its size - a standard four-clove garlic fits into one kilobyte. At the time of this article publication, all I2P tunnel messages are split into chunks of one kilobyte. This is the standard size and is an anti-traffic analysis element.

The minimum size garlic with 528 byte cloves requires three tunnel messages to be sent. Despite the obfuscation of traffic, hypothetically, it is possible to identify a pattern: three kilobyte messages are standard garlic. If we see a typical traffic movement, it means that the observed user is building a tunnel, or is a transit link in the tunnel of another user. In practice, this threat is from the category of fantasy, because the tunnels live for only ten minutes, and active nodes constantly participate in more than two thousand transit tunnels. You can imagine how many garlic flies on such a router in just a minute ...
When using short messages, all the garlic fits into one tunnel message, so the new garlic is completely merged with the rest of the information flow. The last hope of a spy to analyze traffic for three kilobyte messages in a row is lost. But not all about paranoia, ladies and gentlemen! One tunnel message instead of three is a factor in increasing the speed of building a tunnel: fewer packets go - there is less chance that some of them can be lost, and less information is transmitted faster.

Outgoing and incoming tunnels are built in a similar way:
  • When building an inbound tunnel to remain incognito, the creator router sends the garlic to the first party through its outbound tunnel. The garlic is then transferred from node to node and comes back to the tunnel maker. For the last member of the chain, the creator does not differ from the next transit node.
  • When building an outgoing tunnel, the picture is the opposite: the creator router sends the garlic directly to the router closest in the chain, which cannot know whether the router that sent the garlic is the garlic owner or the same transit node. The clove of the last transit node contains the information for sending the garlic back to the creator. After going through the entire chain, the garlic returns to the creator through one of the incoming tunnels.
At startup, the I2P router has no real tunnels, only two zero-length tunnels. When creating the first full-fledged tunnels, the necessary calls do not occur through the anonymizing chain of nodes, but directly - through a zero-length tunnel. Due to the fact that this is a rare event and circulation through a zero-length tunnel is indistinguishable from normal, this feature is not considered a weak point of the I2P architecture.

Tunnels in a practical network context
The cloves of garlic, in addition to the encryption keys, contain information about the tunnel numbers (each 4 bytes). One number is the number of the tunnel to which the router will receive messages, and the second number is the number of the tunnel at the next transit node, the address of which is attached, where the current transit router will transmit the incoming information.
Garlic messages contain only short addresses of routers to which garlic needs to be transferred further along the chain. The router address is a SHA256 hash from its full Router Info. If the required router is not in the local database, the transit node needs to contact the flood file to get its full address (RI). When composing garlic, the tunnel creator takes into account the transport compatibility of neighbors with each other, which reduces the likelihood of an unsuccessful attempt to build a tunnel. For example, it is silly to ask a router without an IPv6 address to contact a network participant that only has IPv6.
The tunnel is considered to be a failure if the answer does not come for a long time, or the garlic returned, but one of the routers surveyed in his answer laid a refusal to participate (this may be due to the limit on the number of transit tunnels, etc.).
When a tunnel is created, its life cycle is limited to ten minutes. After this time, all transit nodes stop accepting packets within the old tunnel, and by this time its creator needs to have a new tunnel - it is created shortly before the old one "decays". After updating the inbound tunnels, if we are talking about a server endpoint that is waiting for a call from other network participants, its lisset is also updated - contact information on floodfiles, which includes information about the inbound tunnels.
As such, garlic encryption is only used in I2P when creating tunnels. In operational mode, the tunnel uses only onion encryption (plus end-to-end encryption from user to user).
Onion encryption is a term for multilayer symmetric key encryption. If someone has forgotten or does not know: with symmetric encryption, encryption and decryption are carried out with one key, in contrast to asymmetric algorithms, where encryption occurs with a public key and decryption with a private key. Asymmetric encryption is used with end-to-end encryption.
All tunnels are unidirectional, while transit nodes do not know anything: whether the tunnel is inbound or outbound, how many participants there are, and so on. Their task is to encrypt the transmitted information with the received (or mathematically derived) symmetric key and transfer the resulting packet to the next node.
Only end nodes in tunnels assume special roles: for an outbound tunnel, the last node is an Endpoint, and the first node in an inbound tunnel is a Gateway. Unlike the usual "middle" knots, these two are aware of their place in the chain thanks to the special flags they receive in the garlic. Endpoint's task is to collect kilobyte tunnel messages into a more weighty packet (up to 64 kilobytes) and transfer it further according to instructions (into the incoming tunnel, directly to another router, or local information processing). Gateway's task is the opposite: breaking the received messages into standard fragments of one kilobyte and sending these fragments further along the incoming tunnel.

0bea398a80e55eb1edc86bc373a26b89.gif


When sending a packet, the router decrypts it one by one with all the keys of the transit nodes. This is done so that after each transit node encrypts the information, it turns out to be in its original form. This is a feature of the symmetric encryption algorithms AES and ChaCha20. It sounds a little complicated, but the bottom line is that encryption and decryption are mirror operations and when decrypting the original information, it is encrypted, but in order to decrypt it later, you need to perform an operation in the opposite direction, that is , encrypt.
On the last router of the outgoing tunnel - Endpoint - the last onion encryption layer is removed and information is transmitted to the incoming tunnel of the other side. Even though all onion encryption has been removed at this location, user information is not compromised since end-to-end encryption is applied to it in the first place.
In the incoming tunnel, everything is easier for the mind of an uninitiated person: each transit node adds encryption with its own key (and this is really just a new encryption layer), then transfers the resulting packet to the next node. The final recipient, having removed all onion encryption, decrypts the information with its asymmetric key and checks the signature, after which the decrypted information rises to a higher level and is given to the external (application) application in its original form.
In general terms, the completed user data transfer scheme looks as shown in the illustration.

816e23d9f0a4f8b77aff2ea7cfbf9dce.png


For outgoing tunnel:
  • I2P router receives information from an external local application, packs it into I2NP Data (gzip + service headers);
  • End-to-end encryption is carried out with the recipient's key (key from the lysset), it turns out I2NP Garlic;
  • The information is prepared to be sent on a specific tunnel, a message is generated I2NP Tunnel;
  • Traffic is wrapped in transport protocol cryptography;
  • The information goes through the physical network to the first (closest) node of the outgoing tunnel.

With an incoming tunnel:
  • Packets destined for the I2P router arrive over the network;
  • Processing of transport protocols, decryption of I2NP messages is in progress;
  • The message is I2NP Tunnel - a tunnel message;
  • By the tunnel identifier, the router understands that this message came from its tunnel. Onion encryption is removed;
  • The existing message of type I2NP Garlic is the original sender's message, encrypted with the recipient's asymmetric key. The router, being the holder of the address, which is the recipient of this message, decrypts it with the private key of the address (in fact, one-time keys and so on are used, but spare our psyche!);
  • The output I2NP Data is user information (gzip + service headers), which is unpacked and given to an external application.
In the description for the illustration, the emphasis is placed on user information received from the tunnel, because there are still many service messages and things are different there.
It should be noted that I2NP Garlic this is a type that only denotes a single-layer end-to-end encrypted message, not a garlic message that is used when building a tunnel. Perhaps it was once planned to use garlic encryption here, but the fate of the practical implementation decided otherwise.

This is a simplified explanation of how I2P tunnels work, but despite all the efforts to simplify the material, it turned out to be very difficult to understand. Be that as it may, the most important thing, I believe, is the presence of at least some understanding of the technology, on which you dare to rely in solving important problems.

habrastorage.org
 
Top