Fundamentals of Cryptography (Encryption)

Father · Jul 2, 2021

When people talk about crypt in general, there are some fundamental principles. One of them is the Kerkhoffs principle, which says that open source is very important in cryptography. More specifically, it provides general knowledge about the protocol design. The point is very simple: the cryptographic algorithms that are used in a particular system should not be a secret that ensures its stability. Ideally, it is necessary to build systems so that their cryptographic side is fully known to the attacker and the only secret is the cryptographic key that is used in this system.

Modern and commercially available encryption systems - all or nearly all or the best of them - are built from components that are well known in design and operation. The only secret thing about them is the encryption key. There is only one significant exception known to me - a set of secret cryptographic protocols for all kinds of government organizations. In the USA it is called NSA suite B, and in Russia it is all sorts of strange secret encryption algorithms that are used to a certain extent by the military and government agencies.

I would not say that such algorithms are of great benefit to them, except that it is roughly like atomic physics. You can try to understand the design of the protocol to understand the direction of thought of the people who developed it, and somehow get ahead of the other side. I don't know how relevant this principle is by today's standards, but people who know about it more than me do just that.

The situation is different with every commercial protocol you come across. An open system is used everywhere, everyone adheres to this principle.
The first cryptographic primitive is symmetric ciphers.

They are very simple. We have some kind of algorithm, the input of which is a plain text and something called a key, some value. The output is an encrypted message. When we want to decrypt it, it is important that we take the same encryption key. And, applying it to another algorithm, the decryption algorithm, we get our plaintext back from the ciphertext.

What are the important nuances here? In most of the common symmetric encryption algorithms that you may encounter, the size of the ciphertext is always equal to the size of the plaintext. Modern encryption algorithms operate on key sizes. Key sizes are measured in bits. The current size is from 128 to 256 bits for symmetric encryption algorithms. We'll talk about the rest, including the block size, later.

Historically, in the conventional 4th century BC, there were two methods of cipher design: substitution and permutation ciphers. Substitution ciphers are an algorithm where in those days they replaced one letter of the message with another according to some principle. A simple substitution cipher - according to the table: we take the table where it is written that we change A to Z, B to Yu, etc. Further, we encrypt using this table, and decipher it using it.

How complex is the algorithm in your opinion in terms of key size? How many key options are there? The order of the factorial of the length of the alphabet. We take a table. How do we build it? Let's say there is a table with 26 characters. We can replace the letter A with any of them, the letter B - with any of the remaining 25, C - with any of the remaining 24 ... We get 26 * 25 * 24 * ... - that is, the factorial of 26. The factorial of the alphabet's dimension.

If you take log226 !, it will be a lot. I think you will definitely get in the region of 100 bits of the key length, or even more. It turned out that from the point of view of the formal representation of security, the specified encryption algorithm is quite good. 100 bits is acceptable. At the same time, everyone, probably in childhood or adolescence, when faced with encodings, saw that such algorithms are trivially decrypted. There are no problems with decoding.

For a long time there have been all sorts of substitution algorithms in different constructions. One of them, even more primitive, is the Caesar cipher, where the table is formed not by a random permutation of characters, but by a shift by three characters: A changes to D, B to E, etc. It is clear that the Caesar cipher, along with all its variants, to iterate very easy: unlike table substitution, there are only 25 variants in the Caesar key with 26 letters in the alphabet - not counting the trivial encryption into itself. And you can just sort it out with a complete search. There is some complexity here.

Why is the table lookup cipher so simple? Where does the problem arise in which we can easily, even without knowing anything about cryptography, decrypt table lookup? It's about frequency analysis. There are the most common letters - some I or E. Their prevalence is great, vowels are much more common than consonants, and there are negative pairs that are never found in natural languages - something like bb. I even gave my students the task of making an automatic substitution cipher decoder, and, in principle, many did it.

What is the problem? It is necessary to distort the statistics of the distribution of letters so that common letters do not shine so much in the cipher text. The obvious way: let's encrypt the most common letters not in one character, but in five different ones, for example. If a letter occurs on average five times more often, then let's take turns - first we will encrypt the first character, then the second, the third, etc. Then we get a mapping of letters not 1 to 1, but, conditionally, 26 k 50. Statistics will thus be violated. This is the first example of a polyalphabetic cipher that somehow worked. However, there are quite a few problems with it, and most importantly, it is very inconvenient to work with the table.

Then we came up with: let's not encrypt with such tables, but try to take the Caesar cipher and change the shift for each next letter. The result is the Vigenère cipher.

We take the word VASYA as a key. We take the message of MASHA. We use the Caesar cipher, but counting from these letters. For example, B is the third letter in the alphabet. We have to shift the corresponding letter in the plaintext by three letters. M shifts to P. A in A. Sh - by 16, we jump over the letter A, we get, conditionally, D. I will shift A to I. PADYA.

What is convenient in the resulting cipher? There were two identical letters, but as a result they were encrypted into different ones. This is cool because it blurs the statistics. The method worked well until sometime in the 19th century, just recently against the background of the history of cryptography, they figured out how to break it. If you look at a message of several dozen words, and the key is rather short, then the whole structure looks like several Caesar ciphers. We say: okay, let's consider every fourth letter - the first, fifth, ninth - as a Caesar cipher. And let's look for statistical patterns among them. We will definitely find them. Then we take the second, sixth, tenth, and so on. We'll find it again. This will restore the key. The only problem is figuring out how long it is. It's not very difficult, how long can it be? Well 4, well 10 characters. Going through 6 options from 4 to 10 is not very difficult. A simple attack - it was available without computers, just with a pen and a sheet of paper.

How to make an unbreakable cipher out of this thing? Take the key of the text size. A character named Claude Shannon in the twentieth century, in 1946, wrote the classic first work on cryptography as a branch of mathematics, where he formulated a theorem. The length of the key is equal to the length of the message - he used XOR instead of modulo addition equal to the length of the alphabet, but in this situation it is not very important. The key is randomly generated, is a sequence of random bits, and the output will also be a random sequence of bits. Theorem: if we have such a key, then such a construction is absolutely resistant. The proof is not very complicated, but I will not talk about it now.

It is important that you can create an unbreakable cipher, but it has drawbacks. First, the key must be completely random. Second, it should never be reused. Third, the length of the key must be equal to the length of the message. Why can't you use the same key to encrypt different messages? Because, by intercepting this key the next time, it will be possible to decrypt all messages? Not. Will the Caesar cipher be visible in the first characters? I didn't quite understand. It seems not.

Let's take two messages: MASHA, encrypted with the key VASYA, and another word, which also had the key VASYA, - VERA. We get something like the following: ZESHYA. Let's add the two received messages, and so that the two keys are mutually deleted. As a result, we get only the difference between a meaningful ciphertext and a meaningful ciphertext. On XOR it is more convenient than on addition along the length of the alphabet, but there is practically no difference.

If we get the difference between two meaningful ciphertexts, then further, as a rule, it becomes much easier, since the texts in natural language have a high redundancy. Often we can guess what is happening by making different assumptions, hypotheses. And the main thing is that each correct hypothesis will reveal to us a piece of the key, and therefore pieces of two ciphertexts. Something like this. Therefore, it is bad.

In addition to substitution ciphers, there were also transposition ciphers. With them, too, everything is quite simple. We take VASYAI's message, write it into a block of some length, for example, in DIDOM, and read the result in the same way.

Not God knows what a thing. How to break it is also clear - we will sort out all possible variants of permutations. There are not very many of them. We take the length of the block, select and restore.
As the next iteration, the following method was chosen: we take everything the same, and on top we write some key - SIMON. Let's rearrange the columns so that the letters are in alphabetical order. As a result, we get a new key permutation. It is already much better than the old one, since the number of permutations is much larger and it is not always easy to pick it up.

Every modern cipher in one way or another is based on these two principles - substitution and permutation. Now their use is much more complex, but the basic principles themselves have remained the same.

If we talk about modern ciphers, they are divided into two categories: stream and block. The stream cipher is designed in such a way that in fact it is a random number generator, the output of which we add modulo 2, "xorim", with our ciphertext, as you can see on my slide. Earlier I said: if the length of the resulting key stream - it is the key - is absolutely random, it is never reused and its length is equal to the length of the message, then we got an absolutely strong cipher, unbreakable.

The question arises: how to generate a random, long and eternal Key for such a cipher? How do stream ciphers generally work? In essence, they are a random number generator based on some kind of seed. The initial value is the cipher key, the answer.

There is one interesting exception to this story - cipher pads. This is a real spy story about real espionage. Some people who need absolutely stable communication generate random numbers - for example, literally throwing a dice or literally taking balls out of a drum, like in lotto. Create two sheets where these random numbers are printed. One sheet is given to the recipient, and the second is left with the sender. When they want to chat, they use this stream of random numbers as the key stream. No, the story is not taken from the very distant past. I have a real radio intercept from October 15, 2014: 7 2 6, 7 2 6, 7 2 6. This is the callsign. 4 8 3, 4 8 3, 4 8 3. This is the number of the cipher pad. 5 0, 5 0, 5 0. This is the number of words. 8 4 4 7 9 8 4 4 7 9 2 0 5 1 4 2 0 5 1 4 etc. 50 such numerical groups. I do not know where, somewhere not in Russia, a man with a pen and a pencil was sitting by an ordinary radio receiver and was writing down these numbers. Having written them down, he took out a similar thing, folded them modulo 10 and received his message. In other words, it really works, and such a message cannot be cracked. If really good random numbers were generated and he later burned a piece of paper with a key, then hacking cannot be done in any way, at all.

But there are quite a few problems here. The first is how to generate really good random numbers. The world around us is deterministic, and if we are talking about computers, they are completely deterministic.

Secondly, delivering keys of this size ... if we are talking about transferring messages from 55 digital groups, then doing this is not very difficult, but transferring several gigabytes of text is already a serious problem. Therefore, some algorithms are needed that, in fact, generate pseudo-random numbers based on some small initial value, and which could be used as such streaming algorithms.

The most commonly used algorithm of this kind is called RC4. It was developed by Ron Rivest about 25 years ago and has been actively used for a very long time, it was the most common algorithm for TLS, all of its different variants, including HTTPS. But lately RC4 has started to show its age. There are a number of attacks for him. It is actively used in WEP. There was one good lecture by Anton , a story that shows: poor use of a decent encryption algorithm, even by today's standards, leads to the fact that the entire system is compromised.

RC4 is simple. The slide describes his work in full. There is an internal byte state of 256 bytes. At each step of this state, there are two numbers, two pointers to different bytes in the state. And at each step, there is an addition between these numbers - they are placed in some place in the state. The byte received from there is the next byte in the numerical sequence. By rotating this knob in this way, performing a similar action at each step, we get each next byte. We can receive the next byte of the numerical sequence forever, in a stream.

A great advantage of RC4 is that it is entirely intrabyte, which means that its software implementation works quite quickly - much faster, several times, if not tens of times faster than the DES cipher comparable and existing at about the same time as it. This is why RC4 is so widespread. It was a trade secret of RSA for a long time, but then, somewhere in the 90s, some people anonymously published the source code of his device on the cypherpunks mailing list. As a result, a lot of drama arose, there were shouts, they say, how could it be that some indecent people stole the intellectual property of RSA and published it. RSA began to threaten all patents, all kinds of legal prosecution. To avoid them, all implementations of the algorithm that are in open source are called not RC4, but ARC4 or ARCFOUR. A - alleged. It's about a cipher,

If you are configuring some SSH or OpenSSL, you will not find RC4 in it, but you will find ARC4 or something similar. Simple construction, it is already old, there are attacks on it now, and it is not very recommended for use.

There have been several attempts to replace it. Probably, in my preconceived opinion, the most successful was the Salsa20 cipher and several of its followers from the character Dan Berstein, widely known in narrow circles. To Linux users, he is commonly known as the author of qmail.

Salsa20 is more complex than DES. Its block diagram is complex, but it has several interesting and cool properties. To begin with, it is always executed in a finite time, each of its rounds, which is important to protect against timing attacks. These are attacks where the attacker observes the behavior of the encryption system by feeding it different ciphertexts or different keys behind this black box. And, understanding the changes in the response time or in the power consumption of the system, he can draw conclusions about exactly what processes have occurred inside. If you think the attack is highly contrived, it is not. Attacks of this kind on smart cards are very widespread - very convenient, since the attacker has full access to the box. The only thing he, as a rule, cannot do in it is to read the key itself. It's complicated,

Salsa20 is designed so that it always executes in a constant, equal amount of time. Internally, it consists of only three primitives: a constant time shift, and addition modulo 2 and modulo 32, 32-bit words. The Salsa20 is even faster than the RC4. It has not yet gained such widespread acceptance in mainstream cryptography - we do not have a cipher suite for TLS using Salsa20 - but it is still slowly becoming mainstream. This cipher became one of the winners of the eSTREAM competition for choosing the best stream cipher. There were four of them, and Salsa is one of them. It is slowly starting to appear in all sorts of open source products. Perhaps soon - maybe in a couple of years - there will even be a cipher suite in TLS with Salsa20. I really like him.

There is a certain amount of cryptanalysis on it, there are even attacks. From the outside, it looks like a streamed one, generating a sequence of almost arbitrary length based on the key, 264. But inside it works like a block one. The algorithm has a place where you can substitute the block number, and it will return the specified block.

What's the problem with stream ciphers? If you have a stream of data traveling over a network, a stream cipher is handy for it. A packet flew in to you, you encrypted and transmitted it. The next one flew in - they applied this scale and passed it on. The first byte, the second, the third go through the network. Conveniently.

If the data, for example a gigabyte file in its entirety, is encrypted on the disk with a stream cipher, then in order to read the last 10 bytes, you will first need to generate the gamma of the cipher stream by 1 gigabyte, and then take the last 10 bytes from it. Very uncomfortable.

In Salsa, this problem is solved, since it also receives the number of the block that needs to be generated at the input. Then the algorithm is applied to the block number 20 times. 20 rounds - and we get 512 bits of the output stream.

The most successful attack is 8 rounds. It itself is 256-bit, and the complexity of an attack in 8 rounds is 250 or 251 bits. It is considered to be very stable, good. There is a public cryptanalysis for it. Despite all the odiousness of Berstein's personality in this aspect, it seems to me that the thing is good and it has a greater future.

Historically, there have been many stream ciphers. They are pioneers not only in commercial encryption, but also in military encryption. They used what were called linear shift registers.

What are the problems here? First, in classic stream ciphers, not in Salsa, in order to decrypt the last value of a gigabyte file, the last byte, you need to generate a sequence per gigabyte first. From it you only use the last byte. Very uncomfortable.

Stream ciphers are poorly suited for systems with inconsistent access, the most common example of which is the hard disk.
There is one more problem, we will talk about it further. It manifests itself very clearly in stream ciphers. Together, the two problems have made it great to use some other mechanism.

Another mechanism for symmetric encryption is called a block cipher. It is arranged a little differently. It does not generate this key stream, which must be correlated with our ciphertext, but works similarly - like a lookup table. It takes a block of text of a fixed length, outputs the same block of text, and that's it.

The block size in modern ciphers is usually 128 bits. There are different variations, but as a rule, we are talking about 128 or 256 bits, no more and no less. The key size is exactly the same as for streaming algorithms: 128 or 256 bits in modern implementations, inside and out.

Of all the widespread block ciphers, two can now be named - DES and AES. DES is a very old cipher, as old as RC4. DES now has a block size of 64 bits and a key size of 56 bits. It was created at IBM under the name Lucifer. When it was designed by Horst Feistel at IBM, they suggested 128 bits for the block size. And the key size was variable, from 124 to 192 bits.

When DES began to pass standardization, it was submitted for review, including the NSA. From there it came back with a block size reduced to 64 bits and a key size reduced to 56 bits.
20 years ago, this whole story made a lot of noise. Everyone said - they must have embedded a bookmark there, awful, picked up such a block size to be able to attack. However, the great advantage of DES is that it is the first cipher that was standardized and then became the basis for commercial cryptography.

He was attacked a lot and researched a lot. There are a large number of all kinds of attacks. But there is still not a single practically feasible attack, despite its rather venerable age. The only thing is that the key size of 56 bits is now simply unacceptable and you can brute force attack.

How does DES work? Feistel did a cool thing called the Feistel net. She operates with blocks. Each block entering the entrance is divided into two parts: left and right. The left side becomes the right side unchanged. The right side is corrected with the result of calculating a certain function, the input of which is the left side and the key. After this transformation, the right side becomes left.

It has several interesting advantages. The first important advantage: the function F can be anything. It does not have to be reversible, and it may not be linear or nonlinear. All the same, the cipher remains symmetric.

The second very convenient property: decryption works in the same way as encryption. If you need to decrypt a given network, you put the ciphertext into the old mechanism instead of the plaintext and at the output you get the plaintext again.

Why is it convenient? 30 years ago, convenience was a consequence of the fact that encryptors were hardware and it was time consuming to design a separate chipset for encryption and decryption. And in this design, everything is very cool, in fact, we can use one block for different tasks.

In a real situation, such a construction is one round of a block cipher, that is, in a real cipher it is executed 16 times with different keys. Each 16 round generates a separate key and 16 round subkeys, each of which is applied on each round for function F.

The round also looks pretty simple - it consists of only two or three operations. The first operation: the size of the half-block that comes across becomes equal to 32 bits, the half-block passes the expansion function, 32 bits are received at the input. Then, using a special unclassified table, we add a little to 32 bits, turning them into 48: some bits are duplicated and rearranged, such a comb.

Then we xorize it with a round key, the size of which is also 48 bits, and we get a 48-bit value.
It then goes into a set of functions called S-boxes that convert each bit of the input to four bits of the output. Therefore, at the output, we get 32 bits out of 48 bits again.

And finally, the final permutation P. It shuffles the 32 bits together again. Everything is very simple, the round function is as simple as possible.

Its most interesting property lies in the indicated S-boxes: a very complex transformation of 6 bits into 4. If you look at the whole structure, you can see that it consists of XOR and a pair of permutations. If S-boxes were simple, the whole DES would actually be some kind of linear transformations. It could be thought of as a matrix by which we multiply our plaintext to get the ciphertext. And then the attack on DES would be trivial: it would just be required to find the matrix.

All nonlinearity is concentrated in specially selected S-boxes. There are various anecdotes about how exactly they were selected. In particular, about 10 years after DES was published and standardized, cryptographers found a new type of attack - differential cryptanalysis. The essence of the attack is very simple: we make small changes in the plaintext - changing, for example, the value of one bit from 0 to 1 - and see what happens to the ciphertext. It turned out that in an ideal cipher, a change in one bit from 0 to 1 should lead to a change in exactly half of the ciphertext bits. It turned out that DES, although it was made before differential cryptanalysis was discovered, proved to be resistant to this type of attack. As a result, at one time another wave of paranoia arose: they say,

More than one hundred articles are devoted to the analysis of the device of S-boxes. There are cool articles that are titled something like this: features of the statistical distribution of the output bits in the fourth S-box. Because the cipher is many years old, it has been thoroughly researched in different places and remains quite stable even by today's standards.

56 bits now can already be simply enumerated on a cluster of general-purpose machines - maybe even on one. And this is bad. What can be done?

It is impossible to simply shift the size of the key: the whole structure is tied to its length. Triple DES. The obvious answer was this: let's encrypt our block several times, arrange several sequential encryptions. And here everything is not too trivial.

Let's say we take and encrypt twice. First, you need to prove that for encryptions k1 and k2 on two different keys, there is no such encryption on the key k3 that the performance of the two specified functions will be the same. This is where the property that DES is not a group comes into play. There is a proof of this, albeit not a very trivial one.

Okay 56 bit. Let's take two - k1 and k2. 56 + 56 = 112 bits. 112 bits, even by today's standards, is a perfectly acceptable key length. Anything over 100 bits can be considered normal. So why can't you use two encryptions, 112 bits?

One DES encryption consists of 16 rounds. The net is applied 16 times. Changes from left to right occur 16 times. And he is not a group. There is evidence that there is no such key k3 with which we could decrypt the text, sequentially encrypted by the keys we have chosen k1 and k2.

There is an attack. Let's encrypt all possible texts using some key, take a ciphertext and try to decrypt it using all arbitrary keys. Both here and here we get 256 options. And somewhere they will converge. That is, for two times 256 variants each - plus memory for storing all decryptions - we will find such a combination of k1 and k2, at which the attack will be feasible.

The effective strength of the algorithm is not 112 bits, but 57 if we have enough memory. It takes quite a lot of memory, but nonetheless. Therefore, we decided that it is impossible to work this way, let's encrypt three times: k1, k2, k3. The design is called Triple DES. Technically, it can be arranged in different ways. Since encryption and decryption are the same in DES, real algorithms sometimes look like this: encrypt, decrypt and decrypt again - to make operations easier in hardware implementations.

Our reverse implementation of Triple DES will turn into a hardware DES implementation. This can be very handy in different situations for backward compatibility.

Where was DES used? Generally everywhere. It can still be seen occasionally for TLS, there are cipher suites for TLS using Triple DES and DES. But there he is actively dying off, since we are talking about software. The software is easy to update.

But in ATMs, he died for a very long time, and I'm not sure that he finally died. I don’t know if a separate lecture is needed on how this design works in ATMs. In short, the keyboard where you enter the PIN is a self-sufficient thing in itself. Keys are loaded into it, and it gives out not a PIN, but a PIN-block design. The construction is encrypted - for example, via DES. Since there are a huge number of ATMs, there are many old ones among them, and you can still find an ATM where not even Triple DES is implemented inside the box, but ordinary DES.

Once DES began to show its age, it became difficult with it, and people decided to come up with something newer. The American Bureau of Standards, which is called NIST, said: let's run a competition and pick a cool new cipher. It was AES.

DES stands for digital encrypted standard. AES - advanced encrypted standard. The block size in AES is 128 bits, not 64. This is important from a cryptographic point of view. The key size for AES is 128, 192 or 256 bits. AES does not use the Feistel network, but it is also multi-round, it also repeats relatively primitive operations several times. For 128 bits, 10 rounds are used, for 256 bits - 14.

Now I'll show you how each round works. The first and last rounds are slightly different from the standard pattern - for good reason.
As with DES, each round of AES has its own round keys. They are all generated from the encryption key for the algorithm. At this point, AES works the same as DES. A 128-bit key is taken, from which 10 subkeys are generated for 10 rounds. Each subkey, as in DES, is applied on each particular round.

Each round consists of four fairly simple operations. The first round is a substitution according to a special table.
In AES, we build a 4-by-4 byte matrix. Each element of the matrix is a byte. The total is 16 bytes or 128 bits. They make up the entire AES block.

The second operation is a byte shift.
It is arranged in a simple, primitive way. We take a 4-by-4 matrix. The first row remains unchanged, the second row is shifted 1 byte to the left, the third by 2 bytes, the fourth by 3, cyclically.

Next, we mix inside the columns. This is also a very simple operation. It actually rearranges the bits within each column, nothing else happens. You can think of it as multiplying it by a special function.

The fourth, again very simple operation, is XOR each byte in each column with the corresponding key byte. The result is obtained.
In the first round, only the keys are added, and the other three operations are not used. In the last round, there is no such mixing of columns:

The point is that it would not add any cryptographic strength and we can always reverse the last round. We decided not to slow down the structure with an unnecessary operation.
We repeat the 4 described steps 10 times, and at the exit from the 128-bit block, we again get a 128-bit block.

What are the advantages of AES? It operates on bytes, not bits like DES. AES is much faster in software implementations. If we compare the speed of execution of AES and DES on a modern machine, AES will be many times faster, even if we talk about implementation exclusively in program code.

The manufacturers of modern processors, Intel and AMD, have already developed assembly instructions for implementing AES inside a chip, because the standard is quite simple. As a result, AES is even faster. If through DES on a modern machine we can encrypt, for example, 1-2 gigabits, then a 10-gigabit AES encryptor is nearby and is commercially available to ordinary companies.

The block algorithm encrypts block into block. It takes a 128 or 64 bit block and turns it into a 128 or 64 bit block.

What do we do if we need more than 16 bytes?
The first thing that comes to mind is to try to split the original message into blocks, and to supplement the block that remains incomplete with a standard, known and fixed data sequence.

Yes, obviously, we will break everything into blocks of 16 bytes and encrypt it. This encryption is called ECB - electronic code boot, when each of the blocks of 16 bytes in the case of AES or 8 bytes in the case of DES is encrypted independently.

We encrypt each block, get the ciphertext, add the ciphertexts and get the full result.

Something like this looks like a picture encrypted in ECB mode. Even if we imagine that the cipher is completely secure, it seems that the result is less than satisfactory. What is the problem? In that it is a bijective mapping. For the same input, you will always get the same output, and vice versa - for the same ciphertext, you will always get the same plain text.

It would be necessary to somehow contrive and make sure that the output at the output always turns out to be different, depending on the location of the block - despite the fact that the same blocks of ciphertext are fed to the input. The first solution was the CBC mode.

We not only take the key and the plaintext, but also generate a random number that is not secret. It's about the size of a block. It is called an initialization vector.

When encrypting the first block, we take an initialization vector, add it modulo 2 with clear text, and encrypt it. The output is ciphertext. Then we add the received ciphertext mod 2 with the second block and encrypt. The output is the second block of ciphertext. We add it modulo 2 with the third block of plaintext and encrypt it. At the output, we get the third block of ciphertext. Here you can see the adhesion: we concatenate each next block with the previous one.

The result will be a picture where everything, starting from the second block, is evenly smeared, and the first block each time depends on the initialization vector. And it will be absolutely mixed. Everything is fine here.

However, CBC has several problems.

About the block size. Imagine: we started to encrypt and, let's say, we have DES. If DES were the ideal encryption algorithm, the DES output would look like 64-bit, evenly distributed random numbers. What is the probability that, in a sample of 64-bit uniformly distributed random numbers, two numbers will coincide for one operation? 1 / (264). What if we compare three numbers?

Tomcat · Oct 28, 2021

Encryption at a glance

Encryption is the mathematical science of codes, ciphers and classified messages. Throughout history, people have used encryption to exchange messages, the contents of which, they hoped, could not be read by anyone but the addressees. Today there are computers capable of performing encryption, and digital encryption technologies have gone beyond simple plots. Encryption is now used for more complex tasks, such as verifying the identity of the sender of a message or browsing anonymously using the Tog network. Under certain conditions, encryption can be fully automatic and easy to use. But if something goes wrong, it is useful to understand the essence of what is happening - then you will be better protected from problems.

Encryption: Three Important Concepts.

1) Private and public keys
One of the most important concepts in encryption is a key. Common encryption schemes use a private key that is kept secret on your computer so that messages addressed to you can be read. You can also use the private key to digitally sign messages you send to them against counterfeiting. A public key is a file that you can share with other people. It allows them to exchange encrypted messages with you and verify your signatures. Private and public keys are paired, dependent on each other.

2) Safety certifications
The second very important concept is a safety certificate. This is a kind of public key to prevent middleman attacks. A site that has access to the certificate can demonstrate to remote systems that the certificate exists and that no other system (without the certificate) is trying to change the data being transmitted. The web browser on your computer can establish encrypted connections to websites using the HTTPS protocol. In such cases, the browser verifies the certificates by checking the public keys of the domain names (for example, www.google.com or www.amazon.com). Using certificates is one way to authenticate the public key of a user or website you have so that you can safely exchange information with it.
Security certificate error messages appear from time to time. Typically, this is due to an attempt by the public hotspot where you are located to "crack" your encrypted data exchange with the website. In addition, this error often appears due to bureaucratic problems in the certificate system. It also occurs when an attacker tries to break into your encrypted connection.
Unfortunately, it is extremely difficult to determine the true cause of the security certificate error. Therefore, when such an error occurs, you cannot accept exceptions for sites on which you have an account or from where you receive particularly important information.

3) Key fingerprints
Keys in public key encryption systems are very large numbers, sometimes over a thousand digits. The key fingerprint is much shorter. This is a number (a set of numbers and letters), which is unique for a particular key and allows not to analyze all the characters when verifying the authenticity of the key. Suppose you and your interlocutor exchanged copies of the keys, and then decided to make sure that the copies match the originals. You would have to spend a lot of time checking all the symbols of each key. Instead, the key prints can be verified. In modern encryption tools, prints usually consist of 40 letters and numbers, for example: 5d44 4rt8 9167 7401 40dl 5ws4 200z q561 23sd yl91. And if you carefully check the fingerprint of the imported key with the fingerprint,which the real owner will tell you, you can be sure of the authenticity of the key (some programs offer more convenient ways to verify keys).
If the fingerprint check is passed, there is a better chance that the person you are talking to is really who they say they are. But this method is not perfect - an attacker can use the same fingerprint if they copy or steal the key.

Fundamentals of PGP Encryption
PGP (Pretty Good Privacy) encryption is one of the first popular implementations of public key encryption, created by programmer Phil Zimmermann in 1991 to help users secure their communications. Used correctly, PGP can protect the content of your messages and even files from the most serious attackers. When Edward Snowden spoke about encryption, he was referring to PGP and its associated programs.

Public key encryption.
Traditional encryption tools use the same key to encrypt and decrypt messages. Asymmetric encryption (public key encryption) uses two paired keys: one for encryption (public), the other for decryption (private). There are many advantages to this. In particular, you can share your public key with everyone. As long as you have access to your private key, anyone who has your public key can communicate with you securely. Such systems are used to encrypt emails and files according to the PGP, OTR (for instant messaging) and SSL / TLS (for web browsing) standards.

Unfortunately, PGP is not the easiest tool to learn and use. The strong encryption implemented in PGP (Public Key Encryption) is a powerful but tricky security feature. PGP itself has been around for a quarter of a century and is the same age as the earliest versions of Microsoft Windows. Since then, the look and feel of PGP hasn't changed much. However, many programs have been developed that hide the "ancient" design of PGP and significantly simplify its use, especially in terms of encryption and authentication of e-mail (the main functions of PGP). Next, you will learn how to work with these programs. But first, let's take a few minutes to get started with the basics of public key encryption.

A game with two keys
Let's take a simple text - for example, “Hello, friend!”. Let's encrypt it - turn it into a code that is incomprehensible to prying eyes (say, ad & dsDE76vx + fdgQl). Let's send this code over the Internet. Our message can be seen by many people, but who among them will understand the content? In this form, the letter will reach the recipient. He and only he can decrypt and read the source text.
How does the recipient know how to decrypt the message if no one else can? The recipient has additional information that is not available to others. Let's call it the decryption key. This key decodes the text contained in the encrypted message. But the sender must first inform the recipient of the key. This is the flaw in this strategy - if you think your mail can be intercepted, how will you forward the key? After all, the attacker will intercept him. Then there is no point in sending encrypted messages. On the other hand, if you have a secret way to transmit the key, why not use the same way to send all secret messages?
Public key encryption is a great solution to the problem. Each person participating in the correspondence can create two keys. One key (private) must be kept secret and never passed on to other people. Another key (public) can be transferred to anyone who wishes to correspond. It doesn't matter who gets access to the public key. You can publish it on the World Wide Web, from where everyone will download it.
The "keys" themselves are, in fact, very large numbers with certain mathematical properties. At the same time, the public and private keys are interconnected - if you encrypt something with a public key, it can only be decrypted with a pair of private keys.
Let's say you want to send your friend a secret message. He has a private key, and he uploaded his paired public key to his web page. You download its public key, encrypt the message with it and send it to the addressee. And only he can decrypt the message, because only he has a paired private key.

Electronic signature
Public key encryption saves you the trouble of giving the recipient the decryption key (the recipient already has the key). You just need to get the corresponding public key for encryption - it is available to everyone, even hackers. But the public key can only be encrypted, not decrypted.
So, what is encrypted with a specific public key can only be decrypted with its paired private key. But that is not all. Conversely, if you apply a private key to a message, the result can only be processed using the paired public key.
What for? There doesn't seem to be any benefit in protecting a secret message with a private key. Anyone who has your public key (and it is available to anyone in this world) can remove this protection. Suppose you wrote a message "Hello Andrey!" and applied their private key to this text. Anyone can then read this message using the paired public key, but only one person (and this is the main thing!) Could write the message - the owner of the mentioned private key. Of course, if he keeps his private key neatly. This way you can confirm your authorship. We do the same when we sign papers in the real world.
The signature also protects the message from editing. If someone tries to change the text "Hello Andrey!" to "Hello Vlad!", the attacker will not be able to re-sign the message (he does not have your private key). Thus, the digital signature ensures that the message was actually written by its author and has not been altered in transit.
So, public key encryption allows you to encrypt and securely forward messages to anyone whose public keys you know. If, in turn, other people know your public key, they can send you messages that only you can decrypt. You can also sign messages. Then anyone who has your public key can verify the authenticity of your emails. If you receive a message with someone's digital signature, you can use the sender's public key and make sure that he wrote the message.
As you probably already guessed, the benefits of public key encryption are greater the more people know your key. It is also obvious that you need to ensure that your private key is kept securely, because if someone gets a copy of your private key, they can impersonate you and sign messages on your behalf. PGP has a private key revocation feature to alert people that the key can no longer be trusted, but this is not the best solution. A general rule of thumb for public key encryption is to keep the private key in a safe place.

How PGP Works
The use of PGP is mainly the work of creating and enforcing public and private keys. With PGP, you can create a public / private key pair, password protect the private key, and use the keys to sign and encrypt messages. PGP-based programs also allow you to download the public keys of other users and upload your public key to storage servers where other users can find it.
So, you keep your private key in a safe place and protect it with a long password. The public key can be provided to everyone with whom you want to correspond, and to those who want to be sure of the authenticity of your letters.

The web of trust
There is a potential problem with public key encryption. Let's say you distribute Edward Snowden's public key (or at least that's what you say), if people believe you, they'll start sending Snowden letters encrypted with that key. They will also assume that all messages signed with this key are created by Snowden. This rarely happens, but in real life there were precedents.
An attack scenario is also possible when an attacker is between two network interlocutors, reads their letters and periodically inserts his own (confusing) messages into the correspondence. The Internet is designed so that information passes through many different computers. Therefore, such an attack ("middleman attack") is quite possible. Because of it, exchanging keys without prior agreement is a risky business. “Here's my key,” says Edward Snowden and sends you the public key. What if the broker interrupted the transmission of Snowden's key and replaced his key with his own? How can you make sure that the key really belongs to a specific person?
You can, of course, get a key from a person directly, but this is not much easier than transferring a single key to each other and protecting it from interception. However, be that as it may, but the most reliable way to protect against interception is the exchange of public keys in person.

Middleman attack.
Imagine that you are communicating with your friend (let's say Dmitry) using encrypted instant messages. To make sure that this is really Dmitry, you ask the interlocutor to name the city where you first met. “Magadan,” he replies. Right! Alas, secretly from you and Dmitry, someone else intercepts your messages. Your messages for Dmitry go to the attacker, and he, in turn, contacts Dmitry, and vice versa. You think you have established a secure communication channel, but in reality you are communicating through a spy! Such an attack is called a middleman attack. An attacker can intercept information, alter and tamper with messages. Therefore, programs for communication on the Internet must protect against this type of attack, from intruders,

However, PGP offers a better solution - Web of Trust. If you believe the key belongs to a specific person, you can sign that key and upload it (along with the signature) to a public key server. Interested people can download the signed key from there. In general, the more people you trust sign a key, the more trust you have in that key. PGP allows you to sign someone else's keys and trust other users - if they sign a key, your program will automatically consider it trustworthy. Of course, the web of trust is not without its drawbacks. But today, if you are not ready to hand over keys solely in person, using the web of trust and public key servers is the most suitable alternative.

Source: "The Art of Legal, Anonymous and Safe Access to Internet Resources" M. Raitman

Fundamentals of Cryptography (Encryption)

Father

Professional

Tomcat

Professional

Encryption at a glance

Similar threads

Fundamentals of Cryptography (Encryption)

Father

Professional

Tomcat

Professional

Encryption at a glance​

Similar threads

Encryption at a glance