Steganography 2021: a thing in itself. We hide the needle in the haystack.

Jollier · Apr 2, 2021

Nowadays, encryption is the cornerstone of digital security and using its strong methods is a great way to protect your data from prying eyes. But encryption is not the only and not always appropriate way of protection.

Sometimes you need your secret to remain a secret, and the fact of the very existence of encrypted information already suggests that you have something to hide. I do not think that Major Cop will not be impressed by your assurances that the container found on your laptop contains photographs of your cats, not drugs, and most likely tongs, irons and soldering irons will be used. A court order can have approximately the same negative effect.

Are you afraid?

Then it's time to remember the good old stenanography. The word steganography comes from the Greek - steganos (secret) and graphy (record), therefore it can be called cryptography. Its task is to conceal the very fact of the existence of a secret message.

This science first appeared in Egypt. It was used to convey important government information. To do this, they cut the slave baldly and beat the poor fellow with a tattoo on the back of his head. When the hair grew back, the messenger was sent on a journey.

For example, a message can be hidden inside an image, or a picture is hidden inside an audio clip, or encoded in a blog or even broadcast. The trick is to make the "envelope" look unsuspecting and not attracting attention.

This is not a trivial task. Regardless of the type of "envelope" and "useful information" that you choose, the fact that one file is embedded in another will inevitably lead to anomalies that can be detected by the analyst. As with cryptography, this is a cat-and-mouse game between steganographers and steganalitists. The most effective methods often result in low bandwidth in terms of payload size and payload size. This may raise questions. How many pictures of your cats can you have without arousing suspicion? Or why do all your Bach cantatas sound like from an old patifon?

Another similarity with cryptography is that if the analyst figures out how to extract the “message” from the “envelope”, then the secret will become obvious and it will no longer be possible to deny the absence of hidden information.

Steganography is most efficiently used not instead of cryptography, but together with it. This combination allows you to hide both the information itself and the fact of its storage or transmission.

So what can be done?

We must literally “hide a needle in a haystack”. That is, to hide the encrypted "message" in a very large amount of random data in such a way that it would be impossible to determine what data (if any!) Was embedded. Naturally, there must be an algorithm for extracting individual fragments of "hidden" data without revealing other parts.

Since random data, by definition, is already random, the analyst has no point of reference from which to begin solving the riddle. The analyst will see a lot of random data, but will not determine which data is embedded.

Thus, Major Cop, using a soldering iron, can get some insignificant part of the information from you, but at the same time you will save the basic data, giving out only a part of the encrypted information.

Steganographic software
First, let's take a look at existing software. Next, we will study the principles of steganography in detail and write a small program ourselves.

DeepSound
DeepSound embeds files of any type and automatically calculates available space for them based on container size and audio quality settings. When using MP3, the available space for the steg message is shown larger than the container itself, but this is an illusion. Regardless of the original file format, a new container is only created in one of the uncompressed formats: WAV, APE, or FLAC. Therefore, the size of the original container is irrelevant. As a result, the message will take up a percentage of the volume of the uncompressed audio file. Unfortunately, the last version of this program was released in November 2015.

The program can simply place any file inside a music file, or pre-encrypt it using the AES algorithm with a 256-bit key. It was experimentally found that the maximum password length is only 32 characters. My regular passwords were longer and resulted in an unhandled exception.

You can put any number of files in one container until the free space counter is full. Its amount depends on the degree of quality (that is, the distortion introduced into the audio file). There are three settings in total: high, normal, and low quality. Each of them doubles the useful volume of the container. However, I recommend not to be greedy and always use the maximum quality - this will make it harder to find the hidden file. The stego message is retrieved after manually selecting the appropriate container. If encryption was used, then without entering a password, the program will not even show the name of the hidden file. Cyrillic characters in file names are not supported. When extracted, they are replaced with XXXX, but this does not affect the contents of the file. DeepSound can convert MP3 and CDA,

A pleasant surprise awaits us here: the file sizes are identical, but their contents differ immediately after the header. Bytes differ almost everywhere in one, and by small values. Most likely, we have before us an implementation of the LSB algorithm (Least Significant Bit - least significant bit). Its essence is that the hidden file is encoded as changes in the least significant bits in separate bytes of the container. This results in small distortions (change in pixel hue in BMP and sound frequency in WAV) that humans would not normally perceive. The larger the container is in relation to the hidden file, the less likely it is to find the latter. This algorithm leaves no explicit indication of the presence of an embedded file. Only a statistical analysis of noise (acoustic, brightness, color and others) can suggest its presence, but this is a completely different level of steganalysis. DeepSound is already quite suitable for hiding important information (except for state secrets, of course). Built-in encryption can also be used, but no one knows how well it is implemented, because the program did not have an open audit. Therefore, it will be safer to pre-place secret files in some reliable crypto-container (for example, VeraCrypt), and then hide it inside the audio file. If you use unique audio files as containers, there will be nothing to compare them with by byte, and hardly anyone will be able to find your "matryoshka". Just write a few gigabytes of warm, uncompressed sound into the same directory for better masking. because the program did not have an open audit. Therefore, it will be safer to pre-place secret files in some reliable crypto-container (for example, VeraCrypt), and then hide it inside the audio file. If you use unique audio files as containers, there will be nothing to compare them with by byte, and hardly anyone will be able to find your "matryoshka". Just write a few gigabytes of warm, uncompressed sound into the same directory for better masking. because the program did not have an open audit. Therefore, it will be safer to pre-place secret files in some reliable crypto-container (for example, VeraCrypt), and then hide it inside the audio file. If you use unique audio files as containers, there will be nothing to compare them with by byte, and hardly anyone will be able to find your "matryoshka". Just write a few gigabytes of warm, uncompressed sound into the same directory for better masking.

HALLUCINATE
The compact (34 Kb) utility is written in Java and does not require installation. It supports BMP and PNG as a container, which makes it much more convenient than Anubis. PNG images are used more often today than BMP images. There are a lot of them even in the temporary directories of the browser, so such a container will definitely not be a lonely and very noticeable file on disk.

Hallucinate's interface is simple and functional. It is required to select a container, specify the file to be hidden in it and the desired degree of quality of the final image. Eight options are available. The more the original image is roughened, the more you can hide in it, but the more noticeable the artifacts become. Let's choose the best quality in the settings and illustrate this difference by repeating the operation with the BMP file.

Visually, the pictures on the left and right do not differ. However, Beyond Compare shows the difference between them in the center frame. The text file is encoded as changes in the brightness of individual pixels evenly distributed throughout the frame. Only in the darkest and lightest areas do they clump tightly. When comparing files byte, the same hexadecimal difference looks familiar: the same LSB algorithm as DeepSound. Graphic file or sound file - in this case it does not matter. Both formats introduce minimal distortion, indistinguishable without special comparison methods. Finding them without the source file (with only the container in hand) is quite difficult. It does not contain any explicit pointers to injecting a stegose message. Produces a hidden file only frequency analysis, but this method works well only for detecting large nesting dolls. A small file in a large picture remains almost invisible. The hidden file is extracted in just two clicks. It is enough to select a container (HAL-file in the terminology of the author of the program), press Decode and specify the location to save the file.

JHIDE
JHide is another similar Java program. You cannot call it compact, it takes up almost three megabytes. However, unlike Hallucinate, in addition to BMP and PNG, it supports TIFF and also allows password protection.

Comparison with the Beyond Compare utility shows subtle differences. In the first second they are not visible at all. It is necessary to add brightness and look closely to see evenly scattered dark blue dots on a black background.

Comparison in hex codes shows the same LSB algorithm, but its implementation is more successful here. The changed pixels are not grouped in large blocks from the beginning of the file, but are evenly scattered throughout the container. This makes it much more difficult to detect the hidden message in the picture. With the small size of the steg message, this is almost impossible to do without having to compare the original (empty container).

The program itself tries to compress the hidden file as much as possible before placing it in the container. Therefore, it is always extracted in ZIP format, and the hidden file is already inside this archive. Password protection must be deactivated before manually unpacking - jHide itself will not show you whether you need to enter it. This is also a plus, since it excludes the possibility of using the utility to check images for hidden files.

The utility sometimes ignores the input file name and extracts it with the template name stego_% name% .bmp, but this flaw can be forgiven. The contents of the file are read to it without distortion.

OPENPUFF
The hardest utility in this roundup is OpenPuff. Its latest version (4.00) supports not only hiding some files inside others, but also working with arbitrary format stegolags. It can even be allocated multiple processor cores if there is a lot of work ahead.

Unlike other utilities that support password protection of hidden messages, OpenPuff can use a cryptographically secure pseudorandom number generator (CSPRNG) for encryption. If a simple password is not enough, then check the boxes in front of fields B and C, and then enter three different passwords in them, from 8 to 32 characters long. Based on them, the CSPRNG will generate a unique key, which will be used to encrypt the message.

Small files can be stored in pictures and audio recordings, while large files (for example, crypto containers) are more convenient to hide in video recordings - OpenPuff supports MP4, MPG, VOB and many other formats. The maximum size of the hidden file is 256 MB.

Using CSPRNG on small files greatly increases the resulting size of the steg message. Therefore, the difference between an empty and a full container becomes too obvious. Again, we can see that the resized pixels are mostly evenly distributed, however, they form large blocks in the lightest and darkest areas. If there were no such blocks, the result would be more similar to artifacts obtained when compressed using JPEG. A byte comparison also gives a very characteristic picture. Despite the small size of the hidden file, the values of most of the pixels in the container have been changed. If jHide was 330 bytes long enough to write a stegose message, then OpenPuff used more than 170 KB for the same task.

On the one hand, this is a plus: there is no direct correlation between the message size and the number of changed pixels. The analysis of such a container becomes much more complicated. On the other hand, it takes extra effort to create a container, which can put off an inexperienced user. Another mode of operation of the program is writing and reading stegolabels. These are hidden strings up to 32 characters long that can be used for copyright protection. For example, hide copyright in a photo, music file or document. This function works extremely simply. You write an arbitrary quilting tag at the top of the window and indicate below the files to which it should be added. The original files will remain intact, and their copies with the label will be saved in the directory you specified.

If you have any legal disputes, you simply launch OpenPuff and show the previously injected tag to an astonished opponent.

Difficulties arise if the file has been modified. Even a simple conversion to another format erases the stegolabel. It cannot be read even if the file has been converted back to its original format. Persistent quilting exists, but only individual programs are able to implement them. As a rule, they are tied to some specific equipment (for example, a camera model).

OPENSTEGO
The program works on Windows and Linux. It supports BMP, PNG, JPG, GIF and WBMP. The filled container is always saved in PNG format. OpenStego is only 203KB in size, but after getting to know Hallucinate, this is no longer impressive. Formally, the utility requires installation, although it can be easily turned into a portable version. OpenStego is attractive because it supports password protection and also knows how to implement stegolabels (although this function is still in beta status).

After adding a small text file to the selected image, there is practically no visual difference between an empty and a full container. However, the file size increased by one megabyte, and because of the conversion to PNG with a different compression ratio, it became just a different file. In a byte-by-byte comparison with the original, the differences will be in all values immediately after the header.

Interestingly, the program does not in any way check the correctness of the entered password when extracting the steg message from the container. She honestly tries to collect the retrieved file anyway and always reports that the operation was successful. In reality, the hidden file will only be retrieved after entering the correct password. Otherwise, an error will occur and the file will not be written. This approach slightly complicates the use of classic brute force methods, in which the next combination is substituted after a failed check of the previous one. However, there is still a marker of successful extraction. It is enough to specify an empty directory as a directory and try passwords until a file appears in it. It would be better to write any extraction result as a file - this would increase the level of protection. The implementation of stegolabels in this program is not like in others. First, a signature is generated, which is saved in a separate file with the SIG extension. It is impossible to write down any meaningful information in it - it is just a unique bit set, like a private key.

After embedding the stegolab, a new and visually identical image file is created, in which it "dissolves". The verification process is reduced to checking the presence of the specified signature inside the file. If it is completely preserved, then the match will be one hundred percent. If the file has been modified, the stegolabel may be partially lost. The method was conceived as an attempt at introducing persistent watermarks, but in the current implementation it is practically useless. The program shows a zero percent match already after a small cropping of the image and resaving to PNG with high compression.

RARJPEG
You can hide some files inside others without any steganographic utilities. Of course, this will not be a neat “dissolution” according to the LSB algorithm, but a simple merge, but this method, well-known in narrow circles, has its own advantages. First, it is available without additional tools. Secondly, it allows you to easily transfer any file by uploading it as a graphic to some site (for example, hosting or imageboard). The meaning of the method is that graphic files (in particular, JPEG) are interpreted immediately from the header, while archives are read only from the archive start mark. The tag itself can be located anywhere inside the file, since, in addition to ordinary archives, there are multivolume and self-extracting ones. As an experiment, we will pack all the programs from today's review into a ZIP archive and add this archive to the Wallpaper.jpg file, by creating a new picture: Wallpaper-x.jpg. Let's just start the Windows console and write:

Code:

type Wallpaper.jpg Steg.zip > wallpaper-x.jpg

The output will be a merged wallpaper-x.jpg file. It can be viewed as a picture or opened with any archiver that supports the ZIP format. If you change the file extension to ZIP, the file manager will open it as a directory. You can even do without renaming, and immediately use the archive plug-in via the quick unpack command (for example, {ALT} + {F9} in Total Commander). All files from such a "picture" will be extracted without problems. The described trick has been known for a long time and also works with some other file formats (both graphic and archives), but the most popular combination is RAR + JPEG.

Do it yourself
OK. But what about in practice? It's easier than you think. We need some knowledge of Python to create our own steganographic utility.

The first thing we need is a very large amount of random data. In the case of full-scale encryption, this can be an entire disk partition filled with random data, and named, for example, swap. For the purposes of this article, just a large file is fine.

In Linux it is very easy to create any amount of random data using the dd command. For tests, we will use a USB flash drive connected as / media / x / y

Code:

dd if = / dev / urandom of = / media / x / y / bigfile bs = 1M

After executing the command, the file will take up the entire USB flash drive.

Code:

import os
from hashlib import sha256
# calculate block id and hash
def calcblock (bts, numblocks):
hsh = sha256 (bts)
block = int.from_bytes (hsh.digest (), byteorder = 'little', signed = False)% numblocks
return block, bytes (hsh.hexdigest (), "utf-8")
# we encrypt
def encryptdata (value, pwd, blocksize):
# the SHA256 hash of the password will always be 32 bytes
pwsh = sha256 (bytes (pwd, "utf-8")). digest ()
if (len (value) <blocksize):
# we finish off with spaces, if short
val = value.decode ("utf-8"). ljust (blocksize)
value = bytes (val, "utf-8")
return bytes (a ^ b for a, b in zip (value, pwsh))
# decrypt
def decryptdata (value, pwd, blocksize):
# hash of the SHA256 password will always be 32 bytes
pwsh = sha256 (bytes (pwd, "utf-8")). digest ()
if (len (value) <blocksize):
# we finish off with spaces, for short
val = value.decode ("utf-8"). ljust (blocksize)
value = bytes (val, "utf-8")
try:
val = bytes (a ^ b for a, b in zip (value, pwsh)). decode ("utf-8")
except:
val = "End of message"
return val
def writefsys (fsys, fname, pwd, blocksize, numblocks):
# compute first block as filename + password hash
block, hshval = calcblock (bytes (pwd + fname, 'utf-8'), numblocks)
outf = open (fsys, "r + b")
with open (fname, "rb") as inf:
while True:
value = inf.read (blocksize)
if value == b '':
break # end of file
# print (“We write to the block“ + str (block))
# print (value.decode ("utf-8"))
byts = encryptdata (value, pwd, blocksize)
outf.seek (block * blocksize)
outf.write (byts)
# the next block is based on the hash of this block
block, hshval = calcblock (hshval, numblocks)
# this is just a file marker
value = bytes ("End of message" .ljust (blocksize), "utf-8")
# print (“Writing to block“ + str (block))
# print (value.decode ("utf-8"))
byts = encryptdata (value, pwd, blocksize)
outf.write (byts)
inf.close ()
outf.close ()
def readfsys (fsys, fname, pwd, blocksize, numblocks):
value = ""
rc = ""
# compute the first block as a hash of filename + password
block, hshval = calcblock (bytes (pwd + fname, 'utf-8'), numblocks)
with open (fsys, "rb") as inf:
while True:
#print ("read from the block" + str (block))
inf.seek (block * blocksize)
binarydata = inf.read (blocksize)
value = decryptdata (binarydata, pwd, blocksize)
if value.startswith ("End of message"):
break
rc + = value
# the next block is based on the hash of this block
block, hshval = calcblock (hshval, numblocks)
inf.close ()
return rc
# run
if __name__ == '__main__':
fsys = "/ media / x / y / bigfile" # file with our file system
fname = "./ secretmessage.txt # file with encrypted message
pwd = "To Heloise" # encryption password
bsz = 32 # used block size
fsz = os.path.getsize (fsys) # file size fsys
blknum = int (fsz / bsz) -1 # number of blocks in the file system
writefsys (fsys, fname, pwd, bsz, blknum) # write the contents of the file to the file system
msg = readfsys (fsys, fname, pwd, bsz, blknum) # read it
print (msg) # show encrypted message

Understanding.

Code execution starts at line 86.

First of all, we need to calculate the number of available blocks, taking into account the file size and the fixed block size, which we also specified.

We will also need to point to the file we want to hide our little filesystem in. The writefsys () function is responsible for this. First, we compute the initial block ID by computing the SHA256 hash of the secret key and file name modulo the number of available blocks. After that, we encrypt the first block of data and write it down. Our algorithm is the simplest XOR key encryption. If you want something more reliable, I recommend using AES. After that, we find the ID of the next block by counting the SHA256 hash of the previous block and write the data. And so on.

As a result, information is randomly distributed over the file system, and if the steganalist does not know the file name and secret key, there is no way to recover it.

We can hide multiple files at once by calling writefsys () with a different file name and key.

When we need to extract our file, we repeat the process exactly the opposite using the readfsys () function. As before, we compute the starting block ID using the key and filename. After that, we read this block and decipher it. As before, the SHA256 hash will give us the next block, and so on.

The unique key and filename ensures that extracting one file does not entail extracting all the others. Moreover, this approach will not even give a hint that the rest of the files exist. Do you feel the difference?

However, this code has one problem related to the nature of hashing. The fact is that several records at once can have the same value, which will lead to the fact that later records will overwrite the previously saved data. The solution to this problem would be to store each encrypted data item in multiple blocks, along with its checksum, using several different hash values. When extracting files, we will only consider those blocks that have a valid checksum. This arrangement will reduce the number of files that can be saved, but will preserve the integrity of the data. Our code does not have this function and you have to write it yourself.

Or maybe someone has already done this?
Yes, there are several fairly advanced steganographic file systems, the most famous of which is StegFS. It is a Linux filesystem based on FUSE and implemented in C.

There is also an idea called Mnemosyne. It is a peer-to-peer steganographic system. I am not aware of its implementation.

Steganography is a powerful tool for keeping data confidential. This is not encryption. But along with encryption, steganography can save you from many problems.

Take care of yourself and your information.

Steganography 2021: a thing in itself. We hide the needle in the haystack.

Jollier

Professional

Similar threads