How to tame a rootkit. Preventing malicious modules from loading on Linux.

Hacker · Jul 28, 2021

What are Linux rootkits?
Rootkits help an attacker secure access to a compromised system, with an emphasis on maximizing the invisibility of the malware. To do this, they hide network connections, processes, folders, files, fake their contents. Usually, a rootkit carries hacker utilities for controlling an infected machine, with which a villain can install and hide a DDoS bot or a miner in the system (by the way, one such, Skidmap, was discovered relatively recently). Most often, these utilities include backdoors, and not only those that can be easily detected by an external port scanner, but using the Port knocking technology (something like "tapping ports"), when a port is opened only after a correct and predetermined sequence of requests to closed ports.

Traditionally, all rootkits are divided into user-space and kernel-space. There are already utilities against the former that can detect many of them: chkrootkit, rkhunter, Antidoto, and a project called linux-malware-detect. Therefore, "nuclear" rootkits, which are more difficult to detect directly on the infected system, are of greater interest, although some of them can also be detected (but not removed) by these utilities.

Kernel-level rootkits in Linux, as a rule, are implemented in the form of loadable kernel modules (LKM, Loadable Kernel Modules), but there are even more exotic ways: malicious code is written directly into the kernel memory via a device file /dev/kmemor injected at early stages of boot with modification initrafms(if you are familiar with such rootkits, let me know - only theoretical descriptions come across, but I could not find samples). Now, however, cases of infection with "nuclear" rootkits are rarely written, but the very recently revealed Skidmap suggests that this threat should not be forgotten.

Vital tasks of every rootkit
There are many structures in the kernel that describe the current state of the system. For example, this is a list of running processes, consisting of pointers to process descriptors, which is used by the scheduler. Another important object is the list of loaded kernel modules, where each item points to a handle to the loaded module. It is used by teams that operate LKM: lsmod, rmmod, modprobeand the like. These lists refer to internal kernel objects.

Any malicious module first of all removes itself from the list of loaded modules, because if LKM is not described in it, then the kernel considers that such a module is not loaded. This means that it will not appear in the output lsmodand will not be dumped with rmmod. This technique is called manipulation of internal kernel objects (DKOM, Direct Kernel Object Manipulation).

Also, any good rootkit takes care of how to stay in service after a system restart. For example, Snakso, discovered in 2012, writes the command to load the module in for this /etc/rc.local, rkduck prefers the file /etc/rc.modules, and Reptile, depending on the target system, can use and /etc/rc.modules, and /etc/modules. Skidmap brings variety to this list and is pinned to cron job scheduler scripts. In general, other files that affect Linux boot, including boot scripts, may be suitable for this purpose. In what follows, we will refer to such files as startup files.

And everything seems to be simple: check the contents of these files for suspicious lines, and everything will be fine in your system. But a rootkit is not a rootkit unless it also hides that these files have been modified. It can intercept system calls and even the functions of the kernel itself, which are behind these calls: for example, Snakso and Reptile intercept the functions of the file subsystem of the VFS (Virtual File System) kernel. Rootkits check if there was something among the read that needs to be hidden from the eyes of the user or administrator, and, if necessary, modify the buffer with the read data.

And when an unsuspecting (or suspecting) user tries to view the contents of a file modified by a rootkit, he will only see a list of completely legitimate modules or commands. This is what becomes a problem when trying to detect a rootkit while working behind an infected machine. Generally speaking, when infecting kernel-level rootkits, you shouldn't trust anything the kernel tells us.

Hardware: how LKM rootkits cover their tracks
Before starting the fight against malicious modules, you need to take a closer look at the mechanisms by which they hide in the system themselves and hide their protégés, in order to know what exactly to deal with.

Interception of system calls and functions
This step is required by the rootkit for subsequent actions. Interception (hook) of kernel functions and interception of system calls are not much different and allow the rootkit to perform its functions under the guise of intercepted ones, conducting checks to mask the data. There are not so many options for interception: either change the address of the function, or set an unconditional jump instruction in its code itself jmp.

The kernel takes information about the location of the code of system functions and calls from two tables: the kernel symbol table System.map(from where the exported kernel functions are mapped to a /proc/kallsymspseudofile at boot ) and the syscall table sys_call_table, the address of which is in the System.map.

When the address of the required function is found, the rootkit replaces it with the address of its function. The system call table is located in a protected area of memory marked read-only in the register CR0(x86), but this limitation is nonsense for a kernel-level rootkit, because there is write_cr0()one with which it is easily bypassed. Replacing the address of the table itself is possible, but I have not yet met.

Another option, called splicing (from the English splice - joint, gluing), consists in setting instructions jmpin the function code and is performed in the same way for system functions and syscalls. As a rule, the first five bytes of the function prologue are overwritten, which is enough to accommodate the opcode 0xe9and the jump address. The original bytes of the prologue are retained so that the intercepted function can be called correctly. However, a clever hacker can make it much more difficult for white hats by placing a hook somewhere in the middle of the intercepted function, rather than at the beginning of the intercepted function.

Regardless of the method chosen, when the program calls the intercepted function, control passes to the malicious one, which calls the original one, checks the result of its execution and modifies it if necessary. The calling program will receive the already modified data without knowing it at all.

Masking files and their contents
In fact, there are also folders, because since the days of Unix, Linux has been using the "everything is a file" philosophy. For this task, malicious LKMs intercept the functions of the VFS kernel file subsystem in the first place vfs_read(). After its execution, the rootkit scans the read data in search of what needs to be hidden from the user.

For example, Reptile looks in the buffer for vfs_read()its tags (by default, this is <reptile>also the closing one in a pair to it) and removes them from there, as well as everything in between. To hide files and folders, rootkits check the directory listing for predefined names and remove them if they match. At the same time, they do not always intercept calls to hidden folders, and, knowing these names, you can go there and operate with files in them.

Masking processes
This is how rootkits hide miners and other evil spirits. One way is similar to hiding files: the list of processes is available through the kernel interface /proc/<PID>/for each process. These files are used, for example, by programs ps, topand if the rootkit hides the corresponding folder, they will not display such a process.

A more complex option, described in the Black Hat presentation, involves modifying the internal process list and unlinking the task_structdesired process handle from it . But here a problem arises: this list is used by the scheduler, and if there is no process descriptor in it, the process will hang, since the scheduler cannot know about its existence. It is also necessary to change the logic of the scheduler, which, in principle, is possible, but rather complicates the task for the attacker.

Masking network connections and modifying traffic
To hide the backdoor, rootkits either use the Port knocking technique or fake information about open sockets. User programs, among which netstat, to obtain information about network connections, use pseudo -files /proc/net/tcpand /proc/net/tcp6, serving as a display of data from kernel memory. Interception vfs_read()allows the rootkit to filter connections that are visible to the user from this file. You can also intercept tcp4_seq_show()and tcp6_seq_show(), with which these kernel interfaces are implemented in /proc. However, the utility ssworks a little differently, and in some cases may be hidden from display /proc/net/tcpand tcp6the connection.

The NetFilter firewall built into the Linux kernel provides packet filtering, address translation, and other packet conversion. This subsystem is a set of hooks on top of the Linux network protocol stack. They can be used to register a kernel function to work with packets on one of the five processing steps: PREROUTING, INPUT( LOCAL IN) FORWARD, POSTROUTINGand OUTPUT( LOCAL OUT). A "nuclear" rootkit can easily register its function to modify network traffic at any stage.

Ways to deal with LKM rootkits
Despite sophisticated methods of masking rootkits, it is often not so difficult to determine that a system is infected. Rootkits intercept functions, but here, as shown earlier, there are few options.
In cases with address spoofing, it is best to have the original version of the table so that you can later check its integrity, but you need to remember that after updating the kernel, the addresses can change. For splicing, you can check if there is a byte with a value 0xe9corresponding to the address of the function jmpand identify the hook (which is not so easy if it is not installed at the beginning.
In the simplest situation, hidden connections can be detected with an external port scanner, but when using Port knocking, this is unlikely to give a result, unless you have to monitor traffic passing through the network gateway.

There is also an interesting task - to look in the kernel memory for a descriptor of a module unlinked from the list of loaded modules. Once you find it, you can return it to the list so that you can use it later rmmod(unless, of course, the rootkit also changed it, just in case). There is a solution (see Section 6) that allows you to find objects similar to LKM descriptors in memory, but it works on very old and only 32-bit kernels. Enumeration of the memory of modern systems is difficult because of the huge address space, but that makes this task only more interesting.

However, establishing the presence of a rootkit in the system is only half the battle. It would be nice to remove it from there, and also find the executable file itself so that you have something to study on the weekend. The easiest and most reliable way is to take a media with an infected operating system and look for suspicious files of loadable modules from a third-party machine there, although this will not help against one top-secret method of storing files (those who are familiar with such malware, also write). But that's not interesting at all, is it? In addition, there are certainly systems for which it is highly undesirable to be in an inoperative state while the search is in progress. Therefore, let's try to examine the infected machine from the inside. What can give out the presence of a rootkit and at the same time point to the file of the malicious module?

We start our own investigation
So what do we know? That an LKM rootkit must write a line somewhere that leads to its loading when the OS starts. We also know that he is trying to hide it.

Let's say we have Reptile installed on our system. We open it in any text editor /etc/modulesand see that it seems to be clean, but we know that there is hidden content there. What happens if you try to save it without making any changes to the file? Exactly what we see will be saved - that is, after the OS is rebooted, the malicious LKM will no longer be active. Isn't that great? The author of the rootkit EnyeLKM also suggested this way to prevent it from loading. The described effect is observed because exactly what is open in the editor is saved, that is, the part of the file that is free of rootkit data.

Well, that's not bad already. But what if you don't know which startup file the rootkit has registered in? Do not iterate over all the files involved (although it is possible, but it is too simple). In addition, this method destroys all clues in the file that would help identify the malware faster. You need to find the modified startup file, place a copy free of rootkit data in its place, but somehow leave this data to make it easier to find the malicious LKM. It would also be useful to determine which file was modified by the rootkit, so as not to re-save and analyze them all.

Thus, our task of preventing the download of an LKM rootkit boils down to two questions:

how to find a file with hidden contents, if it is not known how the rootkit got fixed in the system;
how to save this file so that this content is not deleted.

And how can we understand that we are dealing with a file that knows more than it shows? It turns out that everything is simple: extra content means extra data on the disk. Yes, my point is that by hiding the contents, the rootkits have not yet figured out to falsify the file size. As I will show later, it is not very easy to do this.

The second question is also easy to deal with. Let's rename the file found in the previous step so that the system does not read from it the next time it starts, and then save a copy (which will no longer contain rootkit data) under the original name. Voila!

We develop an arsenal
Now we are ready to automate the search and recovery of the modified startup file. All that is needed is a list of these files so that the program knows exactly where to look for the catch. Further, for each such file, the program compares the number of bytes read using fread()(behind which there is a system call read(), and behind it, in turn, the intercepted kernel function vfs_read()), with the file size obtained from the structure describing the file in the file system (i-node ). Since this structure is inaccessible from user space, it is necessary to use a system call fstat().

Of course, the LKM rootkit will not be difficult to intercept it, but I wanted to show that the current "nuclear" rootkits can be successfully countered from the user space, so I propose a user-level program. It can be ported to the kernel if needed. There, the rootkit can fake the file size only by changing the value f->inode->i_size(for kernels older than 3.9.0 - f->f_mapping->host->i_size), but this is at least fraught with problems when reading the file - after all, it is a serious low-level structure. In any case, the currently known rootkits do not affect either this data or fstat()(I dare to assume that now they will start).

The code of the program I developed can be found on the github. It offers a minimal set of features to demonstrate that it does its job; if desired, it can and should be expanded. The program contains an array of strings with the names of files most vulnerable to modification by LKM rootkits (the list is still far from complete), and checks each of them for the presence of masked content.

The main part of this check is in the function cmp_size(). If a rootkit is suspected, an appropriate message is displayed with information on how many bytes differ from the actual content read. get_fsize()gets the size of the file being checked from its descriptor:

Code:

off_t get_fsize(FILE *f)
{
    int res;
    struct stat fst;
    errno = 0;
    res = fstat(fileno(f), &fst);
    if(res){
        perror("In get_fsize(): couldn't get fstat");
        return 0;
    }
    return fst.st_size;
}

short cmp_size(FILE *f)
{
    unsigned int i_size, read;
    char *fbuf;

    i_size = (unsigned int)get_fsize(f);
    if (errno){         
         printf("\x1b[1;31m***WARN***\x1b[0m Some problems with %s.\n",
                start_files[i]);
         return 1;
    }

    fbuf = (char*)malloc((i_size+1) * sizeof(char));
    memset(fbuf, 0, i_size+1);

    read = fread(fbuf, 1, i_size, f);

    if (i_size != read){
        printf("\x1b[1;31m***WARN***\x1b[0m Something performs file tampering of %s : "
               "read %u bytes instead of %u.\n", start_files[i], read, i_size);
        lets_talk(f, i_size, read);
        free(fbuf);
        return 1;
    }else{
        printf("\x1b[32m%s\x1b[0m looks fine to the userland\n", start_files[i]);
        free(fbuf);
        return 0;
    }   
}

If a mismatch is found, the program will suggest possible actions: try to read the actual contents of the file (byte-wise so that the rootkit cannot find its tokens in the read buffer. This does not give any guarantees, but why not try?) And replace it with a safe copy. This adds a suffix to the old rootkit data file .old.

Now it remains to reboot, because the rootkit is still sitting in memory and doing its dirty business. Well, then we look for what was hidden in the now clean startup file, find the malicious binary, check it against VirusTotal or somewhere else and rejoice at how great we are.

For the sake of fairness, I will say: against rootkits that track access to their autoload file and check its contents upon writing, the methods described above may not help, but this, as they say, is a completely different story. We have considered here a very specific special case, but who knows, what if the very idea that we came to will be useful in the future?

Finally
It is clear that the confrontation between antiviruses and malicious programs is an eternal struggle between a ~~beaver and a donkey~~ between good and evil, and the further, the more sophisticated tactics the cybercriminals use for their atrocities.
As you can imagine, the most reliable way to detect a rootkit is to analyze the disk, at least with the help of the Live CD. And only after that, having the malware on hand and having studied all its mechanisms, you can start developing protection directly for the hacked axis, which we tried to do with one specific example. Against future and now unknown LKM rootkits, our program will probably be useless, but this is another reason to feel like a superhero again and improve methods of counteraction.
If you, dear reader, are interested in learning all sorts of bad programs for Linux, especially rootkits, keep a few useful links that helped me a lot in this area.

Good luck, remember to behave yourself, make system backups in time and brush your teeth in the morning.

How to tame a rootkit. Preventing malicious modules from loading on Linux.

Hacker

Professional

Similar threads