On the xz-utils backdoor

Published 2024-03-31, last updated 2024-04-13

Trust

The Linux ecosystem (the kernel and distros like Ubuntu, Debian, and Fedora; which together form the backbone of modern digital infrastructure) operates on multiple distinct networks of trust. This includes the explicit network of trust between distro maintainers: each maintainer trusts every other maintainer to act in the best interest of the project. However, there are other equally important networks of trust that are not as obvious.

Maintainers implicitly trust mailing list users to act in good faith and without duplicity. Users and system administrators implicitly trust every package they install with root access to the entire filesystem. Software users and authors implicitly trust common runtime dependencies like glibc, libsystemd, and many more (including liblzma), expecting these libraries will not impact the stability or security of their own code.

These networks of trust were designed before Linux took over the world (and arguably before computers took over the world). They sprang up in the early 90's and remain largely unchanged. Relationships of trust which were appropriate to a hobbyist project in 1994 have become entirely inadequate as the foundation of all digital infrastructure in 2024.

What Happened?

A malicious actor (possibly a group) simultaneously exploited many of these relationships of trust over the course of at least three years. They used "sock puppet" accounts on mailing lists to pressure the sole, primary maintainer of the xz-utils package into appointing a second maintainer. They exploited the primary maintainer's trust in skilled contributors to pressure him into selecting their persona, "Jia Tan", as that trusted second maintainer. They exploited the trust the distro maintainers have in upstream package maintainers to submit a signed-but-backdoored source tarball for packaging in those distros, even though the tarball could not be reproduced from the git repo. They further exploited the trust those same distro mtaintainers have in both mailing list users and upstream maintainers to pressure those distros into adopting the "upgraded" version more quickly.

The attacker exploited the technological trust that is placed in RPM and deb packages to place a shared library on the system that would be loaded by unrelated programs, including those running as root like sshd. They exploited the trust that the glibc ifunc mechanism places in loaded code to override functions from unrelated libraries that they knew would be used by sshd when accepting remote connections. They exploited the trust that many system administrators have in sshd to ensure their exploit would be exposed to the internet.

The exploit was smuggled into the xz-utils project encrypted and scrambled, masquerading as binary test data, thus defeating the code review process. The shell script to decrypt and activate the exploit was hidden in the release tarballs, taking advantage of the common practice of releasing non-reproducible artifacts and avoiding both code review and transparent git history. False positives in third-party security checks (fuzzers and memory checkers) are common enough that the trusted second maintainer swept genuine red flags under the rug. Finally, the ifunc mechanism defeated the OS- and hardware-level protections against runtime code modification. Normally the dynamic linker would fix up function addresses before a process starts and then mark those pages as executable/read-only, thereby preventing runtime modification of function pointers into dynamically loaded libraries. However, since the ifunc mechanism relies on an implementation selection function running after the dynamic linker is done, the function pointers were effectively unprotected and easily overwritten by the exploit.

Update 2024-04-13: Recent analysis from Kaspersky shows that the ifunc mechanism is only used to achieve early code execution. The actual hook used to intercept calls to libcrypto functions is the rtld-audit linker auditing feature. However, the overall argument stands: the complex, tightly-coupled nature of runtime linking in Linux left a function pointer in writable memory, which the exploit was able to overwrite and use to gain control over function calls into unrelated libaries.

What To Do?

People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world.

—Pedro Domingos

This is a true system failure. The attacker found many distinct weaknesses, each of which were unanticipated by the gatekeepers/watchdogs responsible, and combined them into a novel exploit that appears to have been detected only by luck. Patching any one weakness in the chain will not "solve the problem" nor prevent future attacks from succeeding. For example:

Requiring code review won't prevent obfuscated malicious code from being merged, especially in oft-overlooked places like test data.
Requiring reproducible artifacts or builds won't prevent unexpected (and possibly malicious) interactions between those individual reproducible components.
Requiring the use of memory-safe languages won't prevent logic errors or social engineering attacks.

This isn't to say that these tools aren't valuable for many reasons; rather, it is to say that they have already proven inadequate to correct the structural deficiencies in the process by which security-critical software is written and run today. Recurring system-level failures can only be prevented by re-examining the system as a whole.