The big news this week was that OpenSSH has an unauthorized Remote Code Execution exploit. Or rather, it had one that was fixed in 2006 that was inadvertently reintroduced in version 8.5p1 from 2021. The flaw is a signal handler race condition where async-unsafe code is called from within the SIGALARM handler. What will she say?
To understand, we need to dive into the world of Linux signal handling. Signals are sent by the operating system, to individual processes, to notify the process of a state change. For example SIGHUP
or SIGnal HangUP, originally indicated the disconnection of the serial line of the terminal where a program was running. SIGALRM
is SIGNAL ALARM, which indicates that a timer has expired.
What’s interesting about signal handling in Unix is how it interrupts program execution. The operating system has full control over execution scheduling, so in response to a signal, the scheduler stops execution and immediately handles the signal. If no signal handler function is defined, this means a default handler provided by the OS. But if the handler is set, that function is executed immediately. And here is the dangerous part. Program execution can be anywhere in the program, when it is interrupted the signal handler is executed and then execution continues. From Andries Brouwer on Linux Kernel:
It’s hard to do interesting things in a signal handler because the process can be interrupted at an arbitrary place, data structures can be in arbitrary states, etc. The three most common things to do in a signal handler are (i) set a mark variable and return immediately, and (ii) (messy) dump whatever the program was doing and restart at a convenient point, perhaps the main loop of command or more, and (iii) clear and exit.
The term async-signal-safe describes functions that have predictable behavior even when called from a signal handler, with execution interrupted in an arbitrary state. How can such a function be insecure? Let’s consider asynchronous-signal-uncertain free()
. Here, sections of memory are marked as free, and then pointers to that memory are added to the free memory table. If program execution is interrupted between these points, we have an undefined state where memory is both free and allocated. A second call to free() while execution is paused will corrupt the free memory data structure, as the code is not intended to be called in this reentry mode.
So back to the OpenSSH bug. The SSH daemon sets a timer when a new connection arrives, and if authentication has not completed, the SIGALRM signal is generated when the timer expires. The problem is that this signal handler uses the syslog() system call, which is not an async-safe function, due to the inclusion of malloc()
AND free()
system calls. The trick is to start an SSH connection, wait for the timeout, and send the last bytes of a public-key packet just before the timeout signal goes off. If the public key handling function just happens to be at the right point in a malloc()
call, when the SIGALRM handler reenters malloc()
, the heap is corrupt. This corruption overwrites a function pointer. Replace the pointer with an address where the key input is stored, and suddenly we have shellcode execution.
There are some issues with turning this into a working exploit. The first is that it is a race condition, requiring a very tight time to split the program execution at exactly the right place. The randomness of network timing makes this a high hurdle. Then, all major distributions use Address Space Randomization (ASLR), which should make overwriting that pointer very difficult. It turns out, also in all major distributions, ASLR is somewhat broken. OK, on 32-bit installations, it’s completely broken. On the Debian system tested, there is literally a single ASLR part in play for the glibc library. It can be placed in one of two possible memory locations.
Assuming the default settings for maximum SSH connections and LoginGraceTime, it takes an average of 3-4 hours to win the race condition to trigger the bug, and then there is a 50% chance of finding the correct address on the first try. This seems to put the average time at five and a quarter hours to hit a 32-bit Debian machine. A 64-bit machine has ASLR that works a little better. A working exploit had not been demonstrated since the vulnerability was published, but the authors suggest it could have been achieved within a week of the attack.
So which systems should we really be concerned about? Regression was introduced at 8.5p1 and fixed at 9.8p1. This means that Debian 11, RHEL 8 and their derivatives are in the clear, as they ship older OpenSSH versions. Debian 12 and RHEL 9 are in trouble, although both of these distributions now have updates available that fix the problem. If you’re on one of those distributions, especially the 32-bit version, it’s time to update OpenSSH and restart the service. You can check the OpenSSH version by running nc -w1 localhost 22 -i 1
to see if you might be vulnerable.
Polyfill
The Polyfill utility used to be a useful tool for pulling JavaScript functions to emulate newer browser features in browsers that weren’t quite up to the task. This worked by including the polyfill JS script from polyfill.io. The problem is that the company Funnull bought the polyfill domain and Github account and started serving malicious scripts instead of the legitimate polyfill function.
The list of domains and companies caught in this supply chain attack is quite extensive, with nearly 400,000 still trying to connect to the domain as of July 3. We say “trying”, as providers have taken note of the Sansec report, breaking the story. Google has blocked the linked domains from advertising, Cloudflare is rewriting the calls to fill in a clean cache, and Namecheap has blocked the domain, ending the attack. It’s a reminder that just because a domain is reliable now, it may not be in the future. Be careful where you connect.
Pack it up
We are no strangers to controversy over the drama of CVE severity. There can be a desire to make a discovered vulnerability appear serious, and occasionally this results in a wild exaggeration of the impact of an issue. In this case, the node-ip project has an issue, CVE-2023-42282, that initially scored a CVSS of 9.8. The author of node-ip has maintained that it is not a vulnerability at all, as it requires an untrusted login to be passed to node-ip and then used for an authorization check. It seems to be a reasonable objection – if an attacker can manipulate the source IP address in this way, the source IP is untrusted, regardless of this issue in node-ip.
maintenance, [Fedor] made the call to simply archive the node-ip project in response to the apparently bogus CVE, and the endless stream of unintentional teasing about the issue. Audit tools began to warn developers about this issue, and they began to audit the project. Apparently there is no way to fight the report, archiving the project seemed like the best solution. However, the bug has been fixed and Github has lowered the severity to “low” in their advisory. As a result, [Fedora] announced that the project is coming back, and is indeed an active project on Github again.
Bits and Bytes
[sam4k] found a Remote Access After Free (UAF) in the Linux Transparent Inter Process Communication (TIPC) service, which can be exploited to achieve RCE. This is a kind of toy vulnerability, found while preparing a talk on Linux kernel bug hunting. It’s also not a protocol that’s even built into the kernel by default, so the potential ramifications here are pretty low. The problem is fragmentation handling, since error handling misses a check for the last fragment buffer and tries to free it twice. It was fixed this May, in Kernel version 6.8.
CocaoPods is a dependency manager for Swift/Objective-C projects and had a trio of serious problems. More interesting was the result of a migration, where many packages lost their association with the correct maintainer account. Using the CocaoPods API and a maintainer email address, it was possible for arbitrary users to claim those packages and make changes. This and several other issues were fixed late last year.