Secure Boot is hard! Lessons Learned from the BlackTech Firmware Compromise

At first glance, the latest BlackTech firmware attack isn’t all the interesting as it’s a fairly run of the mill firmware / system-level attack, and that is mostly true. There are, however, a couple aspects worth digging into and discussing in more depth, specifically as they relate to larger system security and extrapolating these concerns to Linux-based systems.  

How did the Blacktech actors gain access? 

The BlackTech actors were able to perform their attacks only after acquiring administrative credentials. That really shouldn’t surprise anyone, as credential compromise is routine and to be expected at this point. Using compromised administrative credentials in and of itself isn’t that interesting, and it’s certainly not novel, but what is interesting is that most existing cybersecurity guidelines are largely focused on how we keep an attacker out; not how we ensure the systems continue to operate once an attacker is in the system. Time and time again this approach is thwarted by attackers, as most CVEs and attacks ultimately require elevated privileges, and are often chained together in order to gain root-level access. This is contrary to how security solutions such as Kevlar Embedded Security are designed. Kevlar Embedded Security assumes an attacker has root-level access and continues to enforce concepts such as application / library allowlisting, read-only filesystems, systemcall filtering, and secure service configuration. One thing to take away from this is, the desire to operate through an attack even when an attacker has root-level access is an inherent design principal of Kevlar Embedded Security and is not something that can just be bolted on. Principles such as this are key to secure system design and show why security must be integrated into a design from Day 1. 

Multi-Factor Authentication 

It's not explicitly stated, but it can be assumed the administrator credentials compromised by BlackTech did not make use of Multi-factor Authentication (MFA). As a basic principle of security, MFA should always be used in addition to limiting where accounts, especially privileged accounts can log in from. If you read our [SIEM BLOG], you’ll know that you should also log and monitor system access. Note that we said log and monitor, it’s not as easy as just logging, but we also need to monitor metrics, activity, and system events which brings us to the next interesting bit of the BlackTech firmware compromise. 

Logging 

As is often the case with malicious actors, they modify various logging facilities to hide their presence. In the case of the latest BlackTech firmware compromise, it would appear they handled this through several means including modifications of the firmware itself, as well as the use of output filters within the ios runtime. This suggests additional steps that should be added to our monitoring activities. First, we need to have some form of a watchdog or periodic check-in of our logging / monitoring components, so we know they are still alive and performing as expected. An astute reader here would also recognize that the check-in should use some form of cryptographic hash, maybe even with mutual authentication, such that it can’t be spoofed. They will also notice that we probably want to include some form of timestamp, counter, or proof of work to help prevent possible replay attacks. But we want to also record, monitor, correlate and investigate events such as logins (and where those logins happen), system uptime / load / reboot activities, firmware updates, filters, access control rules, etc. A fundamental principal of both defense-in-depth and Zero Trust architectures is not assuming an administrator (or really any login) originated from where we expected it to. As part of the overall system monitoring, we want to investigate system reboots and determine whether they were planned or unexpected. An unexpected reboot could be an early indicator of failing hardware, but could also possibly be an indicator of compromise. 

We can assume some of BlackTech’s firmware modifications removed logging services, altered the way commands from actors were processed, and made changes to other services. If we look at a Linux system for a basis, there are a couple of ways to hide commands from a CLI. In most cases, hiding or masking starts with disabling the shell history, maybe by using `unset HISTFILE` or `export HISTFILESIZE=0`. There is, of course, a slew of other options for this depending on the shell and the CLI environment itself. There are even some explicit filtering options built into bash, just look at the man page for bash and you’ll see all kinds of mischief that can be had from modifying environment variables. An attacker that wanted to make these changes permanent, or maybe only filter commands from their Command & Control infrastructure, might choose to modify bash itself, such that any command received over the network from the C2 infrastructure is never logged. Being able to prevent (permanent) modifications of bash is a function of integrity verification, read-only filesystems, application allowlisting (which itself needs to provide some form integrity verification, not just check a list of allowed executables or libraries). ,All of this needs to be tied into the secure boot mechanism and establishment of a trusted computing base. All of these properties are also feature offerings within Kevlar Embedded Security and/or Star Lab services. 

Looking at the above, we can see several other properties we may want to include in our secure systems. We probably want to have a way to check / verify the environment (especially for trusted applications), and potentially even monitor environmental variables out of band. This seems to suggest we may want kernel-based solutions for monitoring our environmental variables, if it makes sense to do so. 

What about Secure Boot? 

Now that we have finally tied BlackTech’s actions to secure boot, we can look at some of the failings in secure boot on these routers and properties of secure boot we want to ensure are in place for all systems. The public reporting on the latest Blacktech firmware compromise is a bit sparse on details, so we’re going to have to make some assumptions regarding what the devices secure boot looked like and how the actors were able to achieve their goals. 

We know BlackTech was able to modify firmware in memory and bypass some form of initial secure boot. This leads us to conclude there was likely a time of check, time of use (TOCTOU) flaw enabling the firmware to be modified after it had been verified. It also suggests there was only a cursory inspection of the firmware by the ROM monitor. It seems likely, the attackers were able to start the firmware update process using a legitimate firmware image, and after the authenticity checks were performed, but before the firmware was written to flash, the attackers were able to modify it (in memory). This enabled the attackers to add their back door, disable logging, and add various other capabilities as required. This class of attacks can be loosely grouped into time of check, time of use. Namely, for whatever reason a full verification wasn’t done after the image was written to flash and during system boot. Mitigating TOCTOU style attacks requires careful consideration and attention to design. Whenever possible, critical operations such as verifying a firmware image and writing it to flash should be handled as atomic operations, and other system activities including DMA should be paused to prevent interference and narrow the window for compromise. 

BlackTech was able to force a firmware rollback (most likely so they could use previously unpatched vulnerabilities in the firmware to hide their activities). While there are occasionally reasons to permit rollback; generally speaking, rollback should be inhibited. From a design perspective, we need to ensure the software update and secure boot mechanisms work together to prevent rollback or flashing previous firmware versions. This introduces support and field service concerns that need to be addressed as part of the device lifecycle. 

The latest targets of BlackTech were older, mostly unsupported routers, which can be assumed to have various weaknesses in secure boot. The lack of support and general End of Life for the platforms is its own can of worms that really should be addressed separately. From the public discussions of the bootloader replacement and the way the ROM monitor worked on these devices, it can be assumed that one key component of secure boot was missing: there was no (or at least very limited) hardware root of trust and the public key hashes were not stored in hardware. 

The key to any secure boot system is a hardware root of trust. It needs to be ensured that the very first code to execute is either authenticated or trusted. In most systems, the early code or processor boot block takes the form of ROM. Because this early boot code is generally ROM, that suggests it can’t be changed in the field (so we better get it right the first time). The boot ROM is then responsible for validating the hashes on the next stage (generally the first software component) and starting execution. This is how we begin to build both a trusted and secure boot environment. Reading into this more, we can see there are a wide variety of early boot attacks that can be conducted including glitching (to bypass various checks or possibly influence the loading of keys / hashes), TOCTOU, and maybe even firmware rollback depending on the implementation. Looking at the early boot attacks, we can see how all our security decisions come together and how we really need to implement defense-in-depth. 

To verify the first software stage in the most secure fashion, we want to ensure we check a hash rooted in hardware. Given the cost of hardware and either ROM or eFUSEs, this is generally a hash of the authorized public key (which itself can then be included with the software or firmware) as it’s too expensive and takes too much space to put the entire public key in hardware. It’s also common practice to provide facilities for storing the hashes of 2 (or more) public keys and the facilities for revoking one or both keys after a compromise (or just being aged out). Enabling secure boot in this fashion limits the device lifetime and introduces additional support and logistic concerns, but it provides the most secure base to use in booting the rest of the system. The devices targeted by BlackTech in the latest round of firmware attacks didn’t make use of hashes rooted in hardware, and the early ROM monitor didn’t use hashes or keys rooted in hardware to verify the integrity and authenticity of the firmware images. Even systems such as Intel’s TXT use a variation of this technique for their secure boot implementations; TXT uses ACMs that are signed by Intel, the public key is included in the ACM itself, and a hash of the public key is stored in MASK-ed ROM within the processor such that the CPU can verify the ACM before executing it. 

Since we are proponents of  defense-in-depth, using hardware to store either public keys or their hashes introduces several other concerns for the software / firmware development lifecycle. Namely, how do we protect the private keys during manufacturing. Generally speaking, this takes the form of a hardware security module, limited / controlled access, and logging / monitoring of its use. 

At the core of the latest BlackTech firmware attack were a variety of secure boot and software update failures, showing just how important secure system design really is. A variety of techniques used by BlackTech show why systems on the edge need defense-in-depth, mandatory access controls, auditing, and monitoring of our systems, and further, why we need to understand how it all works together. 


Keep Reading: