Firmware Fuzzing 101

Adam Van Prooyen

December 17, 2020

(with additional contribution from Richard Bae)

Introduction

Embedded applications are some of the most prolific software out there in the world. Whether it be routers, IoT devices or SCADA systems, they are very varied in architecture, use case, and purpose. Very few of these devices have security in mind when they were built. And even fewer of them have ever been fuzzed. These make them the prime target for fuzzing campaigns.

Prerequisites

This is a blog post for advanced users with binary analysis experience. For this post, you will need:

Mayhem and the Mayhem CLI
Docker
Netgear N300 MIPS firmware image
Binary Ninja (or other disassembler) and a strong knowledge of reverse engineering. We will be looking at MIPS assembly code using Binary Ninja's high level intermediate language (HLIL)

Want to Learn More About Zero-Days?

Catch the FASTR series to see a technical proof of concept on our latest zero-day findings. Episode 2 features uncovering memory defects in cereal.

Watch EP 02 See TV Guide

What's Special about Firmware?

Fuzzing firmware presents a specific set of challenges that are not often present together in other targets. Furthermore, source code is not readily available and therefore harnessing must be performed at the binary level. This requires an increased level of expertise and know-how to deal with efficiently:

Dependency on specific hardware features present on the physical device
Non-x86 processor architecture
Non-glibc C standard library
Lack of available source code or documentation

In this post, we will cover how to deal with each one of these challenges in the firmware fuzzing context.

Example: Netgear N300 a.k.a. DGN2200v4

For this blog post we will be looking at the Netgear N300 (henceforth referred to as DGN2200v4) router firmware image. This is a good target to look at because while it is a Linux firmware binary, it presents all of the challenges listed above. Specifically, this firmware:

Relies on specific hardware features for synchronization used by it's programs
Is a MIPS Linux firmware
Uses uClibc instead of glibc C standard library
Has no source code and very few debug symbols available in binaries of interest

This presents quite a challenge for fuzzing but we will cover how to extract, harness, package, and fuzz this firmware.

Environment Setup

For this post, we have set up a docker image containing all the tools and files necessary. Run the following to start the docker container:

docker run -ti -v $PWD/share:/share forallsecure/fuzzing-firmware

Alternatively, you can install the following tools manually:

QEMU user static (ex: apt-get install -y qemu)
Binwalk (ex: python -m pip install git+https://github.com/ReFirmLabs/binwalk)
Jefferson (ex: python -m pip install git+https://github.com/sviehb/jefferson)
MIPS cross compiler - https://uclibc.org/downloads/binaries/0.9.30.1/cross-compiler-mips.tar.bz2
Mayhem CLI
GDB multiarch (ex: apt-get install -y gdb-multiarch)

In addition to the docker or above setup environment, ensure you have access to Binary Ninja or similar disassembler.

Extracting Firmware

Extracting firmware can sometimes be difficult due to custom firmware layouts and encryption. Luckily many firmwares, including this one, are just compressed file systems. This means that off-the-shelf tools such as binwalk can easily extract them. Run the following to extract:

binwalk -Me DGN2200v4-V1.chk

Note: The e option refers to extract and the M option refers to recursively extract if there are nested containers within the firmware.

After running binwalk, the extracted firmware will be in _DGN2200v4-V1.chk.extracted.

Before doing anything else, let's clean up the extracted firmware. binwalk created a lot of outputs but the only thing we need from it is the filesystem. Let's move the filesystem out of the extraction folder and rename it to root

$ mv _DGN2200v4-V1.chk.extracted/jffs2-root/fs_1 root
$ rm -rf _DGN2200v4-V1.chk.extracted

Picking a target

Now that we have extracted the firmware, we need to identify a binary for harnessing. Two good options for router targets are:

User facing web servers
Custom internal binaries such as database managers, etc.

We can also find interesting binaries by getting another similar firmware (such as a similar model by another manufacturer) and comparing which binaries are unique to each system with a script. While this can generate some noise (such as two routers using sqlite vs postgres), it helps massively narrow down the amount of binaries to look through and is a good first step.

For this post, we will be looking at DGN2200v4's httpd web server. Web Servers on embedded systems (especially routers) are particularly interesting because they tend to control many functions besides just being a web server including device bring-up, authentication, and process management. For this same reason, they can also be tricky to get running.

First look at httpd

First, let's figure out what kind of binary httpd is:

$ file root/usr/sbin/httpd
root/usr/sbin/httpd: ELF 32-bit MSB executable, MIPS, MIPS32 version 1 (SYSV), dynamically linked, interpreter /lib/ld-uClibc.so.0, stripped

From this, we learn that this is a 32 bit MIPS binary using a non-standard libc (uClibc). Let's work past these roadblocks one at a time.

Since this is a MIPS binary and you're probably not following along with this post on a MIPS box, we will need to use QEMU to test out httpd or any other binaries we want to look at.

Additionally, to run these binaries in the proper environment, we will need to use chroot to "remount" the root directory to that of the extracted filesystem. The chroot command is simple:

$ chroot <directory> <cmd...>

Note: The chroot command above "remounts"directory as the root directory for the duration cmd

Because cmd only has access to files with directory, we will need to use statically linked QEMU (normal QEMU needs access to host shared libraries to run) in the chroot environment.

$ cp `which qemu-mips-static` root

Now that all the parts are in place, lets try actually running httpd:

$ chroot root /qemu-mips-static /usr/sbin/httpd
/usr/sbin/httpd: can't load library 'libssl.so.0.9.7'

Note: We use absolute paths for qemu-mips-static and httpd since our root directory has been changed to the root folder.

Uh oh. It looks like this target will need a little TLC before it will run happily.

Getting httpd running

Let's find the missing library and add it to the LD_LIBRARY_PATH environment variable.

Note: We need to remove root/var/tmp/shm_id it is created by the last call and will change the behavior. You can try running with -strace without removing the file and verify for yourself.

$ find root -name libssl.so.0.9.7
root/lib/public/libssl.so.0.9.7
$ rm root/var/tmp/shm_id
$ chroot root /qemu-mips-static -E LD_LIBRARY_PATH=/lib/public/ /usr/sbin/httpd
shm ID: 262152
Semaphore Create Failed.

Let's investigate with the -strace option on QEMU:

$ rm root/var/tmp/shm_id
$ chroot root /qemu-mips-static -strace -E LD_LIBRARY_PATH=/lib/public/ /usr/sbin/httpd
...
271 open("/tmp/shm_id",O_WRONLY|O_CREAT|O_TRUNC,0666) = 3
271 ioctl(3,21517,2147481080,0,0,0) = -1 errno=89 (Function not implemented)
271 brk(0x0069b000) = 0x0069b000
271 write(1,0x7f52d2a8,15)shm ID: 229383
= 15
271 write(3,0x699070,6) = 6
271 close(3) = 0
271 ipc(21,229383,0,2147481280) = 0
271 ipc(2,123456,1,1974) = -1 errno=17 (File exists)
271 write(1,0x7f52d2a8,25)Semaphore Create Failed.
= 25
271 exit(1)

Based on the output from -strace, we can tell that the reason for the failure might be something to do with the IPC and IOCTL calls related to shared memory operations (shm = shared memory) – a poorly supported feature in QEMU.

The only way we can find out for sure is in a disassembler. Let's open root/usr/sbin/httpd and its dependency root/lib/libnvram.so in Binary Ninja and have a look.

Using high level IL view (HLIL) in main at 0x4130d4, we can see that it does indeed look like we're failing after a failed call to sub_408f78 which in turn calls semget.

Subroutine at 0x408f78

Semget

semget likely makes the failed ipc call (since these semaphores are used for interprocess communication) and we should therefore avoid or fix it. One may be tempted to immediately resort to binary patching but since this binary was actually dynamically linked, we can use LD_PRELOAD instead to influence the behavior of the program without modifying the binary at all.

Another thing we might notice from our disassembler is that this main function does a lot more than just serve http requests. It looks like it brings a lot of the router up as well!

Router bring-up code

This means that we might want to enter the program at a different place besides main. Thankfully, if we can find the HTTP parse code, LD_PRELOAD can also be used to directly harness an internal function.

Harnessing functions with LD_PRELOAD

First, let's focus on directly targeting the parse http request function. Through reverse engineering, we can find that this function is located at 0x408f90. Additionally, through more reverse engineering, we can find that this functions signature looks something like this:

void parse_http_req(char *http_req, void *unk, int32_t inet_addr, int32_t out_fd)

Harnessing an individual function in LD_PRELOAD is easy once you know the trick. We simply override the libc main (in our case __uClibc_main) by defining a function of the same name in our LD_PRELOAD harness and calling the function interest inside it.

// hook.c
#include <fcntl.h>
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>

// the real signature is longer but turns out it doesn't matter
void __uClibc_main(void *main, int argc, char** argv)
{
char req[4096 + 1];
int32_t in_addr;

if (argc != 2) {
printf("Usage: %s <fuzz-file>\n", argv[0]);
exit(1);
}

char *fuzzfile = argv[1];
int fd = open(fuzzfile, O_RDONLY);

int n_read = read(fd, &in_addr, sizeof(in_addr));
if (n_read != sizeof(in_addr)) exit(1);

n_read = read(fd, req, sizeof(req) - 1);
if (n_read < 0) exit(1);
req[n_read] = 0;

fprintf(stderr, "Request: %s\n\n", req);

// declare a function pointer pointing to the real parse_http_req
void (*parse_http_req)(char *, void *, int32_t, int) = (void *)0x408f90;

// Skip a lot of device init and get right to the server setup and http handler
parse_http_req(req, NULL, in_addr, STDERR_FILENO);

// need to exit here b/c this function is expected to not return
exit(0);
}

‍

As you can see, this harness is just like normal in almost every way. There are a couple differences:

Instead of using main, start at the libc main (the function that calls the real main).
Because of this we need to exit at the end instead of returning (libc main calls exit with what the real main returns).
Instead of being able to call the function directly, we need to make a function pointer to it and call that.

A couple of notes about this harness:

We are fuzzing both the request and the connecting address. This will check if there are any special cases or mishandled addresses.
We print the fuzzed request to stderr and pass stderr to parse_http_req as the output FD. This will allow us to view results visually on the commandline when testing and in Mayhem.

Since LD_PRELOAD works by overriding shared library loads with a provided shared object, we need to compile hook.c to a shared object as well:

$ PATH=<path to cross-compiler-mips>/bin mips-gcc hook.c -o hook.so -shared -fPIC

Now that we have our LD_PRELOAD harness compiled, let's run it and see what happens! But first we need to make hook.so available in the chroot environment by moving it into the root folder. Additionally, since our harness takes in a file now, let's create a test file. Now we can run httpd as before except we add the LD_PRELOAD environment variable and the test.txt argument.

$ mv hook.so root
$ echo AAAABBBBCCCC > root/test.txt
$ chroot root /qemu-mips-static -E LD_PRELOAD=/hook.so -E LD_LIBRARY_PATH=/lib/public/ /usr/sbin/httpd
Usage: usr/sbin/httpd <fuzz-file>
$ chroot root /qemu-mips-static -E LD_PRELOAD=/hook.so -E LD_LIBRARY_PATH=/lib/public/ /usr/sbin/httpd test.txt
Request: BBBBCCCC
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

Darn. Still crashing. Let's see why using -strace again:

$ chroot root /qemu-mips-static -strace -E LD_PRELOAD=/hook.so -E LD_LIBRARY_PATH=/lib/public/ /usr/sbin/httpd test.txt &
...
97 ipc(23,1074866065,131072,438) = -1 errno=2 (No such file or directory)
97 ipc(21,-1,0,2147275888) = -1 errno=22 (Invalid argument)
97 ipc(2,1074866065,1,1974) = -1 errno=17 (File exists)
97 ipc(2,1074866065,1,512) = 0
97 rt_sigprocmask(SIG_BLOCK,0x7ffcd400,NULL) = 0
97 ipc(1,0,1,0) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=1, si_addr=0x0000001f} ---
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

Looks like it's still crashing in the IPC code. But where? The majority of calls out of parse_http_req appear to be related to acosNvramConfig_. Let's try using GDB to break on one and check if the crash is inside. We can use QEMU with gdb-multiarch using the -g <port> argument to QEMU.

$ chroot root /qemu-mips-static -g 1234 -E LD_PRELOAD=/hook.so -E LD_LIBRARY_PATH=/lib/public/ /usr/sbin/httpd test.txt
$ gdb-multiarch -q root/usr/sbin/httpd
(gdb) target remote :1234
Remote debugging using :1234
...
(gdb) break acosNvramConfig_match
Breakpoint 1 at 0x4ade90
(gdb) continue
...
Breakpoint 1, 0x004ade90 in acosNvramConfig_match ()
(gdb) finish
...
Program received signal SIGSEGV, Segmentation fault.
0x7f6f5f90 in ?? ()

Since the crash happened somewhere in acosNvramConfig_match, overriding this function should fix the crash. Let's give it a shot.

Overriding functions with LD_PRELOAD

With LD_PRELOAD we can override any dynamically linked function. In this case, we found that our program crashes in acosNvramConfig_match. So if we override this function by skipping it or reimplementing it ourselves, we should be able to avoid the crash all together. Let's add an override for the match function to our harness hook.so:

int acosNvramConfig_match(char *key, char *value) {
printf("acosNvramConfig_match(%s, %s)\n", key, value);
return 0;
}

Let's try running the new harness version:

$ PATH=<path to cross-compiler-mips>/bin mips-gcc hook.c -o hook.so -shared -fPIC
$ mv hook.so root
$ chroot root /qemu-mips-static -E LD_PRELOAD=/hook.so -E LD_LIBRARY_PATH=/lib/public/ /usr/sbin/httpd test.txt
...
acosNvramConfig_match(passwordrecovered_debug2, 1)
acosNvramConfig_match(passwordrecovered_debug, 1)
acosNvramConfig_match(http_rmenable, 1)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

Unfortunately it looks like we’re still segfaulting. However, if we look at the strace output, it looks like we have the same call pattern as before. Let's add overrides for the rest of the acosNvramConfig_* family (left as an exercise to the reader).

$ chroot root /qemu-mips-static -E LD_PRELOAD=/hook.so -E LD_LIBRARY_PATH=/lib/public/ /usr/sbin/httpd test.txt
Request: BBBBCCCC
...
acosNvramConfig_get(local_ip9)
acosNvramConfig_get(local_ip10)
acosNvramConfig_match(rm_access, ip_single)
acosNvramConfig_match(rm_access, ip_range)
acosNvramConfig_match(rm_access, ip_list)
acosNvramConfig_match(rm_access, all)

Great! Now we aren’t crashing when we run httpd with our harness. However, recall that even though we passed stderr as our output file descriptor, we see no output here. This must mean that parse_http_req is relying on the outputs of acosNvramConfig_*. This is an indication that we need to increase the fidelity of our overrides. Right now we are just returning default values (0 or the empty string) for these functions. Let's modify our harness to get these values out of /etc/nvram in our firmware image (again left as an exercise to the reader).

Finally, running our httpd with our harness produces output that we want:

$ chroot root /qemu-mips-static -E LD_PRELOAD=/hook.so -E LD_LIBRARY_PATH=/lib/public/ /usr/sbin/httpd test.txt
...
HTTP/1.0 401 Unauthorized
WWW-Authenticate: Basic realm="NETGEAR DGN2200v4"
x-frame-options: SAMEORIGIN
Set-Cookie: XSRF_TOKEN=1222440606; Path=/
Content-type: text/html

<html>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
<title></title></head>
<body><h1></h1>
<p></p></body>
</html>

It's time to get it into Mayhem!

Getting It Into Mayhem

Surprisingly, once we have the firmware running locally, getting it into Mayhem is easy!

Filesystem as a package

If you recall from the Mayhem tutorials, Mayhem expects a package to look like:

my-package/
Mayhemfile
root/
usr/
bin/
etc/
...
[corpus/]
…

Luckily, our unpacked firmware already looks like the root folder (indeed the root folder in a Mayhem package is a mini-filesystem). So, to make our firmware + harness in to a Mayhem package we just need to add a Mayhemfile. Of course, we will need to specify the environment variables to make sure Mayhem knows to LD_PRELOAD the harness and where to look for the libraries.

# Mayhemfile
version: '1.8'
project: netgear-n300
target: parse_http_request

cmds:
- cmd: /usr/sbin/httpd @@
env:
LD_LIBRARY_PATH: /lib/public
LD_PRELOAD: /hook.so

Now we can mayhem run from the directory containing our firmware root directory and Mayhemfile:

$ ls
Mayhemfile hook.c root
$ mayhem run .
…

Mayhem Uncovers Defects At Speed, Scale, and Accuracy.

Find out how ForAllSecure can scale your security and development efforts with autonomous fuzz testing.

Request Demo Learn More

Conclusion

In this blog, we covered a generalized concept for analyzing firmware images, handling their weird eccentricities, and enabling them to be fuzzed. While we did not post results here, this is a widely applicable methodology for tackling these types of targets that were traditionally deemed ‘hard’. Given the incredible amount of new embedded devices being produced regularly, this opens up a wide aperture of targets to start analyzing and looking for bugs in places people have not started to look yet. Also if you want more on embedded security, check out this project. And for a nice list of LD_PRELOAD tricks, this project. We wish you well on your bug hunting adventures.

Happy Fuzzing.

Share this post