1 00:00:00,000 --> 00:00:09,520 *36c3 prerol music* 2 00:00:18,410 --> 00:00:23,250 Herald: So, the next talk for this afternoon is about high speed binary 3 00:00:23,250 --> 00:00:28,110 fuzzing. We have two researchers that will be presenting the product of their latest 4 00:00:28,110 --> 00:00:33,640 work, which is a framework for static binary rewriting. Our speakers are—the 5 00:00:33,640 --> 00:00:38,580 first one is a computer science master's student at EPFL and the second one is a 6 00:00:38,580 --> 00:00:42,730 security researcher and assistant professor at EPFL. Please give a big round 7 00:00:42,730 --> 00:00:45,048 of applause to Nspace and gannimo. 8 00:00:45,048 --> 00:00:50,280 *Applause* 9 00:00:50,280 --> 00:00:52,610 gannimo (Mathias Payer): Thanks for the introduction. It's a pleasure to be here, 10 00:00:52,610 --> 00:00:57,850 as always. We're going to talk about different ways to speed up your fuzzing 11 00:00:57,850 --> 00:01:02,050 and to find different kinds of vulnerabilities or to tweak your binaries 12 00:01:02,050 --> 00:01:08,070 in somewhat unintended ways. I'm Mathias Payer or I go by gannimo on Twitter and I 13 00:01:08,070 --> 00:01:14,440 am an assistant professor at EPFL working on different forms of software security: 14 00:01:14,440 --> 00:01:18,700 fuzzing sanitization, but also different kinds of mitigations. And Matteo over 15 00:01:18,700 --> 00:01:24,160 there is working on his master's thesis on different forms of binary rewriting for 16 00:01:24,160 --> 00:01:27,820 the kernel. And today we're going to take you on a journey on how to actually 17 00:01:27,820 --> 00:01:32,180 develop very fast and very efficient binary rewriting mechanisms that allow you 18 00:01:32,180 --> 00:01:37,710 to do unintended modifications to the binaries and allow you to explore 19 00:01:37,710 --> 00:01:45,700 different kinds of unintended features in binaries. So about this talk. What we 20 00:01:45,700 --> 00:01:49,729 discovered or the reason why we set out on this journey was that fuzzing binaries is 21 00:01:49,729 --> 00:01:56,460 really, really hard. There's very few tools in user space. There's—it's 22 00:01:56,460 --> 00:01:59,680 extremely hard to set it up and it's extremely hard to set it up in a 23 00:01:59,680 --> 00:02:04,479 performant way. The setup is complex. You have to compile different tools. You have 24 00:02:04,479 --> 00:02:08,520 to modify it. And the results are not really that satisfactory. As soon as you 25 00:02:08,520 --> 00:02:13,320 move to the kernel, fuzzing binaries in a kernel is even harder. There's no tooling 26 00:02:13,320 --> 00:02:16,880 whatsoever, there's very few users actually working with binary code in the 27 00:02:16,880 --> 00:02:22,630 kernel or modifying binary code, and it's just a nightmare to work with. So what we 28 00:02:22,630 --> 00:02:26,850 are presenting today is a new approach that allows you to instrument any form of 29 00:02:26,850 --> 00:02:31,920 binary code or modern binary code based on static rewriting, which gives you full 30 00:02:31,920 --> 00:02:36,819 native performance. You only pay for the instrumentation that you add, and you can 31 00:02:36,819 --> 00:02:41,690 do very heavyweight transformations on top of it. The picture, if you look at the 32 00:02:41,690 --> 00:02:47,470 modern system, let's say we are looking at a modern setup. Let's say you're looking 33 00:02:47,470 --> 00:02:52,700 at cat pictures in your browser: Chrome plus the kernel plus the libc plus the 34 00:02:52,700 --> 00:02:57,920 graphical user interface together clog up at about 100 million lines of code. 35 00:02:57,920 --> 00:03:02,670 Instrumenting all of this for some form of security analysis is a nightmare, 36 00:03:02,670 --> 00:03:06,690 especially along this large stack of software. There's quite a bit of different 37 00:03:06,690 --> 00:03:11,260 compilers involved. There's different linkers. It may be compiled on a different 38 00:03:11,260 --> 00:03:14,620 system, with different settings and so on. And then getting your instrumentation 39 00:03:14,620 --> 00:03:18,569 across all of this is pretty much impossible and extremely hard to work 40 00:03:18,569 --> 00:03:24,269 with. And we want to enable you to select those different parts that you're actually 41 00:03:24,269 --> 00:03:29,629 interested in. Modify those and then focus your fuzzing or analysis approaches on 42 00:03:29,629 --> 00:03:35,040 those small subsets of the code, giving you a much better and stronger capability 43 00:03:35,040 --> 00:03:38,690 to test the systems that you're, or those parts of the system that you're really, 44 00:03:38,690 --> 00:03:45,659 really interested in. Who's worked on fuzzing before? Quick show of hands. Wow, 45 00:03:45,659 --> 00:03:54,379 that's a bunch of you. Do you use AFL? Yeah, most of you, AFL. Libfuzzer? Cool, 46 00:03:54,379 --> 00:03:59,760 about 10, 15 percent libfuzzer, 30 percent fuzzing, and AFL. There's a quite good 47 00:03:59,760 --> 00:04:03,980 knowledge of fuzzing, so I'm not going to spend too much time on fuzzing, but for 48 00:04:03,980 --> 00:04:07,500 those that haven't really run their fuzzing campaigns yet, it's a very simple 49 00:04:07,500 --> 00:04:12,060 software testing technique. You're effectively taking a binary, let's say 50 00:04:12,060 --> 00:04:16,480 Chrome, as a target and you're running this in some form of execution 51 00:04:16,480 --> 00:04:20,959 environment. And fuzzing then consists of some form of input generation that creates 52 00:04:20,959 --> 00:04:26,620 new test cases, throws them at your program and sees—and checks what is 53 00:04:26,620 --> 00:04:31,310 happening with your program. And either everything is OK, and your code is being 54 00:04:31,310 --> 00:04:35,640 executed, and your input—the program terminates, everything is fine, or you 55 00:04:35,640 --> 00:04:39,773 have a bug report. If you have a bug report, you can use this. Find the 56 00:04:39,773 --> 00:04:44,520 vulnerability, maybe develop a PoC and then come up with some form of either 57 00:04:44,520 --> 00:04:49,240 exploit or patch or anything else. Right. So this is pretty much fuzzing in a in a 58 00:04:49,240 --> 00:04:55,560 nutshell. How do you get fuzzing to be effective? How can you cover large source 59 00:04:55,560 --> 00:05:00,419 bases, complex code, and complex environment? Well, there's a couple of 60 00:05:00,419 --> 00:05:04,979 simple steps that you can take. And let's walk quickly through effective fuzzing 61 00:05:04,979 --> 00:05:12,630 101. Well, first, you want to be able to create test cases that actually trigger 62 00:05:12,630 --> 00:05:18,100 bugs. And this is a very, very complicated, complicated part. And we need 63 00:05:18,100 --> 00:05:22,800 to have some notion of the inputs that a program accepts. And we need to have some 64 00:05:22,800 --> 00:05:27,780 notion of how we can explore different parts of the program, right? Different 65 00:05:27,780 --> 00:05:30,870 parts of functionality. Well, on one hand, we could have a developer write all the 66 00:05:30,870 --> 00:05:34,370 test cases by hand, but this would be kind of boring. It would also require a lot of 67 00:05:34,370 --> 00:05:40,220 human effort in creating these different inputs and so on. So coverage guided 68 00:05:40,220 --> 00:05:46,990 fuzzing has evolved as a very simple way to guide the fuzzing process, leveraging 69 00:05:46,990 --> 00:05:51,220 the information on which parts of the code have been executed by simply tracing the 70 00:05:51,220 --> 00:05:58,500 individual path through the program based on the execution flow. So we can—the 71 00:05:58,500 --> 00:06:03,460 fuzzer can use this feedback to then modify the inputs that are being thrown at 72 00:06:03,460 --> 00:06:09,830 the fuzzing process. The second step is the fuzzer must be able to detect bugs. If 73 00:06:09,830 --> 00:06:13,080 you've ever looked at a memory corruption, if you're just writing one byte after the 74 00:06:13,080 --> 00:06:18,490 end of a buffer, it's highly likely that your software is not going to crash. But 75 00:06:18,490 --> 00:06:21,180 it's still a bug, and it may still be exploitable based on the underlying 76 00:06:21,180 --> 00:06:26,690 conditions. So we want to be able to detect violations as soon as they happen, 77 00:06:26,690 --> 00:06:31,600 for example, based on on some form of sanitization that we add, some form of 78 00:06:31,600 --> 00:06:35,400 instrumentation that we add to the to the binary, that then tells us, hey, there's a 79 00:06:35,400 --> 00:06:39,729 violation of the memory safety property, and we terminate the application right 80 00:06:39,729 --> 00:06:45,300 away as a feedback to the fuzzer. Third, but the—and last but not least: Speed is 81 00:06:45,300 --> 00:06:49,569 key, right? For if you're running a fuzzing campaign, you have a fixed 82 00:06:49,569 --> 00:06:54,639 resource budget. You have a couple of cores, and you want to run for 24 hours, 83 00:06:54,639 --> 00:06:59,470 48 hours, a couple of days. But in any way, whatever your constraints are, you 84 00:06:59,470 --> 00:07:04,210 have a fixed amount of instructions that you can actually execute. And you have to 85 00:07:04,210 --> 00:07:08,699 decide, am I spending my instructions on generating new inputs, tracking 86 00:07:08,699 --> 00:07:14,139 constraints, finding bugs, running sanitization or executing the program? And 87 00:07:14,139 --> 00:07:17,790 you need to find a balance between all of them, as it is a zero sum game. You have a 88 00:07:17,790 --> 00:07:20,870 fixed amount of resources and you're trying to make the best with these 89 00:07:20,870 --> 00:07:26,890 resources. So any overhead is slowing you down. And again, this becomes an 90 00:07:26,890 --> 00:07:30,819 optimization problem. How can you most effectively use the resources that you 91 00:07:30,819 --> 00:07:37,580 have available? As we are fuzzing with source code, it's quite easy to actually 92 00:07:37,580 --> 00:07:41,770 leverage existing mechanisms, and we add all that instrumentation at compile time. 93 00:07:41,770 --> 00:07:45,630 We take source code, we pipe it through the compiler and modern compiler 94 00:07:45,630 --> 00:07:51,169 platforms, allow you to instrument and add little code snippets during the 95 00:07:51,169 --> 00:07:55,419 compilation process that then carry out all these tasks that are useful for 96 00:07:55,419 --> 00:08:00,270 fuzzing. For example, modern compilers can add short snippets of code for coverage 97 00:08:00,270 --> 00:08:03,990 tracking that will record which parts of the code that you have executed, or for 98 00:08:03,990 --> 00:08:08,770 sanitization which record and check every single memory access if it is safe or not. 99 00:08:08,770 --> 00:08:12,360 And then when you're running the instrumented binary, everything is fine 100 00:08:12,360 --> 00:08:17,380 and you can detect the policy violations as you go along. Now if you would have 101 00:08:17,380 --> 00:08:21,330 source code for everything, this would be amazing. But it's often not the case, 102 00:08:21,330 --> 00:08:28,129 right? We may be able on Linux to cover a large part of the protocol stack by 103 00:08:28,129 --> 00:08:33,940 focusing only on source-code-based approaches. But there may be applications 104 00:08:33,940 --> 00:08:39,300 where no source code is available. If we move to Android or other mobile systems, 105 00:08:39,300 --> 00:08:43,199 there's many drivers that are not available as open source or just available 106 00:08:43,199 --> 00:08:48,630 as binary blobs, or the full software stack may be closed-source and we only get 107 00:08:48,630 --> 00:08:52,329 the binaries. And we still want to find vulnerabilities in these complex software 108 00:08:52,329 --> 00:08:59,530 stacks that span hundreds of millions of lines of code in a very efficient way. The 109 00:08:59,530 --> 00:09:04,620 only solution to cover this part of massive code base is to actually rewrite 110 00:09:04,620 --> 00:09:08,990 and focus on binaries. A very simple approach could be black box fuzzing, but 111 00:09:08,990 --> 00:09:11,620 this is—this doesn't really get you anywhere because you don't get any 112 00:09:11,620 --> 00:09:16,100 feedback; you don't get any information if you're triggering bugs. So one simple 113 00:09:16,100 --> 00:09:20,290 approach, and this is the approach that is most dominantly used today, is to rewrite 114 00:09:20,290 --> 00:09:26,040 the program or the binary dynamically. So you're taking the binary and during 115 00:09:26,040 --> 00:09:32,010 execution you use some form of dynamic binary instrumentation based on Pin, angr, 116 00:09:32,010 --> 00:09:37,140 or some other binary rewriting tool and translate the targeted runtime, adding 117 00:09:37,140 --> 00:09:43,330 this binary instrumentation on top of it as you're executing it. It's simple, it's 118 00:09:43,330 --> 00:09:46,930 straightforward, but it comes at a terrible performance cost of ten to a 119 00:09:46,930 --> 00:09:51,600 hundred x slow down, which is not really effective. And you're spending all your 120 00:09:51,600 --> 00:09:57,600 cores and your cycles on just executing the binary instrumentation. So we don't 121 00:09:57,600 --> 00:10:01,790 really want to do this and we want to have something that's more effective than that. 122 00:10:01,790 --> 00:10:07,360 So what we are focusing on is to do static rewriting. It involves a much more complex 123 00:10:07,360 --> 00:10:12,380 analysis as we are rewriting the binary before it is being executed, and we have 124 00:10:12,380 --> 00:10:17,880 to recover all of the control flow, all of the different mechanisms, but it results 125 00:10:17,880 --> 00:10:24,690 in a much better performance. And we can get more bang for our buck. So why is 126 00:10:24,690 --> 00:10:30,830 static rewriting so challenging? Well, first, simply adding code will break the 127 00:10:30,830 --> 00:10:35,320 target. So if you are disassembling this piece of code here, which is a simple loop 128 00:10:35,320 --> 00:10:40,620 that loads data, decrements the registers, and then jumps if you're not at the end of 129 00:10:40,620 --> 00:10:46,470 the array and keeps iterating through this array. Now, as you look at the jump-not- 130 00:10:46,470 --> 00:10:52,100 zero instruction, the last instruction of the snippet, it is a relative offset. So 131 00:10:52,100 --> 00:10:57,990 it jumps backward seven bytes. Which is nice if you just execute the code as is. 132 00:10:57,990 --> 00:11:02,040 But as soon as you want to insert new code, you change the offsets in the 133 00:11:02,040 --> 00:11:07,110 program, and you're modifying all these different offsets. And simply adding new 134 00:11:07,110 --> 00:11:12,769 code somewhere in between will break the target. So a core feature that we need to 135 00:11:12,769 --> 00:11:18,170 enforce, or core property that we need to enforce, is that we must find all the 136 00:11:18,170 --> 00:11:24,050 references and properly adjust them, both relative offsets and absolute offsets as 137 00:11:24,050 --> 00:11:29,800 well. Getting a single one wrong will break everything. What makes this problem 138 00:11:29,800 --> 00:11:34,520 really, really hard is that if you're looking at the binary, a byte is a byte, 139 00:11:34,520 --> 00:11:38,320 right? There's no way for us to distinguish between scalars and 140 00:11:38,320 --> 00:11:43,649 references, and in fact they are indistinguishable. Getting a single 141 00:11:43,649 --> 00:11:50,400 reference wrong breaks the target and would introduce arbitrary crashes. So we 142 00:11:50,400 --> 00:11:54,460 have to come up with ways that allow us to distinguish between the two. So for 143 00:11:54,460 --> 00:11:59,899 example, if you have this code here, it takes a value and stores it somewhere on 144 00:11:59,899 --> 00:12:07,060 the stack. This could come from two different kind of high-level constructs. 145 00:12:07,060 --> 00:12:12,170 On one hand, it could be taking the address of a function and storing this 146 00:12:12,170 --> 00:12:16,540 function address somewhere and in a stack variable. Or it could be just storing a 147 00:12:16,540 --> 00:12:21,579 scalar in a stack variable. And these two are indistinguishable, and rewriting them, 148 00:12:21,579 --> 00:12:25,220 as soon as we add new code, the offsets will change. If it is a function, we would 149 00:12:25,220 --> 00:12:31,800 have to modify the value; if it is a scalar, we have to keep the same value. So 150 00:12:31,800 --> 00:12:35,510 how can we come up with a way that allows us to distinguish between the two and 151 00:12:35,510 --> 00:12:44,610 rewrite binaries by recovering this missing information? So let us take—let me 152 00:12:44,610 --> 00:12:48,120 take you or let us take you on a journey towards instrumenting binaries in the 153 00:12:48,120 --> 00:12:53,070 kernel. This is what we aim for. We'll start with the simple case of 154 00:12:53,070 --> 00:12:57,410 instrumenting binaries in user land, talk about different kinds of coverage guided 155 00:12:57,410 --> 00:13:01,750 fuzzing and what kind of instrumentation we can add, what kind of sanitization we 156 00:13:01,750 --> 00:13:06,390 can add, and then focusing on taking it all together and applying it to kernel 157 00:13:06,390 --> 00:13:11,480 binaries to see what what will fall out of it. Let's start with instrumenting 158 00:13:11,480 --> 00:13:17,019 binaries first. I will now talk a little bit about RetroWrite, our mechanism and 159 00:13:17,019 --> 00:13:24,560 our tool that enables static binary instrumentation by symbolizing existing 160 00:13:24,560 --> 00:13:30,800 binaries. So we recover the information and we translate relative offsets and 161 00:13:30,800 --> 00:13:39,710 absolute offsets into actual labels that are added to the assembly file. The 162 00:13:39,710 --> 00:13:42,760 instrumentation can then work on the recovered assembly file, which can then be 163 00:13:42,760 --> 00:13:48,110 reassembled into a binary that can then be executed for fuzzing. We implement 164 00:13:48,110 --> 00:13:52,459 coverage tracking and binary address sanitizer on top of this, leveraging 165 00:13:52,459 --> 00:13:57,970 abstraction as we go forward. The key to enabling this kind of binary rewriting is 166 00:13:57,970 --> 00:14:02,170 position-independent code. And position- independent code has become the de-facto 167 00:14:02,170 --> 00:14:07,420 standard for any code that is being executed on a modern system. And it 168 00:14:07,420 --> 00:14:12,019 effectively says that it is code that can be loaded at any arbitrary address in your 169 00:14:12,019 --> 00:14:15,600 address space as you are executing binaries. It is essential and a 170 00:14:15,600 --> 00:14:19,010 requirement if you want to have address space layout randomization or if you want 171 00:14:19,010 --> 00:14:22,269 to use shared libraries, which de facto you want to use in all these different 172 00:14:22,269 --> 00:14:26,090 systems. So since a couple of years, all the code that you're executing on your 173 00:14:26,090 --> 00:14:33,079 phones, on your desktops, on your laptops is position-independent code. And the idea 174 00:14:33,079 --> 00:14:36,680 between the position-independent code is that you can load it anywhere in your 175 00:14:36,680 --> 00:14:41,040 address space and you can therefore not use any hard-coded static addresses and 176 00:14:41,040 --> 00:14:44,420 you have to inform the system of relocations or pick relative 177 00:14:44,420 --> 00:14:52,920 addresses—to—on how the system can relocate these different mechanisms. On 178 00:14:52,920 --> 00:14:58,540 x86_64, position-independent code leverages addressing that is relative to 179 00:14:58,540 --> 00:15:03,440 the instruction pointer. So for example, it uses the current instruction pointer 180 00:15:03,440 --> 00:15:07,519 and then a relative offset to that instruction pointer to reference global 181 00:15:07,519 --> 00:15:12,030 variables, other functions and so on. And this is a very easy way for us to 182 00:15:12,030 --> 00:15:17,710 distinguish references from constants, especially in PIE binaries. If it is RIP- 183 00:15:17,710 --> 00:15:21,360 relative, it is a reference; everything else is a constant. And we can build our 184 00:15:21,360 --> 00:15:25,690 translation algorithm and our translation mechanism based on this fundamental 185 00:15:25,690 --> 00:15:30,130 finding to remove any form of heuristic that is needed by focusing especially on 186 00:15:30,130 --> 00:15:35,030 position-independent code. So we're supporting position-independent code; we 187 00:15:35,030 --> 00:15:38,920 are—we don't support non-position- independent code, but we give you the 188 00:15:38,920 --> 00:15:43,200 guarantee that we can rewrite all the different code that is out there. So 189 00:15:43,200 --> 00:15:48,449 symbolization works as follows: If you have the little bit of code on the lower 190 00:15:48,449 --> 00:15:54,030 right, symbolization replaces first all the references with assembler labels. So 191 00:15:54,030 --> 00:15:57,700 look at the call instruction and the jump- not-zero instruction; the call instruction 192 00:15:57,700 --> 00:16:02,399 references an absolute address and the jump-not-zero instruction jumps backward 193 00:16:02,399 --> 00:16:08,259 relative 15 bytes. So by focusing on these relative jumps and calls, we can replace 194 00:16:08,259 --> 00:16:12,020 them with actual labels and rewrite the binary as follows: so we're calling a 195 00:16:12,020 --> 00:16:15,839 function, replacing it with the actual label, and for the jump-not-zero we are 196 00:16:15,839 --> 00:16:21,020 inserting an actual label in the assembly code and adding a backward reference. For 197 00:16:21,020 --> 00:16:26,089 PC-relative addresses, for example the data load, we can then replace it with the 198 00:16:26,089 --> 00:16:30,329 name of the actual data that we have recovered, and we can then add all the 199 00:16:30,329 --> 00:16:35,630 different relocations and use that as auxiliary information on top of it. After 200 00:16:35,630 --> 00:16:43,480 these three steps, we can insert any new code in between, and can therefore add 201 00:16:43,480 --> 00:16:47,420 different forms of instrumentations or run some more higher-level analysis on top of 202 00:16:47,420 --> 00:16:53,940 it, and then reassemble the file for fuzzing or coverage-guided tracking, 203 00:16:53,940 --> 00:16:59,100 address sanitization or whatever else you want to do. I will now hand over to 204 00:16:59,100 --> 00:17:04,490 Matteo, who will cover coverage-guided fuzzing and sanitization and then 205 00:17:04,490 --> 00:17:07,260 instrumenting the binaries in the kernel. Go ahead. 206 00:17:07,260 --> 00:17:11,300 Nspace (Matteo Rizzo): So, now that we have this really nice framework to rewrite 207 00:17:11,300 --> 00:17:16,500 binaries, one of the things that we want to add to actually get the fuzzing is this 208 00:17:16,500 --> 00:17:22,960 coverage-tracking instrumentation. So coverage-guided fuzzing is a way, a 209 00:17:22,960 --> 00:17:27,549 method, for—to let the fuzzer discover interesting inputs, an interesting path to 210 00:17:27,549 --> 00:17:35,520 the target by itself. So the basic idea is that the fuzzer will track coverage—the 211 00:17:35,520 --> 00:17:39,190 parts of the programs that are covered by different inputs by inserting some kind of 212 00:17:39,190 --> 00:17:43,419 instrumentation. So, for example, here we have this target program that checks if 213 00:17:43,419 --> 00:17:48,651 the input contains the string "PNG" at the beginning, and if it does, then it does 214 00:17:48,651 --> 00:17:53,559 something interesting, otherwise it just bails out and fails. So if we track the 215 00:17:53,559 --> 00:17:58,240 part of the programs that each input executes, the fuzzer can figure out that 216 00:17:58,240 --> 00:18:03,100 an input that contains "P" will have discovered a different path through the 217 00:18:03,100 --> 00:18:08,080 program than input that doesn't contain it. And then so on it can, one byte at a 218 00:18:08,080 --> 00:18:13,360 time, discover that this program expects this magic sequence "PNG" at the start of 219 00:18:13,360 --> 00:18:19,280 the input. So the way that the fuzzer does this is that every time a new input 220 00:18:19,280 --> 00:18:23,730 discovers a new path though the target, it is considered interesting and added to a 221 00:18:23,730 --> 00:18:28,890 corpus of interesting inputs. And every time the fuzzer needs to generate a new 222 00:18:28,890 --> 00:18:35,610 input, it will select something from the corpus, mutate it randomly, and then use 223 00:18:35,610 --> 00:18:39,830 it as the new input. So this is like a—this is, like, conceptually pretty 224 00:18:39,830 --> 00:18:43,150 simple, but in practice it works really well and it really lets the fuzzer 225 00:18:43,150 --> 00:18:47,740 discover the format that the target expects in an unsupervised way. So as an 226 00:18:47,740 --> 00:18:53,010 example, this is an experiment that was run by the author of AFL—AFL is the fuzzer 227 00:18:53,010 --> 00:18:58,049 that sort of popularized this technique—where he was fuzzing a JPEG- 228 00:18:58,049 --> 00:19:02,160 parsing library, starting from a corpus that only contained the string "hello". So 229 00:19:02,160 --> 00:19:07,650 now clearly "hello" is not a valid JPEG image and so—but still, like, the fuzzer 230 00:19:07,650 --> 00:19:12,070 was still able to find—to discover the correct format. So after a while it 231 00:19:12,070 --> 00:19:17,580 started generating some grayscale images, on the top left, and as it generated more 232 00:19:17,580 --> 00:19:20,720 and more inputs, it started generating more interesting images, such as some 233 00:19:20,720 --> 00:19:25,120 grayscale gradients, and later on even some color images. So as you can see, this 234 00:19:25,120 --> 00:19:30,630 really works, and it allows us to fuzz a program without really teaching the fuzzer 235 00:19:30,630 --> 00:19:34,600 how the input should look like. So that's it for coverage-guided fuzzing. Now we'll 236 00:19:34,600 --> 00:19:38,190 talk a bit about sanitizations. As a reminder, the core idea behind 237 00:19:38,190 --> 00:19:42,330 sanitization is that just looking for crashes is likely to miss some of the 238 00:19:42,330 --> 00:19:45,919 bugs. So, for example, if you have this out-of-bounds one-byte read, that will 239 00:19:45,919 --> 00:19:49,590 probably not crash the target, but you would still like to catch it because it 240 00:19:49,590 --> 00:19:53,080 could be used for an info leak, for example. So one of the most popular 241 00:19:53,080 --> 00:19:59,030 sanitizers is Address Sanitizer. So Address Sanitizer will instrument all the 242 00:19:59,030 --> 00:20:04,630 memory accesses in your program and check for memory corruption, which—so, memory 243 00:20:04,630 --> 00:20:08,809 corruption is a pretty dangerous class of bugs that unfortunately still plagues C 244 00:20:08,809 --> 00:20:16,770 and C++ programs and unsafe languages in general. And ASan tries to catch it by 245 00:20:16,770 --> 00:20:21,220 instrumenting the target. It is very popular; it has been used to find 246 00:20:21,220 --> 00:20:26,900 thousands of bugs in complex software like Chrome and Linux, and even though it has, 247 00:20:26,900 --> 00:20:31,500 like, a bit of a slowdown—like about 2x—it is still really popular because it lets 248 00:20:31,500 --> 00:20:37,120 you find many, many more bugs. So how does it work? The basic idea is that ASan will 249 00:20:37,120 --> 00:20:41,790 insert some special regions of memory called 'red zones' around every object in 250 00:20:41,790 --> 00:20:47,270 memory. So we have a small example here where we declare a 4-byte array on the 251 00:20:47,270 --> 00:20:53,700 stack. So ASan will allocate the array "buf" and then add a red zone before it 252 00:20:53,700 --> 00:20:59,060 and a red zone after it. Whenever the program accesses the red zones, it is 253 00:20:59,060 --> 00:21:02,660 terminated with a security violation. So the instrumentation just prints a bug 254 00:21:02,660 --> 00:21:07,419 report and then crashes the target. This is very useful for detecting, for example, 255 00:21:07,419 --> 00:21:11,400 buffer overflows or underflows and many other kinds of bugs such as use-after-free 256 00:21:11,400 --> 00:21:16,230 and so on. So, as an example here, we are trying to copy 5 bytes into a 4-byte 257 00:21:16,230 --> 00:21:22,580 buffer, and ASan will check each of the accesses one by one. And when it sees that 258 00:21:22,580 --> 00:21:26,810 the last byte writes to a red zone, it detects the violation and crashes the 259 00:21:26,810 --> 00:21:32,370 program. So this is good for us because this bug might have not been found by 260 00:21:32,370 --> 00:21:36,120 simply looking for crashes, but it's definitely found if we use ASan. So this 261 00:21:36,120 --> 00:21:40,750 is something we want for fuzzing. So now that we've covered—briefly covered ASan we 262 00:21:40,750 --> 00:21:45,970 can talk about instrumenting binaries in the kernel. So Mathias left us with 263 00:21:45,970 --> 00:21:52,580 RetroWrite, and with RetroWrite we can add both coverage tracking and ASan to 264 00:21:52,580 --> 00:21:57,410 binaries. So the simple—it's a really simple idea: now that we can rewrite this 265 00:21:57,410 --> 00:22:02,760 binary and add instructions wherever we want, we can implement both coverage 266 00:22:02,760 --> 00:22:07,390 tracking and ASan. In order to implement coverage tracking, we simply have to 267 00:22:07,390 --> 00:22:11,710 identify the start of every basic block and add a little piece of instrumentation 268 00:22:11,710 --> 00:22:15,789 at the start of the basic block that tells the fuzzer 'hey, we've reached this part 269 00:22:15,789 --> 00:22:19,400 of the program'—'hey, we've reached this other part of the program'. Then the 270 00:22:19,400 --> 00:22:25,039 fuzzer can figure out whether that's a new part or not. ASan is also, like, you know, 271 00:22:25,039 --> 00:22:29,240 it's also somewhat—it can also be implemented in this way by finding all 272 00:22:29,240 --> 00:22:33,929 memory accesses, and then linking with libASan. libASan is a sort of runtime for 273 00:22:33,929 --> 00:22:38,820 ASan that takes care of inserting the red zones and instrument—and adding, you know, 274 00:22:38,820 --> 00:22:43,340 like, keeping around all the metadata that ASan needs to know where the red zones 275 00:22:43,340 --> 00:22:48,419 are, and detecting whether a memory access is invalid. So, how can we apply all of 276 00:22:48,419 --> 00:22:52,309 this in the kernel? Well, first of all, fuzzing the kernel is not as easy as 277 00:22:52,309 --> 00:22:57,920 fuzzing some userspace program. There's some issues here. So first of all, there's 278 00:22:57,920 --> 00:23:01,950 crash handling. So whenever you're fuzzing a userspace program, you expect crashes, 279 00:23:01,950 --> 00:23:06,289 well, because that's what we're after. And if a userspace program crashes, then the 280 00:23:06,289 --> 00:23:11,410 US simply terminates the crash gracefully. And so the fuzzer can detect this, and 281 00:23:11,410 --> 00:23:16,270 save the input as a crashing input, and so on. And this is all fine. But when you're 282 00:23:16,270 --> 00:23:19,470 fuzzing the kernel, so—if you were fuzzing the kernel of the machine that you were 283 00:23:19,470 --> 00:23:23,040 using for fuzzing, after a while, the machine would just go down. Because, after 284 00:23:23,040 --> 00:23:27,180 all, the kernel runs the machine, and if it starts misbehaving, then all of it can 285 00:23:27,180 --> 00:23:31,720 go wrong. And more importantly, you can lose your crashes, because the if the 286 00:23:31,720 --> 00:23:35,450 machine crashes, then the state of the fuzzer is lost and you have no idea what 287 00:23:35,450 --> 00:23:39,590 your crashing input was. So what most kernel fuzzers have to do is that they 288 00:23:39,590 --> 00:23:43,419 resort to some kind of VM to keep the system stable. So they fuzz the kernel in 289 00:23:43,419 --> 00:23:48,500 a VM and then run the fuzzing agent outside the VM. On top of that is tooling. 290 00:23:48,500 --> 00:23:52,710 So, if you want to fuzz a user space program, you can just download AFL or use 291 00:23:52,710 --> 00:23:57,540 libfuzzer; there's plenty of tutorials online, it's really easy to set up and 292 00:23:57,540 --> 00:24:01,200 just, like—compile your program, you start fuzzing and you're good to go. If you want 293 00:24:01,200 --> 00:24:05,240 to fuzz the kernel, it's already much more complicated. So, for example, if you want 294 00:24:05,240 --> 00:24:09,390 to fuzz Linux with, say, syzkaller, which is a popular kernel fuzzer, you have to 295 00:24:09,390 --> 00:24:14,030 compile the kernel, you have to use a special config that supports syzkaller, 296 00:24:14,030 --> 00:24:20,100 you have way less guides available than for userspace fuzzing, and in general it's 297 00:24:20,100 --> 00:24:24,940 just much more complex and less intuitive than just fuzzing userspace. And lastly, 298 00:24:24,940 --> 00:24:29,330 we have the issue of determinism. So in general, if you have a single threaded 299 00:24:29,330 --> 00:24:32,770 userspace program, unless it uses some kind of random number generator, it is 300 00:24:32,770 --> 00:24:38,210 more or less deterministic. There's nothing that affects the execution of the 301 00:24:38,210 --> 00:24:42,299 program. But—and this is really nice if you want to try to reproduce a test case, 302 00:24:42,299 --> 00:24:46,340 because if you have a non-deterministic test case, then it's really hard to know 303 00:24:46,340 --> 00:24:50,680 whether this is really a crash or if it's just something that you should ignore, and 304 00:24:50,680 --> 00:24:56,280 in the kernel this is even harder, because you don't only have concurrency, like 305 00:24:56,280 --> 00:25:01,200 multi-processing, you also have interrupts. So interrupts can happen at any time, and 306 00:25:01,200 --> 00:25:05,850 if one time you got an interrupt while executing your test case and the second 307 00:25:05,850 --> 00:25:09,947 time you didn't, then maybe it only crashes one time - you don't really know, 308 00:25:09,947 --> 00:25:15,910 it's not pretty. And so again, we have several approaches to fuzzing 309 00:25:15,910 --> 00:25:20,550 binaries in the kernel. First one is to do black box fuzzing. We don't really 310 00:25:20,550 --> 00:25:23,677 like this because it doesn't find much, especially in something complex 311 00:25:23,677 --> 00:25:27,380 like a kernel. Approach 1 is to use dynamic translation, 312 00:25:27,380 --> 00:25:32,620 so, use something like QEMU or—you name it. This works, and 313 00:25:32,620 --> 00:25:35,121 people have used it successfully; the problem is that it is really, really, 314 00:25:35,121 --> 00:25:41,500 really slow. Like, we're talking about 10x-plus overhead. And as we said before, 315 00:25:41,500 --> 00:25:45,570 the more iterations, the more test cases you can execute in the same amount of 316 00:25:45,570 --> 00:25:50,700 time, the better, because you find more bugs. And on top of that, there's no 317 00:25:50,700 --> 00:25:57,520 currently available sanitizer for kernel binaries that works—is based on 318 00:25:57,520 --> 00:26:01,309 this approach. So in userspace you have something like valgrind; in the kernel, 319 00:26:01,309 --> 00:26:05,071 you don't have anything, at least that we know of. There is another approach, which 320 00:26:05,071 --> 00:26:09,951 is to use Intel Processor Trace. This has been, like—there's been some research 321 00:26:09,951 --> 00:26:14,240 papers on this recently, and this is nice because it allows you to collect coverage 322 00:26:14,240 --> 00:26:18,040 at nearly zero overhead. It's, like, really fast, but the problem is that it 323 00:26:18,040 --> 00:26:23,020 requires hardware support, so it requires a fairly new x86 CPU, and if you want to 324 00:26:23,020 --> 00:26:27,159 fuzz something on ARM, say, like, your Android driver, or if you want to use an 325 00:26:27,159 --> 00:26:32,120 older CPU, then you're out of luck. And what's worse, you cannot really use it for 326 00:26:32,120 --> 00:26:36,490 sanitization, or at least not the kind of sanitization that ASan does, because it 327 00:26:36,490 --> 00:26:41,770 just traces the execution; it doesn't allow you to do checks on memory accesses. 328 00:26:41,770 --> 00:26:47,350 So Approach 3, which is what we will use, is static rewriting. So, we had this very 329 00:26:47,350 --> 00:26:50,750 nice framework for rewriting userspace binaries, and then we asked ourselves, can 330 00:26:50,750 --> 00:26:56,659 we make this work in the kernel? So we took the system, the original RetroWrite, 331 00:26:56,659 --> 00:27:02,650 we modified it, we implemented support for Linux modules, and... it works! So we have 332 00:27:02,650 --> 00:27:08,110 implemented it—we have used it to fuzz some kernel modules, and it really shows 333 00:27:08,110 --> 00:27:11,640 that this approach doesn't only work for userspace; it can also be applied to the 334 00:27:11,640 --> 00:27:18,510 kernel. So as for some implementation, the nice thing about kernel modules is that 335 00:27:18,510 --> 00:27:22,170 they're always position independent. So you cannot have position—like, fixed- 336 00:27:22,170 --> 00:27:26,370 position kernel modules because Linux just doesn't allow it. So we sort of get that 337 00:27:26,370 --> 00:27:32,220 for free, which is nice. And Linux modules are also a special class of ELF files, 338 00:27:32,220 --> 00:27:35,890 which means that the format is—even though it's not the same as userspace binaries, 339 00:27:35,890 --> 00:27:40,310 it's still somewhat similar, so we didn't have to change the symbolizer that much, 340 00:27:40,310 --> 00:27:46,539 which is also nice. And we implemented symbolization with this, and we used it to 341 00:27:46,539 --> 00:27:54,490 implement both code coverage and binary ASan for kernel binary modules. So for 342 00:27:54,490 --> 00:27:59,039 coverage: The idea behind the whole RetroWrite project was that we wanted to 343 00:27:59,039 --> 00:28:03,500 integrate with existing tools. So existing fuzzing tools. We didn't want to force our 344 00:28:03,500 --> 00:28:08,770 users to write their own fuzzer that is compatible with RetroWrite. So for—in 345 00:28:08,770 --> 00:28:13,470 userspace we had AFL-style coverage tracking, and binary ASan which is 346 00:28:13,470 --> 00:28:16,490 compatible with source-based ASan, and we wanted to follow the same principle in the 347 00:28:16,490 --> 00:28:21,900 kernel. So it turns out that Linux has this built-in coverage-tracking framework 348 00:28:21,900 --> 00:28:26,529 called kCov that is used by several popular kernel fuzzers like syzkaller, and 349 00:28:26,529 --> 00:28:31,049 we wanted to use it ourselves. So we designed our coverage instrumentation so 350 00:28:31,049 --> 00:28:36,590 that it integrates with kCov. The downside is that you need to compile the kernel 351 00:28:36,590 --> 00:28:40,690 with kCov, but then again, Linux is open source, so you can sort of always do that; 352 00:28:40,690 --> 00:28:44,279 the kernel usually—it's usually not the kernel, it is a binary blob, but it's 353 00:28:44,279 --> 00:28:48,929 usually only the modules. So that's just still fine. And the way you do this is—the 354 00:28:48,929 --> 00:28:53,370 way you implement kCov for binary modules is that you just have to find the start of 355 00:28:53,370 --> 00:28:58,539 every basic block, and add a call to some function that then stores the collected 356 00:28:58,539 --> 00:29:02,530 coverage. So here's an example: we have a short snippet of code with three basic 357 00:29:02,530 --> 00:29:07,620 blocks, and all we have to do is add a call to "trace_pc" to the start of the 358 00:29:07,620 --> 00:29:11,940 basic block. "trace_pc" is a function that is part of the main kernel image that then 359 00:29:11,940 --> 00:29:17,230 collects this coverage and makes it available to a userspace fuzzing agent. So 360 00:29:17,230 --> 00:29:21,210 this is all really easy and it works. And let's now see how we implemented binary 361 00:29:21,210 --> 00:29:25,600 ASan. So as I mentioned before, when we instrument the program with binary ASan in 362 00:29:25,600 --> 00:29:29,690 userspace we link with libASan, which takes care of setting up the metadata, 363 00:29:29,690 --> 00:29:33,880 takes care of putting the red zones around our allocations, and so on. So we had to 364 00:29:33,880 --> 00:29:37,330 do something similar in the kernel; of course, you cannot link with libASan in 365 00:29:37,330 --> 00:29:42,630 the kernel, because that doesn't work, but what we can do instead is, again, compile 366 00:29:42,630 --> 00:29:47,240 the kernel with kASan support. So this instruments the allocator, kmalloc, to add 367 00:29:47,240 --> 00:29:52,110 the red zones; it allocates space for the metadata, it keeps this metadata around, 368 00:29:52,110 --> 00:29:56,279 does this all for us, which is really nice. And again, the big advantage of 369 00:29:56,279 --> 00:30:00,580 using this approach is that we can integrate seamlessly with a kASan- 370 00:30:00,580 --> 00:30:05,800 instrumented kernel and with fuzzers that rely on kASan such as syzkaller. So we see 371 00:30:05,800 --> 00:30:11,500 this as more of a plus than, like, a limitation. And how do you implement ASan? 372 00:30:11,500 --> 00:30:16,561 Well, you have to find every memory access and instrument it to check the—to check 373 00:30:16,561 --> 00:30:22,370 whether this is accessing a red zone. And if it does then you just call this bug 374 00:30:22,370 --> 00:30:26,010 report function that produces a stack trace, a bug report, and crashes the 375 00:30:26,010 --> 00:30:29,649 kernel, so that the fuzzer can detect it. Again, this is compatible with source- 376 00:30:29,649 --> 00:30:36,990 based kASan, so we're happy. We can simply load the rewritten module with added 377 00:30:36,990 --> 00:30:40,220 instrumentation into a kernel, as long as you have compiled the kernel with the 378 00:30:40,220 --> 00:30:44,340 right flags, and we can use a standard kernel fuzzer. Here for the—our 379 00:30:44,340 --> 00:30:49,910 evaluation, we used syzkaller, a popular kernel fuzzer by some folks at Google, and 380 00:30:49,910 --> 00:30:55,460 it worked really well. So we've finally reached the end of our journey, and now we 381 00:30:55,460 --> 00:31:00,470 wanted to present some experiments we did to see if this really works. So for 382 00:31:00,470 --> 00:31:05,289 userspace, we wanted to compare the performance of our binary ASan with 383 00:31:05,289 --> 00:31:10,360 source-based ASan and with existing solutions that also work on binaries. So 384 00:31:10,360 --> 00:31:15,860 for userspace, you can use valgrind memcheck. It's a memory sanitizer that is 385 00:31:15,860 --> 00:31:20,850 based on binary translation and dynamic binary translation and works on binaries. 386 00:31:20,850 --> 00:31:25,460 We compared it with source ASan and RetroWrite ASan on the SPEC CPU benchmark 387 00:31:25,460 --> 00:31:31,100 and saw how fast it was. And for the kernel we decided to fuzz some file 388 00:31:31,100 --> 00:31:37,519 systems and some drivers with syzkaller using both source-based KASan and kCov and 389 00:31:37,519 --> 00:31:44,671 kRetroWrite-based KASan and kCov. So these are our results for userspace. So the red 390 00:31:44,671 --> 00:31:48,990 bar is valgrind. We can see that the execution time of valgrind is the highest. 391 00:31:48,990 --> 00:31:55,892 It is really, really slow—like, 3, 10, 30x overhead, way too slow for fuzzing. Then 392 00:31:55,892 --> 00:32:02,580 in green, we have our binary ASan, which is, like, already a large improvement. In 393 00:32:02,580 --> 00:32:07,059 orange we have source-based ASan. And then finally in blue we have the original code 394 00:32:07,059 --> 00:32:11,090 without any instrumentation whatsoever. So we can see that source-based ASan has, 395 00:32:11,090 --> 00:32:16,659 like, 2x or 3x overhead, and binary ASan is a bit higher, like, a bit less 396 00:32:16,659 --> 00:32:21,312 efficient, but still somewhat close. So that's for userspace, and for the kernel, 397 00:32:21,312 --> 00:32:25,440 we—these are some preliminary results, so, this is, like—I'm doing this work as part 398 00:32:25,440 --> 00:32:29,897 of my master's thesis, and so I'm still, like, running the evaluation. Here we can 399 00:32:29,897 --> 00:32:33,419 see that the overhead is already, like, a bit lower. So the reason for this is that 400 00:32:33,419 --> 00:32:39,690 SPEC is a pure CPU benchmark; it doesn't interact with the system that much. And so 401 00:32:39,690 --> 00:32:44,416 any instrumentation that you add is going to massively slow down, or, like, 402 00:32:44,416 --> 00:32:49,320 considerably slow down the execution. By contrast, when you fuzz a file system with 403 00:32:49,320 --> 00:32:56,460 syzkaller, not only every test case has to go from the high—the host to the guest and 404 00:32:56,460 --> 00:33:01,770 then do multiple syscalls and so on, but also every system call has to go through 405 00:33:01,770 --> 00:33:05,368 several layers of abstraction before it gets to the actual file system. And all 406 00:33:05,368 --> 00:33:09,610 these—like, all of this takes a lot of time, and so in practice the overhead of 407 00:33:09,610 --> 00:33:15,581 our instrumentation seems to be pretty reasonable. So, since we know that you 408 00:33:15,581 --> 00:33:32,838 like demos, we've prepared a small demo of kRetroWrite. So. Let's see. Yep. Okay. All 409 00:33:32,838 --> 00:33:40,470 right, so we've prepared a small kernel module. And this module is just, like, 410 00:33:40,470 --> 00:33:45,669 really simple; it contains a vulnerability, and what it does is that it 411 00:33:45,669 --> 00:33:49,929 creates a character device. So if you're not familiar with this, a character device 412 00:33:49,929 --> 00:33:55,130 is like a fake file that is exposed by a kernel driver and that it can read to and 413 00:33:55,130 --> 00:34:01,630 write from. And instead of going to a file, the data that you read—that you, in 414 00:34:01,630 --> 00:34:05,590 this case, write to the fake file—goes to the driver and is handled by this demo 415 00:34:05,590 --> 00:34:10,481 write function. So as we can see, this function allocates a buffer, a 16-byte 416 00:34:10,481 --> 00:34:14,850 buffer on the heap, and then copies some data into it, and then it checks if the 417 00:34:14,850 --> 00:34:19,970 data contains the string "1337". If it does, then it accesses the buffer out of 418 00:34:19,970 --> 00:34:23,446 bounds; you can see "alloc[16]" and the buffer is sixteen bytes; this is an out- 419 00:34:23,446 --> 00:34:27,550 of-bounds read by one byte. And if it doesn't then it just accesses the buffer 420 00:34:27,550 --> 00:34:33,050 in bounds, which is fine, and it's not a vulnerability. So we can compile this 421 00:34:33,050 --> 00:34:47,450 driver. OK, um... OK, and then so we have our module, and then we will instrument it 422 00:34:47,450 --> 00:35:01,495 using kRetroWrite. So, instrument... Yes, please. OK. Right. So kRetroWrite did some 423 00:35:01,495 --> 00:35:07,329 processing, and it produced an instrumented module with ASan or kASan and 424 00:35:07,329 --> 00:35:09,770 a symbolized assembly file. We can actually have a look at the symbolized 425 00:35:09,770 --> 00:35:17,740 assembly file to see what it looks like. Yes. Yes. OK. So, is this big enough? 426 00:35:17,740 --> 00:35:22,900 Yeah... As you can see, so—we can actually see here the ASan instrumentation. Ah, 427 00:35:22,900 --> 00:35:29,329 shouldn't—yeah. So, we—this is the ASan instrumentation. The original code loads 428 00:35:29,329 --> 00:35:33,290 some data from this address. And as you can see, the ASan instrumentation first 429 00:35:33,290 --> 00:35:38,240 computes the actual address, and then does some checking—basically, this is checking 430 00:35:38,240 --> 00:35:44,430 some metadata that ASan stores to check if the address is in a red zone or not, and 431 00:35:44,430 --> 00:35:49,430 then if the fail check fails, then it calls this ASan report which produces a 432 00:35:49,430 --> 00:35:54,829 stack trace and crashes the kernel. So this is fine. We can actually even look at 433 00:35:54,829 --> 00:36:17,820 the disassembly of both modules, so... object dump and then demo... Ah, nope. OK, 434 00:36:17,820 --> 00:36:21,830 so on the left, we have the original module without any instrumentation; on the 435 00:36:21,830 --> 00:36:27,070 right, we have the module instrumented with ASan. So as you can see, the original 436 00:36:27,070 --> 00:36:33,160 module has "push r13" and then has this memory load here; on the right in the 437 00:36:33,160 --> 00:36:38,559 instrumented module, kRetroWrite inserted the ASan instrumentation. So the original 438 00:36:38,559 --> 00:36:43,940 load is still down here, but between that, between the first instruction and this 439 00:36:43,940 --> 00:36:47,851 instruction, we have—now have the kASan instrumentation that does our check. So 440 00:36:47,851 --> 00:36:56,700 this is all fine. Now we can actually test it and see what it does. So we can—we will 441 00:36:56,700 --> 00:37:02,210 boot a very simple, a very minimal Linux system, and try to target the 442 00:37:02,210 --> 00:37:05,793 vulnerability first with the non- instrumented module and then with the 443 00:37:05,793 --> 00:37:10,410 instrumented module. And we can—we will see that in the—with the non-instrumented 444 00:37:10,410 --> 00:37:14,550 module, the kernel will not crash, but with the instrumented module it will crash 445 00:37:14,550 --> 00:37:22,434 and produce a bug report. So. Let's see. Yeah, this is a QEMU VM, I have no idea 446 00:37:22,434 --> 00:37:27,481 why it's taking so long to boot. I'll blame the the demo gods not being kind to 447 00:37:27,481 --> 00:37:39,730 us. Yeah, I guess we just have to wait. OK. So. All right, so we loaded the 448 00:37:39,730 --> 00:37:47,334 module. We will see that it has created a fake file character device in /dev/demo. 449 00:37:47,334 --> 00:37:59,020 Yep. We can write this file. Yep. So this will—this accesses the array in bounds, 450 00:37:59,020 --> 00:38:04,410 and so this is fine. Then what we can also do is write "1337" to it so it will access 451 00:38:04,410 --> 00:38:08,968 the array out of bounds. So this is the non instrumented module, so this will not 452 00:38:08,968 --> 00:38:14,050 crash. It will just print some garbage value. Okay, that's it. Now we can load 453 00:38:14,050 --> 00:38:25,890 the instrumented module instead... and do the same experiment again. All right. We 454 00:38:25,890 --> 00:38:31,640 can see that /dev/demo is still here. So the module still works. Let's try to write 455 00:38:31,640 --> 00:38:38,540 "1234" into it. This, again, doesn't crash. But when we try to write "1337", 456 00:38:38,540 --> 00:38:47,940 this will produce a bug report. *applause* 457 00:38:47,940 --> 00:38:51,129 So this has quite a lot of information. We 458 00:38:51,129 --> 00:38:55,700 can see, like, the—where the memory was allocated, there's a stack trace for that; 459 00:38:55,700 --> 00:39:02,150 it wasn't freed, so there's no stack trace for the free. And we see that the cache 460 00:39:02,150 --> 00:39:06,760 size of the memory, like, it was a 16-byte allocation. We can see the shape of the 461 00:39:06,760 --> 00:39:10,900 memory. We see that these two zeros means that there's two 8-byte chunks of valid 462 00:39:10,900 --> 00:39:15,550 memory. And then these "fc fc fc" is the—are the red zones that I was talking 463 00:39:15,550 --> 00:39:19,980 about before. All right, so that's it for the demo. We will switch back to our 464 00:39:19,980 --> 00:39:24,630 presentation now. So... hope you enjoyed it. 465 00:39:24,630 --> 00:39:30,530 gannimo: Cool. So after applying this to a demo module, we also wanted to see what 466 00:39:30,530 --> 00:39:35,365 happens if we apply this to a real file system. After a couple of hours we 467 00:39:35,365 --> 00:39:41,390 were—when we came back and checked on the results, we saw a couple of issues popping 468 00:39:41,390 --> 00:39:48,720 up, including a nice set of use-after-free reads, a set of use-after-free writes, and 469 00:39:48,720 --> 00:39:56,220 we checked the bug reports and we saw a whole bunch of Linux kernel issues popping 470 00:39:56,220 --> 00:40:02,640 up one after the other in this nondescript module that we fuzzed. We're in the 471 00:40:02,640 --> 00:40:06,930 process of reporting it. This will take some time until it is fixed; that's why 472 00:40:06,930 --> 00:40:13,470 you see the blurry lines. But as you see, there's still quite a bit of opportunity 473 00:40:13,470 --> 00:40:19,190 in the Linux kernel where you can apply different forms of targeted fuzzing into 474 00:40:19,190 --> 00:40:26,349 different modules, leverage these modules on top of a kASan instrumented kernel and 475 00:40:26,349 --> 00:40:31,720 then leveraging this as part of your fuzzing toolchain to find interesting 476 00:40:31,720 --> 00:40:39,080 kernel 0days that... yeah. You can then develop further, or report, or do whatever 477 00:40:39,080 --> 00:40:44,766 you want with them. Now, we've shown you how you can take existing binary-only 478 00:40:44,766 --> 00:40:51,250 modules, think different binary-only drivers, or even existing modules where 479 00:40:51,250 --> 00:40:55,800 you don't want to instrument a full set of the Linux kernel, but only focus fuzzing 480 00:40:55,800 --> 00:41:02,130 and exploration on a small different—small limited piece of code and then do security 481 00:41:02,130 --> 00:41:09,247 tests on those. We've shown you how we can do coverage-based tracking and address 482 00:41:09,247 --> 00:41:13,500 sanitization. But this is also up to you on what kind of other instrumentation you 483 00:41:13,500 --> 00:41:17,890 want. Like this is just a tool, a framework that allows you to do arbitrary 484 00:41:17,890 --> 00:41:23,780 forms of instrumentation. So we've taken you on a journey from instrumenting 485 00:41:23,780 --> 00:41:29,380 binaries over coverage-guided fuzzing and sanitization to instrumenting modules in 486 00:41:29,380 --> 00:41:36,692 the kernel and then finding crashes in the kernel. Let me wrap up the talk. So, this 487 00:41:36,692 --> 00:41:41,581 is one of the the fun pieces of work that we do in the hexhive lab at EPFL. So if 488 00:41:41,581 --> 00:41:45,740 you're looking for postdoc opportunities or if you're thinking about a PhD, come 489 00:41:45,740 --> 00:41:51,809 talk to us. We're always hiring. The tools will be released as open source. A large 490 00:41:51,809 --> 00:41:57,319 chunk of the userspace work is already open source. We're working on a set of 491 00:41:57,319 --> 00:42:02,350 additional demos and so on so that you can get started faster, leveraging the 492 00:42:02,350 --> 00:42:07,810 different existing instrumentation that is already out there. The userspace work is 493 00:42:07,810 --> 00:42:12,139 already available. The kernel work will be available in a couple of weeks. This 494 00:42:12,139 --> 00:42:16,770 allows you to instrument real-world binaries for fuzzing, leveraging existing 495 00:42:16,770 --> 00:42:21,200 transformations for coverage tracking to enable fast and effective fuzzing and 496 00:42:21,200 --> 00:42:26,490 memory checking to detect the actual bugs that exist there. The key takeaway from 497 00:42:26,490 --> 00:42:32,430 this talk is that RetroWrite and kRetroWrite enables static binary 498 00:42:32,430 --> 00:42:38,300 rewriting at zero instrumentation cost. We take the limitation of focusing only on 499 00:42:38,300 --> 00:42:43,240 position-independent code, which is not a real implementation, but we get the 500 00:42:43,240 --> 00:42:47,800 advantage of being able to symbolize without actually relying on heuristics, so 501 00:42:47,800 --> 00:42:55,380 we can even symbolize large, complex source—large, complex applications and 502 00:42:55,380 --> 00:43:01,090 effectively rewrite those aspects and then you can focus fuzzing on these parts. 503 00:43:01,090 --> 00:43:06,329 Another point I want to mention is that this enables you to reuse existing tooling 504 00:43:06,329 --> 00:43:10,981 so you can take a binary blob, instrument it, and then reuse, for example, Address 505 00:43:10,981 --> 00:43:15,966 Sanitizer or existing fuzzing tools, as it integrates really, really nice. As I said, 506 00:43:15,966 --> 00:43:22,700 all the code is open source. Check it out. Try it. Let us know if it breaks. We're 507 00:43:22,700 --> 00:43:27,521 happy to fix. We are committed to open source. And let us know if there are any 508 00:43:27,521 --> 00:43:36,750 questions. Thank you. *applause* 509 00:43:36,750 --> 00:43:42,250 Herald: So, thanks, guys, for an interesting talk. We have some time for 510 00:43:42,250 --> 00:43:47,180 questions, so we have microphones along the aisles. We'll start from question from 511 00:43:47,180 --> 00:43:51,079 microphone number two. Q: Hi. Thanks for your talk and for the 512 00:43:51,079 --> 00:43:59,400 demo. I'm not sure about the use-case you showed for the kernel RetroWrite. 'Cause 513 00:43:59,400 --> 00:44:05,579 you're usually interested in fuzzing binary in kernelspace when you don't have 514 00:44:05,579 --> 00:44:13,980 source code for the kernel. For example, for IoT or Android and so on. But you just 515 00:44:13,980 --> 00:44:22,260 reuse the kCov and kASan in the kernel, and you never have the kernel in IoT or 516 00:44:22,260 --> 00:44:28,599 Android which is compiled with that. So are you—do you have any plans to binary 517 00:44:28,599 --> 00:44:31,666 instrument the kernel itself, not the modules? 518 00:44:31,666 --> 00:44:39,390 Nspace: So we thought about that. I think that there's some additional problems that 519 00:44:39,390 --> 00:44:43,910 we would have to solve in order to be able to instrument the full kernel. So other 520 00:44:43,910 --> 00:44:47,819 than the fact that it gives us compatibility with, like, existing tools, 521 00:44:47,819 --> 00:44:51,720 the reason why we decided to go with compiling the kernel with kASan and kCov 522 00:44:51,720 --> 00:44:56,757 is that building the, like—you would you have to, like, think about it. You 523 00:44:56,757 --> 00:45:01,540 have to instrument the memory allocator to add red zones, which is, like, already 524 00:45:01,540 --> 00:45:07,069 somewhat complex. You have to instrument the exception handlers to catch, like, any 525 00:45:07,069 --> 00:45:12,240 faults that the instrumentation detects. You would have to, like, set up some 526 00:45:12,240 --> 00:45:17,480 memory for the ASan shadow. So this is, like—I think you should be able to do it, 527 00:45:17,480 --> 00:45:21,690 but it would require a lot of additional work. So this is, like—this was like four 528 00:45:21,690 --> 00:45:25,510 months' thesis. So we decided to start small and prove that it works in 529 00:45:25,510 --> 00:45:30,470 the kernel for modules, and then leave it to future work to actually extend it to 530 00:45:30,470 --> 00:45:37,558 the full kernel. Also, like, I think for Android—so in the case of Linux, the 531 00:45:37,558 --> 00:45:42,072 kernel is GPL, right, so if the manufacturers ships a custom kernel, they 532 00:45:42,072 --> 00:45:44,614 have to release the source code, right? Q: They never do. 533 00:45:44,614 --> 00:45:47,220 Nspace: They never—well, that's a different issue. Right? 534 00:45:47,220 --> 00:45:49,009 gannimo: Right. Q: So that's why I ask, because I don't 535 00:45:49,009 --> 00:45:51,839 see how it just can be used in the real world. 536 00:45:51,839 --> 00:45:57,122 gannimo: Well, let me try to put this into perspective a little bit as well. Right. 537 00:45:57,122 --> 00:46:02,030 So there's the—what we did so far is we leveraged existing tools, like kASan or 538 00:46:02,030 --> 00:46:09,440 kCov, and integrated into these existing tools. Now, doing heap-based allocation is 539 00:46:09,440 --> 00:46:13,572 fairly simple and replacing those with additional red zones—that instrumentation 540 00:46:13,572 --> 00:46:20,203 you can carry out fairly well by focusing on the different allocators. Second to 541 00:46:20,203 --> 00:46:24,972 that, simply oopsing the kernel and printing the stack trace is also fairly 542 00:46:24,972 --> 00:46:29,250 straightforward. So it's not a lot of additional effort. So it is—it involves 543 00:46:29,250 --> 00:46:38,471 some engineering effort to port this to non-kASan-compiled kernels. But we think 544 00:46:38,471 --> 00:46:44,740 it is very feasible. In the interest of time, we focused on kASan-enabled kernels, 545 00:46:44,740 --> 00:46:50,960 so that some form of ASan is already enabled. But yeah, this is additional 546 00:46:50,960 --> 00:46:55,660 engineering effort. But there is also a community out there that can help us with 547 00:46:55,660 --> 00:47:00,960 these kind of changes. So kRetroWrite and RetroWrite themselves are the binary 548 00:47:00,960 --> 00:47:07,060 rewriting platform that allow you to turn a binary into an assembly file that you 549 00:47:07,060 --> 00:47:11,619 can then instrument and run different passes on top of it. So another pass would 550 00:47:11,619 --> 00:47:16,399 be a full ASan pass or kASan pass that somebody could add and then contribute 551 00:47:16,399 --> 00:47:19,100 back to the community. Q: Yeah, it would be really useful. 552 00:47:19,100 --> 00:47:20,186 Thanks. gannimo: Cool. 553 00:47:20,186 --> 00:47:24,260 Angel: Next question from the Internet. Q: Yes, there is a question regarding the 554 00:47:24,260 --> 00:47:30,890 slide on the SPEC CPU benchmark. The second or third graph from the right had 555 00:47:30,890 --> 00:47:36,700 an instrumented version that was faster than the original program. Why is that? 556 00:47:36,700 --> 00:47:42,299 gannimo: Cache effect. Thank you. Angel: Microphone number one. 557 00:47:42,299 --> 00:47:47,032 Q: Thank you. Thank you for presentation. I have question: how many architecture do 558 00:47:47,032 --> 00:47:51,210 you support, and if you have support more, what then? 559 00:47:51,210 --> 00:47:56,400 gannimo: x86_64. Q: Okay. So no plans for ARM or MIPS, 560 00:47:56,400 --> 00:47:58,130 or...? gannimo: Oh, there are plans. 561 00:47:58,130 --> 00:48:01,390 Q: Okay. Nspace: Right, so— 562 00:48:01,390 --> 00:48:05,980 gannimo: Right. Again, there's a finite amount of time. We focused on the 563 00:48:05,980 --> 00:48:11,778 technology. ARM is high up on the list. If somebody is interested in working on it 564 00:48:11,778 --> 00:48:17,670 and contributing, we're happy to hear from it. Our list of targets is ARM first and 565 00:48:17,670 --> 00:48:22,915 then maybe something else. But I think with x86_64 and ARM we've covered a 566 00:48:22,915 --> 00:48:33,420 majority of the interesting platforms. Q: And second question, did you try to 567 00:48:33,420 --> 00:48:37,970 fuzz any real closed-source program? Because as I understand from presentation, 568 00:48:37,970 --> 00:48:44,710 you fuzz, like, just file system, what we can compile and fuzz with syzkaller like 569 00:48:44,710 --> 00:48:48,570 in the past. Nspace: So for the evaluation, we wanted 570 00:48:48,570 --> 00:48:52,130 to be able to compare between the source- based instrumentation and the binary-based 571 00:48:52,130 --> 00:48:57,460 instrumentation, so we focused mostly on open-source filesystem and drivers because 572 00:48:57,460 --> 00:49:02,058 then we could instrument them with a compiler. We haven't yet tried, but this 573 00:49:02,058 --> 00:49:05,740 is, like, also pretty high up on the list. We wanted to try to find some closed- 574 00:49:05,740 --> 00:49:10,609 source drivers—there's lots of them, like for GPUs or anything—and we'll give it a 575 00:49:10,609 --> 00:49:15,460 try and find some 0days, perhaps. Q: Yes, but with syzkaller, you still have 576 00:49:15,460 --> 00:49:22,582 a problem. You have to write rules, like, dictionaries. I mean, you have to 577 00:49:22,582 --> 00:49:24,599 understand the format, have to communicate with the driver. 578 00:49:24,599 --> 00:49:28,550 Nspace: Yeah, right But there's, for example, closed-source file systems that 579 00:49:28,550 --> 00:49:33,270 we are looking at. Q: Okay. Thinking. 580 00:49:33,270 --> 00:49:38,657 Herald: Number two. Q: Hi. Thank you for your talk. So I don't 581 00:49:38,657 --> 00:49:45,070 know if there are any kCov- or kASan- equivalent solution to Windows, but I was 582 00:49:45,070 --> 00:49:49,933 wondering if you tried, or are you planning to do it on Windows, the 583 00:49:49,933 --> 00:49:52,540 framework? Because I know it might be challenging because of the driver 584 00:49:52,540 --> 00:49:56,849 signature enforcement and PatchGuard, but I wondered if you tried or thought about 585 00:49:56,849 --> 00:49:59,290 it. gannimo: Yes, we thought about it and we 586 00:49:59,290 --> 00:50:06,383 decided against it. Windows is incredibly hard and we are academics. The research I 587 00:50:06,383 --> 00:50:11,800 do in my lab, or we do in my research lab, focuses on predominantly open-source 588 00:50:11,800 --> 00:50:17,060 software and empowers open-source software. Doing full support for Microsoft 589 00:50:17,060 --> 00:50:20,780 Windows is somewhat out of scope. If somebody wants to port these tools, we are 590 00:50:20,780 --> 00:50:24,190 happy to hear it and work with these people. But it's a lot of additional 591 00:50:24,190 --> 00:50:28,530 engineering effort, versus very additional—very low additional research 592 00:50:28,530 --> 00:50:33,060 value, so we'll have to find some form of compromise. And, like, if you would be 593 00:50:33,060 --> 00:50:38,650 willing to fund us, we would go ahead. But it's—yeah, it's a cost question. 594 00:50:38,650 --> 00:50:42,089 Q: And you're referring both to kernel and user space, right? 595 00:50:42,089 --> 00:50:45,089 gannimo: Yeah. Q: Okay. Thank you. 596 00:50:45,089 --> 00:50:48,105 Herald: Number five. Q: Hi, thanks for the talk. This seems 597 00:50:48,105 --> 00:50:52,400 most interesting if you're looking for vulnerabilities in closed source kernel 598 00:50:52,400 --> 00:50:58,359 modules, but not giving it too much thought, it seems it's really trivial to 599 00:50:58,359 --> 00:51:01,920 prevent this if you're writing a closed source module. 600 00:51:01,920 --> 00:51:07,130 gannimo: Well, how would you prevent this? Q: Well, for starters, you would just take 601 00:51:07,130 --> 00:51:11,492 a difference between the address of two functions. That's not gonna be IP 602 00:51:11,492 --> 00:51:15,860 relative, so... Nspace: Right. So we explicitly—like, even 603 00:51:15,860 --> 00:51:21,589 in the original RetroWrite paper—we explicitly decided to not try to deal with 604 00:51:21,589 --> 00:51:25,777 obfuscated code, or code that is purposefully trying to defeat this kind of 605 00:51:25,777 --> 00:51:30,510 rewriting. Because, like, the assumption is that first of all, there are techniques 606 00:51:30,510 --> 00:51:34,099 to, like, deobfuscate code or remove these, like, checks in some way, but this 607 00:51:34,099 --> 00:51:39,510 is, like, sort of orthogonal work. And at the same time, I guess most drivers are 608 00:51:39,510 --> 00:51:43,980 not really compiled with the sort of obfuscation; they're just, like, you know, 609 00:51:43,980 --> 00:51:47,657 they're compiled with a regular compiler. But yeah, of course, this is, like, a 610 00:51:47,657 --> 00:51:50,070 limitation. gannimo: They're likely stripped, but not 611 00:51:50,070 --> 00:51:54,281 necessarily obfuscated. At least from what we've seen when we looked at binary-only 612 00:51:54,281 --> 00:51:58,980 drivers. Herald: Microphone number two. 613 00:51:58,980 --> 00:52:04,350 Q: How do you decide where to place the red zones? From what I heard, you talked 614 00:52:04,350 --> 00:52:10,030 about instrumenting the allocators, but, well, there are a lot of variables on the 615 00:52:10,030 --> 00:52:13,270 stack, so how do you deal with those? gannimo: Oh, yeah, that's actually super 616 00:52:13,270 --> 00:52:20,159 cool. I refer to some extent to the paper that is on the GitHub repo as well. If you 617 00:52:20,159 --> 00:52:26,778 think about it, modern compilers use canaries for buffers. Are you aware of 618 00:52:26,778 --> 00:52:31,150 stack canaries—how stack canaries work? So, stack canaries—like, if the compiler 619 00:52:31,150 --> 00:52:34,440 sees there's a buffer that may be overflown, it places a stack canary 620 00:52:34,440 --> 00:52:39,740 between the buffer and any other data. What we use is we—as part of our analysis 621 00:52:39,740 --> 00:52:44,750 tool, we find these stack canaries, remove the code that does the stack canary, and 622 00:52:44,750 --> 00:52:49,420 use this space to place our red zones. So we actually hack the stack in areas, 623 00:52:49,420 --> 00:52:54,569 remove that code, and add ASan red zones into the empty stack canaries that are now 624 00:52:54,569 --> 00:52:58,599 there. It's actually a super cool optimization because we piggyback on what 625 00:52:58,599 --> 00:53:02,630 kind of work the compiler already did for us before, and we can then leverage that 626 00:53:02,630 --> 00:53:06,780 to gain additional benefits and protect the stack as well. 627 00:53:06,780 --> 00:53:11,120 Q: Thanks. Angel: Another question from the Internet. 628 00:53:16,039 --> 00:53:20,920 Q: Yes. Did you consider lifting the binary code to LLVM IR instead of 629 00:53:20,920 --> 00:53:28,370 generating assembler source? gannimo: Yes. *laughter* But, so—a little 630 00:53:28,370 --> 00:53:32,060 bit longer answer. Yes, we did consider that. Yes, it would be super nice to lift 631 00:53:32,060 --> 00:53:38,710 to LLVM IR. We've actually looked into this. It's incredibly hard. It's 632 00:53:38,710 --> 00:53:42,270 incredibly complex. There's no direct mapping between the machine code 633 00:53:42,270 --> 00:53:48,490 equivalent and the LLVM IR. You would still need to recover all the types. So 634 00:53:48,490 --> 00:53:51,800 it's like this magic dream that you recover full LLVM IR, then do heavyweight 635 00:53:51,800 --> 00:53:57,470 transformations on top of it. But this is incredibly hard because if you compile 636 00:53:57,470 --> 00:54:03,570 down from LLVM IR to machine code, you lose a massive amount of information. You 637 00:54:03,570 --> 00:54:07,150 would have to find a way to recover all of that information, which is pretty much 638 00:54:07,150 --> 00:54:14,990 impossible and undecidable for many cases. So for example, just as a note, we only 639 00:54:14,990 --> 00:54:19,420 recover control flow and we only desymbolize control flow. For data 640 00:54:19,420 --> 00:54:23,030 references—we don't support instrumentation of data references yet 641 00:54:23,030 --> 00:54:28,839 because there's still an undecidable problem that we are facing with. I can 642 00:54:28,839 --> 00:54:32,859 talk more about this offline, or there is a note in the paper as well. So this is 643 00:54:32,859 --> 00:54:37,270 just a small problem. Only if you're lifting to assembly files. If you're 644 00:54:37,270 --> 00:54:41,700 lifting to LLVM IR, you would have to do full end-to-end type recovery, which is 645 00:54:41,700 --> 00:54:46,400 massively more complicated. Yes, it would be super nice. Unfortunately, it is 646 00:54:46,400 --> 00:54:50,530 undecidable and really, really hard. So you can come up with some heuristics, but 647 00:54:50,530 --> 00:54:55,270 there is no solution that will do this in—that will be correct 100 percent of the 648 00:54:55,270 --> 00:54:57,490 time. Angel: We'll take one more question from 649 00:54:57,490 --> 00:55:02,609 microphone number six. Q: Thank you for your talk. What kind of 650 00:55:02,609 --> 00:55:07,299 disassemblers did you use for RetroWrite, and did you have problems with the wrong 651 00:55:07,299 --> 00:55:12,880 disassembly? And if so, how did you handle it? 652 00:55:12,880 --> 00:55:18,790 Nspace: So, RetroWrite—so we used Capstone for the disassembly. 653 00:55:18,790 --> 00:55:24,150 gannimo: An amazing tool, by the way. Nspace: Yeah. So the idea is that, like, 654 00:55:24,150 --> 00:55:30,240 we need some kind of—some information about where the functions are. So for the 655 00:55:30,240 --> 00:55:33,549 kernel modules, this is actually fine because kernel modules come with this sort 656 00:55:33,549 --> 00:55:37,730 of information because the kernel needs it, to build stack traces, for example. 657 00:55:37,730 --> 00:55:41,869 For userspace binaries, this is somewhat less common, but you can use another tool 658 00:55:41,869 --> 00:55:46,170 to try to do function identification. And we do, like—sort of, like, disassemble the 659 00:55:46,170 --> 00:55:54,500 entire function. So we have run into some issues with, like—in AT&T syntax, because 660 00:55:54,500 --> 00:55:59,650 like we wanted to use gas, GNU's assembler, for, for... 661 00:55:59,650 --> 00:56:04,240 gannimo: Reassembling. Nspace: Reassembly, yeah. And some 662 00:56:04,240 --> 00:56:09,819 instructions are a lot—you can express the same, like, two different instructions, 663 00:56:09,819 --> 00:56:15,670 like five-byte NOP and six-byte NOP, using the same string of, like, text—a mnemonic, 664 00:56:15,670 --> 00:56:19,970 an operand string. But the problem is that, like, the kernel doesn't like it and 665 00:56:19,970 --> 00:56:21,970 crashes. This took me like two days to debug. 666 00:56:21,970 --> 00:56:27,640 gannimo: So the kernel uses dynamic binary patching when it runs, at runtime, and it 667 00:56:27,640 --> 00:56:32,980 uses fixed offsets, so if you replace a five-byte NOP with a six-byte NOP or vice 668 00:56:32,980 --> 00:56:37,830 versa, your offsets change and your kernel just blows up in your face. 669 00:56:37,830 --> 00:56:43,099 Q: So it was kind of a case-on-case basis where you saw the errors coming out of the 670 00:56:43,099 --> 00:56:47,920 disassembly and you had to fix it? Nspace: So sorry, can you repeat the 671 00:56:47,920 --> 00:56:51,030 question? Q: Like, for example, if you—if some 672 00:56:51,030 --> 00:56:54,910 instruction is not supported by the disassembler, so you saw that it crashed, 673 00:56:54,910 --> 00:56:58,000 that there's something wrong, and then you fix it by hand? 674 00:56:58,000 --> 00:57:02,940 Nspace: Yeah, well, if we saw that there was a problem with it, this—like, I don't 675 00:57:02,940 --> 00:57:06,960 recall having any unknown instructions in the dissasembler. I don't think I've ever 676 00:57:06,960 --> 00:57:11,290 had a problem with that. But yeah, this was a lot of, like, you know, engineering 677 00:57:11,290 --> 00:57:14,290 work. gannimo: So let me repeat. The problem was 678 00:57:14,290 --> 00:57:19,220 not a bug in the disassembler, but an issue with the instruction format—that the 679 00:57:19,220 --> 00:57:24,530 same mnemonic can be translated into two different instructions, one of which was 680 00:57:24,530 --> 00:57:29,089 five bytes long, the other one was six bytes long. Both used the exact same 681 00:57:29,089 --> 00:57:32,880 mnemonic. Right, so this was an issue with assembly encoding. 682 00:57:32,880 --> 00:57:38,290 Q: But you had no problems with unsupported instructions which couldn't be 683 00:57:38,290 --> 00:57:41,339 disassembled? Nspace: No, no. Not as far as I know, at 684 00:57:41,339 --> 00:57:43,339 least. Angel: We have one more minute, so a very 685 00:57:43,339 --> 00:57:52,069 short question from microphone number two. Q: Does it work? Ah. Is your binary 686 00:57:52,069 --> 00:58:02,020 instrumentation equally powerful as kernel address space... I mean, kASan? So, does 687 00:58:02,020 --> 00:58:06,349 it detect all the memory corruptions on stack, heap and globals? 688 00:58:06,349 --> 00:58:13,050 gannimo: No globals. But heap—it does all of them on the heap. There's some slight 689 00:58:13,050 --> 00:58:20,150 variation on the stack because we have to piggyback on the canary stuff. As I 690 00:58:20,150 --> 00:58:23,880 mentioned quickly before, there is no reflowing and full recovery of data 691 00:58:23,880 --> 00:58:28,990 layouts. So to get anything on the stack, we have to piggyback on existing compiler 692 00:58:28,990 --> 00:58:36,650 extensions like stack canaries. But—so we don't support intra-object overflows on 693 00:58:36,650 --> 00:58:40,631 the stack. But we do leverage the stack in areas to get some stack benefits, which 694 00:58:40,631 --> 00:58:45,490 is, I don't know, 90, 95 percent there because the stack canaries are pretty 695 00:58:45,490 --> 00:58:51,319 good. For heap, we get the same precision. For globals, we have very limited support. 696 00:58:51,319 --> 00:58:54,290 Q: Thanks. Angel: So that's all the time we have for 697 00:58:54,290 --> 00:58:57,600 this talk. You can find the speakers, I think, afterwards offline. Please give 698 00:58:57,600 --> 00:58:59,820 them a big round of applause for an interesting talk. 699 00:58:59,820 --> 00:59:03,050 *applause* 700 00:59:03,050 --> 00:59:07,360 *36c3 postrol music* 701 00:59:07,360 --> 00:59:29,000 Subtitles created by c3subtitles.de in the year 2021. Join, and help us!