0 00:00:00,000 --> 00:00:30,000 Dear viewer, these subtitles were generated by a machine via the service Trint and therefore are (very) buggy. If you are capable, please help us to create good quality subtitles: https://c3subtitles.de/talk/339 Thanks! 1 00:00:11,770 --> 00:00:13,359 Thanks for the introduction. 2 00:00:13,360 --> 00:00:14,949 Hey, guys, it's a pleasure to be here. 3 00:00:17,140 --> 00:00:18,140 Awesome. Yeah, 4 00:00:19,450 --> 00:00:21,039 let me tell you some story from the 5 00:00:21,040 --> 00:00:23,799 trenches that we are fighting 6 00:00:23,800 --> 00:00:25,089 where we are fighting every day. 7 00:00:25,090 --> 00:00:27,129 This is joint work with Bova Quittance 8 00:00:27,130 --> 00:00:29,469 itself, Lazlo Chakiris, George 9 00:00:29,470 --> 00:00:31,599 Candia, our Sheika and 10 00:00:31,600 --> 00:00:33,639 Don song. Most of the work has been done 11 00:00:33,640 --> 00:00:34,640 at UC Berkeley. 12 00:00:35,830 --> 00:00:38,349 Bova and Laszlo did most of the 13 00:00:38,350 --> 00:00:40,659 heavy programing work and heavy lifting. 14 00:00:40,660 --> 00:00:42,219 While I only could add some of the 15 00:00:42,220 --> 00:00:44,779 support libraries and a 16 00:00:44,780 --> 00:00:46,719 little bit of some of the other stuff, 17 00:00:46,720 --> 00:00:47,720 but most of the. 18 00:00:48,820 --> 00:00:50,889 Programing praise goes to the 19 00:00:50,890 --> 00:00:52,090 goes of the two students, 20 00:00:54,010 --> 00:00:57,069 also blame me for all the 21 00:00:57,070 --> 00:00:58,759 bad jokes in the presentations. 22 00:00:58,760 --> 00:01:01,029 It's it's not their fault. 23 00:01:01,030 --> 00:01:02,979 So let's start right away. 24 00:01:02,980 --> 00:01:05,409 We live on an ugly planet 25 00:01:05,410 --> 00:01:08,319 at Buck Planet, and 26 00:01:08,320 --> 00:01:09,729 it's just terrible out. 27 00:01:09,730 --> 00:01:11,079 There are so many bugs. 28 00:01:11,080 --> 00:01:12,189 It's just crazy. 29 00:01:12,190 --> 00:01:14,029 And we are overrun by them. 30 00:01:14,030 --> 00:01:16,329 If you just look at what happened 31 00:01:16,330 --> 00:01:18,759 in the last couple of years, 32 00:01:18,760 --> 00:01:20,530 for example, pointers not working, 33 00:01:23,620 --> 00:01:24,519 memory corruption. 34 00:01:24,520 --> 00:01:25,629 Bugs are abundant. 35 00:01:25,630 --> 00:01:26,889 They are everywhere. 36 00:01:26,890 --> 00:01:28,449 They are just literally everywhere. 37 00:01:28,450 --> 00:01:31,179 And there are so many CVS's 38 00:01:31,180 --> 00:01:33,789 are out there 39 00:01:33,790 --> 00:01:36,129 that an attacker 40 00:01:36,130 --> 00:01:37,869 can gain some form of control, flow, 41 00:01:37,870 --> 00:01:38,870 hatcheck 42 00:01:40,090 --> 00:01:42,399 permissions or capabilities. 43 00:01:42,400 --> 00:01:45,309 And this is just as 44 00:01:45,310 --> 00:01:47,379 we just picked a couple of programs 45 00:01:47,380 --> 00:01:49,749 out there, like Acrobat Firefox, 46 00:01:49,750 --> 00:01:50,979 i.e. 47 00:01:50,980 --> 00:01:53,349 General OS X and Linux books 48 00:01:53,350 --> 00:01:56,199 that allow an outside attacker to gain 49 00:01:56,200 --> 00:01:58,389 control, flow or code execution 50 00:01:58,390 --> 00:02:00,609 capabilities on a system. 51 00:02:00,610 --> 00:02:02,889 And the attacks are on the rise. 52 00:02:02,890 --> 00:02:04,959 And all of these attacks 53 00:02:04,960 --> 00:02:06,999 rely on some fundamental memory 54 00:02:07,000 --> 00:02:08,469 corruption, vulnerability somewhere in 55 00:02:08,470 --> 00:02:09,459 the background. 56 00:02:09,460 --> 00:02:11,919 And we are being overrun 57 00:02:11,920 --> 00:02:14,169 by all these books, even though we are 58 00:02:14,170 --> 00:02:16,809 investing more and more time into fuzzing 59 00:02:16,810 --> 00:02:19,299 and tools that find books 60 00:02:19,300 --> 00:02:21,279 that are just way too many out there to 61 00:02:21,280 --> 00:02:22,779 that that we can fix. 62 00:02:22,780 --> 00:02:23,499 Manoly. 63 00:02:23,500 --> 00:02:24,939 So we have to come up with some 64 00:02:24,940 --> 00:02:28,029 techniques that allow us to 65 00:02:28,030 --> 00:02:30,279 have some proactive steps 66 00:02:30,280 --> 00:02:32,199 against these books and vulnerabilities 67 00:02:32,200 --> 00:02:34,239 that are in our software to protect them 68 00:02:34,240 --> 00:02:36,489 against different form of 69 00:02:36,490 --> 00:02:38,469 control for high tech attacks. 70 00:02:38,470 --> 00:02:40,329 And just to name two prominent examples, 71 00:02:40,330 --> 00:02:42,009 this year, there was the the Heartbleed 72 00:02:42,010 --> 00:02:45,129 vulnerability, which was a memory safety 73 00:02:45,130 --> 00:02:46,929 or a memory corruption vulnerability and 74 00:02:46,930 --> 00:02:48,009 shell shock as well. 75 00:02:48,010 --> 00:02:49,839 And some of these vulnerabilities, the 76 00:02:49,840 --> 00:02:51,779 sleep in slumber and the software for 77 00:02:51,780 --> 00:02:53,859 many, many years. 78 00:02:53,860 --> 00:02:55,929 And suddenly they are found and they can 79 00:02:55,930 --> 00:02:58,179 be exploited on a very, 80 00:02:58,180 --> 00:02:59,259 very wide scale. 81 00:02:59,260 --> 00:03:01,029 And I argue that we have to come up with 82 00:03:01,030 --> 00:03:03,639 some proactive defense mechanism 83 00:03:03,640 --> 00:03:05,679 that protects us against these kinds of 84 00:03:05,680 --> 00:03:07,149 vulnerabilities. 85 00:03:07,150 --> 00:03:09,549 So let me take a step back 86 00:03:09,550 --> 00:03:11,619 and explain to you what memory safety 87 00:03:11,620 --> 00:03:13,269 is all about and why we don't have it 88 00:03:13,270 --> 00:03:14,270 yet. 89 00:03:14,650 --> 00:03:16,989 So at the most 90 00:03:16,990 --> 00:03:19,149 basic level, memory safety 91 00:03:19,150 --> 00:03:21,189 relies on some form of or a memory 92 00:03:21,190 --> 00:03:23,169 safety, corruption or lack thereof relies 93 00:03:23,170 --> 00:03:25,509 on some form of invalid reference. 94 00:03:25,510 --> 00:03:27,099 So, for example, there can be dangling 95 00:03:27,100 --> 00:03:29,439 pointers, could be a temporal 96 00:03:29,440 --> 00:03:31,539 issue. So, for example, at one point in 97 00:03:31,540 --> 00:03:33,699 time, in a language like C 98 00:03:33,700 --> 00:03:36,339 or C++, we have a memory object 99 00:03:36,340 --> 00:03:37,749 and we have a pointer to that memory 100 00:03:37,750 --> 00:03:40,299 object. So if you have a valid reference, 101 00:03:40,300 --> 00:03:42,459 but we free that object at one 102 00:03:42,460 --> 00:03:45,579 point in time and it becomes an invalid 103 00:03:45,580 --> 00:03:47,319 object. And there's a there's a dangling 104 00:03:47,320 --> 00:03:49,419 pointer to that object which can lead to 105 00:03:49,420 --> 00:03:51,339 some form of memory corruption. 106 00:03:51,340 --> 00:03:53,499 Then we reference it or they 107 00:03:53,500 --> 00:03:55,239 are out of bounds. Point to imagine that 108 00:03:55,240 --> 00:03:57,399 we are working and iterating 109 00:03:57,400 --> 00:03:58,400 through an area 110 00:04:00,100 --> 00:04:02,469 and as soon as we step outside 111 00:04:02,470 --> 00:04:04,359 of the area, we have an out of bounds 112 00:04:04,360 --> 00:04:06,429 pointer, which is some form of spatial 113 00:04:06,430 --> 00:04:08,619 memory safety violation. 114 00:04:08,620 --> 00:04:10,899 Both of these pointers are still fine 115 00:04:10,900 --> 00:04:12,939 as long as we don't reference them. 116 00:04:12,940 --> 00:04:15,429 But something bad will happen if 117 00:04:15,430 --> 00:04:17,499 the pointer is read, written or freed 118 00:04:17,500 --> 00:04:19,479 again and we end up with some form of 119 00:04:19,480 --> 00:04:20,480 corruption. 120 00:04:22,010 --> 00:04:25,579 So otherwise, there's no violation. 121 00:04:25,580 --> 00:04:27,289 The threat model that we are going to 122 00:04:27,290 --> 00:04:29,749 have for this talk looks as follows. 123 00:04:29,750 --> 00:04:31,969 We assume that our attacker can 124 00:04:31,970 --> 00:04:34,129 read and write arbitrary data and 125 00:04:34,130 --> 00:04:35,449 the whole process in which 126 00:04:36,470 --> 00:04:38,389 the attacker can read code as well. 127 00:04:38,390 --> 00:04:40,999 But the attacker cannot modify 128 00:04:41,000 --> 00:04:43,159 the code as it is being executed. 129 00:04:43,160 --> 00:04:44,689 And we'll discuss some of these 130 00:04:44,690 --> 00:04:45,919 limitations later on. 131 00:04:45,920 --> 00:04:47,779 Or the attacker can influence program 132 00:04:47,780 --> 00:04:49,429 loading. So if the attacker has control 133 00:04:49,430 --> 00:04:51,529 of the whole process image before 134 00:04:51,530 --> 00:04:52,939 we can actually set up our defense 135 00:04:52,940 --> 00:04:55,009 mechanism, there's nothing we can do, 136 00:04:55,010 --> 00:04:57,379 so to 137 00:04:57,380 --> 00:04:57,769 speak. 138 00:04:57,770 --> 00:04:59,239 A little bit more permission wise, 139 00:04:59,240 --> 00:05:01,039 there's there are explanations for the 140 00:05:01,040 --> 00:05:03,259 attacker on the code, 141 00:05:03,260 --> 00:05:05,869 but the attacker can change to the 142 00:05:05,870 --> 00:05:07,279 code pointers. 143 00:05:07,280 --> 00:05:09,409 There is reason, right, for the 144 00:05:09,410 --> 00:05:10,879 heap and read and write for the stack. 145 00:05:11,930 --> 00:05:13,999 So just 146 00:05:14,000 --> 00:05:15,829 so that we are all on the same page. 147 00:05:15,830 --> 00:05:17,929 I'll quickly walk you through some 148 00:05:17,930 --> 00:05:20,179 form of control flow, high tech attack as 149 00:05:20,180 --> 00:05:22,729 it is being used nowadays to 150 00:05:22,730 --> 00:05:24,919 sidestep all the defense mechanisms that 151 00:05:24,920 --> 00:05:27,589 are active on current systems 152 00:05:27,590 --> 00:05:29,809 so that we can we can understand how 153 00:05:29,810 --> 00:05:32,029 the defense mechanisms that I'll present 154 00:05:32,030 --> 00:05:33,859 later on will work. 155 00:05:33,860 --> 00:05:36,109 So we have this simple 156 00:05:36,110 --> 00:05:37,909 C program on the left hand side. 157 00:05:37,910 --> 00:05:40,039 We define a function pointer and 158 00:05:40,040 --> 00:05:42,139 there's some weird pintura magic 159 00:05:42,140 --> 00:05:44,419 going on Bavji, where we assign 160 00:05:44,420 --> 00:05:46,819 Q to a buffer plus some 161 00:05:46,820 --> 00:05:49,009 attacker controlled input so the attacker 162 00:05:49,010 --> 00:05:50,179 can kind of. 163 00:05:51,340 --> 00:05:53,439 Point summer into 164 00:05:53,440 --> 00:05:55,509 memory by carefully crafting 165 00:05:55,510 --> 00:05:57,009 the inputs to that pointer 166 00:05:58,840 --> 00:06:01,059 after that happened, the function 167 00:06:01,060 --> 00:06:03,099 pointer is assigned to the function foo, 168 00:06:03,100 --> 00:06:05,769 but later on, the attacker, a controlled 169 00:06:05,770 --> 00:06:08,499 pointer Q is assigned a value. 170 00:06:08,500 --> 00:06:10,659 So initially the 171 00:06:10,660 --> 00:06:12,789 programmer intended Q to point into 172 00:06:12,790 --> 00:06:15,309 the buffer, but the attacker 173 00:06:15,310 --> 00:06:17,679 can override or can control. 174 00:06:17,680 --> 00:06:19,809 Q To point to some function point or 175 00:06:19,810 --> 00:06:22,509 later on somewhere, somewhere in memory 176 00:06:22,510 --> 00:06:24,609 and as a second step of the 177 00:06:24,610 --> 00:06:27,309 attack, the attacker can write 178 00:06:27,310 --> 00:06:29,199 to the function pointer and instead of 179 00:06:29,200 --> 00:06:31,179 pointing to the valid function that we 180 00:06:31,180 --> 00:06:33,849 want to execute, the 181 00:06:33,850 --> 00:06:35,469 function pointer now points to an attack, 182 00:06:35,470 --> 00:06:37,449 a controlled Gachet and is allows a 183 00:06:37,450 --> 00:06:39,549 control flow hatcheck attack that the 184 00:06:39,550 --> 00:06:40,550 attacker can then 185 00:06:42,070 --> 00:06:44,139 exploit as soon as the pointer 186 00:06:44,140 --> 00:06:45,219 is dereference. 187 00:06:45,220 --> 00:06:46,929 And as soon as that happens, all the bets 188 00:06:46,930 --> 00:06:49,449 are off. The attacker runs, runs code 189 00:06:49,450 --> 00:06:52,209 and can get control over your system. 190 00:06:52,210 --> 00:06:53,799 But you might say we've got all these 191 00:06:53,800 --> 00:06:56,619 fancy defense mechanisms out there are. 192 00:06:56,620 --> 00:06:59,199 What about these existing defenses? 193 00:06:59,200 --> 00:07:00,469 So, yeah, that's actually true. 194 00:07:00,470 --> 00:07:02,529 We've got a bunch of defense mechanisms 195 00:07:02,530 --> 00:07:04,869 like data execution prevention, which 196 00:07:04,870 --> 00:07:07,719 prohibit an attacker from injecting code. 197 00:07:07,720 --> 00:07:10,179 That's why I assumed in the beginning 198 00:07:10,180 --> 00:07:12,249 that the attacker cannot 199 00:07:12,250 --> 00:07:14,289 modify the ongoing code. 200 00:07:14,290 --> 00:07:16,329 So this is all fair and nice. 201 00:07:16,330 --> 00:07:18,459 But in addition to that, a problem 202 00:07:18,460 --> 00:07:19,899 on current systems is that the attacker 203 00:07:19,900 --> 00:07:22,179 can restitch individual 204 00:07:22,180 --> 00:07:24,159 Gachet right next to each other and 205 00:07:24,160 --> 00:07:26,679 thereby execute arbitrary colts' 206 00:07:26,680 --> 00:07:28,419 code, so-called return oriented 207 00:07:28,420 --> 00:07:30,309 programing or trump oriented programing, 208 00:07:30,310 --> 00:07:31,310 if you want to look it up. 209 00:07:32,700 --> 00:07:34,769 There's a Asla, which is a nice 210 00:07:34,770 --> 00:07:37,019 probabilistic defense mechanism that 211 00:07:37,020 --> 00:07:38,639 shuffles memory around, but it's only a 212 00:07:38,640 --> 00:07:39,659 probabilistic defense. 213 00:07:39,660 --> 00:07:41,639 And if we assume that the attacker can 214 00:07:41,640 --> 00:07:43,709 read arbitrary memory, the attacker can 215 00:07:43,710 --> 00:07:45,989 easily sidestep this defense 216 00:07:45,990 --> 00:07:47,939 mechanism by reading out some pointers 217 00:07:47,940 --> 00:07:50,129 and thereby reconstructing the 218 00:07:50,130 --> 00:07:52,679 randomization algorithm that was used 219 00:07:52,680 --> 00:07:54,899 and thereby leaked the correct addresses 220 00:07:54,900 --> 00:07:57,389 and craft an attack carefully 221 00:07:57,390 --> 00:07:59,129 that sidesteps this defense missile 222 00:07:59,130 --> 00:08:00,329 defense mechanism. 223 00:08:00,330 --> 00:08:02,099 And the same goes for four stacked 224 00:08:02,100 --> 00:08:04,979 categories that protect against 225 00:08:04,980 --> 00:08:07,319 overriding the return instruction 226 00:08:07,320 --> 00:08:08,219 point on the stack. 227 00:08:08,220 --> 00:08:09,689 If the attacker can read out the stack 228 00:08:09,690 --> 00:08:11,309 from before that, the attacker can 229 00:08:11,310 --> 00:08:13,679 carefully craft an attack that sidesteps 230 00:08:13,680 --> 00:08:14,759 the defense mechanism. 231 00:08:16,440 --> 00:08:18,779 So it looks like they're almost right, 232 00:08:18,780 --> 00:08:20,309 if you want to know more. 233 00:08:20,310 --> 00:08:22,229 There are a bunch of nice papers out 234 00:08:22,230 --> 00:08:24,539 there or you can just watch the 235 00:08:24,540 --> 00:08:26,729 talk from last year 236 00:08:26,730 --> 00:08:28,409 where we evaluated all the different 237 00:08:28,410 --> 00:08:30,569 kinds of war games that are in 238 00:08:30,570 --> 00:08:32,879 yet in memory and how these 239 00:08:32,880 --> 00:08:35,099 these war games all play together and how 240 00:08:35,100 --> 00:08:36,289 an attacker can exploit them. 241 00:08:37,590 --> 00:08:40,199 So now that we know that 242 00:08:40,200 --> 00:08:42,329 we cannot just use C-code 243 00:08:42,330 --> 00:08:44,609 or C++ code because they are the code 244 00:08:44,610 --> 00:08:46,679 is full of of bugs, you 245 00:08:46,680 --> 00:08:47,909 might say, ha ha. 246 00:08:49,180 --> 00:08:51,099 Memory safety to the rescue bill just 247 00:08:51,100 --> 00:08:53,379 moved to safe language, right, instead 248 00:08:53,380 --> 00:08:55,389 of coding in C or C++ 249 00:08:57,550 --> 00:08:59,679 program, language research 250 00:08:59,680 --> 00:09:01,149 has come up with a whole bunch of 251 00:09:01,150 --> 00:09:03,129 languages. There's Python, 252 00:09:04,210 --> 00:09:06,339 there's Java there, C 253 00:09:06,340 --> 00:09:08,349 sharp or there swift. 254 00:09:08,350 --> 00:09:10,209 These are all memory, safe language 255 00:09:10,210 --> 00:09:11,859 languages. And these problems that I just 256 00:09:11,860 --> 00:09:13,929 talked about would completely go away. 257 00:09:14,980 --> 00:09:15,980 Sounds good, right? 258 00:09:17,040 --> 00:09:19,439 So where do we stand? 259 00:09:19,440 --> 00:09:21,549 Let's assume let's look at 260 00:09:21,550 --> 00:09:23,819 the Dropbox's uploader for 261 00:09:23,820 --> 00:09:24,939 your files. 262 00:09:24,940 --> 00:09:26,819 It's written in three three thousand 263 00:09:26,820 --> 00:09:28,889 lines of code, completely 264 00:09:28,890 --> 00:09:30,169 memory safe. 265 00:09:30,170 --> 00:09:32,249 Everything is good, right? 266 00:09:32,250 --> 00:09:35,129 There's no way you can exploit that cut. 267 00:09:35,130 --> 00:09:37,199 But imagine to run these three 268 00:09:37,200 --> 00:09:39,659 thousand lines of Python code. 269 00:09:39,660 --> 00:09:41,909 What do you need on top of that? 270 00:09:41,910 --> 00:09:44,129 Well, for one, there's the Python 271 00:09:44,130 --> 00:09:46,199 runtime, half a 272 00:09:46,200 --> 00:09:47,549 million lines of code. 273 00:09:47,550 --> 00:09:48,600 It's written in C. 274 00:09:49,970 --> 00:09:51,919 And again, we have all the memory 275 00:09:51,920 --> 00:09:54,019 corruption vulnerabilities, in addition 276 00:09:54,020 --> 00:09:57,019 to that, we used to Lipsey, 277 00:09:57,020 --> 00:09:59,089 to all the courts to do all the 278 00:09:59,090 --> 00:10:01,129 system court, two and a half million 279 00:10:01,130 --> 00:10:03,349 lines of code. 280 00:10:03,350 --> 00:10:05,089 And on top of that, there's a thing 281 00:10:05,090 --> 00:10:07,219 called the Linux Linux kernel or 282 00:10:07,220 --> 00:10:09,289 the the Windows kernel, another 16 283 00:10:09,290 --> 00:10:11,659 million lines of secret. 284 00:10:11,660 --> 00:10:14,269 So C-code or C++ code, 285 00:10:14,270 --> 00:10:16,189 low level languages are not going to get 286 00:10:16,190 --> 00:10:17,179 away. 287 00:10:17,180 --> 00:10:19,459 There's no way we can fix all these bugs. 288 00:10:19,460 --> 00:10:21,739 And it's just way too much of an attack 289 00:10:21,740 --> 00:10:23,749 surface. And we need to come up with 290 00:10:23,750 --> 00:10:25,460 strong defense mechanisms for those. 291 00:10:26,750 --> 00:10:29,119 So if you look at the current state of 292 00:10:29,120 --> 00:10:31,189 defense mechanisms, a 293 00:10:31,190 --> 00:10:33,259 lot of the code that runs on 294 00:10:33,260 --> 00:10:35,179 the system, even if you program in a safe 295 00:10:35,180 --> 00:10:36,769 language, is actually unsafe. 296 00:10:36,770 --> 00:10:38,899 And only a small, tiny little 297 00:10:38,900 --> 00:10:41,179 subset is written in safe 298 00:10:41,180 --> 00:10:42,180 code, if at all. 299 00:10:43,280 --> 00:10:44,390 So if you compare. 300 00:10:45,700 --> 00:10:47,709 The how how close are we to safe 301 00:10:47,710 --> 00:10:48,789 languages? 302 00:10:48,790 --> 00:10:51,099 We are way off and there's 303 00:10:51,100 --> 00:10:53,469 a huge way to go 304 00:10:53,470 --> 00:10:54,730 and we need to get much better. 305 00:10:56,120 --> 00:10:58,249 So you might say, OK, fair 306 00:10:58,250 --> 00:10:59,250 enough. 307 00:10:59,720 --> 00:11:01,729 We cannot rewrite everything in a in a 308 00:11:01,730 --> 00:11:04,729 safe language, so let's just retrofit 309 00:11:04,730 --> 00:11:06,949 safe language on top of the 310 00:11:06,950 --> 00:11:08,989 old languages. Right. 311 00:11:08,990 --> 00:11:10,939 Sounds like a good idea because memory 312 00:11:10,940 --> 00:11:13,099 safety first, all these books 313 00:11:13,100 --> 00:11:16,159 would go away and 314 00:11:16,160 --> 00:11:18,409 academia and industry came 315 00:11:18,410 --> 00:11:20,179 up with a bunch of defense mechanisms on 316 00:11:20,180 --> 00:11:20,989 top of that. 317 00:11:20,990 --> 00:11:23,089 So could that you can just plug into your 318 00:11:23,090 --> 00:11:25,369 compiler and more 319 00:11:25,370 --> 00:11:27,379 or less easily just recompile your 320 00:11:27,380 --> 00:11:29,539 software to a couple of hundred 321 00:11:29,540 --> 00:11:31,459 hours of of code changes and you're good 322 00:11:31,460 --> 00:11:32,460 to go. 323 00:11:33,260 --> 00:11:36,109 So they're soft bound plus seats, 324 00:11:36,110 --> 00:11:38,269 which is a great defense mechanism that 325 00:11:38,270 --> 00:11:40,369 retrofits memory safety on top of C 326 00:11:40,370 --> 00:11:42,589 and C++, but it comes with 327 00:11:42,590 --> 00:11:44,389 a one almost one hundred and twenty 328 00:11:44,390 --> 00:11:46,099 percent price tag. 329 00:11:46,100 --> 00:11:48,019 So it's fairly expensive. 330 00:11:48,020 --> 00:11:50,539 Also, it's not exactly 331 00:11:50,540 --> 00:11:51,769 compatible with all the code. 332 00:11:53,200 --> 00:11:55,179 But it runs most of the code or a lot of 333 00:11:55,180 --> 00:11:57,399 the code, they're secure to try 334 00:11:57,400 --> 00:11:59,529 to retrofit memory safety on top 335 00:11:59,530 --> 00:12:01,989 off of the C language and restricts 336 00:12:01,990 --> 00:12:04,269 or enforces a strong 337 00:12:04,270 --> 00:12:06,519 or type system that then comes with some 338 00:12:06,520 --> 00:12:08,319 form of memory safety on top of it. 339 00:12:08,320 --> 00:12:10,509 But again, it comes as a 60, 60 340 00:12:10,510 --> 00:12:12,429 percent price tag, which is way too 341 00:12:12,430 --> 00:12:14,559 expensive in practice 342 00:12:14,560 --> 00:12:16,719 or address sanitizers, which is 343 00:12:16,720 --> 00:12:19,059 great for debugging and a great tool. 344 00:12:19,060 --> 00:12:20,229 But it's on one hand it's only 345 00:12:20,230 --> 00:12:21,169 probabilistic. 346 00:12:21,170 --> 00:12:23,559 And again, it adds 73 percent 347 00:12:23,560 --> 00:12:24,560 overhead. 348 00:12:26,570 --> 00:12:28,819 So even though we know that 349 00:12:28,820 --> 00:12:30,919 as the tools are right now, they have way 350 00:12:30,920 --> 00:12:32,479 too much overhead. 351 00:12:32,480 --> 00:12:34,609 That's what we want to do. 352 00:12:34,610 --> 00:12:36,859 We want to retrofit memory safety 353 00:12:36,860 --> 00:12:39,049 on top of these existing 354 00:12:39,050 --> 00:12:41,209 compilers and then enforce it at 355 00:12:41,210 --> 00:12:43,219 runtime to ensure that we are protected 356 00:12:43,220 --> 00:12:44,629 at all times. 357 00:12:44,630 --> 00:12:46,699 And just to show you where this 358 00:12:46,700 --> 00:12:47,899 overhead comes from. 359 00:12:47,900 --> 00:12:50,119 I'm going to give you an example of 360 00:12:50,120 --> 00:12:52,219 how soft bound would 361 00:12:52,220 --> 00:12:54,409 enforce memory safety on this on a small 362 00:12:54,410 --> 00:12:55,309 program. 363 00:12:55,310 --> 00:12:58,099 So we've got a couple of lines of code. 364 00:12:58,100 --> 00:12:59,809 We have a buffer that is that is 365 00:12:59,810 --> 00:13:01,069 allocated. 366 00:13:01,070 --> 00:13:03,139 We have our pointer cue again, which 367 00:13:03,140 --> 00:13:05,359 is from the from the motivating example, 368 00:13:05,360 --> 00:13:07,729 which is assigned the initial 369 00:13:07,730 --> 00:13:09,979 base address of the buffer, plus some 370 00:13:09,980 --> 00:13:12,049 user controlled input 371 00:13:12,050 --> 00:13:13,909 southbound as a compiler based 372 00:13:13,910 --> 00:13:16,069 transformation that adds additional 373 00:13:16,070 --> 00:13:18,619 checks in the background 374 00:13:18,620 --> 00:13:21,199 that are then executed at runtime. 375 00:13:21,200 --> 00:13:23,329 So first of all, it 376 00:13:23,330 --> 00:13:26,299 assigns metadata to all the pointers. 377 00:13:26,300 --> 00:13:28,789 So as soon as as we declare 378 00:13:28,790 --> 00:13:31,009 buffer here, there are two additional 379 00:13:31,010 --> 00:13:34,149 variables declared that contain 380 00:13:34,150 --> 00:13:36,319 the the lower 381 00:13:36,320 --> 00:13:39,169 and upper bounds of the 382 00:13:39,170 --> 00:13:40,579 of the buffer itself. 383 00:13:40,580 --> 00:13:43,279 So we have a lower pointer and an upper 384 00:13:43,280 --> 00:13:46,129 pointer, and those are carried along 385 00:13:46,130 --> 00:13:49,099 all the all the accesses and so on 386 00:13:49,100 --> 00:13:50,809 if we do have assignments. 387 00:13:52,400 --> 00:13:54,499 From other types or 388 00:13:54,500 --> 00:13:56,929 across pointers, we propagate 389 00:13:56,930 --> 00:13:59,569 metadata, so we see that culo, 390 00:13:59,570 --> 00:14:02,149 the two variables, Culo and Kuepper 391 00:14:02,150 --> 00:14:04,849 are assigned to the lower bounds of 392 00:14:04,850 --> 00:14:06,499 the buffer and the upper bounds of the 393 00:14:06,500 --> 00:14:08,809 buffer. So this this metadata 394 00:14:08,810 --> 00:14:10,999 is carried along and can be used 395 00:14:11,000 --> 00:14:12,589 for further protection. 396 00:14:12,590 --> 00:14:14,659 In addition to that, we 397 00:14:14,660 --> 00:14:15,660 have. 398 00:14:17,530 --> 00:14:19,749 Dereference or a check whenever 399 00:14:19,750 --> 00:14:21,099 that pointer is used. 400 00:14:21,100 --> 00:14:24,249 So before star Q 401 00:14:24,250 --> 00:14:26,439 Q equals in Putu is 402 00:14:26,440 --> 00:14:28,659 actually executed, we do a check 403 00:14:28,660 --> 00:14:30,909 if the current value is Q is 404 00:14:30,910 --> 00:14:33,129 inside the bounds and 405 00:14:33,130 --> 00:14:35,559 aboard otherwise, which protects 406 00:14:35,560 --> 00:14:38,739 us from any possible attacks 407 00:14:38,740 --> 00:14:41,079 against us 408 00:14:41,080 --> 00:14:42,669 that would that would design it out of 409 00:14:42,670 --> 00:14:43,209 bounds. 410 00:14:43,210 --> 00:14:44,829 So the function pointer cannot be 411 00:14:44,830 --> 00:14:46,049 overwritten in this example. 412 00:14:47,590 --> 00:14:48,590 So. 413 00:14:49,470 --> 00:14:51,719 What we have is or what we get 414 00:14:51,720 --> 00:14:53,249 is this one hundred and sixty percent 415 00:14:53,250 --> 00:14:54,899 performance overhead because they are 416 00:14:54,900 --> 00:14:57,179 just way too many pointer assignments 417 00:14:57,180 --> 00:14:59,609 in low lot of languages like C or C++. 418 00:14:59,610 --> 00:15:01,739 And the compiler has a very hard time 419 00:15:01,740 --> 00:15:03,989 at optimizing these and getting rid of 420 00:15:03,990 --> 00:15:06,659 all the surplus assignments. 421 00:15:06,660 --> 00:15:08,819 So in reality, or with 422 00:15:08,820 --> 00:15:10,919 a perfect compiler, we should be able to 423 00:15:10,920 --> 00:15:12,989 prove for many more accesses that they 424 00:15:12,990 --> 00:15:14,429 are actually safe. 425 00:15:14,430 --> 00:15:16,739 But it's very hard to reason about 426 00:15:16,740 --> 00:15:18,329 these things on the on the compiler 427 00:15:18,330 --> 00:15:20,009 level. And therefore, we have to very 428 00:15:20,010 --> 00:15:22,079 high or fairly high overhead, even as 429 00:15:22,080 --> 00:15:24,389 a very sophisticated compiler analysis 430 00:15:24,390 --> 00:15:26,249 framework like LVM offers. 431 00:15:27,430 --> 00:15:29,379 So it looks like this, right, we are 432 00:15:29,380 --> 00:15:31,839 walking towards that that safe, 433 00:15:31,840 --> 00:15:33,969 safe haven and we do have that 434 00:15:33,970 --> 00:15:36,039 safe haven, but we are facing 435 00:15:36,040 --> 00:15:38,679 a problem that we either have safety 436 00:15:38,680 --> 00:15:41,499 or flexibility and performance. 437 00:15:41,500 --> 00:15:43,749 So it's an either or which is 438 00:15:43,750 --> 00:15:45,129 really, really bad. 439 00:15:45,130 --> 00:15:47,259 We would ideally we would have to 440 00:15:47,260 --> 00:15:50,469 we want to have both safety, flexibility 441 00:15:50,470 --> 00:15:51,470 and performance. 442 00:15:53,460 --> 00:15:55,259 You might want to know more about 443 00:15:56,430 --> 00:15:57,430 memory safety. 444 00:15:59,100 --> 00:16:01,409 Feel free to or read the paper 445 00:16:01,410 --> 00:16:03,569 or watch last year's talk 446 00:16:03,570 --> 00:16:06,329 by Andreas, who presented 447 00:16:06,330 --> 00:16:08,640 southbound four for FreeBSD. 448 00:16:09,910 --> 00:16:11,499 So now that we know how memory safety 449 00:16:11,500 --> 00:16:13,599 works can be adapted 450 00:16:13,600 --> 00:16:16,869 somehow so that we can protect only 451 00:16:16,870 --> 00:16:18,789 a small set of data. 452 00:16:20,380 --> 00:16:22,419 We no longer want to protect all the 453 00:16:22,420 --> 00:16:23,420 data. 454 00:16:24,140 --> 00:16:25,969 Because that's that's way too expensive, 455 00:16:25,970 --> 00:16:28,199 right? Otherwise, we would run again into 456 00:16:28,200 --> 00:16:30,379 into this high overhead bird, possibly 457 00:16:30,380 --> 00:16:32,029 instead of protecting all the data that 458 00:16:32,030 --> 00:16:34,669 is out there, let us just focus 459 00:16:34,670 --> 00:16:36,889 on a small subset of data and 460 00:16:36,890 --> 00:16:39,379 protect that small subset of data. 461 00:16:39,380 --> 00:16:41,869 So just a couple of code pointers 462 00:16:41,870 --> 00:16:44,449 on the heap and 463 00:16:44,450 --> 00:16:46,549 a couple of pointers 464 00:16:46,550 --> 00:16:48,859 and variables on a stack that we deemed 465 00:16:48,860 --> 00:16:50,390 to be protection versus. 466 00:16:52,540 --> 00:16:55,089 So instead of just enforcing 467 00:16:55,090 --> 00:16:57,309 a probabilistic defenses for 468 00:16:57,310 --> 00:16:59,409 defense, for all the data or 469 00:16:59,410 --> 00:17:01,179 a strong defense mechanism for all, the 470 00:17:01,180 --> 00:17:02,889 data is high overhead. 471 00:17:02,890 --> 00:17:05,019 We offer strong protection for 472 00:17:05,020 --> 00:17:07,959 a select subset of data. 473 00:17:07,960 --> 00:17:10,059 And in addition to that, we 474 00:17:10,060 --> 00:17:12,219 have a very different attack 475 00:17:12,220 --> 00:17:14,439 from model to other defense mechanisms. 476 00:17:14,440 --> 00:17:16,659 We assume that the attacker may modify 477 00:17:16,660 --> 00:17:18,848 any unprotected data that is out there 478 00:17:18,849 --> 00:17:21,009 and the attacker can freely write to any 479 00:17:21,010 --> 00:17:22,358 of the data that we don't really care 480 00:17:22,359 --> 00:17:23,359 about. 481 00:17:25,230 --> 00:17:27,179 So instead of protecting everything a 482 00:17:27,180 --> 00:17:29,579 little, we protect 483 00:17:29,580 --> 00:17:31,680 a small set of data completely. 484 00:17:33,590 --> 00:17:34,850 And just to give you a. 485 00:17:36,980 --> 00:17:38,029 A peek preview 486 00:17:39,500 --> 00:17:41,359 from we change 487 00:17:42,680 --> 00:17:45,079 the overhead numbers from complete 488 00:17:45,080 --> 00:17:47,089 memory safety, which faces one hundred 489 00:17:47,090 --> 00:17:48,410 and twenty percent overhead 490 00:17:49,460 --> 00:17:52,409 to as low as. 491 00:17:52,410 --> 00:17:54,599 Only two to 492 00:17:54,600 --> 00:17:56,669 eight percent overhead if we 493 00:17:56,670 --> 00:17:58,829 protect only code pointers, 494 00:17:58,830 --> 00:18:00,929 so we focus for our protection on 495 00:18:00,930 --> 00:18:03,269 code pointers and enforce strong memory 496 00:18:03,270 --> 00:18:04,679 safety for code pointers. 497 00:18:04,680 --> 00:18:06,959 So anything that is used 498 00:18:06,960 --> 00:18:09,059 in a contraflow decision 499 00:18:09,060 --> 00:18:10,140 at one point in time. 500 00:18:11,180 --> 00:18:12,969 Through an indirect jump through in the 501 00:18:12,970 --> 00:18:14,659 call or anything like that, they'll be 502 00:18:14,660 --> 00:18:17,389 protected by our defense mechanism, 503 00:18:17,390 --> 00:18:19,669 but we don't care about any of 504 00:18:19,670 --> 00:18:21,739 the data that is on the on the heap or 505 00:18:21,740 --> 00:18:23,779 on the stack. And we focus only on the on 506 00:18:23,780 --> 00:18:25,879 a small subset of 507 00:18:25,880 --> 00:18:27,949 coach pointers that actually are 508 00:18:27,950 --> 00:18:29,899 used for control flow decisions. 509 00:18:29,900 --> 00:18:32,689 And therefore, we can protect 510 00:18:32,690 --> 00:18:35,689 against control, flow, hatcheck attacks. 511 00:18:35,690 --> 00:18:37,959 We don't protect against any data 512 00:18:37,960 --> 00:18:38,589 data attack. 513 00:18:38,590 --> 00:18:40,999 So to 514 00:18:41,000 --> 00:18:43,099 actually enforce enforce 515 00:18:43,100 --> 00:18:45,559 this, we had to come up 516 00:18:45,560 --> 00:18:47,869 with a set of special techniques 517 00:18:47,870 --> 00:18:50,179 that transform your program 518 00:18:50,180 --> 00:18:51,469 into a protected program. 519 00:18:53,180 --> 00:18:55,249 One of the core ideas that we have 520 00:18:55,250 --> 00:18:57,319 as something that's been out there 521 00:18:57,320 --> 00:18:59,509 in or has been used 522 00:18:59,510 --> 00:19:01,639 in networking for decades but hasn't been 523 00:19:01,640 --> 00:19:03,449 used in software engineering, we 524 00:19:03,450 --> 00:19:05,959 separate, uh, separate 525 00:19:05,960 --> 00:19:09,019 control data and, 526 00:19:09,020 --> 00:19:11,479 uh, the control plane and the data plane. 527 00:19:11,480 --> 00:19:13,759 So instead of having just one single view 528 00:19:13,760 --> 00:19:14,760 of memory. 529 00:19:15,860 --> 00:19:17,839 We separate the program memory into two 530 00:19:17,840 --> 00:19:20,119 different views, so on one 531 00:19:20,120 --> 00:19:22,189 hand we have the regular memory with 532 00:19:22,190 --> 00:19:24,499 all the buffer's, all the pointers 533 00:19:24,500 --> 00:19:26,329 and so on. And on the other hand, 534 00:19:27,470 --> 00:19:28,490 we have safe memory 535 00:19:29,630 --> 00:19:32,749 and our safe memory contains 536 00:19:32,750 --> 00:19:34,940 code pointers and code pointers only. 537 00:19:36,220 --> 00:19:37,749 The regular memory, on the other hand, 538 00:19:37,750 --> 00:19:39,160 contains all other data. 539 00:19:43,870 --> 00:19:44,870 So. 540 00:19:48,000 --> 00:19:49,339 It looks quite bad, right? 541 00:19:52,730 --> 00:19:55,069 But I guess you get the gist 542 00:19:55,070 --> 00:19:56,070 so. 543 00:19:58,590 --> 00:20:00,299 Safe memory contains all the safe code 544 00:20:00,300 --> 00:20:02,069 pointers, regular memory contains 545 00:20:02,070 --> 00:20:04,229 everything else are 546 00:20:04,230 --> 00:20:06,569 just at the memory locations 547 00:20:06,570 --> 00:20:08,460 for the function, pointers are not used. 548 00:20:10,930 --> 00:20:13,149 The memory layout itself is unchanged. 549 00:20:13,150 --> 00:20:15,219 So in the place where 550 00:20:15,220 --> 00:20:17,409 a code pointer was before, there will 551 00:20:17,410 --> 00:20:19,300 just be an unused block. 552 00:20:21,390 --> 00:20:24,479 And a control plane, any 553 00:20:24,480 --> 00:20:26,639 memory location as either a code 554 00:20:26,640 --> 00:20:28,889 pointer or null and 555 00:20:28,890 --> 00:20:31,259 we can imposters memory of you using 556 00:20:31,260 --> 00:20:32,819 some compiler based technique. 557 00:20:35,750 --> 00:20:38,089 And enforced that the safe 558 00:20:38,090 --> 00:20:40,339 memory region only contains coach 559 00:20:40,340 --> 00:20:42,519 pointers and nothing else using the 560 00:20:42,520 --> 00:20:44,749 word transformation, but more on that 561 00:20:44,750 --> 00:20:46,579 that later. So for now, just remember 562 00:20:46,580 --> 00:20:48,109 that we split the memory of you. 563 00:20:48,110 --> 00:20:49,819 We have a safe memory view that contains 564 00:20:49,820 --> 00:20:52,069 code pointers, nothing else, 565 00:20:52,070 --> 00:20:53,989 or null values. 566 00:20:53,990 --> 00:20:56,269 And the regular memory is the rest of the 567 00:20:56,270 --> 00:20:57,270 data. 568 00:21:00,570 --> 00:21:02,699 So on the stack, we have 569 00:21:02,700 --> 00:21:05,489 a different kind of technique. 570 00:21:05,490 --> 00:21:07,859 We split the stack similar to 571 00:21:07,860 --> 00:21:10,079 the the heat memory 572 00:21:10,080 --> 00:21:12,959 into a safe stack and 573 00:21:12,960 --> 00:21:13,960 a regular stack. 574 00:21:18,870 --> 00:21:20,939 Has it been like this since the 575 00:21:20,940 --> 00:21:21,940 beginning? 576 00:21:22,480 --> 00:21:24,519 It's like a third of the slight cut of. 577 00:21:27,780 --> 00:21:30,030 Let me try to do something. 578 00:21:32,210 --> 00:21:33,210 That's annoying. 579 00:21:43,790 --> 00:21:44,790 Uh. 580 00:21:45,810 --> 00:21:46,810 So much for. 581 00:21:48,170 --> 00:21:49,170 TVI out. 582 00:21:50,620 --> 00:21:52,589 OK, let's let's try to continue. 583 00:21:54,570 --> 00:21:56,279 So we've got the safe stack and the 584 00:21:56,280 --> 00:21:58,979 regular stack on the safe stack, 585 00:21:58,980 --> 00:22:01,799 we add an additional 586 00:22:01,800 --> 00:22:04,169 compiler instrumentation path that looks 587 00:22:04,170 --> 00:22:06,269 at all the local variables on 588 00:22:06,270 --> 00:22:08,849 the on a stack frame 589 00:22:08,850 --> 00:22:11,489 and everything that we can prove 590 00:22:11,490 --> 00:22:13,619 in our instrumentation pass that is 591 00:22:13,620 --> 00:22:14,789 safely accessed. 592 00:22:14,790 --> 00:22:16,619 So any local variable that is that is 593 00:22:16,620 --> 00:22:19,049 safe is pushed to the safe stack 594 00:22:19,050 --> 00:22:21,119 while all the other variables that 595 00:22:21,120 --> 00:22:23,549 we can approve are safe, are 596 00:22:23,550 --> 00:22:25,049 pushed to the stack. 597 00:22:25,050 --> 00:22:27,179 So stuff that could cause something to be 598 00:22:27,180 --> 00:22:29,549 unsafe as either some of your pointer 599 00:22:29,550 --> 00:22:31,709 arithmetic if it escapes 600 00:22:31,710 --> 00:22:32,710 the local. 601 00:22:34,780 --> 00:22:35,859 What the heck is this? 602 00:22:40,510 --> 00:22:42,730 So if it escapes the local that frame or, 603 00:22:44,200 --> 00:22:46,269 yeah, whatever, we push it or we 604 00:22:46,270 --> 00:22:48,129 keep it under under regular SEC and we 605 00:22:48,130 --> 00:22:50,379 assume that the attacker can 606 00:22:50,380 --> 00:22:52,839 corrupt anything on the regular stack. 607 00:22:52,840 --> 00:22:53,840 So we 608 00:22:55,360 --> 00:22:57,459 ensure complete safety on the safe 609 00:22:57,460 --> 00:22:58,659 stack. 610 00:22:58,660 --> 00:23:00,549 We don't give any guarantees on the 611 00:23:00,550 --> 00:23:01,550 regular stack. 612 00:23:02,690 --> 00:23:05,509 So if you look at our small code snippet, 613 00:23:05,510 --> 00:23:07,759 the variable R and 614 00:23:07,760 --> 00:23:09,949 the return address would be pushed onto 615 00:23:09,950 --> 00:23:12,079 the safe stack and 616 00:23:12,080 --> 00:23:14,179 the buffer would be pushed 617 00:23:14,180 --> 00:23:15,829 onto the unsafe stack and the attacker 618 00:23:15,830 --> 00:23:17,899 could corrupt some of the other stuff on 619 00:23:17,900 --> 00:23:18,900 the safe stack. 620 00:23:20,310 --> 00:23:22,649 So in using this principle, 621 00:23:22,650 --> 00:23:24,779 we can ensure that the attacker 622 00:23:24,780 --> 00:23:26,879 can only corrupt data that we 623 00:23:26,880 --> 00:23:29,009 are not interested in protecting and 624 00:23:29,010 --> 00:23:31,199 we can decide how much of the data 625 00:23:31,200 --> 00:23:32,279 that we want to protect. 626 00:23:32,280 --> 00:23:34,199 And in our case, we protect anything 627 00:23:34,200 --> 00:23:36,369 that's code Pinder's the 628 00:23:36,370 --> 00:23:38,189 contains code pointers or is used in 629 00:23:38,190 --> 00:23:40,109 contraflow decisions or 630 00:23:41,490 --> 00:23:44,039 can be proven or is proven to be safe 631 00:23:44,040 --> 00:23:45,180 on the on a staggering. 632 00:23:46,720 --> 00:23:48,789 So if you look from above 633 00:23:48,790 --> 00:23:50,979 at the memory 634 00:23:50,980 --> 00:23:53,079 layout, we have 635 00:23:53,080 --> 00:23:55,629 two areas of memory, we have safe memory 636 00:23:55,630 --> 00:23:57,999 for coach pointers and regular memory 637 00:23:58,000 --> 00:24:00,269 for all the other pointers for 638 00:24:00,270 --> 00:24:02,529 the safe memory, we ensure 639 00:24:02,530 --> 00:24:04,929 and guarantee that all the accesses 640 00:24:04,930 --> 00:24:05,930 are safe. 641 00:24:07,000 --> 00:24:09,189 But for the regular memory, they 642 00:24:09,190 --> 00:24:11,379 are fast, but we don't give 643 00:24:11,380 --> 00:24:12,460 any other guarantees. 644 00:24:14,460 --> 00:24:16,409 In between safe memory and regular 645 00:24:16,410 --> 00:24:18,719 memory, veivers views, hardware based 646 00:24:18,720 --> 00:24:20,999 instruction level isolation, using 647 00:24:21,000 --> 00:24:23,219 some technique like segmentation 648 00:24:23,220 --> 00:24:25,499 or some other form 649 00:24:25,500 --> 00:24:26,519 of of blinding. 650 00:24:27,920 --> 00:24:30,409 The regular memory contains regular heap 651 00:24:30,410 --> 00:24:32,659 stack frames and the only 652 00:24:32,660 --> 00:24:35,269 code regions, and to save memory 653 00:24:35,270 --> 00:24:37,549 contains likewise the safe keep 654 00:24:38,750 --> 00:24:41,609 the safe stack are 655 00:24:41,610 --> 00:24:43,459 in the safe stacks of the individual 656 00:24:43,460 --> 00:24:44,460 threats. 657 00:24:46,170 --> 00:24:47,170 So. 658 00:24:48,360 --> 00:24:50,459 Now that I've I've shown the 659 00:24:50,460 --> 00:24:52,739 basic overview of how we separate 660 00:24:52,740 --> 00:24:54,839 code pointers and it's actually 661 00:24:54,840 --> 00:24:56,459 just like in the name, code point or 662 00:24:56,460 --> 00:24:58,799 separation, that all the code pointers 663 00:24:58,800 --> 00:25:00,389 are in a completely different memory 664 00:25:00,390 --> 00:25:02,849 space, how much protection 665 00:25:02,850 --> 00:25:04,649 does that give us? 666 00:25:04,650 --> 00:25:06,779 Let's look at how we can 667 00:25:06,780 --> 00:25:09,449 attack code point or separation. 668 00:25:09,450 --> 00:25:11,219 We have two small C program here. 669 00:25:12,300 --> 00:25:14,009 It looks very similar to the to the 670 00:25:14,010 --> 00:25:15,689 motivating example I used in the 671 00:25:15,690 --> 00:25:18,179 beginning with a slight difference. 672 00:25:18,180 --> 00:25:20,399 So instead of just assigning a function 673 00:25:20,400 --> 00:25:22,709 pointer, we assign 674 00:25:22,710 --> 00:25:24,539 a function pointer through a struct. 675 00:25:24,540 --> 00:25:26,009 So let's assume that the function is 676 00:25:26,010 --> 00:25:28,919 somewhere in the struct and we have a 677 00:25:28,920 --> 00:25:31,139 double in direct reference 678 00:25:31,140 --> 00:25:32,140 to the function pointer. 679 00:25:37,820 --> 00:25:40,279 Again, we have 680 00:25:40,280 --> 00:25:42,319 the opportunity for the attacker to clear 681 00:25:42,320 --> 00:25:45,409 up the buffer, and 682 00:25:45,410 --> 00:25:47,479 we know that the attacker cannot corrupt 683 00:25:47,480 --> 00:25:49,699 the function pointer itself because 684 00:25:49,700 --> 00:25:51,199 this function pointer is a completely 685 00:25:51,200 --> 00:25:53,029 different memory space and therefore 686 00:25:53,030 --> 00:25:55,339 protected. So the attacker using this 687 00:25:55,340 --> 00:25:58,069 simple right, which has a different type 688 00:25:58,070 --> 00:25:59,869 and our type based analysis ensures that 689 00:25:59,870 --> 00:26:01,039 the function pointer is in the other 690 00:26:01,040 --> 00:26:03,169 memory space detector cannot override 691 00:26:03,170 --> 00:26:04,699 the function pointer itself. 692 00:26:04,700 --> 00:26:06,019 But what the attacker can do, the 693 00:26:06,020 --> 00:26:08,569 attacker can override the structure, 694 00:26:08,570 --> 00:26:11,119 the struct pointer and 695 00:26:11,120 --> 00:26:13,579 let it point to somewhere else in memory 696 00:26:13,580 --> 00:26:15,739 where possibly there might 697 00:26:15,740 --> 00:26:17,749 be some other function pointer lying 698 00:26:17,750 --> 00:26:19,009 around. 699 00:26:19,010 --> 00:26:21,349 So if you assume that this or 700 00:26:21,350 --> 00:26:22,759 if you think back to that different 701 00:26:22,760 --> 00:26:24,699 memory of you, that. 702 00:26:24,700 --> 00:26:26,799 Code point memory of contains only code 703 00:26:26,800 --> 00:26:28,359 pointers or null, 704 00:26:29,470 --> 00:26:31,270 and we whenever a 705 00:26:32,560 --> 00:26:34,739 region is freed, we cleared 706 00:26:34,740 --> 00:26:36,549 the code pointers in addition to that. 707 00:26:36,550 --> 00:26:38,709 So this memory of you only contains 708 00:26:38,710 --> 00:26:40,119 the code pointers that are currently 709 00:26:40,120 --> 00:26:42,219 active and in use. 710 00:26:42,220 --> 00:26:44,679 But using this kind of technique, 711 00:26:44,680 --> 00:26:46,899 the attacker can redirected to some other 712 00:26:46,900 --> 00:26:49,359 code pointer, which is 713 00:26:49,360 --> 00:26:51,669 null or a pointer to another function. 714 00:26:51,670 --> 00:26:53,020 But we'll see what, 715 00:26:54,400 --> 00:26:56,529 uh, what the defense mechanism in the 716 00:26:56,530 --> 00:26:57,789 end is like. 717 00:26:57,790 --> 00:27:00,759 So using the compiler based technique 718 00:27:00,760 --> 00:27:02,919 in our analysis analysis phase, we 719 00:27:02,920 --> 00:27:05,049 identify all code pointer 720 00:27:05,050 --> 00:27:07,419 accesses through some static 721 00:27:07,420 --> 00:27:09,549 type analysis and redirect 722 00:27:09,550 --> 00:27:11,589 them to the different memory of you. 723 00:27:14,150 --> 00:27:16,309 The two men will use again are separated 724 00:27:16,310 --> 00:27:18,229 using instructional isolation, for 725 00:27:18,230 --> 00:27:20,419 example, segmentation on eighty-six 726 00:27:20,420 --> 00:27:22,969 or blinding on 64, 727 00:27:22,970 --> 00:27:24,469 and we give a bunch of security 728 00:27:24,470 --> 00:27:25,939 guarantees. 729 00:27:25,940 --> 00:27:28,009 So using the separation 730 00:27:28,010 --> 00:27:29,899 and you have to think about it for a 731 00:27:29,900 --> 00:27:32,419 while until you understand the security 732 00:27:32,420 --> 00:27:33,420 guarantees. 733 00:27:34,020 --> 00:27:36,239 The this separation ensures 734 00:27:36,240 --> 00:27:38,879 that the attacker cannot forge 735 00:27:38,880 --> 00:27:42,239 any code, any new code pointers. 736 00:27:42,240 --> 00:27:43,559 So using a. 737 00:27:44,700 --> 00:27:46,979 Memory, right, using a 738 00:27:46,980 --> 00:27:49,229 non pointer type, 739 00:27:49,230 --> 00:27:51,479 the attacker can never write to that safe 740 00:27:51,480 --> 00:27:52,480 memory of you. 741 00:27:54,630 --> 00:27:56,699 We guarantee that any pointer 742 00:27:56,700 --> 00:27:59,219 that is written into that safe memory 743 00:27:59,220 --> 00:28:02,279 memory of you is either an immediate 744 00:28:02,280 --> 00:28:04,419 like a fixed value. 745 00:28:04,420 --> 00:28:06,489 Or assigned from an 746 00:28:06,490 --> 00:28:07,490 order code pointer. 747 00:28:08,590 --> 00:28:10,299 Therefore, an attacker can never 748 00:28:10,300 --> 00:28:12,249 construct a code pointer to some other 749 00:28:12,250 --> 00:28:13,250 location in memory, 750 00:28:14,470 --> 00:28:16,689 what the attacker can do using 751 00:28:16,690 --> 00:28:19,539 that doubly indirect or multiple 752 00:28:19,540 --> 00:28:21,309 going through multiple interaction can 753 00:28:21,310 --> 00:28:23,609 replace existing functions through these 754 00:28:23,610 --> 00:28:26,559 in. So, for example, foobar. 755 00:28:27,620 --> 00:28:29,689 Function can be turned 756 00:28:29,690 --> 00:28:31,939 into food banks function, but it must 757 00:28:31,940 --> 00:28:32,649 be a code point. 758 00:28:32,650 --> 00:28:33,829 It must exist. 759 00:28:33,830 --> 00:28:36,019 The current memory region must be valid 760 00:28:36,020 --> 00:28:37,459 at that point in time. 761 00:28:37,460 --> 00:28:39,709 And then successful attack is very 762 00:28:39,710 --> 00:28:40,729 unlikely for this. 763 00:28:41,830 --> 00:28:44,049 So what we basically 764 00:28:44,050 --> 00:28:46,839 did is we took all the coach pointers 765 00:28:46,840 --> 00:28:49,239 that are used in a program, grouped 766 00:28:49,240 --> 00:28:50,240 them together. 767 00:28:51,350 --> 00:28:53,419 And make them look out for 768 00:28:53,420 --> 00:28:55,699 each other and protect them against other 769 00:28:55,700 --> 00:28:57,979 modifications from uncowed pointers 770 00:28:57,980 --> 00:28:59,809 that are out there and thereby protect 771 00:28:59,810 --> 00:29:01,969 them against other 772 00:29:01,970 --> 00:29:03,329 modifications. 773 00:29:03,330 --> 00:29:04,330 Also, 774 00:29:05,720 --> 00:29:07,939 if you think about it, all 775 00:29:07,940 --> 00:29:08,940 the. 776 00:29:10,130 --> 00:29:12,619 Memory safety violations are usually 777 00:29:12,620 --> 00:29:14,899 not in point or arithmetic 778 00:29:14,900 --> 00:29:17,179 to code pointers, but 779 00:29:17,180 --> 00:29:18,889 in point or arithmetic to some other 780 00:29:18,890 --> 00:29:21,379 buffer or some other input. 781 00:29:21,380 --> 00:29:23,329 So it is very likely that this will be 782 00:29:23,330 --> 00:29:24,330 the exploitable. 783 00:29:25,620 --> 00:29:26,620 So. 784 00:29:28,350 --> 00:29:30,459 This is the simple code point 785 00:29:30,460 --> 00:29:32,669 separation, but can we do better 786 00:29:32,670 --> 00:29:33,989 than this? 787 00:29:33,990 --> 00:29:35,519 Remember, we talked about memory safety 788 00:29:35,520 --> 00:29:37,499 before and here I didn't talk about 789 00:29:37,500 --> 00:29:38,500 memory safety yet. 790 00:29:39,890 --> 00:29:42,019 So let me see if 791 00:29:42,020 --> 00:29:44,419 I could point to integrity takes 792 00:29:44,420 --> 00:29:46,999 code point or separation as a baseline. 793 00:29:47,000 --> 00:29:49,369 And in addition to that, and forces 794 00:29:49,370 --> 00:29:51,529 memory safety for the code Pinder's. 795 00:29:51,530 --> 00:29:53,599 So the sensitive pointers 796 00:29:53,600 --> 00:29:55,669 that we protect are the code pointers. 797 00:29:55,670 --> 00:29:58,129 And in addition to that, any pointers 798 00:29:58,130 --> 00:30:00,259 used to access to sensitive 799 00:30:00,260 --> 00:30:02,329 pointers. So everything that's used in 800 00:30:02,330 --> 00:30:04,609 the dereference chain automatically 801 00:30:04,610 --> 00:30:06,439 becomes protected as well. 802 00:30:06,440 --> 00:30:08,569 In addition to that, we enforce 803 00:30:08,570 --> 00:30:11,059 bounced checks, as I've discussed before, 804 00:30:11,060 --> 00:30:12,060 for all the 805 00:30:13,160 --> 00:30:16,009 for all the sensitive 806 00:30:16,010 --> 00:30:17,329 pointers that we identified. 807 00:30:19,880 --> 00:30:22,789 We could identify individual instances 808 00:30:22,790 --> 00:30:25,069 of safe, sensitive 809 00:30:25,070 --> 00:30:26,809 pointers, but we do an over 810 00:30:26,810 --> 00:30:29,869 approximation, which does not 811 00:30:29,870 --> 00:30:32,209 which does not lower security but 812 00:30:32,210 --> 00:30:34,219 might increase overhead instead of 813 00:30:34,220 --> 00:30:36,079 protecting individual instances, we 814 00:30:36,080 --> 00:30:38,659 protect types that we identify 815 00:30:38,660 --> 00:30:40,879 and deem to be sensitive. 816 00:30:40,880 --> 00:30:42,890 So it's an approximation which is safe. 817 00:30:45,150 --> 00:30:47,429 This approximation only only affects 818 00:30:47,430 --> 00:30:50,009 performance, but we measure it on spec 819 00:30:50,010 --> 00:30:51,629 two thousand six, which is a standard 820 00:30:51,630 --> 00:30:53,250 benchmark that using 821 00:30:54,450 --> 00:30:56,759 this full transformation, roughly six 822 00:30:56,760 --> 00:30:59,249 point five percent of all 823 00:30:59,250 --> 00:31:01,889 the memory accesses 824 00:31:01,890 --> 00:31:03,569 are access to sensitive data. 825 00:31:05,180 --> 00:31:07,429 So let's see how our example looks 826 00:31:07,430 --> 00:31:08,869 like if we have 827 00:31:10,100 --> 00:31:12,289 bounced checks in addition to the 828 00:31:12,290 --> 00:31:13,290 to the separation. 829 00:31:15,960 --> 00:31:17,849 Just like in the example of memory 830 00:31:17,850 --> 00:31:20,069 safety, before we at 831 00:31:20,070 --> 00:31:22,529 the lower and upper bounds 832 00:31:22,530 --> 00:31:24,599 and we execute an additional 833 00:31:24,600 --> 00:31:26,789 check if the 834 00:31:26,790 --> 00:31:29,219 bounds are still valid and 835 00:31:29,220 --> 00:31:30,719 before the attacker could 836 00:31:32,670 --> 00:31:34,739 override the function pointer or 837 00:31:34,740 --> 00:31:37,169 redirect a pointer to 838 00:31:37,170 --> 00:31:39,269 an exception would be triggered whenever 839 00:31:39,270 --> 00:31:40,619 that is referenced. 840 00:31:43,600 --> 00:31:45,969 So if we compare code 841 00:31:45,970 --> 00:31:48,039 point to integrity and code point 842 00:31:48,040 --> 00:31:50,049 or separation, what additional kind of 843 00:31:50,050 --> 00:31:51,640 security guarantees to get 844 00:31:52,780 --> 00:31:54,909 our first defense mechanism 845 00:31:54,910 --> 00:31:57,009 separate sensitive pointers from 846 00:31:57,010 --> 00:31:59,529 regular data, both of them 847 00:31:59,530 --> 00:32:01,659 are based on a or use a type A 848 00:32:01,660 --> 00:32:02,829 static analysis. 849 00:32:03,850 --> 00:32:06,219 But where they differ is what 850 00:32:06,220 --> 00:32:08,449 sensitive pointers are for 851 00:32:08,450 --> 00:32:09,819 coach point or separation. 852 00:32:09,820 --> 00:32:12,399 Sensitive pointers are code pointers only 853 00:32:12,400 --> 00:32:14,019 for a code point integrity. 854 00:32:14,020 --> 00:32:16,389 The ads, in addition to that, pointers 855 00:32:16,390 --> 00:32:18,039 to sensitive pointers, which is a 856 00:32:18,040 --> 00:32:20,259 recursive definition and kind of follows 857 00:32:20,260 --> 00:32:22,599 the dereference change for and 858 00:32:22,600 --> 00:32:24,969 therefore protect anything that 859 00:32:24,970 --> 00:32:27,039 points to 860 00:32:27,040 --> 00:32:29,739 to coach pointers directly or indirectly 861 00:32:29,740 --> 00:32:30,740 through anything. 862 00:32:33,670 --> 00:32:36,039 We guarantee that accessing 863 00:32:36,040 --> 00:32:37,720 sensitive pointers is safe. 864 00:32:38,770 --> 00:32:40,999 So we use instruction, level 865 00:32:41,000 --> 00:32:43,089 of granularity, separation and 866 00:32:43,090 --> 00:32:44,679 four code point to integrity, in 867 00:32:44,680 --> 00:32:47,019 addition, we use runtime bounce checks 868 00:32:47,020 --> 00:32:48,020 on top of that. 869 00:32:50,100 --> 00:32:52,469 Also, on the other hand, accessing 870 00:32:52,470 --> 00:32:54,539 regular data is safe, 871 00:32:54,540 --> 00:32:56,189 is fast. 872 00:32:56,190 --> 00:32:58,799 So we don't impose any additional 873 00:32:58,800 --> 00:33:01,319 instructions when accessing 874 00:33:01,320 --> 00:33:03,209 other regular data that we don't want to 875 00:33:03,210 --> 00:33:05,399 protect, and this allows us to get 876 00:33:05,400 --> 00:33:06,599 very low overhead. 877 00:33:08,190 --> 00:33:10,379 So what kind of security 878 00:33:10,380 --> 00:33:11,670 guarantees do we have? 879 00:33:14,620 --> 00:33:17,319 Four code point or integrity, 880 00:33:17,320 --> 00:33:19,719 we offer a formerly guaranteed 881 00:33:19,720 --> 00:33:21,909 protection, and if you are interested in 882 00:33:21,910 --> 00:33:24,129 the more theoretical aspects, we do have 883 00:33:24,130 --> 00:33:25,569 a formal proof in our paper. 884 00:33:26,980 --> 00:33:29,169 All in all, if we enforce 885 00:33:29,170 --> 00:33:31,179 memory safety for the code pointers and 886 00:33:31,180 --> 00:33:32,799 the sensitive pointers that we identify 887 00:33:32,800 --> 00:33:34,989 on top of it, we run 888 00:33:34,990 --> 00:33:37,269 into eight point four to 10 percent, 889 00:33:37,270 --> 00:33:39,249 ten point five percent overhead for 890 00:33:39,250 --> 00:33:41,469 roughly six point five percent of memory 891 00:33:41,470 --> 00:33:43,959 accesses, which is 892 00:33:43,960 --> 00:33:46,089 almost deployable and definitely 893 00:33:46,090 --> 00:33:47,059 deployable. 894 00:33:47,060 --> 00:33:49,419 If you are interested in 895 00:33:49,420 --> 00:33:51,519 protecting against specific attacks or 896 00:33:51,520 --> 00:33:53,530 in in different security contexts 897 00:33:54,820 --> 00:33:56,319 for code pointer separation. 898 00:33:59,380 --> 00:34:02,139 We offer strong protection in practice, 899 00:34:02,140 --> 00:34:04,989 so it will be very hard 900 00:34:04,990 --> 00:34:07,389 for an attacker to find. 901 00:34:07,390 --> 00:34:08,390 Uh. 902 00:34:09,170 --> 00:34:11,569 Uh, an exploitable 903 00:34:11,570 --> 00:34:14,209 condition, we don't say it's impossible, 904 00:34:14,210 --> 00:34:16,189 but it will be very hard. 905 00:34:16,190 --> 00:34:18,709 It's definitely much stronger than 906 00:34:18,710 --> 00:34:20,809 any of the defense mechanisms that exist 907 00:34:20,810 --> 00:34:23,089 currently. It offers complete protection 908 00:34:23,090 --> 00:34:25,279 against anything that are 909 00:34:25,280 --> 00:34:27,379 against any return or 910 00:34:27,380 --> 00:34:29,809 into programing attacks and strong 911 00:34:29,810 --> 00:34:32,238 separation for anything. 912 00:34:32,239 --> 00:34:34,609 Any code pointed out is on the heap adds 913 00:34:34,610 --> 00:34:36,349 almost negligible overhead. 914 00:34:36,350 --> 00:34:38,539 We have zero point five to one point nine 915 00:34:38,540 --> 00:34:40,939 percent overhead to appoint five suspects 916 00:34:40,940 --> 00:34:42,919 CPU and one point nine percent average 917 00:34:42,920 --> 00:34:46,519 overhead for the four Onyx benchmarks. 918 00:34:46,520 --> 00:34:48,709 And we protect roughly two point five 919 00:34:48,710 --> 00:34:50,090 percent of every Access's. 920 00:34:53,929 --> 00:34:56,149 What we can also do is 921 00:34:56,150 --> 00:34:57,150 we can only 922 00:34:58,370 --> 00:35:00,649 use the safe stack, which 923 00:35:00,650 --> 00:35:02,300 protects the 924 00:35:04,130 --> 00:35:06,229 return instruction pointers on the stack, 925 00:35:06,230 --> 00:35:08,629 and we we offer full protection 926 00:35:08,630 --> 00:35:10,909 against return into programing attacks at 927 00:35:10,910 --> 00:35:12,829 negligible overhead. 928 00:35:12,830 --> 00:35:14,779 So we've got different levels of defense 929 00:35:14,780 --> 00:35:17,779 mechanism that you can use whenever 930 00:35:17,780 --> 00:35:19,849 or however you feel like and 931 00:35:19,850 --> 00:35:21,949 how much how how 932 00:35:21,950 --> 00:35:24,079 big your your budget is. 933 00:35:24,080 --> 00:35:26,239 So if you want to give strong 934 00:35:26,240 --> 00:35:28,429 deterministic guarantees, use code point 935 00:35:28,430 --> 00:35:31,100 or integrity, if you want to protect, 936 00:35:33,230 --> 00:35:34,639 if you want to use strong protection is 937 00:35:34,640 --> 00:35:35,869 code for separation. 938 00:35:35,870 --> 00:35:37,909 If you just want to be sure that the 939 00:35:37,910 --> 00:35:40,009 returner into programing is now possible, 940 00:35:40,010 --> 00:35:41,060 use the safe stack. 941 00:35:43,880 --> 00:35:46,339 So enough of the 942 00:35:46,340 --> 00:35:47,869 design of the system. 943 00:35:47,870 --> 00:35:49,430 Let's talk about the implementation 944 00:35:51,290 --> 00:35:53,719 be implemented on top of Klann, 945 00:35:53,720 --> 00:35:56,359 where we collect type information 946 00:35:56,360 --> 00:35:58,549 and all the transformations 947 00:35:58,550 --> 00:36:00,979 are then done on an album 948 00:36:00,980 --> 00:36:03,319 instrumentation pass that either 949 00:36:03,320 --> 00:36:05,599 or the CPI, KPS and the safe 950 00:36:05,600 --> 00:36:06,600 stack. 951 00:36:08,030 --> 00:36:10,279 We have additional runtime support 952 00:36:10,280 --> 00:36:12,619 for the the safe stack 953 00:36:12,620 --> 00:36:14,839 and the safe heap and all the management 954 00:36:14,840 --> 00:36:16,459 functions on top of that. 955 00:36:16,460 --> 00:36:19,579 And currently we support X 64 956 00:36:19,580 --> 00:36:21,769 and X eighty six, although 957 00:36:21,770 --> 00:36:23,419 X eighty six is a bit shaky 958 00:36:24,620 --> 00:36:26,509 for the systems support systems. 959 00:36:26,510 --> 00:36:28,939 We support Mac OS, X, FreeBSD 960 00:36:28,940 --> 00:36:29,940 and Linux. 961 00:36:31,330 --> 00:36:34,179 The current status of the implementation, 962 00:36:34,180 --> 00:36:36,069 this is a research prototype after all 963 00:36:36,070 --> 00:36:37,839 right. Nothing is perfect. 964 00:36:37,840 --> 00:36:39,699 We have great support for code or 965 00:36:39,700 --> 00:36:42,099 integrity, separation and 966 00:36:42,100 --> 00:36:43,839 stack pointers for Mac OS, X and 967 00:36:43,840 --> 00:36:46,179 Freebasing, for example, 968 00:36:46,180 --> 00:36:48,759 and are 969 00:36:48,760 --> 00:36:50,899 fairly good support for 970 00:36:50,900 --> 00:36:52,269 for the other architectures. 971 00:36:52,270 --> 00:36:54,129 But we are working on on improving the 972 00:36:54,130 --> 00:36:55,130 code quality. 973 00:36:56,230 --> 00:36:58,689 We are currently in progress 974 00:36:58,690 --> 00:37:00,969 of up streaming the patches. 975 00:37:00,970 --> 00:37:03,039 And as you might imagine, there's a 976 00:37:03,040 --> 00:37:04,749 whole bunch of changes out there. 977 00:37:04,750 --> 00:37:06,819 So we believe 978 00:37:06,820 --> 00:37:09,369 we built a couple of chunks 979 00:37:09,370 --> 00:37:11,529 that we are up streaming one after 980 00:37:11,530 --> 00:37:13,659 the other. And currently we are working 981 00:37:13,660 --> 00:37:15,789 on the on the safe stack because 982 00:37:15,790 --> 00:37:17,379 that's the logic. 983 00:37:17,380 --> 00:37:18,460 First choice. 984 00:37:19,570 --> 00:37:21,609 It has lower overheads than stack 985 00:37:21,610 --> 00:37:23,589 cannery's that are currently used, 986 00:37:23,590 --> 00:37:25,929 currently used. It offers full 987 00:37:25,930 --> 00:37:28,329 deterministic protection and it's vitally 988 00:37:28,330 --> 00:37:30,129 compatible with anything that's out 989 00:37:30,130 --> 00:37:32,199 there. It's easy to add and gives you 990 00:37:32,200 --> 00:37:33,579 very strong protection. 991 00:37:33,580 --> 00:37:36,579 So this we are hoping to finish 992 00:37:36,580 --> 00:37:38,649 integrating the safe stack soon and 993 00:37:38,650 --> 00:37:39,849 we will then continue. 994 00:37:39,850 --> 00:37:41,889 Mr KPS and S.P.I patches. 995 00:37:43,150 --> 00:37:45,339 You can falke the current version 996 00:37:45,340 --> 00:37:47,379 on GitHub. It's all out there. 997 00:37:47,380 --> 00:37:50,110 Feel free to give it a try and 998 00:37:51,460 --> 00:37:53,739 we are happy to hear back from you 999 00:37:53,740 --> 00:37:55,809 if you, if you find bugs, if you find 1000 00:37:55,810 --> 00:37:56,810 other things. 1001 00:37:57,610 --> 00:37:59,769 Also, we are currently working on 1002 00:37:59,770 --> 00:38:01,989 a code review for code pointer separation 1003 00:38:01,990 --> 00:38:03,789 code Poinar Integrity. 1004 00:38:03,790 --> 00:38:05,919 There's still a couple 1005 00:38:05,920 --> 00:38:06,999 of bucks to be fixed. 1006 00:38:08,320 --> 00:38:10,119 The you can still play with the 1007 00:38:10,120 --> 00:38:11,120 prototype. 1008 00:38:11,980 --> 00:38:14,319 It's out there, download 1009 00:38:14,320 --> 00:38:16,149 it, try it out. 1010 00:38:16,150 --> 00:38:18,759 We will definitely release more packages 1011 00:38:18,760 --> 00:38:20,919 and an 1012 00:38:20,920 --> 00:38:22,059 updated source is soon. 1013 00:38:23,710 --> 00:38:24,850 And also 1014 00:38:26,170 --> 00:38:27,170 we worked. 1015 00:38:28,180 --> 00:38:30,039 Or we obviously. 1016 00:38:31,530 --> 00:38:33,449 Run into some problems this fall and 1017 00:38:33,450 --> 00:38:35,849 there are some changes to super complex 1018 00:38:35,850 --> 00:38:37,949 built systems, like, for example, when we 1019 00:38:37,950 --> 00:38:40,109 worked on on FreeBSD, we had 1020 00:38:40,110 --> 00:38:42,329 to adapt some of the larger make 1021 00:38:42,330 --> 00:38:45,089 files to actually get it to run. 1022 00:38:45,090 --> 00:38:47,460 So you might ask, is it practical? 1023 00:38:48,940 --> 00:38:51,179 So we compiled 1024 00:38:51,180 --> 00:38:53,609 the complete FreeBSD userspace 1025 00:38:53,610 --> 00:38:55,239 with our protection. 1026 00:38:55,240 --> 00:38:57,209 And in addition to that, more than one 1027 00:38:57,210 --> 00:38:59,489 hundred protection was one more 1028 00:38:59,490 --> 00:39:01,799 than 100 packages with strong protection 1029 00:39:01,800 --> 00:39:02,800 guarantees. 1030 00:39:04,280 --> 00:39:06,619 Where we can now guarantee that 1031 00:39:06,620 --> 00:39:08,839 contraflow hijack attacks are no longer 1032 00:39:08,840 --> 00:39:11,029 possible at fairly 1033 00:39:11,030 --> 00:39:13,309 low overheads that you can 1034 00:39:13,310 --> 00:39:14,310 you will run into. 1035 00:39:16,370 --> 00:39:18,499 So now it's up to you to do your 1036 00:39:18,500 --> 00:39:20,779 part, look at the code, 1037 00:39:20,780 --> 00:39:22,909 try to put some of the more complex 1038 00:39:22,910 --> 00:39:24,979 make files, try to find bugs 1039 00:39:24,980 --> 00:39:27,469 in our implementation, 1040 00:39:27,470 --> 00:39:29,659 help us get get 1041 00:39:29,660 --> 00:39:31,879 the word out there, gets 1042 00:39:31,880 --> 00:39:33,709 distributions to actually use these 1043 00:39:33,710 --> 00:39:35,849 patches, lets the 1044 00:39:35,850 --> 00:39:38,059 attacker accessible software be compiled 1045 00:39:38,060 --> 00:39:39,060 using a very. 1046 00:39:40,300 --> 00:39:41,859 Deterministic and strong defense 1047 00:39:41,860 --> 00:39:42,849 mechanism. 1048 00:39:42,850 --> 00:39:45,129 And if we compel our 1049 00:39:45,130 --> 00:39:47,199 software that is 1050 00:39:47,200 --> 00:39:49,239 reachable for codas, reachable from the 1051 00:39:49,240 --> 00:39:50,559 Internet, using the strong defense 1052 00:39:50,560 --> 00:39:52,749 mechanisms, we can stop control for high 1053 00:39:52,750 --> 00:39:53,750 tech attacks. 1054 00:39:55,240 --> 00:39:56,240 So let me conclude, 1055 00:39:57,400 --> 00:39:59,169 coach, point to integrity and point of 1056 00:39:59,170 --> 00:40:01,209 separation, offer strong protection 1057 00:40:01,210 --> 00:40:03,429 against contraflow, tech attacks 1058 00:40:03,430 --> 00:40:05,529 and the key insight that we had here 1059 00:40:05,530 --> 00:40:07,839 is that we offer memory safety 1060 00:40:07,840 --> 00:40:10,179 for coach pointers only, which allows 1061 00:40:10,180 --> 00:40:12,369 us to limit the overhead that 1062 00:40:12,370 --> 00:40:14,349 we actually secure, bringing down the 1063 00:40:14,350 --> 00:40:15,999 overhead from almost one hundred and 1064 00:40:16,000 --> 00:40:18,249 twenty percent to 1065 00:40:18,250 --> 00:40:20,379 less than than two to eight 1066 00:40:20,380 --> 00:40:22,509 percent on average for 1067 00:40:22,510 --> 00:40:24,639 depending on if you either 1068 00:40:24,640 --> 00:40:26,349 use code or separation or code point or 1069 00:40:26,350 --> 00:40:28,929 integrity, which is easily deployable 1070 00:40:28,930 --> 00:40:31,089 in practice and can be used on 1071 00:40:31,090 --> 00:40:33,369 a very wide scale without impacting 1072 00:40:33,370 --> 00:40:35,169 the runtime performance of current 1073 00:40:35,170 --> 00:40:36,170 software. 1074 00:40:36,640 --> 00:40:38,529 We do have a working prototype which 1075 00:40:38,530 --> 00:40:40,689 supports unmodified C 1076 00:40:40,690 --> 00:40:42,879 and C++ at very, very 1077 00:40:42,880 --> 00:40:45,009 low overhead. In practice, ops 1078 00:40:45,010 --> 00:40:46,629 dreaming of our patches is in 1079 00:40:47,800 --> 00:40:48,909 progress. 1080 00:40:48,910 --> 00:40:51,219 The safe stack should be available soon. 1081 00:40:51,220 --> 00:40:52,929 You can like you're going to Fourcade on 1082 00:40:52,930 --> 00:40:55,179 GitHub, you can read 1083 00:40:55,180 --> 00:40:57,549 the paper on our homepage 1084 00:40:57,550 --> 00:40:59,769 or you'll be happy to hear 1085 00:40:59,770 --> 00:41:00,770 from you. 1086 00:41:01,360 --> 00:41:02,889 If you if you have questions, if you find 1087 00:41:02,890 --> 00:41:05,169 bugs, if you want to audit the code or 1088 00:41:05,170 --> 00:41:07,389 help any in any way. 1089 00:41:07,390 --> 00:41:09,699 And if you continue like that, 1090 00:41:09,700 --> 00:41:11,829 we'll go on and 1091 00:41:11,830 --> 00:41:14,289 they'll be able to understand the bugs. 1092 00:41:14,290 --> 00:41:16,419 And then in the end, we can get rid 1093 00:41:16,420 --> 00:41:18,489 of them and protect against them. 1094 00:41:18,490 --> 00:41:20,679 And I said, I would like to close my talk 1095 00:41:20,680 --> 00:41:22,779 and I'm happy to answer any questions. 1096 00:41:22,780 --> 00:41:23,780 Thank you. 1097 00:41:32,910 --> 00:41:34,820 Thank you very much, Matteus player. 1098 00:41:36,030 --> 00:41:37,679 As you heard, he is open to questions 1099 00:41:37,680 --> 00:41:39,809 now. We have four microphones in the 1100 00:41:39,810 --> 00:41:41,999 room. There's microphone one, 1101 00:41:42,000 --> 00:41:44,309 microphone two, three, 1102 00:41:44,310 --> 00:41:45,899 and number four. 1103 00:41:45,900 --> 00:41:47,459 You can go up on any of them. 1104 00:41:47,460 --> 00:41:49,079 I just picked them. 1105 00:41:49,080 --> 00:41:50,939 Well, one after another. 1106 00:41:50,940 --> 00:41:52,769 And I will start with microphone two, 1107 00:41:52,770 --> 00:41:53,770 please. 1108 00:41:54,230 --> 00:41:55,920 Thank you for a fascinating 1109 00:41:57,780 --> 00:41:58,780 talk on the 1110 00:42:00,180 --> 00:42:01,299 approach here. 1111 00:42:01,300 --> 00:42:03,539 I just got one question you discussed 1112 00:42:03,540 --> 00:42:06,329 extensively the overhead 1113 00:42:06,330 --> 00:42:08,339 in one time. 1114 00:42:08,340 --> 00:42:10,440 What's the overhead in memory footprint? 1115 00:42:11,460 --> 00:42:13,649 Because normally the page 1116 00:42:13,650 --> 00:42:15,719 page size will be the limit for your 1117 00:42:15,720 --> 00:42:18,179 hardware protection. 1118 00:42:18,180 --> 00:42:20,399 So the overhead is not 1119 00:42:20,400 --> 00:42:21,509 too bad. 1120 00:42:21,510 --> 00:42:23,849 Um, naively, you 1121 00:42:23,850 --> 00:42:25,409 would just shadow the memory space, 1122 00:42:25,410 --> 00:42:26,410 right? 1123 00:42:27,230 --> 00:42:29,579 What we do, we implement we've 1124 00:42:29,580 --> 00:42:31,739 got several implementations where 1125 00:42:31,740 --> 00:42:33,879 we stored 1126 00:42:33,880 --> 00:42:35,760 the code pointers and the meta data. 1127 00:42:37,320 --> 00:42:39,569 You can use a hash map, a compacted hash 1128 00:42:39,570 --> 00:42:41,639 map, some form of error or 1129 00:42:41,640 --> 00:42:42,679 some other data structure. 1130 00:42:42,680 --> 00:42:44,939 It is just protected and moved, moved 1131 00:42:44,940 --> 00:42:45,940 to the side. 1132 00:42:48,600 --> 00:42:50,729 I'm not 100 percent sure if you put 1133 00:42:50,730 --> 00:42:53,069 numbers into into the paper, 1134 00:42:53,070 --> 00:42:55,439 but they are 1135 00:42:55,440 --> 00:42:58,469 in the low low digits. 1136 00:42:58,470 --> 00:43:00,569 So there's not too much memory overhead 1137 00:43:00,570 --> 00:43:02,699 because a four one, there are only 1138 00:43:02,700 --> 00:43:04,049 very few code pointers. 1139 00:43:04,050 --> 00:43:05,699 So we don't need to store an excessive 1140 00:43:05,700 --> 00:43:06,929 amount of data. 1141 00:43:06,930 --> 00:43:09,119 Also, we can we can 1142 00:43:09,120 --> 00:43:11,069 store them in very compact 1143 00:43:11,070 --> 00:43:13,289 representations using some form 1144 00:43:13,290 --> 00:43:15,089 of hash map or something like that. 1145 00:43:15,090 --> 00:43:16,949 That gives us the illusion of the full 1146 00:43:16,950 --> 00:43:17,950 memory space. 1147 00:43:19,620 --> 00:43:20,539 OK. 1148 00:43:20,540 --> 00:43:21,959 All right. Thank you. 1149 00:43:21,960 --> 00:43:23,529 Microphone one, please. 1150 00:43:23,530 --> 00:43:24,779 Yes. 1151 00:43:24,780 --> 00:43:27,209 And your motivating example, 1152 00:43:27,210 --> 00:43:28,799 the biggest chunk of potentially 1153 00:43:28,800 --> 00:43:30,329 vulnerable C-code 1154 00:43:31,470 --> 00:43:32,759 was the Linux kernel. 1155 00:43:32,760 --> 00:43:35,069 And I was wondering 1156 00:43:35,070 --> 00:43:36,779 whether this kind of protection is 1157 00:43:36,780 --> 00:43:39,179 practical in kind of space 1158 00:43:39,180 --> 00:43:40,650 and with full 1159 00:43:43,200 --> 00:43:44,999 permissions and hardware. 1160 00:43:45,000 --> 00:43:47,099 That's a very good question. 1161 00:43:47,100 --> 00:43:49,439 It's also a very hard question to answer 1162 00:43:49,440 --> 00:43:50,969 so far to see code, 1163 00:43:52,200 --> 00:43:53,819 we'd be perfectly fine. 1164 00:43:53,820 --> 00:43:55,949 Unfortunately, the Linux kernel 1165 00:43:55,950 --> 00:43:58,409 contains a whole bunch of 1166 00:43:58,410 --> 00:44:00,809 assembly code and inline assembly code, 1167 00:44:00,810 --> 00:44:02,969 which is very, very hard to protect. 1168 00:44:02,970 --> 00:44:05,279 So we cannot give any guarantees for 1169 00:44:05,280 --> 00:44:07,019 inline assembly that is in the in a 1170 00:44:07,020 --> 00:44:08,020 source code. 1171 00:44:09,270 --> 00:44:11,339 We could imagine some form 1172 00:44:11,340 --> 00:44:13,949 of annotation based 1173 00:44:13,950 --> 00:44:16,139 system for the programmer 1174 00:44:16,140 --> 00:44:18,629 has to identify assembly 1175 00:44:18,630 --> 00:44:20,699 or inline assembly sequences that modify 1176 00:44:20,700 --> 00:44:21,809 code pointers. 1177 00:44:21,810 --> 00:44:23,519 And if so, we could give the same 1178 00:44:23,520 --> 00:44:26,489 guarantees, but we would do that. 1179 00:44:26,490 --> 00:44:29,129 So our instrumentation pass 1180 00:44:29,130 --> 00:44:30,130 runs on top of 1181 00:44:31,320 --> 00:44:33,389 LVM does not have any type information 1182 00:44:33,390 --> 00:44:34,949 for the inline assembly code that you put 1183 00:44:34,950 --> 00:44:37,319 into the, uh, into the code. 1184 00:44:37,320 --> 00:44:39,779 So if you as a programmer would supply 1185 00:44:39,780 --> 00:44:41,909 additional annotations so that 1186 00:44:41,910 --> 00:44:43,829 we could reason about the inline assembly 1187 00:44:43,830 --> 00:44:46,079 code, we would be fine, but that would be 1188 00:44:46,080 --> 00:44:47,170 definitely ongoing work. 1189 00:44:48,600 --> 00:44:49,600 Thank you. 1190 00:44:50,340 --> 00:44:51,840 We have a question from the Internet. 1191 00:44:55,220 --> 00:44:57,319 Yes, things are still an 1192 00:44:57,320 --> 00:44:58,789 open question, the chat room. 1193 00:44:58,790 --> 00:45:00,769 The question is, how are the address 1194 00:45:00,770 --> 00:45:03,199 spaces separated the pointers 1195 00:45:03,200 --> 00:45:04,489 from the code? 1196 00:45:04,490 --> 00:45:06,649 Yeah, that's a good question. 1197 00:45:06,650 --> 00:45:09,499 On 86, we use segmentation, 1198 00:45:09,500 --> 00:45:11,510 a segmentation register that we set up 1199 00:45:12,560 --> 00:45:14,179 so we can easily enforce 1200 00:45:15,230 --> 00:45:17,299 we can easily use hardware, enforce the 1201 00:45:17,300 --> 00:45:19,429 separation on x 1202 00:45:19,430 --> 00:45:20,430 64. 1203 00:45:22,400 --> 00:45:24,319 We can we can use a set of different 1204 00:45:24,320 --> 00:45:26,599 techniques depending on how much overhead 1205 00:45:26,600 --> 00:45:28,159 you're willing to pay. 1206 00:45:28,160 --> 00:45:29,269 You can use Asla 1207 00:45:30,650 --> 00:45:32,629 so you can use a randomization based 1208 00:45:32,630 --> 00:45:34,819 approach, but just 1209 00:45:34,820 --> 00:45:36,919 use or allocate your 1210 00:45:36,920 --> 00:45:39,109 safe reaching somewhere in memory that 1211 00:45:39,110 --> 00:45:41,299 is safe from the attacker, because in 1212 00:45:41,300 --> 00:45:43,399 a in a 64 bit address space, the address 1213 00:45:43,400 --> 00:45:45,319 space is big enough so that you can you 1214 00:45:45,320 --> 00:45:46,849 can hide it. And in addition to that, we 1215 00:45:46,850 --> 00:45:49,039 guarantee that in unsafe memory 1216 00:45:49,040 --> 00:45:50,959 or in attacker accessible memory, there 1217 00:45:50,960 --> 00:45:53,029 will never be a pointer to 1218 00:45:53,030 --> 00:45:55,099 our, uh, our safe 1219 00:45:55,100 --> 00:45:55,609 memory. 1220 00:45:55,610 --> 00:45:58,699 Therefore, we are safe against, um, 1221 00:45:58,700 --> 00:46:00,259 information leaks. 1222 00:46:00,260 --> 00:46:02,359 Or if you are willing to pay two or 1223 00:46:02,360 --> 00:46:04,850 three percent overhead, you can blind 1224 00:46:06,350 --> 00:46:08,059 blind out a memory reachin. 1225 00:46:08,060 --> 00:46:10,309 So imagine it that for every 1226 00:46:10,310 --> 00:46:12,679 memory access that you execute 1227 00:46:12,680 --> 00:46:14,989 or memory, every memory reader, every 1228 00:46:14,990 --> 00:46:15,769 memory. Right. 1229 00:46:15,770 --> 00:46:17,839 That goes to unsafe memory, you 1230 00:46:17,840 --> 00:46:20,509 do an end to the, 1231 00:46:20,510 --> 00:46:22,669 uh, to the actual address that 1232 00:46:22,670 --> 00:46:24,979 is used and therefore blind out 1233 00:46:24,980 --> 00:46:27,589 there with a mask, blind outside the bits 1234 00:46:27,590 --> 00:46:28,969 that are to protect that memory, 1235 00:46:30,170 --> 00:46:31,639 which cost you like two or three percent 1236 00:46:31,640 --> 00:46:32,640 overhead. 1237 00:46:33,760 --> 00:46:35,119 All right. Microphone to please. 1238 00:46:36,410 --> 00:46:38,179 At the beginning of the talk, you 1239 00:46:38,180 --> 00:46:39,799 mentioned the Heartbleed buck. 1240 00:46:39,800 --> 00:46:43,069 And as far as I could tell, 1241 00:46:43,070 --> 00:46:45,169 none of your security measures would help 1242 00:46:45,170 --> 00:46:46,249 against it. 1243 00:46:46,250 --> 00:46:47,449 Yeah. 1244 00:46:47,450 --> 00:46:48,450 Um, 1245 00:46:49,850 --> 00:46:53,029 so the Heartbleed bug is 1246 00:46:53,030 --> 00:46:55,639 a piece of information that 1247 00:46:55,640 --> 00:46:57,589 it's basically a data leak. 1248 00:46:57,590 --> 00:46:59,839 Right. So you could 1249 00:46:59,840 --> 00:47:01,759 protect against the Heartbleed bug by 1250 00:47:01,760 --> 00:47:03,439 extending the protection that we 1251 00:47:03,440 --> 00:47:05,659 currently have to other data types 1252 00:47:05,660 --> 00:47:06,679 as well. 1253 00:47:06,680 --> 00:47:08,779 So you've seen that we basically 1254 00:47:08,780 --> 00:47:11,279 run the typist analysis and 1255 00:47:11,280 --> 00:47:12,769 the identity. We currently identify 1256 00:47:12,770 --> 00:47:14,689 everything that's like a code pointer or 1257 00:47:14,690 --> 00:47:16,759 used like a code pointer or 1258 00:47:16,760 --> 00:47:17,839 anywhere in the chain. 1259 00:47:17,840 --> 00:47:19,819 Been a code point, Christie reference. 1260 00:47:19,820 --> 00:47:22,159 But there's nothing that stops us from 1261 00:47:22,160 --> 00:47:24,349 adding additional data and increase 1262 00:47:24,350 --> 00:47:26,569 the protection for that additional data. 1263 00:47:26,570 --> 00:47:28,609 So instead of just protecting code 1264 00:47:28,610 --> 00:47:30,739 pointers, we can select other 1265 00:47:30,740 --> 00:47:31,759 data as well. 1266 00:47:31,760 --> 00:47:33,919 And sensitive data 1267 00:47:33,920 --> 00:47:36,349 types like private keys 1268 00:47:36,350 --> 00:47:38,509 would be very good candidates for 1269 00:47:38,510 --> 00:47:40,849 additional scrutiny, scrutiny 1270 00:47:40,850 --> 00:47:43,009 and inclusion into the set 1271 00:47:43,010 --> 00:47:44,689 of sensitive pointers. 1272 00:47:44,690 --> 00:47:46,639 So that's definitely ongoing future work. 1273 00:47:48,640 --> 00:47:49,640 All right. 1274 00:47:50,810 --> 00:47:52,909 All right, you've mentioned that 1275 00:47:52,910 --> 00:47:55,789 you've evaluated the performance 1276 00:47:55,790 --> 00:47:58,069 with the benchmark, have you also 1277 00:47:58,070 --> 00:48:00,109 evaluated the functionality? 1278 00:48:00,110 --> 00:48:01,369 You've mentioned that you've compiled the 1279 00:48:01,370 --> 00:48:03,649 previous to the whole freebees, the SAT 1280 00:48:03,650 --> 00:48:04,699 and a couple of hundred packages. 1281 00:48:04,700 --> 00:48:06,409 But have you also run them successfully 1282 00:48:06,410 --> 00:48:06,949 at all? 1283 00:48:06,950 --> 00:48:09,529 Yeah. So Specs CPU is 1284 00:48:09,530 --> 00:48:11,869 a self validating benchmark that checks 1285 00:48:11,870 --> 00:48:13,999 if the, uh, if 1286 00:48:14,000 --> 00:48:16,069 the code runs correctly and verifies 1287 00:48:16,070 --> 00:48:18,679 that it runs correctly and so does for. 1288 00:48:18,680 --> 00:48:20,659 So we did run the four Onyx benchmarks on 1289 00:48:20,660 --> 00:48:22,819 top of FreeBSD, which is a big 1290 00:48:22,820 --> 00:48:25,129 package of of benchmarks 1291 00:48:25,130 --> 00:48:26,929 that sell certified results and ensure 1292 00:48:26,930 --> 00:48:27,930 that they are correct. 1293 00:48:29,120 --> 00:48:31,459 And then and then you've mentioned 1294 00:48:31,460 --> 00:48:33,439 that you have a runtime support. 1295 00:48:33,440 --> 00:48:35,549 What does that entail and how portable 1296 00:48:35,550 --> 00:48:36,319 is? 1297 00:48:36,320 --> 00:48:38,539 So we do need runtime support to set 1298 00:48:38,540 --> 00:48:40,639 up all the data structures we 1299 00:48:40,640 --> 00:48:42,079 need to runtime support, to set up the 1300 00:48:42,080 --> 00:48:44,329 safe memory and so on, which is basically 1301 00:48:44,330 --> 00:48:46,489 like, do you know, compiler art 1302 00:48:46,490 --> 00:48:47,449 in LVM? 1303 00:48:47,450 --> 00:48:49,879 It's like a library that is linked into 1304 00:48:49,880 --> 00:48:52,129 any or included into 1305 00:48:52,130 --> 00:48:54,919 any executable that is compiled. 1306 00:48:54,920 --> 00:48:57,049 It's like a turkey has 1307 00:48:57,050 --> 00:48:58,079 its own library and so on. 1308 00:48:58,080 --> 00:49:00,439 It just contains a set of start functions 1309 00:49:00,440 --> 00:49:03,079 and so on. That set up the process image 1310 00:49:03,080 --> 00:49:05,179 you it when you execute a program, 1311 00:49:05,180 --> 00:49:06,349 it doesn't start at Main. 1312 00:49:06,350 --> 00:49:07,969 There's a whole bunch of other plot that 1313 00:49:07,970 --> 00:49:10,039 is executed beforehand and the adds 1314 00:49:10,040 --> 00:49:11,269 to that stuff that is executed 1315 00:49:11,270 --> 00:49:13,219 beforehand. It's a standard compiler 1316 00:49:13,220 --> 00:49:15,409 technique to include 1317 00:49:15,410 --> 00:49:17,719 some initialization functions and stuff 1318 00:49:17,720 --> 00:49:18,979 like that. 1319 00:49:18,980 --> 00:49:19,980 All right, thanks. Thanks. 1320 00:49:22,430 --> 00:49:24,139 All right, there's another question from 1321 00:49:24,140 --> 00:49:25,140 the Internet. 1322 00:49:25,850 --> 00:49:27,619 Yes, thank you. 1323 00:49:27,620 --> 00:49:30,349 Another question is, what about 1324 00:49:30,350 --> 00:49:32,539 applications for the runtime code, 1325 00:49:32,540 --> 00:49:33,679 e.g. 1326 00:49:33,680 --> 00:49:35,449 browsers? Can you protect these? 1327 00:49:38,120 --> 00:49:40,219 Yeah, so that's a good question 1328 00:49:41,630 --> 00:49:43,999 at the beginning and the assumptions 1329 00:49:44,000 --> 00:49:45,019 I put in that 1330 00:49:46,790 --> 00:49:48,889 there's no self 1331 00:49:48,890 --> 00:49:51,109 modifying code, basically, and 1332 00:49:51,110 --> 00:49:53,179 browsers use 1333 00:49:53,180 --> 00:49:55,129 a lot basically JIT compilers and 1334 00:49:55,130 --> 00:49:58,009 therefore recompile code all the time. 1335 00:49:58,010 --> 00:50:01,279 There are two answers to this question. 1336 00:50:01,280 --> 00:50:03,349 One of them is if you lift 1337 00:50:03,350 --> 00:50:05,509 the compiler, the Just-In-Time compiler 1338 00:50:05,510 --> 00:50:07,489 itself into into the trusted computing 1339 00:50:07,490 --> 00:50:09,739 base and you instrument and give 1340 00:50:09,740 --> 00:50:11,959 the compiler access to 1341 00:50:13,340 --> 00:50:15,499 all the support functions, the compiler 1342 00:50:15,500 --> 00:50:17,839 can produce safe code as well. 1343 00:50:17,840 --> 00:50:20,179 But the drawback, obviously, is 1344 00:50:20,180 --> 00:50:22,439 that the compiler itself then is inside 1345 00:50:22,440 --> 00:50:25,219 the trusted computing base in 1346 00:50:25,220 --> 00:50:27,169 during the time the process or the 1347 00:50:27,170 --> 00:50:29,300 program is executed and might be 1348 00:50:30,710 --> 00:50:32,629 attackable if there are bugs in a 1349 00:50:32,630 --> 00:50:34,999 complaint. Obviously, the other option 1350 00:50:35,000 --> 00:50:37,039 or the other answer is that most of the 1351 00:50:37,040 --> 00:50:39,109 time, even 1352 00:50:39,110 --> 00:50:40,819 if we don't give any strong guarantees, 1353 00:50:40,820 --> 00:50:43,009 the compiler, the testing 1354 00:50:43,010 --> 00:50:45,739 time compilers that we use in browsers 1355 00:50:45,740 --> 00:50:48,499 compile as safe a memory, safe language. 1356 00:50:48,500 --> 00:50:50,719 And inside this memory, safe language 1357 00:50:50,720 --> 00:50:53,929 the attacker doesn't have access to 1358 00:50:53,930 --> 00:50:56,029 are to 1359 00:50:56,030 --> 00:50:58,749 code pointers directly to 1360 00:50:58,750 --> 00:51:00,949 the it should it should 1361 00:51:00,950 --> 00:51:02,689 be safe in most cases. 1362 00:51:02,690 --> 00:51:04,819 But obviously, it's not a perfect answer. 1363 00:51:04,820 --> 00:51:05,929 That's why we included it. 1364 00:51:05,930 --> 00:51:08,089 And that's why the excluded 1365 00:51:08,090 --> 00:51:10,789 compilers or just in time 1366 00:51:10,790 --> 00:51:12,639 generate a code from the attack model. 1367 00:51:13,730 --> 00:51:14,899 You can come up with some defense 1368 00:51:14,900 --> 00:51:16,399 mechanism, but you'll have to verify the 1369 00:51:16,400 --> 00:51:17,900 compiler. That's the short answer. 1370 00:51:18,920 --> 00:51:20,419 All right. Thank you. 1371 00:51:20,420 --> 00:51:21,949 There's still three questions left at 1372 00:51:21,950 --> 00:51:23,150 microphone two, I guess. 1373 00:51:24,520 --> 00:51:26,989 So you said that 1374 00:51:26,990 --> 00:51:28,649 this doesn't work in all cases. 1375 00:51:28,650 --> 00:51:30,709 So it goes in almost every case, 1376 00:51:30,710 --> 00:51:32,859 but it doesn't work for some cases. 1377 00:51:32,860 --> 00:51:35,389 You already explained that this wasn't 1378 00:51:35,390 --> 00:51:37,189 work for inline assembly. 1379 00:51:37,190 --> 00:51:38,269 Stuff like that. 1380 00:51:38,270 --> 00:51:39,919 What what are the other cases? 1381 00:51:39,920 --> 00:51:40,920 This doesn't work. 1382 00:51:42,080 --> 00:51:44,149 Um, well, 1383 00:51:44,150 --> 00:51:46,879 not necessarily not work, but, 1384 00:51:46,880 --> 00:51:49,309 um, for code pointer 1385 00:51:49,310 --> 00:51:51,439 integrity, there are some 1386 00:51:51,440 --> 00:51:53,749 sometimes very Bizerte costs like 1387 00:51:53,750 --> 00:51:55,939 C-code. If you if you look 1388 00:51:55,940 --> 00:51:57,679 into what is actually compiled and 1389 00:51:57,680 --> 00:51:59,609 written out there, then C-code becomes 1390 00:51:59,610 --> 00:52:00,919 very, very ugly. 1391 00:52:00,920 --> 00:52:03,199 And according 1392 00:52:03,200 --> 00:52:05,629 to the C language 1393 00:52:05,630 --> 00:52:07,639 specification, you're allowed to cast 1394 00:52:07,640 --> 00:52:10,129 everything into void or CA. 1395 00:52:10,130 --> 00:52:12,649 And if you have a bunch of these costs 1396 00:52:12,650 --> 00:52:14,629 from Code Pinder's to void and back and 1397 00:52:14,630 --> 00:52:17,239 forth and all that, you end up protecting 1398 00:52:17,240 --> 00:52:19,429 all the pointers and then 1399 00:52:19,430 --> 00:52:21,499 you end up with a fairly high amount of 1400 00:52:21,500 --> 00:52:22,579 overhead. 1401 00:52:22,580 --> 00:52:24,889 So it actually happens a lot 1402 00:52:24,890 --> 00:52:27,439 that some pointers 1403 00:52:27,440 --> 00:52:29,599 are cast into into a char pointer and 1404 00:52:29,600 --> 00:52:31,729 then back into a struct pointer that then 1405 00:52:31,730 --> 00:52:32,989 contain a code pointer. 1406 00:52:32,990 --> 00:52:34,639 And if that happens, you might end up 1407 00:52:34,640 --> 00:52:36,859 protecting all the pointers, 1408 00:52:36,860 --> 00:52:38,479 which then results in fairly high 1409 00:52:38,480 --> 00:52:39,480 overhead. 1410 00:52:40,420 --> 00:52:42,819 And so you do have high overhead 1411 00:52:42,820 --> 00:52:45,009 as a as a drawback, but if you don't 1412 00:52:45,010 --> 00:52:47,319 support if something bad happens from 1413 00:52:47,320 --> 00:52:48,249 the hardware. 1414 00:52:48,250 --> 00:52:50,709 So if you would protect the VMS or 1415 00:52:50,710 --> 00:52:52,989 if you if you have access to tables, 1416 00:52:52,990 --> 00:52:55,419 let's say in the kernel, if the attacker, 1417 00:52:55,420 --> 00:52:58,479 we don't protect data memory. 1418 00:52:58,480 --> 00:53:00,669 So if the if the attacker 1419 00:53:00,670 --> 00:53:02,589 writes in to the table or something like 1420 00:53:02,590 --> 00:53:03,939 that, you could come up with very weird 1421 00:53:03,940 --> 00:53:06,729 stuff. But in userspace 1422 00:53:06,730 --> 00:53:07,730 you should be safe. 1423 00:53:12,370 --> 00:53:14,799 There so it looks like from the top that 1424 00:53:14,800 --> 00:53:16,509 all the work here is being done in 1425 00:53:16,510 --> 00:53:17,439 software. 1426 00:53:17,440 --> 00:53:19,239 I was wondering, do you see any 1427 00:53:19,240 --> 00:53:21,699 opportunities for hardware acceleration 1428 00:53:21,700 --> 00:53:23,859 and, say, the CPU to reduce some 1429 00:53:23,860 --> 00:53:24,759 of the overhead? 1430 00:53:24,760 --> 00:53:25,760 You're talking about 1431 00:53:27,220 --> 00:53:29,199 NPCs or anything else you can come up 1432 00:53:29,200 --> 00:53:30,249 with? 1433 00:53:30,250 --> 00:53:31,570 Yeah, we actually looked at 1434 00:53:33,310 --> 00:53:35,619 it's it's a bit of a bummer that it's not 1435 00:53:35,620 --> 00:53:37,059 available in real hardware yet. 1436 00:53:38,380 --> 00:53:40,539 But then again, Intel kind 1437 00:53:40,540 --> 00:53:42,759 of advertises as a 1438 00:53:42,760 --> 00:53:44,709 debugging feature. 1439 00:53:44,710 --> 00:53:46,989 There's some hints that up to that it'll 1440 00:53:46,990 --> 00:53:48,550 have up to 40 percent overhead. 1441 00:53:49,570 --> 00:53:51,639 So we'll definitely look into it as 1442 00:53:51,640 --> 00:53:53,320 soon as it is out there in real hardware. 1443 00:53:54,400 --> 00:53:57,369 What we could really profit from is 1444 00:53:57,370 --> 00:53:59,889 some faster implementations for 1445 00:53:59,890 --> 00:54:01,480 the, uh, 1446 00:54:02,530 --> 00:54:04,479 for the for the additional tables that we 1447 00:54:04,480 --> 00:54:06,219 have, like for the for the bounce 1448 00:54:06,220 --> 00:54:07,299 information and so on. 1449 00:54:07,300 --> 00:54:09,609 And Intel try to address that using using 1450 00:54:09,610 --> 00:54:11,859 NPLEX something 1451 00:54:11,860 --> 00:54:14,199 following this line of work 1452 00:54:14,200 --> 00:54:16,089 could be interesting or something. 1453 00:54:16,090 --> 00:54:18,129 Some other line look up that you can, you 1454 00:54:18,130 --> 00:54:19,719 can speed up using some additional 1455 00:54:19,720 --> 00:54:21,159 instructions. Might be interesting as 1456 00:54:21,160 --> 00:54:22,160 well. 1457 00:54:22,930 --> 00:54:24,459 Thank you. Next question. 1458 00:54:24,460 --> 00:54:25,419 Hi. 1459 00:54:25,420 --> 00:54:27,670 I wanted to ask, what is the 1460 00:54:28,840 --> 00:54:31,149 interaction between this and 1461 00:54:31,150 --> 00:54:33,039 dynamic linking, for example, what 1462 00:54:33,040 --> 00:54:35,709 happens or can you link 1463 00:54:35,710 --> 00:54:38,109 unsafe plugins 1464 00:54:38,110 --> 00:54:40,269 that are incompatible with this blob's 1465 00:54:40,270 --> 00:54:42,819 into your code and have 1466 00:54:42,820 --> 00:54:45,009 guarantees of safety or the other way 1467 00:54:45,010 --> 00:54:46,010 around, have a 1468 00:54:47,440 --> 00:54:49,599 yes user program that's not 1469 00:54:49,600 --> 00:54:51,699 compiled with this loading a 1470 00:54:51,700 --> 00:54:53,199 safe library. 1471 00:54:54,310 --> 00:54:56,409 So let's start with 1472 00:54:56,410 --> 00:54:57,429 the safe stack. 1473 00:54:57,430 --> 00:54:59,589 First, let's assume 1474 00:54:59,590 --> 00:55:00,880 we only use the safe stack 1475 00:55:02,110 --> 00:55:05,559 if you branch into unprotected 1476 00:55:05,560 --> 00:55:07,719 code, which is don't give any 1477 00:55:07,720 --> 00:55:09,519 guarantees while you're executing 1478 00:55:09,520 --> 00:55:11,019 unprotected code. 1479 00:55:11,020 --> 00:55:12,969 As soon as you return to protect the 1480 00:55:12,970 --> 00:55:14,919 code, it continues. 1481 00:55:14,920 --> 00:55:17,259 So unprotected code is perfectly 1482 00:55:17,260 --> 00:55:19,359 supported for a safe stack. 1483 00:55:19,360 --> 00:55:21,249 We don't give any guarantees while you're 1484 00:55:21,250 --> 00:55:23,529 executing the unprotected 1485 00:55:23,530 --> 00:55:25,699 code for code point 1486 00:55:25,700 --> 00:55:28,570 or separation or code point or integrity, 1487 00:55:29,920 --> 00:55:31,149 it looks a bit different. 1488 00:55:31,150 --> 00:55:33,249 As long as you don't modify any 1489 00:55:33,250 --> 00:55:34,510 of the sensitive pointers, 1490 00:55:35,650 --> 00:55:37,179 you're fine. 1491 00:55:37,180 --> 00:55:40,269 If you modify the sensitive pointers, 1492 00:55:40,270 --> 00:55:41,270 then 1493 00:55:42,910 --> 00:55:44,619 some of the pointers could be out of out 1494 00:55:44,620 --> 00:55:44,979 of line. 1495 00:55:44,980 --> 00:55:47,109 So are the light like 1496 00:55:47,110 --> 00:55:48,759 you would miss some of the updates 1497 00:55:48,760 --> 00:55:50,979 because the shadow location 1498 00:55:50,980 --> 00:55:52,509 would be written that is not actually 1499 00:55:52,510 --> 00:55:54,159 used in regular memory. 1500 00:55:54,160 --> 00:55:55,959 And you would miss the update. 1501 00:55:55,960 --> 00:55:57,139 So we did. 1502 00:55:57,140 --> 00:55:59,319 There are no safety guarantees and 1503 00:55:59,320 --> 00:56:01,539 you could miss some of the updates. 1504 00:56:01,540 --> 00:56:03,519 But the unsafe code, 1505 00:56:05,650 --> 00:56:07,479 if you segmentation, for example, 1506 00:56:07,480 --> 00:56:09,009 wouldn't be able to modify the code 1507 00:56:09,010 --> 00:56:10,959 pointers even when executing on safe 1508 00:56:10,960 --> 00:56:11,960 code. 1509 00:56:12,610 --> 00:56:14,679 So which way around 1510 00:56:14,680 --> 00:56:17,260 is like safer, like loading, 1511 00:56:18,280 --> 00:56:20,859 loading now, 1512 00:56:20,860 --> 00:56:23,259 unsafe coding code, loading your 1513 00:56:23,260 --> 00:56:25,719 safe library or the other way around? 1514 00:56:25,720 --> 00:56:27,039 Well, it just depends, right. 1515 00:56:27,040 --> 00:56:28,959 While you're executing unsafe code, there 1516 00:56:28,960 --> 00:56:30,019 are no guarantees. 1517 00:56:30,020 --> 00:56:31,330 That's basically how it looked like 1518 00:56:32,440 --> 00:56:34,659 things you could you could run some 1519 00:56:34,660 --> 00:56:36,969 some form of binary instrumentation 1520 00:56:36,970 --> 00:56:39,099 on top of it to kind of, 1521 00:56:39,100 --> 00:56:41,739 ah, pay a very high overhead 1522 00:56:41,740 --> 00:56:43,029 for for it unsafe code. 1523 00:56:43,030 --> 00:56:44,439 But it's not what you what you would 1524 00:56:44,440 --> 00:56:45,440 want. 1525 00:56:46,130 --> 00:56:48,519 All right. We have three questions and 1526 00:56:48,520 --> 00:56:50,499 five minutes left on four minutes. 1527 00:56:50,500 --> 00:56:51,009 Go ahead. 1528 00:56:51,010 --> 00:56:53,079 OK, not exactly a question, 1529 00:56:53,080 --> 00:56:54,189 just a comment. 1530 00:56:54,190 --> 00:56:56,139 Casting function pointers to avoid 1531 00:56:56,140 --> 00:56:57,520 pointers is not allowed. 1532 00:56:59,320 --> 00:57:01,269 I don't have the C standard memorized, 1533 00:57:01,270 --> 00:57:03,009 but I'm pretty sure of that. 1534 00:57:03,010 --> 00:57:03,969 And I.C.C. 1535 00:57:03,970 --> 00:57:06,129 an Excel C both one for that. 1536 00:57:06,130 --> 00:57:08,679 But DC does not want whatever reason. 1537 00:57:08,680 --> 00:57:10,969 OK, I don't know about good 1538 00:57:10,970 --> 00:57:13,239 casting to Cha and back is allowed. 1539 00:57:16,540 --> 00:57:18,819 Yeah, POSIX requires it like 1540 00:57:18,820 --> 00:57:20,949 dorsum if you get a 1541 00:57:20,950 --> 00:57:21,950 white pointer. 1542 00:57:24,270 --> 00:57:25,809 All right, thanks. Microphone two. 1543 00:57:25,810 --> 00:57:27,340 And then after that the internet 1544 00:57:28,930 --> 00:57:31,209 was developed a few years 1545 00:57:31,210 --> 00:57:31,569 ago. 1546 00:57:31,570 --> 00:57:33,789 So why Segan if 1547 00:57:33,790 --> 00:57:36,969 the fee was developed 1548 00:57:36,970 --> 00:57:39,549 a few center know 1549 00:57:39,550 --> 00:57:41,949 a few decades ago. 1550 00:57:41,950 --> 00:57:44,229 So why has it taken 1551 00:57:44,230 --> 00:57:46,719 so long to find a 1552 00:57:46,720 --> 00:57:47,979 simpler solution? 1553 00:57:49,450 --> 00:57:51,639 Well, the solution is not simple, right? 1554 00:57:51,640 --> 00:57:52,749 It's fairly complex. 1555 00:57:52,750 --> 00:57:54,909 If you run a whole bunch of 1556 00:57:54,910 --> 00:57:57,069 type based analysis and 1557 00:57:57,070 --> 00:57:59,289 all these type based analysis have only 1558 00:57:59,290 --> 00:58:01,119 come up to speed in the last couple of 1559 00:58:01,120 --> 00:58:01,929 years. 1560 00:58:01,930 --> 00:58:04,209 And you've seen, like I talked about, 1561 00:58:04,210 --> 00:58:06,339 cured, which which was proposed 1562 00:58:06,340 --> 00:58:08,319 in the early two thousand, I think like 1563 00:58:08,320 --> 00:58:11,229 2002 or 2003 or so 1564 00:58:11,230 --> 00:58:13,319 and. There's been a lot of research going 1565 00:58:13,320 --> 00:58:15,779 on, and only now we have these frameworks 1566 00:58:15,780 --> 00:58:17,159 available that we can actually do, these 1567 00:58:17,160 --> 00:58:19,529 heavy transformations are 1568 00:58:19,530 --> 00:58:21,419 and then run additional optimization 1569 00:58:21,420 --> 00:58:23,309 policies on top of it to get the overhead 1570 00:58:23,310 --> 00:58:25,439 low enough so that it could actually be 1571 00:58:25,440 --> 00:58:26,579 usable in practice. 1572 00:58:27,750 --> 00:58:29,879 Also, people assumed that C programmers 1573 00:58:29,880 --> 00:58:31,739 would write safe code, but apparently 1574 00:58:31,740 --> 00:58:32,740 it's not like that. 1575 00:58:34,890 --> 00:58:36,779 All right. One last question from the 1576 00:58:36,780 --> 00:58:38,069 Internet. 1577 00:58:38,070 --> 00:58:39,899 I think you have to read it out. 1578 00:58:39,900 --> 00:58:41,129 Actually, you have no clue what it's 1579 00:58:41,130 --> 00:58:43,979 about in the context of C++, 1580 00:58:43,980 --> 00:58:46,079 wouldn't all the pointers to instances of 1581 00:58:46,080 --> 00:58:48,749 classes containing virtual methods 1582 00:58:48,750 --> 00:58:51,059 be protected and by extension, 1583 00:58:51,060 --> 00:58:53,309 all classes containing pointers 1584 00:58:53,310 --> 00:58:55,679 to those classes as members? 1585 00:58:57,690 --> 00:58:58,690 Should I read it again 1586 00:59:00,820 --> 00:59:03,539 in the context of C++? 1587 00:59:03,540 --> 00:59:05,789 Wouldn't all the pointers to instances 1588 00:59:05,790 --> 00:59:07,919 of classes containing 1589 00:59:07,920 --> 00:59:10,169 virtual methods be protected and by 1590 00:59:10,170 --> 00:59:12,359 extension, all classes 1591 00:59:12,360 --> 00:59:14,459 containing pointers to those classes 1592 00:59:14,460 --> 00:59:15,389 as members? 1593 00:59:15,390 --> 00:59:16,390 Yeah. 1594 00:59:17,100 --> 00:59:19,179 So the answer is yes, the answer is yes. 1595 00:59:21,600 --> 00:59:22,879 All right, thank you very much. 1596 00:59:22,880 --> 00:59:23,880 My pleasure. 1597 00:59:26,490 --> 00:59:29,489 If there is no more further questions. 1598 00:59:29,490 --> 00:59:31,589 All right. One last one, really quick 1599 00:59:31,590 --> 00:59:32,519 microphone three. 1600 00:59:32,520 --> 00:59:34,649 I don't know, is it 1601 00:59:34,650 --> 00:59:34,829 on? 1602 00:59:34,830 --> 00:59:37,229 OK, expanding the last question. 1603 00:59:37,230 --> 00:59:39,329 What's the profit on that? 1604 00:59:39,330 --> 00:59:42,599 What's the what? The profit performance 1605 00:59:42,600 --> 00:59:43,599 on which one? 1606 00:59:43,600 --> 00:59:44,940 The last question when you get 1607 00:59:47,830 --> 00:59:50,369 because that sounds expensive to me, 1608 00:59:50,370 --> 00:59:51,629 it actually depends. 1609 00:59:51,630 --> 00:59:53,939 We only look at the pointers and 1610 00:59:53,940 --> 00:59:56,089 luckily a lot of the 1611 00:59:56,090 --> 00:59:58,259 stuff is pushed onto the stack. 1612 00:59:58,260 --> 00:59:59,969 And so we don't pay anything on the 1613 00:59:59,970 --> 01:00:01,559 stack. We do protect. 1614 01:00:01,560 --> 01:00:03,929 We do get a performance hit on some of 1615 01:00:03,930 --> 01:00:05,400 the the stuff on a heap, 1616 01:00:06,990 --> 01:00:09,009 and especially for for C++. 1617 01:00:09,010 --> 01:00:11,099 As I said, we might end up 1618 01:00:11,100 --> 01:00:13,619 protecting a whole bunch of the 1619 01:00:13,620 --> 01:00:14,620 of the 1620 01:00:16,110 --> 01:00:18,179 pointers as to 1621 01:00:18,180 --> 01:00:19,469 objects with virtual functions. 1622 01:00:22,470 --> 01:00:24,569 Depends on the program, 1623 01:00:24,570 --> 01:00:27,489 how frequent these these operations are, 1624 01:00:27,490 --> 01:00:30,719 there they are, there's a full list of 1625 01:00:30,720 --> 01:00:32,819 of individual benchmarks in the paper. 1626 01:00:32,820 --> 01:00:34,889 Um, the 1627 01:00:34,890 --> 01:00:36,780 most of the benchmarks are 1628 01:00:37,860 --> 01:00:39,189 below five percent. 1629 01:00:39,190 --> 01:00:41,639 Some of them are around 10, very few 1630 01:00:41,640 --> 01:00:42,929 around 20. 1631 01:00:42,930 --> 01:00:44,849 And the highest overhead we've seen was 1632 01:00:44,850 --> 01:00:45,850 roughly 80 percent. 1633 01:00:46,890 --> 01:00:47,789 Thank you. 1634 01:00:47,790 --> 01:00:48,799 So there. 1635 01:00:48,800 --> 01:00:50,249 And four to one with 80 percent. 1636 01:00:50,250 --> 01:00:52,079 There's definitely room for future 1637 01:00:52,080 --> 01:00:54,299 optimization where you can you can kind 1638 01:00:54,300 --> 01:00:56,939 of at 1639 01:00:56,940 --> 01:00:59,339 or reduce the number of 1640 01:00:59,340 --> 01:01:01,529 of checks by by streamlining 1641 01:01:01,530 --> 01:01:04,049 and grouping and reducing the total 1642 01:01:04,050 --> 01:01:05,009 total amount of it. 1643 01:01:05,010 --> 01:01:05,909 Thank you. 1644 01:01:05,910 --> 01:01:07,079 All right. 1645 01:01:07,080 --> 01:01:09,299 Thank you very much. Again, 1646 01:01:09,300 --> 01:01:11,099 that's what's going on, Taggerty.