0 00:00:00,000 --> 00:00:30,000 Dear viewer, these subtitles were generated by a machine via the service Trint and therefore are (very) buggy. If you are capable, please help us to create good quality subtitles: https://c3subtitles.de/talk/914 Thanks! 1 00:00:14,960 --> 00:00:17,179 So the next speaker of today Will 2 00:00:17,180 --> 00:00:18,949 Tulka will tell us more about the 3 00:00:18,950 --> 00:00:21,229 glorious future, where mushroom 4 00:00:21,230 --> 00:00:23,359 check the proof will be used in day. 5 00:00:23,360 --> 00:00:25,579 After all, the way 6 00:00:25,580 --> 00:00:27,799 down to our development, we have 7 00:00:27,800 --> 00:00:29,089 Adam Chapala. 8 00:00:29,090 --> 00:00:31,579 He's a faculty at MIT and 9 00:00:31,580 --> 00:00:33,499 he builds a computer system and has fun 10 00:00:33,500 --> 00:00:35,119 proving them using COK. 11 00:00:35,120 --> 00:00:36,470 Not written as you think it is. 12 00:00:38,780 --> 00:00:39,780 Thank you. 13 00:00:47,420 --> 00:00:50,419 So from the earliest days of programing, 14 00:00:50,420 --> 00:00:52,159 people started to realize that the actual 15 00:00:52,160 --> 00:00:53,989 act of writing the code was going to be a 16 00:00:53,990 --> 00:00:55,609 relatively small fraction of the time 17 00:00:55,610 --> 00:00:57,589 that they spent building useful things. 18 00:00:57,590 --> 00:00:58,819 And here's a quote from one of the 19 00:00:58,820 --> 00:01:00,559 pioneers of computer science telling us 20 00:01:00,560 --> 00:01:02,269 about how at some point he realized a 21 00:01:02,270 --> 00:01:03,529 good part of his life was going to be 22 00:01:03,530 --> 00:01:05,629 spent finding errors in his own code. 23 00:01:05,630 --> 00:01:07,729 So we know that it would make a 24 00:01:07,730 --> 00:01:09,469 huge difference for developers that get a 25 00:01:09,470 --> 00:01:11,179 lot more cool stuff built if they could 26 00:01:11,180 --> 00:01:13,429 reduce the time that they spent 27 00:01:13,430 --> 00:01:15,559 trying to get their code to be 28 00:01:15,560 --> 00:01:16,969 correct, trying to understand what their 29 00:01:16,970 --> 00:01:19,279 code does. All these activities besides 30 00:01:19,280 --> 00:01:21,499 just writing it. And I want to tell 31 00:01:21,500 --> 00:01:22,909 you about some opportunities with 32 00:01:22,910 --> 00:01:24,229 technologies that are getting mature 33 00:01:24,230 --> 00:01:26,749 enough to potentially make a big impact 34 00:01:26,750 --> 00:01:29,179 in saving a lot of people's time building 35 00:01:29,180 --> 00:01:30,739 correct systems. 36 00:01:30,740 --> 00:01:32,659 And I think these are just ready enough 37 00:01:32,660 --> 00:01:33,739 that if you want to be on the bleeding 38 00:01:33,740 --> 00:01:35,299 edge, you could start looking into these 39 00:01:35,300 --> 00:01:36,229 systems today. 40 00:01:36,230 --> 00:01:38,599 So first, let me summarize 41 00:01:38,600 --> 00:01:40,699 a perspective on what programmers 42 00:01:40,700 --> 00:01:43,129 are doing today to gain 43 00:01:43,130 --> 00:01:45,289 confidence that their code is correct. 44 00:01:45,290 --> 00:01:47,659 So there's debugging the classic way to 45 00:01:47,660 --> 00:01:49,729 find a flaw in a program where 46 00:01:49,730 --> 00:01:52,009 you sort of do some detective work, 47 00:01:52,010 --> 00:01:54,059 make successive runs of a program, change 48 00:01:54,060 --> 00:01:55,879 a small thing between runs, see what 49 00:01:55,880 --> 00:01:58,849 happens differently on each execution. 50 00:01:58,850 --> 00:02:00,799 I'll graph these different approaches and 51 00:02:00,800 --> 00:02:03,709 a very wishy washy two axis way. 52 00:02:03,710 --> 00:02:06,079 Riggo I'm going to use to refer to 53 00:02:06,080 --> 00:02:08,119 the when you use some sort of method to 54 00:02:08,120 --> 00:02:10,159 understand your program, how closely tied 55 00:02:10,160 --> 00:02:12,079 it is to the actual code that runs. 56 00:02:12,080 --> 00:02:13,549 So, you know, it's accurate. 57 00:02:13,550 --> 00:02:15,649 And then completeness is covering how 58 00:02:15,650 --> 00:02:17,869 many different execution scenarios you 59 00:02:17,870 --> 00:02:19,159 get some guarantees about. 60 00:02:19,160 --> 00:02:21,019 So with with debugging, we only get 61 00:02:21,020 --> 00:02:23,009 guarantees about the scenarios we run. 62 00:02:23,010 --> 00:02:24,259 So that's low completeness. 63 00:02:24,260 --> 00:02:26,149 But I'm calling it high rigor because 64 00:02:26,150 --> 00:02:29,239 it's literally running your code. 65 00:02:29,240 --> 00:02:31,099 You can also when you're writing your 66 00:02:31,100 --> 00:02:32,569 code in the first place at a whiteboard 67 00:02:32,570 --> 00:02:34,069 or just in your head, think through 68 00:02:34,070 --> 00:02:36,019 carefully all the things that can happen. 69 00:02:36,020 --> 00:02:37,639 Consider all sorts of what if scenarios. 70 00:02:37,640 --> 00:02:38,779 Convince yourself, don't worry, my 71 00:02:38,780 --> 00:02:40,129 program is ready for those or I'll add 72 00:02:40,130 --> 00:02:41,329 some new code to be ready for that. 73 00:02:41,330 --> 00:02:42,589 And there can be all sorts of branching 74 00:02:42,590 --> 00:02:43,909 and we can think very carefully. 75 00:02:43,910 --> 00:02:46,189 And if we're very principled, then we get 76 00:02:46,190 --> 00:02:47,959 high levels of completeness but low rigor 77 00:02:47,960 --> 00:02:49,489 because this is just an abstract 78 00:02:49,490 --> 00:02:51,289 whiteboard level kind of exercise. 79 00:02:53,440 --> 00:02:55,359 Testing has a very similar profile to 80 00:02:55,360 --> 00:02:57,519 debugging, we create a bunch of scenarios 81 00:02:57,520 --> 00:02:59,469 that are saved in a list of test cases 82 00:02:59,470 --> 00:03:01,269 that we can run over and over again in an 83 00:03:01,270 --> 00:03:02,499 automatic way. 84 00:03:02,500 --> 00:03:04,629 And an alternative is to put lots of 85 00:03:04,630 --> 00:03:06,789 comments in your program that capture 86 00:03:06,790 --> 00:03:08,079 exactly your expectations. 87 00:03:08,080 --> 00:03:09,489 What should be true whenever you reach 88 00:03:09,490 --> 00:03:11,499 that comment and then read the code and 89 00:03:11,500 --> 00:03:13,599 think through the interactions 90 00:03:13,600 --> 00:03:15,729 between all these expectations so we 91 00:03:15,730 --> 00:03:17,229 could we could call this the comments 92 00:03:17,230 --> 00:03:18,639 approach, which might have a high level 93 00:03:18,640 --> 00:03:19,929 of completeness. There could be comments 94 00:03:19,930 --> 00:03:21,339 everywhere saying a lot of things about a 95 00:03:21,340 --> 00:03:23,229 lot of code, but it's not so rigorous 96 00:03:23,230 --> 00:03:25,179 because there's there's no typically no 97 00:03:25,180 --> 00:03:26,949 tool that connects the comments to the 98 00:03:26,950 --> 00:03:28,539 actual running behavior of the code. 99 00:03:30,560 --> 00:03:32,749 Another approach is auditing, and there's 100 00:03:32,750 --> 00:03:34,669 a whole spectrum of different levels of 101 00:03:34,670 --> 00:03:36,619 seriousness we can use in auditing, we 102 00:03:36,620 --> 00:03:38,419 can do relatively cheap auditing where we 103 00:03:38,420 --> 00:03:40,069 maybe read a lot of code but don't think 104 00:03:40,070 --> 00:03:41,299 too hard about it. 105 00:03:41,300 --> 00:03:43,429 Or we can have some more 106 00:03:43,430 --> 00:03:45,889 in-depth auditing where we probably 107 00:03:45,890 --> 00:03:47,479 need to read a smaller amount of code, 108 00:03:47,480 --> 00:03:50,359 but we can be much more 109 00:03:50,360 --> 00:03:52,399 thorough in checking the properties of 110 00:03:52,400 --> 00:03:53,869 that code. 111 00:03:53,870 --> 00:03:56,029 So the techniques I want 112 00:03:56,030 --> 00:03:58,129 to tell you about our way to sort of get 113 00:03:58,130 --> 00:04:00,259 the best of both worlds to get high 114 00:04:00,260 --> 00:04:02,629 ratings on both of these axes 115 00:04:02,630 --> 00:04:04,699 in contrast to the traditional 116 00:04:04,700 --> 00:04:06,079 methods. 117 00:04:06,080 --> 00:04:08,719 So I'm a professor at university. 118 00:04:08,720 --> 00:04:10,819 You're probably used to people from that 119 00:04:10,820 --> 00:04:12,859 sort of place shooting out all these 120 00:04:12,860 --> 00:04:14,779 little Greek letters and fancy 121 00:04:14,780 --> 00:04:16,398 mathematical symbols and saying this is 122 00:04:16,399 --> 00:04:17,689 formal methods, this is great. 123 00:04:17,690 --> 00:04:18,799 We're going to prove things about 124 00:04:18,800 --> 00:04:20,599 programs. I think this is an even more 125 00:04:20,600 --> 00:04:22,669 popular research area here in Germany 126 00:04:22,670 --> 00:04:24,439 than in the US where I'm from. 127 00:04:24,440 --> 00:04:26,029 It's been around for a long time, still 128 00:04:26,030 --> 00:04:28,279 not too widely used in 129 00:04:28,280 --> 00:04:31,399 industry and the open source world, but 130 00:04:31,400 --> 00:04:32,329 maybe there's hope. 131 00:04:32,330 --> 00:04:34,369 I think most people in those worlds are 132 00:04:34,370 --> 00:04:35,569 used to this kind of scenario where 133 00:04:35,570 --> 00:04:37,159 people doing phone methods, research are 134 00:04:37,160 --> 00:04:39,019 saying it's really the right thing to do 135 00:04:39,020 --> 00:04:40,159 to prove your program. 136 00:04:40,160 --> 00:04:41,659 Don't you want to do the right thing? 137 00:04:41,660 --> 00:04:44,059 And then the developers just sort of nod 138 00:04:44,060 --> 00:04:46,129 and smile politely and then they need to 139 00:04:46,130 --> 00:04:47,659 get back to work because they need to 140 00:04:47,660 --> 00:04:49,219 implement a few new features for the 141 00:04:49,220 --> 00:04:51,049 version of the system that ships next 142 00:04:51,050 --> 00:04:51,979 week. 143 00:04:51,980 --> 00:04:53,479 I want to try to argue that formal 144 00:04:53,480 --> 00:04:55,549 methods, if not already there today, 145 00:04:55,550 --> 00:04:56,839 are very close to the point where they 146 00:04:56,840 --> 00:04:58,069 will actually save you time. 147 00:04:58,070 --> 00:05:00,019 They will reduce the cost of developing 148 00:05:00,020 --> 00:05:02,149 systems, especially those where 149 00:05:02,150 --> 00:05:04,699 bugs are especially costly. 150 00:05:04,700 --> 00:05:06,829 So the the pitch I'm trying to make is 151 00:05:06,830 --> 00:05:08,689 that in the history of computing, we've 152 00:05:08,690 --> 00:05:10,849 gone through layers upon 153 00:05:10,850 --> 00:05:12,469 layers of building new, higher level 154 00:05:12,470 --> 00:05:14,629 abstractions that lower the cost of 155 00:05:14,630 --> 00:05:16,639 creating particular functionality. 156 00:05:16,640 --> 00:05:18,739 And we've built 157 00:05:18,740 --> 00:05:20,539 each of these levels upon earlier ones. 158 00:05:20,540 --> 00:05:22,039 They provide a new possibilities for 159 00:05:22,040 --> 00:05:24,259 modularity and abstraction and formal 160 00:05:24,260 --> 00:05:26,419 methods is actually an enabler 161 00:05:26,420 --> 00:05:28,519 for some of these 162 00:05:28,520 --> 00:05:30,949 next level of possibilities for dividing 163 00:05:30,950 --> 00:05:32,989 big programs into small pieces that we 164 00:05:32,990 --> 00:05:34,370 can understand effectively. 165 00:05:35,750 --> 00:05:37,849 So what I'm claiming 166 00:05:37,850 --> 00:05:39,949 should be feasible by, let's say, 167 00:05:39,950 --> 00:05:41,809 10 years from now with the right kind of 168 00:05:41,810 --> 00:05:43,609 community forming around it and doing the 169 00:05:43,610 --> 00:05:45,919 work is complete. 170 00:05:45,920 --> 00:05:47,749 Deployed computer systems with top to 171 00:05:47,750 --> 00:05:50,179 bottom correctness proofs, meaning 172 00:05:50,180 --> 00:05:53,359 from hardware up to applications 173 00:05:53,360 --> 00:05:54,739 where these troops will be checked 174 00:05:54,740 --> 00:05:56,419 algorithmically. 175 00:05:56,420 --> 00:05:58,519 So we don't have to worry about 176 00:05:58,520 --> 00:06:00,739 low reliability people reading long 177 00:06:00,740 --> 00:06:02,149 mathematical documents. 178 00:06:02,150 --> 00:06:04,219 And we will get to the point 179 00:06:04,220 --> 00:06:06,109 where none of the code that you run needs 180 00:06:06,110 --> 00:06:08,449 to be trusted to guarantee the properties 181 00:06:08,450 --> 00:06:09,619 you care about, whether they're in 182 00:06:09,620 --> 00:06:11,959 security or other kinds of correctness. 183 00:06:11,960 --> 00:06:13,579 And as if that weren't good enough, 184 00:06:14,630 --> 00:06:15,829 by the way, that this will apply to 185 00:06:15,830 --> 00:06:18,439 hardware designs, not just software code, 186 00:06:18,440 --> 00:06:20,539 and this 187 00:06:20,540 --> 00:06:22,219 kind of approach should be able to lower 188 00:06:22,220 --> 00:06:23,779 development costs in the same way that a 189 00:06:23,780 --> 00:06:26,239 good systematic testing framework can 190 00:06:26,240 --> 00:06:27,979 and to a large extent should be able to 191 00:06:27,980 --> 00:06:30,709 replace very time intensive activities 192 00:06:30,710 --> 00:06:33,559 like debugging, testing and code review. 193 00:06:33,560 --> 00:06:35,959 So I better go into a little more detail 194 00:06:35,960 --> 00:06:37,699 on what kind of techniques I'm talking 195 00:06:37,700 --> 00:06:39,359 about here and why we should believe the 196 00:06:39,360 --> 00:06:40,549 stuff is ready. 197 00:06:40,550 --> 00:06:43,249 But before I get into the 198 00:06:43,250 --> 00:06:44,539 the next level of detail, I'll just 199 00:06:44,540 --> 00:06:46,609 summarize how this kind of 200 00:06:46,610 --> 00:06:47,929 approach can substitute for the 201 00:06:47,930 --> 00:06:49,789 traditional development activities. 202 00:06:49,790 --> 00:06:51,049 Instead of debugging where we're 203 00:06:51,050 --> 00:06:52,909 exploring concrete executions of a 204 00:06:52,910 --> 00:06:54,949 system, we can do proving where we're 205 00:06:54,950 --> 00:06:56,989 exploring symbolic arguments that cover 206 00:06:56,990 --> 00:06:59,239 all possible executions 207 00:06:59,240 --> 00:07:00,859 instead of testing, which applies to 208 00:07:00,860 --> 00:07:02,389 concrete scenarios, we can do 209 00:07:02,390 --> 00:07:04,669 specification writing where we describe 210 00:07:04,670 --> 00:07:07,009 general requirements that can cover 211 00:07:07,010 --> 00:07:08,399 all scenarios. 212 00:07:08,400 --> 00:07:10,789 Instead of auditing our code in detail, 213 00:07:10,790 --> 00:07:12,379 we can audit the specifications in 214 00:07:12,380 --> 00:07:14,389 detail. Specifications should be much 215 00:07:14,390 --> 00:07:16,399 shorter and typically don't include 216 00:07:16,400 --> 00:07:18,439 optimizations, which can be some of the 217 00:07:18,440 --> 00:07:20,149 hardest parts of code to understand. 218 00:07:21,430 --> 00:07:23,589 So a summary of what I'm 219 00:07:23,590 --> 00:07:25,179 going to cover in the rest of this talk. 220 00:07:25,180 --> 00:07:27,099 I'm going to introduce the key technology 221 00:07:27,100 --> 00:07:28,419 that I'm selling here, which is called 222 00:07:28,420 --> 00:07:29,829 proof assistance. 223 00:07:29,830 --> 00:07:31,959 I'll cover some common objections 224 00:07:31,960 --> 00:07:34,089 and the standard answers to those 225 00:07:34,090 --> 00:07:36,249 questions about why do 226 00:07:36,250 --> 00:07:38,319 we believe this can actually 227 00:07:38,320 --> 00:07:40,509 be applied to real systems, why 228 00:07:40,510 --> 00:07:42,519 it can be applied cost effectively. 229 00:07:42,520 --> 00:07:44,049 I'll explain a few different ways that 230 00:07:44,050 --> 00:07:45,369 this technology can be used in the 231 00:07:45,370 --> 00:07:46,929 development of computer systems. 232 00:07:46,930 --> 00:07:48,549 I'll talk about two case studies, one in 233 00:07:48,550 --> 00:07:50,109 hardware and one in cryptographic 234 00:07:50,110 --> 00:07:51,849 software, and I'll finish up with some 235 00:07:51,850 --> 00:07:54,999 pointers to further reading for 236 00:07:55,000 --> 00:07:56,619 continuing the journey along learning 237 00:07:56,620 --> 00:07:57,620 this subject. 238 00:07:58,600 --> 00:08:00,879 So one of the first questions people 239 00:08:00,880 --> 00:08:03,249 have about formal methods is essentially 240 00:08:03,250 --> 00:08:05,559 based on an argument embodied 241 00:08:05,560 --> 00:08:07,329 in a paper from nineteen seventy nine by 242 00:08:07,330 --> 00:08:09,519 DeMello, Lipton and Perlis called 243 00:08:09,520 --> 00:08:11,259 Social Processes and Proofs of Theorems 244 00:08:11,260 --> 00:08:12,369 and Programs. 245 00:08:12,370 --> 00:08:14,139 They had a lot to say in that paper, but 246 00:08:14,140 --> 00:08:16,629 I would kind of boil it down to 247 00:08:16,630 --> 00:08:19,389 they say real systems are so complex 248 00:08:19,390 --> 00:08:21,549 and they change so frequently 249 00:08:21,550 --> 00:08:23,949 that it is beyond the powers 250 00:08:23,950 --> 00:08:26,109 of human concentration to read 251 00:08:26,110 --> 00:08:28,779 careful proofs about those systems. 252 00:08:28,780 --> 00:08:30,459 And it's not just one proof its version 253 00:08:30,460 --> 00:08:31,749 after version of the system, you'll need 254 00:08:31,750 --> 00:08:33,969 a new proof to cover the correctness 255 00:08:33,970 --> 00:08:35,918 of the new system. 256 00:08:35,919 --> 00:08:38,079 And I would say today 257 00:08:38,080 --> 00:08:40,178 we we learned that these authors are both 258 00:08:40,179 --> 00:08:41,949 right and wrong about this point. 259 00:08:41,950 --> 00:08:44,048 They're probably right that people can't 260 00:08:44,049 --> 00:08:46,209 be trusted to read proofs about real 261 00:08:46,210 --> 00:08:48,699 systems, but they were implicitly 262 00:08:48,700 --> 00:08:50,289 assuming you would be people who read the 263 00:08:50,290 --> 00:08:51,909 proofs today. 264 00:08:51,910 --> 00:08:52,899 We have a better way. 265 00:08:52,900 --> 00:08:55,869 Let's get algorithms to read the proofs. 266 00:08:55,870 --> 00:08:57,939 So here's how this kind 267 00:08:57,940 --> 00:09:00,339 of proving we can scale 268 00:09:00,340 --> 00:09:01,749 up to real systems. 269 00:09:01,750 --> 00:09:03,909 So we have here's a proof engineer trying 270 00:09:03,910 --> 00:09:04,899 to prove a theorem. 271 00:09:04,900 --> 00:09:06,009 He wrote a proof about it. 272 00:09:07,030 --> 00:09:08,529 And don't worry, he didn't have to write 273 00:09:08,530 --> 00:09:10,119 it all himself. There's an ecosystem of 274 00:09:10,120 --> 00:09:12,669 libraries, of existing proof facts, 275 00:09:12,670 --> 00:09:14,859 as well as proof techniques 276 00:09:14,860 --> 00:09:15,909 that can be reused. 277 00:09:16,960 --> 00:09:18,969 And the proof is actually aski source 278 00:09:18,970 --> 00:09:21,129 code. It's checked into version control 279 00:09:21,130 --> 00:09:23,589 like on GitHub, and that allows 280 00:09:23,590 --> 00:09:26,259 a number of developers to collaborate 281 00:09:26,260 --> 00:09:27,549 writing different theorems within a 282 00:09:27,550 --> 00:09:29,739 larger development, get them to 283 00:09:29,740 --> 00:09:31,149 all wind up in the same place and have 284 00:09:31,150 --> 00:09:32,559 their versions tracked. 285 00:09:32,560 --> 00:09:34,359 We can plug our version control into 286 00:09:34,360 --> 00:09:35,709 continuous integration, which is 287 00:09:35,710 --> 00:09:37,239 effectively running over and over again 288 00:09:37,240 --> 00:09:39,159 checking. Is this proof still convincing? 289 00:09:39,160 --> 00:09:40,689 Is it still convincing every time there's 290 00:09:40,690 --> 00:09:41,739 a change? 291 00:09:41,740 --> 00:09:42,939 And how do we know it's convincing? 292 00:09:42,940 --> 00:09:45,249 We have an automatic proof checker 293 00:09:45,250 --> 00:09:47,319 that reads the source code of our 294 00:09:47,320 --> 00:09:49,539 proofs and applies some 295 00:09:49,540 --> 00:09:51,639 straightforward rules to validate 296 00:09:51,640 --> 00:09:53,259 that they really prove the facts that 297 00:09:53,260 --> 00:09:55,149 they're claiming to prove. 298 00:09:55,150 --> 00:09:57,249 By the way, one risk here is 299 00:09:57,250 --> 00:09:58,869 that some of these programmers or 300 00:09:58,870 --> 00:10:01,209 developers prove the wrong theorems. 301 00:10:01,210 --> 00:10:03,939 So we do some code review where 302 00:10:03,940 --> 00:10:06,309 each proof engineer might read 303 00:10:06,310 --> 00:10:07,809 the theorem statements written by the 304 00:10:07,810 --> 00:10:09,579 others and make sure that we're really 305 00:10:09,580 --> 00:10:12,579 proving what we set out to prove. 306 00:10:12,580 --> 00:10:14,709 So one, probably some of you are thinking 307 00:10:14,710 --> 00:10:16,899 this picture looks just like the way that 308 00:10:16,900 --> 00:10:18,579 software development works today in the 309 00:10:18,580 --> 00:10:19,659 real world. 310 00:10:19,660 --> 00:10:20,709 And that is very true. 311 00:10:20,710 --> 00:10:23,289 It's also true that we also have 312 00:10:23,290 --> 00:10:25,509 all these tools already for doing this 313 00:10:25,510 --> 00:10:27,879 kind of development for systems with 314 00:10:27,880 --> 00:10:29,259 correctness proofs. 315 00:10:29,260 --> 00:10:31,509 And I will try to demonstrate that 316 00:10:31,510 --> 00:10:33,489 in the rest of this talk. So this future 317 00:10:33,490 --> 00:10:35,619 is more or less here today. 318 00:10:35,620 --> 00:10:37,599 There are a lot more rough edges on the 319 00:10:37,600 --> 00:10:39,369 proof versions of some of these tools 320 00:10:39,370 --> 00:10:40,659 than in the traditional programing 321 00:10:40,660 --> 00:10:41,979 versions. And there's a lot of 322 00:10:41,980 --> 00:10:43,539 opportunity to get involved with 323 00:10:43,540 --> 00:10:44,540 improving the tools. 324 00:10:45,490 --> 00:10:47,469 So the key system we're going to use here 325 00:10:47,470 --> 00:10:49,119 is called a proof assistant. 326 00:10:49,120 --> 00:10:51,369 It's a software package that is basically 327 00:10:51,370 --> 00:10:53,109 an idea for stating improving 328 00:10:53,110 --> 00:10:55,209 mathematical theorems where you have to 329 00:10:55,210 --> 00:10:57,189 do manual work to write the proofs of the 330 00:10:57,190 --> 00:10:59,679 theorems as ASCII source code. 331 00:10:59,680 --> 00:11:02,559 But then they are checked automatically, 332 00:11:02,560 --> 00:11:04,449 just like we're used to with, say, type 333 00:11:04,450 --> 00:11:05,559 Checker's in programing. 334 00:11:07,330 --> 00:11:08,739 There are a number of proof assistants 335 00:11:08,740 --> 00:11:10,809 that have been developed for the 336 00:11:10,810 --> 00:11:12,369 last few decades and that are pretty 337 00:11:12,370 --> 00:11:13,389 usable today. 338 00:11:13,390 --> 00:11:15,019 One of them is called Isabelle Hall. 339 00:11:15,020 --> 00:11:16,839 A large part of that development was done 340 00:11:16,840 --> 00:11:19,209 here in Germany at TI Munich, and 341 00:11:19,210 --> 00:11:20,859 probably its best known application is 342 00:11:20,860 --> 00:11:22,959 the L for Verified Project, which has 343 00:11:22,960 --> 00:11:24,549 done a correctness proof of an operating 344 00:11:24,550 --> 00:11:25,719 system. 345 00:11:25,720 --> 00:11:27,189 Another proof of system is called Calk 346 00:11:27,190 --> 00:11:29,649 Developed at India and France. 347 00:11:29,650 --> 00:11:31,599 One of its best known applications is 348 00:11:31,600 --> 00:11:33,799 CERT, which is a 349 00:11:33,800 --> 00:11:36,399 C compiler with a proof of correctness 350 00:11:36,400 --> 00:11:38,259 bridging the gap from the C-code down to 351 00:11:38,260 --> 00:11:40,630 the assembly code that is produced. 352 00:11:42,040 --> 00:11:43,899 I'm going to focus on the court system in 353 00:11:43,900 --> 00:11:45,639 this presentation because that's what I 354 00:11:45,640 --> 00:11:46,720 use in most of my work. 355 00:11:48,580 --> 00:11:50,679 So how does this process 356 00:11:50,680 --> 00:11:52,119 work where we have a sort of 357 00:11:52,120 --> 00:11:54,129 collaboration between the human and the 358 00:11:54,130 --> 00:11:56,289 machine to come to a 359 00:11:56,290 --> 00:11:57,309 convincing proof? 360 00:11:57,310 --> 00:11:59,559 Well, the human user is repeatedly 361 00:11:59,560 --> 00:12:01,839 making suggestions to the automatic 362 00:12:01,840 --> 00:12:03,999 engine, basically saying, I think 363 00:12:04,000 --> 00:12:05,709 the following strategy ought to work for 364 00:12:05,710 --> 00:12:06,809 the next step of this proof. 365 00:12:06,810 --> 00:12:09,159 And the engine spins for a while 366 00:12:09,160 --> 00:12:10,659 and then says, OK, if we do that, then 367 00:12:10,660 --> 00:12:11,739 here's what's left to prove. 368 00:12:11,740 --> 00:12:13,299 Please tell me what to do here. 369 00:12:13,300 --> 00:12:16,209 And the library of steps or tactics 370 00:12:16,210 --> 00:12:18,159 is not fixed. We can plug in new 371 00:12:18,160 --> 00:12:20,469 libraries to the tactic engine 372 00:12:20,470 --> 00:12:21,789 introducing higher and higher level 373 00:12:21,790 --> 00:12:24,729 abstractions of what constitutes a valid 374 00:12:24,730 --> 00:12:25,730 piece of an argument 375 00:12:26,800 --> 00:12:29,319 in the end, after we finish this cycle 376 00:12:29,320 --> 00:12:31,869 and there is nothing left to prove, then 377 00:12:31,870 --> 00:12:33,399 the tactic engine outputs what's called a 378 00:12:33,400 --> 00:12:36,369 proof term, which is a 379 00:12:36,370 --> 00:12:38,289 essentially a program in a small language 380 00:12:38,290 --> 00:12:40,179 with just a few reasoning steps built 381 00:12:40,180 --> 00:12:42,249 into it. And in fact, everything I've 382 00:12:42,250 --> 00:12:44,709 shown you up to this point is untrusted 383 00:12:44,710 --> 00:12:46,689 in the sense of trusted computing bases 384 00:12:46,690 --> 00:12:48,159 and security. 385 00:12:48,160 --> 00:12:49,749 The only thing we need to trust, the only 386 00:12:49,750 --> 00:12:51,459 place where a bug in the infrastructure 387 00:12:51,460 --> 00:12:52,989 could lead us to accept an incorrect 388 00:12:52,990 --> 00:12:55,389 proof is a separate, relatively 389 00:12:55,390 --> 00:12:57,399 small proof checker that only looks at 390 00:12:57,400 --> 00:12:59,019 the proof term, ignores the way it was 391 00:12:59,020 --> 00:13:00,909 produced and checked whether it actually 392 00:13:00,910 --> 00:13:02,589 establishes the theorem that we were 393 00:13:02,590 --> 00:13:03,590 interested in. 394 00:13:04,950 --> 00:13:06,419 So what are those primitive deduction 395 00:13:06,420 --> 00:13:08,489 steps that the the small proof 396 00:13:08,490 --> 00:13:09,899 checker is implementing? 397 00:13:09,900 --> 00:13:12,259 I'll just give you one classic example, 398 00:13:12,260 --> 00:13:13,709 a kind of step which is more or less 399 00:13:13,710 --> 00:13:15,899 built into systems like this is the mode 400 00:13:15,900 --> 00:13:18,149 Poland's rule. It says if you know 401 00:13:18,150 --> 00:13:20,309 P implies Q and you know 402 00:13:20,310 --> 00:13:22,289 P, then you are allowed to deduce. 403 00:13:22,290 --> 00:13:24,359 Q So imagine 404 00:13:24,360 --> 00:13:26,639 a set of roughly 10 rules 405 00:13:26,640 --> 00:13:28,829 like this that we write out 406 00:13:28,830 --> 00:13:31,109 in ASCII syntax for the core 407 00:13:31,110 --> 00:13:32,819 proof terms of the system. 408 00:13:32,820 --> 00:13:35,039 And this is actually 409 00:13:35,040 --> 00:13:36,479 a language inspired by functional 410 00:13:36,480 --> 00:13:37,409 programing. 411 00:13:37,410 --> 00:13:38,759 It's a lot like a camel. 412 00:13:38,760 --> 00:13:40,799 In fact, Camel was invented to use to 413 00:13:40,800 --> 00:13:43,559 implement COK in the 80s. 414 00:13:43,560 --> 00:13:45,359 And so we have lines like this that are 415 00:13:45,360 --> 00:13:46,739 essentially changing the other proof 416 00:13:46,740 --> 00:13:48,599 steps and we have little applications 417 00:13:48,600 --> 00:13:50,039 like that. Monas Poland's at the end 418 00:13:50,040 --> 00:13:52,139 saying, and here's the deduction rule 419 00:13:52,140 --> 00:13:54,689 we use here. And believe it or not, 420 00:13:54,690 --> 00:13:56,249 these chains of steps are enough to 421 00:13:56,250 --> 00:13:58,439 reproduce all of the results from math 422 00:13:58,440 --> 00:13:59,729 and computer science and all sorts of 423 00:13:59,730 --> 00:14:02,129 other logic based fields. 424 00:14:02,130 --> 00:14:04,139 And it's easy to write a relatively 425 00:14:04,140 --> 00:14:05,909 small, trustworthy checker for 426 00:14:05,910 --> 00:14:07,739 applications of these rules that will 427 00:14:07,740 --> 00:14:09,299 apply to all those different domains. 428 00:14:10,890 --> 00:14:13,079 So let me make this a bit more concrete 429 00:14:13,080 --> 00:14:15,269 and show you an example of 430 00:14:15,270 --> 00:14:16,649 a proof or two in Cock's. 431 00:14:16,650 --> 00:14:18,819 So I'm going to be working with this code 432 00:14:18,820 --> 00:14:21,179 in in Emacs, in emacs 433 00:14:21,180 --> 00:14:23,309 mode for cok authoring, 434 00:14:23,310 --> 00:14:24,269 of course. 435 00:14:24,270 --> 00:14:25,889 So Corke is based on a functional 436 00:14:25,890 --> 00:14:27,269 programing language. It's a lot like 437 00:14:27,270 --> 00:14:28,259 Haskell or Okemos. 438 00:14:28,260 --> 00:14:30,029 For those of you who know those, we can 439 00:14:30,030 --> 00:14:31,589 do things like. Right. This expression, 440 00:14:31,590 --> 00:14:33,509 which is a list of natural numbers, five 441 00:14:33,510 --> 00:14:34,859 thirteen eighty nine. 442 00:14:34,860 --> 00:14:37,079 I run a check command and I press 443 00:14:37,080 --> 00:14:38,459 some keys. And then I got this 444 00:14:38,460 --> 00:14:39,749 highlighting in this region, which 445 00:14:39,750 --> 00:14:41,099 indicates this is now where the 446 00:14:41,100 --> 00:14:42,539 processing of the proof Persistent has 447 00:14:42,540 --> 00:14:43,949 moved up to. 448 00:14:43,950 --> 00:14:45,929 And it tells me this is a list of natural 449 00:14:45,930 --> 00:14:48,389 numbers. I can make a list booleans 450 00:14:48,390 --> 00:14:50,039 and I can use this alternative notation 451 00:14:50,040 --> 00:14:51,749 for list with an index operator, colon 452 00:14:51,750 --> 00:14:53,819 colon for adding a new 453 00:14:53,820 --> 00:14:54,959 element to the front of a list. 454 00:14:54,960 --> 00:14:56,459 So this is the list consisting of five 455 00:14:56,460 --> 00:14:58,499 thirteen eighty nine with an empty list 456 00:14:58,500 --> 00:15:00,509 here. It's another way of writing this. 457 00:15:02,220 --> 00:15:04,889 And I've also written a recursive 458 00:15:04,890 --> 00:15:06,839 function here called length that for any 459 00:15:06,840 --> 00:15:08,699 type A takes and a list of type and 460 00:15:08,700 --> 00:15:09,989 returns a natural number. 461 00:15:09,990 --> 00:15:12,239 And we do pattern matching and say an 462 00:15:12,240 --> 00:15:14,369 empty list, match the number zero and 463 00:15:14,370 --> 00:15:16,199 a non-empty list match to one plus the 464 00:15:16,200 --> 00:15:17,069 length of the table. 465 00:15:17,070 --> 00:15:18,989 The list where I'm signing the name ls 466 00:15:18,990 --> 00:15:19,949 prime to the tail. 467 00:15:19,950 --> 00:15:21,329 I don't have time to go over all the 468 00:15:21,330 --> 00:15:23,039 classic ideas from functional programing 469 00:15:23,040 --> 00:15:24,899 like in Haskell and Emelle. 470 00:15:24,900 --> 00:15:27,209 Hopefully this will make 471 00:15:27,210 --> 00:15:28,649 just enough sense for people who aren't 472 00:15:28,650 --> 00:15:30,569 used to seeing those, but it's very 473 00:15:30,570 --> 00:15:32,579 similar notation to in those languages. 474 00:15:32,580 --> 00:15:33,479 So let's prove a theorem. 475 00:15:33,480 --> 00:15:34,789 Well, first I'll check what's the length? 476 00:15:34,790 --> 00:15:36,689 This list, it's three good. 477 00:15:36,690 --> 00:15:38,519 And I also define another classic 478 00:15:38,520 --> 00:15:39,809 function that takes two lists and 479 00:15:39,810 --> 00:15:41,879 essentially concatenate them makes a 480 00:15:41,880 --> 00:15:43,259 new list. That's first the elements of 481 00:15:43,260 --> 00:15:44,789 the first list, then the elements of the 482 00:15:44,790 --> 00:15:46,649 second list. You just write this as plus 483 00:15:46,650 --> 00:15:49,259 four lists in Python, for instance, 484 00:15:49,260 --> 00:15:51,149 here I'm calling it append and append. 485 00:15:51,150 --> 00:15:53,159 One, two with three, four, five and get 486 00:15:53,160 --> 00:15:54,990 one, two, three, four, five down here. 487 00:15:56,670 --> 00:15:58,349 So you can already see some of these 488 00:15:58,350 --> 00:15:59,959 elements that led me to call this an 489 00:15:59,960 --> 00:16:02,039 idea. We're getting constant 490 00:16:02,040 --> 00:16:04,439 feedback as we go, much like the 491 00:16:04,440 --> 00:16:07,019 ominous red squiggly is in eclipse 492 00:16:07,020 --> 00:16:09,149 when you make a mistake in your code. 493 00:16:09,150 --> 00:16:10,619 We'll have those coming up shortly. 494 00:16:10,620 --> 00:16:12,029 When we try to do some proofs and we get 495 00:16:12,030 --> 00:16:14,099 this feedback on the 496 00:16:14,100 --> 00:16:15,359 computational behavior of what we're 497 00:16:15,360 --> 00:16:17,429 defining. So here's a theorem 498 00:16:17,430 --> 00:16:19,529 for any type and for two lists of that 499 00:16:19,530 --> 00:16:21,629 type. If I append them and 500 00:16:21,630 --> 00:16:23,459 then take the length, that's the same as 501 00:16:23,460 --> 00:16:25,109 taking the sum of their individual 502 00:16:25,110 --> 00:16:26,110 lengths. 503 00:16:26,860 --> 00:16:29,259 So there's this red 504 00:16:29,260 --> 00:16:31,269 backgrounded admitted telling us this is 505 00:16:31,270 --> 00:16:33,429 a fact we didn't prove yet, so let's 506 00:16:33,430 --> 00:16:34,869 prove it so we don't have to look at that 507 00:16:34,870 --> 00:16:35,769 red anymore. 508 00:16:35,770 --> 00:16:37,959 And so I'm going to say my first step is 509 00:16:37,960 --> 00:16:39,759 I'm going to prove this by induction on 510 00:16:39,760 --> 00:16:41,319 the variable one. 511 00:16:41,320 --> 00:16:42,669 And then what this is going to do is 512 00:16:42,670 --> 00:16:44,169 create two new subgoals. 513 00:16:44,170 --> 00:16:45,099 Here's our current goal. 514 00:16:45,100 --> 00:16:47,259 Just repeat it from up here when I run 515 00:16:47,260 --> 00:16:48,260 this. 516 00:16:48,990 --> 00:16:50,019 Let's give us more space. 517 00:16:50,020 --> 00:16:52,569 There there are now two subgoals, 518 00:16:52,570 --> 00:16:55,059 a base case and an inductive 519 00:16:55,060 --> 00:16:57,189 case in the base 520 00:16:57,190 --> 00:16:59,259 case. Plus one was replaced with an 521 00:16:59,260 --> 00:17:01,299 empty list and an inductive case, plus 522 00:17:01,300 --> 00:17:03,879 one was replaced with an arbitrary 523 00:17:03,880 --> 00:17:05,949 non-empty list with some head element 524 00:17:05,950 --> 00:17:07,999 and some tail elements. 525 00:17:08,000 --> 00:17:10,088 So let's start proving 526 00:17:10,089 --> 00:17:11,679 the base case. 527 00:17:11,680 --> 00:17:14,318 One thing I'll do I noticed the 528 00:17:14,319 --> 00:17:16,209 goal here starts with a four all so I'll 529 00:17:16,210 --> 00:17:18,309 run a step called intro that's going to 530 00:17:18,310 --> 00:17:19,989 take out the overall and instead give us 531 00:17:19,990 --> 00:17:22,729 a local variable called Elst to 532 00:17:22,730 --> 00:17:23,979 here it is. 533 00:17:23,980 --> 00:17:26,108 And then I notice, oh, this 534 00:17:26,109 --> 00:17:27,729 looks pretty obvious just from the 535 00:17:27,730 --> 00:17:28,839 definition of a pen. 536 00:17:28,840 --> 00:17:30,729 So let's ask Corke to do some algebraic 537 00:17:30,730 --> 00:17:33,249 simplification with that definition, 538 00:17:33,250 --> 00:17:35,139 essentially running parts of the program 539 00:17:35,140 --> 00:17:36,759 at compile time, if you will. 540 00:17:36,760 --> 00:17:38,459 And then that was looks pretty clear. 541 00:17:38,460 --> 00:17:39,919 The same thing is on both sides of the 542 00:17:39,920 --> 00:17:42,459 equality. So I can say reflexivity 543 00:17:42,460 --> 00:17:44,419 and that case is proved. 544 00:17:44,420 --> 00:17:46,629 Let's go into the inductive case. 545 00:17:46,630 --> 00:17:48,789 I'll start at the same way I 546 00:17:48,790 --> 00:17:51,009 get rid of that for all down below here 547 00:17:51,010 --> 00:17:53,949 and simplify. 548 00:17:53,950 --> 00:17:56,109 Now I notice there's this term here, 549 00:17:56,110 --> 00:17:58,239 length of append, blah, blah, blah, 550 00:17:58,240 --> 00:18:00,609 which is actually also present up here, 551 00:18:00,610 --> 00:18:02,529 what appears here. So in general, above 552 00:18:02,530 --> 00:18:04,569 the double line are our hypotheses are 553 00:18:04,570 --> 00:18:06,189 known facts that we're allowed to appeal 554 00:18:06,190 --> 00:18:07,299 to. And here we have an induction 555 00:18:07,300 --> 00:18:09,339 hypothesis which has the theorem 556 00:18:09,340 --> 00:18:11,079 restated, essentially, but applying 557 00:18:11,080 --> 00:18:12,729 specifically to the tail of the current 558 00:18:12,730 --> 00:18:15,039 list. So I can say, hey, let's take that 559 00:18:15,040 --> 00:18:17,319 n'goni quality, use it to rewrite, 560 00:18:17,320 --> 00:18:20,169 will match this with this 561 00:18:20,170 --> 00:18:22,059 and therefore we'll be able to replace it 562 00:18:22,060 --> 00:18:23,169 with this. 563 00:18:23,170 --> 00:18:25,779 So we went like that and now reflexivity 564 00:18:25,780 --> 00:18:26,780 applies. 565 00:18:27,390 --> 00:18:29,499 Very good. And the most exciting 566 00:18:29,500 --> 00:18:32,469 moment of any proof in copyright QAD 567 00:18:32,470 --> 00:18:34,299 process that and if the system doesn't 568 00:18:34,300 --> 00:18:35,979 complain, this is the equivalent of 569 00:18:35,980 --> 00:18:37,719 taking your code into eclipse and not 570 00:18:37,720 --> 00:18:39,429 getting any red squiggles under it. 571 00:18:39,430 --> 00:18:40,929 This is a correct proof. 572 00:18:40,930 --> 00:18:42,579 Cork is convinced of the theorem is true. 573 00:18:46,300 --> 00:18:47,300 Thank you. 574 00:18:51,060 --> 00:18:53,579 One more example to show 575 00:18:53,580 --> 00:18:55,439 how we reuse Limas improves. 576 00:18:55,440 --> 00:18:57,569 This is a reverse function for 577 00:18:57,570 --> 00:18:59,549 lists that takes in a list and then steps 578 00:18:59,550 --> 00:19:01,559 down and building a new list using the 579 00:19:01,560 --> 00:19:04,229 append function to repeatedly add 580 00:19:04,230 --> 00:19:06,059 elements that were on the front onto the 581 00:19:06,060 --> 00:19:07,299 end. 582 00:19:07,300 --> 00:19:09,149 So reverse one, two, three into three to 583 00:19:09,150 --> 00:19:10,139 one. 584 00:19:10,140 --> 00:19:12,239 And so then we can quickly prove length. 585 00:19:12,240 --> 00:19:13,240 Rev. 586 00:19:14,900 --> 00:19:16,639 I'll do this by induction on the list I'm 587 00:19:16,640 --> 00:19:18,079 reversing, I want to show the length does 588 00:19:18,080 --> 00:19:19,459 not change when you reverse the list 589 00:19:20,540 --> 00:19:22,259 to the base case is pretty simple here. 590 00:19:22,260 --> 00:19:24,409 I use algebraic simplification and 591 00:19:24,410 --> 00:19:25,549 then that's obviously true. 592 00:19:28,200 --> 00:19:30,359 And see what happens 593 00:19:30,360 --> 00:19:31,360 in this case. 594 00:19:32,580 --> 00:19:34,679 Well, here I see I have 595 00:19:34,680 --> 00:19:36,449 a length of an append that doesn't 596 00:19:36,450 --> 00:19:38,399 directly match my induction hypothesis, 597 00:19:38,400 --> 00:19:40,619 but luckily I approved 598 00:19:40,620 --> 00:19:42,689 something called length append or a 599 00:19:42,690 --> 00:19:43,619 pen length. I don't remember. 600 00:19:43,620 --> 00:19:46,019 We'll find out length append, which 601 00:19:46,020 --> 00:19:48,659 replaced Pend 602 00:19:48,660 --> 00:19:51,029 length of append with some of 603 00:19:51,030 --> 00:19:52,739 the lengths I would play that again. 604 00:19:52,740 --> 00:19:54,029 We have the length of the Rev plus the 605 00:19:54,030 --> 00:19:55,949 length of the Singleton list. 606 00:19:55,950 --> 00:19:59,069 Then I can simplify this 607 00:19:59,070 --> 00:20:00,899 and now we're very close so I can rewrite 608 00:20:00,900 --> 00:20:02,099 with IHSS. 609 00:20:02,100 --> 00:20:04,199 By the way, this capitalist's is the plus 610 00:20:04,200 --> 00:20:06,149 one function. It stands for successor. 611 00:20:06,150 --> 00:20:09,149 So this is pretty obviously true, 612 00:20:09,150 --> 00:20:11,399 though capital s is not defined as plus 613 00:20:11,400 --> 00:20:13,259 one is a little different, but I can call 614 00:20:13,260 --> 00:20:14,669 a procedure called Omega, which is a 615 00:20:14,670 --> 00:20:16,049 solver for what's called linear 616 00:20:16,050 --> 00:20:17,849 arithmetic, which knows how to prove all 617 00:20:17,850 --> 00:20:19,919 sorts of numerical facts like 618 00:20:19,920 --> 00:20:20,669 that. 619 00:20:20,670 --> 00:20:22,859 So that was an example primarily of using 620 00:20:22,860 --> 00:20:24,389 a fact we already proved within a larger 621 00:20:24,390 --> 00:20:26,459 proof. And this allows us 622 00:20:26,460 --> 00:20:28,799 to build up to proofs that cover pretty 623 00:20:28,800 --> 00:20:30,659 significant systems and their correctness 624 00:20:30,660 --> 00:20:31,660 properties. 625 00:20:33,910 --> 00:20:35,439 One last example, and I will not go 626 00:20:35,440 --> 00:20:36,609 through as much detail, I just want to 627 00:20:36,610 --> 00:20:38,949 show that even for programs 628 00:20:38,950 --> 00:20:40,749 I can write in one page of code, it's not 629 00:20:40,750 --> 00:20:43,029 very straightforward, naively 630 00:20:43,030 --> 00:20:45,039 to establish their correctness or define 631 00:20:45,040 --> 00:20:46,659 the language of abstract syntax, trees 632 00:20:46,660 --> 00:20:48,609 for expressions with a plus. 633 00:20:48,610 --> 00:20:50,709 Sometimes, as well as constants and 634 00:20:50,710 --> 00:20:52,809 variables, I 635 00:20:52,810 --> 00:20:54,249 have defined a recursive function to 636 00:20:54,250 --> 00:20:56,409 evaluate those expressions to 637 00:20:56,410 --> 00:20:58,719 numbers given values for all variables. 638 00:20:58,720 --> 00:21:01,419 And I've defined 639 00:21:01,420 --> 00:21:03,879 medium complexity optimizer that finds 640 00:21:03,880 --> 00:21:05,139 parts of the arithmetic that we can 641 00:21:05,140 --> 00:21:06,969 actually simplify at compile time. 642 00:21:06,970 --> 00:21:09,339 This is a classic optimization called 643 00:21:09,340 --> 00:21:10,340 constant folding, 644 00:21:11,800 --> 00:21:13,149 but I won't dwell on the code there. 645 00:21:13,150 --> 00:21:15,219 I just want to show here I started trying 646 00:21:15,220 --> 00:21:17,559 to prove the correctness, namely when you 647 00:21:17,560 --> 00:21:19,479 constant fold and then interpret. 648 00:21:19,480 --> 00:21:20,709 It's the same answer as if you 649 00:21:20,710 --> 00:21:22,269 interpreted the original program in the 650 00:21:22,270 --> 00:21:24,219 same variable environment excess. 651 00:21:24,220 --> 00:21:27,169 And then I started approved by induction. 652 00:21:27,170 --> 00:21:28,989 The first case was easy, second case was 653 00:21:28,990 --> 00:21:30,549 easy. But then I started needing to do 654 00:21:30,550 --> 00:21:33,249 case analysis on which constructor 655 00:21:33,250 --> 00:21:35,199 was used to build particular terms and 656 00:21:35,200 --> 00:21:37,029 the abstract syntax tree type. 657 00:21:37,030 --> 00:21:38,769 And then I need to do a subclass analysis 658 00:21:38,770 --> 00:21:40,209 inside that one. Then I need to do a 659 00:21:40,210 --> 00:21:41,829 subclass analysis inside that one. 660 00:21:41,830 --> 00:21:43,179 I'm typing the same thing over and over 661 00:21:43,180 --> 00:21:45,069 again. A lot of stuff is going on here, 662 00:21:45,070 --> 00:21:47,139 but it's very repetitive. 663 00:21:47,140 --> 00:21:49,239 And at some point I just gave up 664 00:21:49,240 --> 00:21:51,429 and wrote, that's 665 00:21:51,430 --> 00:21:52,959 not going to work too hard to write that 666 00:21:52,960 --> 00:21:54,489 manually. So keep that in mind. 667 00:21:54,490 --> 00:21:56,109 But for now, this is some reason to be 668 00:21:56,110 --> 00:21:58,299 skeptical that this approach will scale 669 00:21:58,300 --> 00:22:00,639 to full scale, realistic deployed systems 670 00:22:00,640 --> 00:22:02,439 if we have to work this hard on a program 671 00:22:02,440 --> 00:22:04,239 that fits in one page. 672 00:22:04,240 --> 00:22:05,709 But I will come back to that. 673 00:22:10,050 --> 00:22:12,659 All right, so having concretized 674 00:22:12,660 --> 00:22:14,159 things about the use of assistance, you 675 00:22:14,160 --> 00:22:15,449 might have a few questions and I'm going 676 00:22:15,450 --> 00:22:16,739 to try to answer the standard ones. 677 00:22:16,740 --> 00:22:18,899 Now, one is 678 00:22:18,900 --> 00:22:20,849 OK, we know it's pretty hard for 679 00:22:20,850 --> 00:22:22,319 programmers to get the code right. 680 00:22:22,320 --> 00:22:23,789 Despite all these techniques that we 681 00:22:23,790 --> 00:22:25,799 have, there are still frequently high 682 00:22:25,800 --> 00:22:27,030 cost bugs in the wild. 683 00:22:28,080 --> 00:22:30,029 Is it really that much better to make 684 00:22:30,030 --> 00:22:32,009 programmers get the specs right? 685 00:22:32,010 --> 00:22:33,929 For many systems, we have trouble 686 00:22:33,930 --> 00:22:36,509 imagining how a an unambiguous 687 00:22:36,510 --> 00:22:38,549 specification would be much shorter or 688 00:22:38,550 --> 00:22:40,410 simpler than the original code. 689 00:22:41,610 --> 00:22:44,729 One answer is I would claim 690 00:22:44,730 --> 00:22:46,619 verification of this kind is most 691 00:22:46,620 --> 00:22:48,929 worthwhile when we apply it to 692 00:22:48,930 --> 00:22:51,059 the most commonly used system 693 00:22:51,060 --> 00:22:52,829 infrastructure that sits underneath 694 00:22:52,830 --> 00:22:54,179 everything else that we run. 695 00:22:54,180 --> 00:22:56,339 So processors, operating systems, 696 00:22:56,340 --> 00:22:59,009 compilers, databases and so on 697 00:22:59,010 --> 00:23:01,109 our first foundation for 698 00:23:01,110 --> 00:23:02,009 everything else we do. 699 00:23:02,010 --> 00:23:04,079 And second, they do turn out 700 00:23:04,080 --> 00:23:06,149 to have much shorter specifications 701 00:23:06,150 --> 00:23:07,150 than implementations. 702 00:23:08,790 --> 00:23:10,259 For instance, in a compiler, you can have 703 00:23:10,260 --> 00:23:11,759 a specification that doesn't say anything 704 00:23:11,760 --> 00:23:13,499 about the particular optimizations the 705 00:23:13,500 --> 00:23:15,569 compiler uses, even though those are the 706 00:23:15,570 --> 00:23:17,789 most complex parts of it and the easiest 707 00:23:17,790 --> 00:23:19,230 ones to introduce bugs in. 708 00:23:23,140 --> 00:23:25,449 So in my experience, 709 00:23:25,450 --> 00:23:27,369 when we think about what makes a speck a 710 00:23:27,370 --> 00:23:29,529 lot shorter than the original program, 711 00:23:29,530 --> 00:23:31,089 if we take a speck which covers the 712 00:23:31,090 --> 00:23:33,309 functional behavior of a system, and 713 00:23:33,310 --> 00:23:35,529 then we add in optimizations that make 714 00:23:35,530 --> 00:23:37,719 the system fast enough, little 715 00:23:37,720 --> 00:23:39,789 enough memory and so on, then 716 00:23:39,790 --> 00:23:41,829 we recovered the original implementation. 717 00:23:41,830 --> 00:23:43,149 So to the extent there are a lot of 718 00:23:43,150 --> 00:23:45,219 optimizations in the system, then 719 00:23:45,220 --> 00:23:47,439 there's a real opportunity to read 720 00:23:47,440 --> 00:23:49,509 aspect that is much clearer than the 721 00:23:49,510 --> 00:23:50,510 original system. 722 00:23:52,740 --> 00:23:55,319 But still, this is essentially programing 723 00:23:55,320 --> 00:23:56,669 just in a different language, and it's 724 00:23:56,670 --> 00:23:58,169 still hard to get right. 725 00:23:58,170 --> 00:24:00,239 So let's take an example 726 00:24:00,240 --> 00:24:02,189 to think through what might be the 727 00:24:02,190 --> 00:24:03,539 biggest challenges. Imagine we have a 728 00:24:03,540 --> 00:24:05,849 compiler who practice theorem essentially 729 00:24:05,850 --> 00:24:07,709 says the compiler bridges the gap between 730 00:24:07,710 --> 00:24:09,839 the semantics of the source language and 731 00:24:09,840 --> 00:24:11,519 the target language for this purpose. 732 00:24:11,520 --> 00:24:13,349 Let's say the target language is machine 733 00:24:13,350 --> 00:24:14,429 language. 734 00:24:14,430 --> 00:24:16,439 And imagine each of these semantics. 735 00:24:16,440 --> 00:24:17,639 When I write the word semantics, it 736 00:24:17,640 --> 00:24:19,739 really means reference interpreter or 737 00:24:19,740 --> 00:24:20,999 something like that, though usually it's 738 00:24:21,000 --> 00:24:23,190 written in a logical notation. 739 00:24:24,900 --> 00:24:26,399 I only wrote the compiler because I 740 00:24:26,400 --> 00:24:28,199 wanted to use it to compile some 741 00:24:28,200 --> 00:24:31,019 applications and imagine each application 742 00:24:31,020 --> 00:24:33,089 has its own specification and then the 743 00:24:33,090 --> 00:24:34,859 application theorem kind of snaps 744 00:24:34,860 --> 00:24:36,509 together with the compiler theorem 745 00:24:36,510 --> 00:24:37,799 through the semantics of the source 746 00:24:37,800 --> 00:24:39,959 language of the compiler and then down 747 00:24:39,960 --> 00:24:42,149 below AB, my processor that implements 748 00:24:42,150 --> 00:24:43,769 the machine language semantics on top of, 749 00:24:43,770 --> 00:24:46,169 say, VDL semantics or semantics 750 00:24:46,170 --> 00:24:47,969 or whatever language the processor is 751 00:24:47,970 --> 00:24:49,079 written in. 752 00:24:49,080 --> 00:24:51,449 So now think of all this together as one 753 00:24:51,450 --> 00:24:53,849 composite verified unit. 754 00:24:53,850 --> 00:24:55,049 The theorems are approved for the 755 00:24:55,050 --> 00:24:56,669 different parts all compose with each 756 00:24:56,670 --> 00:24:58,829 other to create the abstraction 757 00:24:58,830 --> 00:25:01,319 of a thing that sits 758 00:25:01,320 --> 00:25:03,509 on a particular circuit board and behaves 759 00:25:03,510 --> 00:25:05,579 in this application 760 00:25:05,580 --> 00:25:06,580 specific way. 761 00:25:07,920 --> 00:25:10,049 The really neat thing about this style of 762 00:25:10,050 --> 00:25:12,329 development is that first, 763 00:25:12,330 --> 00:25:14,279 none of the code we wrote is trusted 764 00:25:14,280 --> 00:25:16,499 anymore. It's all internal details 765 00:25:16,500 --> 00:25:18,839 inside this big block and 766 00:25:18,840 --> 00:25:20,339 it's all covered by the proofs that we've 767 00:25:20,340 --> 00:25:22,769 done. Second, even 768 00:25:22,770 --> 00:25:25,049 the internal specifications are now 769 00:25:25,050 --> 00:25:26,069 no longer trusted. 770 00:25:26,070 --> 00:25:28,199 They've become encapsulated details 771 00:25:28,200 --> 00:25:30,269 of the system and it's proof. 772 00:25:30,270 --> 00:25:32,759 So a mistake and say the semantics 773 00:25:32,760 --> 00:25:34,199 of the C programing language, which you 774 00:25:34,200 --> 00:25:35,729 can imagine is pretty easy to make if 775 00:25:35,730 --> 00:25:37,349 you're trying to write out the meaning of 776 00:25:37,350 --> 00:25:39,599 C and not miss any cases does 777 00:25:39,600 --> 00:25:41,939 not allow you to accidentally accept 778 00:25:41,940 --> 00:25:44,009 an incorrect system in this style as 779 00:25:44,010 --> 00:25:45,899 long as you keep that specification 780 00:25:45,900 --> 00:25:48,629 internal to the overall functionality 781 00:25:48,630 --> 00:25:49,630 that you're building. 782 00:25:51,420 --> 00:25:53,519 So let's contrast this with 783 00:25:53,520 --> 00:25:55,019 what we have to do to convince ourselves 784 00:25:55,020 --> 00:25:56,669 in the current standard software 785 00:25:56,670 --> 00:25:58,709 development approach, we have to run not 786 00:25:58,710 --> 00:26:00,419 just unit tests, but our individual 787 00:26:00,420 --> 00:26:02,339 libraries, but also for system 788 00:26:02,340 --> 00:26:04,619 integration tests, because as we compose 789 00:26:04,620 --> 00:26:06,779 libraries and modules together, 790 00:26:06,780 --> 00:26:09,029 the state space of the system is growing 791 00:26:09,030 --> 00:26:11,279 exponentially and therefore it's 792 00:26:11,280 --> 00:26:13,559 much easier to have a complete feeling 793 00:26:13,560 --> 00:26:15,689 coverage for individual functions 794 00:26:15,690 --> 00:26:17,309 than for full systems. 795 00:26:17,310 --> 00:26:19,499 In contrast with the 796 00:26:19,500 --> 00:26:21,629 computer proof assistant approach, 797 00:26:21,630 --> 00:26:23,519 the system integration theorems that we 798 00:26:23,520 --> 00:26:26,309 prove actually imply all the components 799 00:26:26,310 --> 00:26:28,559 inside the box are doing what they need 800 00:26:28,560 --> 00:26:30,689 to do to make the full system work. 801 00:26:30,690 --> 00:26:32,879 And we really fundamentally don't need 802 00:26:32,880 --> 00:26:35,009 the equivalent of unit tests in such a 803 00:26:35,010 --> 00:26:36,479 central role as it has today. 804 00:26:37,950 --> 00:26:40,019 In the old way, we have to do 805 00:26:40,020 --> 00:26:42,209 a careful review of all of our components 806 00:26:42,210 --> 00:26:44,369 because a bug in a corner case in any one 807 00:26:44,370 --> 00:26:46,319 of them could completely destroy our 808 00:26:46,320 --> 00:26:48,359 guarantees of the whole system. 809 00:26:48,360 --> 00:26:51,119 In contrast with the proof base style, 810 00:26:51,120 --> 00:26:53,189 we only need to do careful code review 811 00:26:53,190 --> 00:26:55,649 on the externally facing specifications. 812 00:26:55,650 --> 00:26:57,809 In fact, you could even accept, 813 00:26:57,810 --> 00:26:59,879 in theory, a component from 814 00:26:59,880 --> 00:27:01,019 your worst enemy. 815 00:27:01,020 --> 00:27:03,149 As long as it was proved to meet 816 00:27:03,150 --> 00:27:04,859 the appropriate specification, you can 817 00:27:04,860 --> 00:27:07,619 run it without even inspecting the code. 818 00:27:07,620 --> 00:27:09,749 And this is potentially a way 819 00:27:09,750 --> 00:27:11,849 to support, say, 820 00:27:11,850 --> 00:27:13,679 downloading applications from untrusted 821 00:27:13,680 --> 00:27:15,329 parties and doing proof, checking to 822 00:27:15,330 --> 00:27:17,009 guarantee that they meet particular 823 00:27:17,010 --> 00:27:18,010 security policies. 824 00:27:19,800 --> 00:27:22,049 So another concern I showed 825 00:27:22,050 --> 00:27:24,029 that as I tried to write out that proof 826 00:27:24,030 --> 00:27:26,129 for a medium sized program modeled after 827 00:27:26,130 --> 00:27:28,079 a compiler optimization, it got out of 828 00:27:28,080 --> 00:27:30,569 hand quickly and I ran out of patience. 829 00:27:30,570 --> 00:27:33,299 Well, we're familiar with the scenario, 830 00:27:33,300 --> 00:27:35,399 if you imagine it's nineteen 831 00:27:35,400 --> 00:27:37,379 fifty or so and someone says writing 832 00:27:37,380 --> 00:27:38,939 machine code program sure is a lot of 833 00:27:38,940 --> 00:27:39,359 work. 834 00:27:39,360 --> 00:27:40,979 It's never going to be very popular 835 00:27:40,980 --> 00:27:42,479 today. We know we have compilers that 836 00:27:42,480 --> 00:27:44,339 will go from successively higher levels 837 00:27:44,340 --> 00:27:45,779 of abstraction and automatically write 838 00:27:45,780 --> 00:27:46,889 that code for us. 839 00:27:46,890 --> 00:27:48,689 And by the way, we also have libraries at 840 00:27:48,690 --> 00:27:50,759 each of these language levels that we 841 00:27:50,760 --> 00:27:52,859 can typically reuse for most of the work. 842 00:27:52,860 --> 00:27:54,629 Say, if you're putting together a Web 843 00:27:54,630 --> 00:27:56,999 application with a popular framework, 844 00:27:57,000 --> 00:27:58,739 we don't expect to rewrite everything 845 00:27:58,740 --> 00:27:59,849 each time. 846 00:27:59,850 --> 00:28:02,009 In fact, the same story applies 847 00:28:02,010 --> 00:28:03,629 in the world of machine check. 848 00:28:03,630 --> 00:28:05,429 They're improving where we can build up 849 00:28:05,430 --> 00:28:07,859 layers of automatic proof generators. 850 00:28:07,860 --> 00:28:09,959 We can build up libraries of Limas that 851 00:28:09,960 --> 00:28:12,269 we appeal to for commonly 852 00:28:12,270 --> 00:28:13,829 useful facts. 853 00:28:13,830 --> 00:28:15,839 And this is the secret to building up an 854 00:28:15,840 --> 00:28:18,299 ecosystem of tools to make it 855 00:28:18,300 --> 00:28:20,399 relatively low cost to create a new 856 00:28:20,400 --> 00:28:22,499 development with a proof that 857 00:28:22,500 --> 00:28:24,659 is roughly similar to one you've 858 00:28:24,660 --> 00:28:25,660 done previously. 859 00:28:26,520 --> 00:28:28,289 So let me now go back in and make these 860 00:28:28,290 --> 00:28:30,599 proofs that I wrote before a little more 861 00:28:30,600 --> 00:28:31,769 appetizing. 862 00:28:31,770 --> 00:28:34,349 And so I'll 863 00:28:34,350 --> 00:28:35,549 restart this one. 864 00:28:37,650 --> 00:28:40,109 So I wrote this proof out here 865 00:28:40,110 --> 00:28:42,059 with some very manual steps, but it turns 866 00:28:42,060 --> 00:28:44,159 out I can just do this. 867 00:28:44,160 --> 00:28:46,229 I can. First, I'm going to use 868 00:28:46,230 --> 00:28:47,640 a semicolon to say 869 00:28:49,080 --> 00:28:51,179 take this, run this step a bunch of 870 00:28:51,180 --> 00:28:53,539 new stuff. Goals will exist afterward, 871 00:28:53,540 --> 00:28:55,599 then we're on this one, on all of them, 872 00:28:55,600 --> 00:28:57,729 and then run this one and all 873 00:28:57,730 --> 00:28:59,259 them, the next one has a kind of funny 874 00:28:59,260 --> 00:29:01,059 name called intuition that actually 875 00:29:01,060 --> 00:29:03,339 refers to intuitionist logic, 876 00:29:03,340 --> 00:29:05,889 but people 877 00:29:05,890 --> 00:29:07,569 typically experience it in a different 878 00:29:07,570 --> 00:29:09,789 way at first. That's not what it means. 879 00:29:09,790 --> 00:29:11,650 And then that just proves everything. 880 00:29:16,370 --> 00:29:18,469 And let's see if we can get lucky with 881 00:29:18,470 --> 00:29:19,730 the same thing, the next proof. 882 00:29:21,780 --> 00:29:22,780 Now it's called less 883 00:29:23,940 --> 00:29:26,489 and almost it almost worked. 884 00:29:26,490 --> 00:29:28,109 What we're missing is 885 00:29:30,090 --> 00:29:31,919 recall down here, I mentioned Alema. 886 00:29:31,920 --> 00:29:33,809 We already proved Caulked hadn't been 887 00:29:33,810 --> 00:29:35,279 told that it should consider that Lamba. 888 00:29:35,280 --> 00:29:36,479 So it didn't try. 889 00:29:36,480 --> 00:29:37,680 So what if I say 890 00:29:39,060 --> 00:29:40,319 two steps, I'm going to take dilema. 891 00:29:40,320 --> 00:29:42,719 We already proved it's called length 892 00:29:42,720 --> 00:29:43,919 Pend. 893 00:29:43,920 --> 00:29:46,059 And I'm going to say, hey, remember that 894 00:29:46,060 --> 00:29:47,879 Lemhi approved? That'll be useful. 895 00:29:47,880 --> 00:29:49,769 Try to use it as a rewriting rule 896 00:29:51,300 --> 00:29:53,730 and I can say on a rewrite with core. 897 00:29:56,320 --> 00:29:58,389 And now it's applied here 898 00:29:58,390 --> 00:29:59,709 and we're almost done. 899 00:29:59,710 --> 00:30:01,539 I just need to say these three steps 900 00:30:01,540 --> 00:30:03,759 should be done over and over again until 901 00:30:03,760 --> 00:30:05,139 theorem is proved. 902 00:30:05,140 --> 00:30:07,599 So this is an example of a loop here. 903 00:30:07,600 --> 00:30:08,979 This is actually a Turing complete 904 00:30:08,980 --> 00:30:10,119 scripting language. 905 00:30:10,120 --> 00:30:11,500 It's very different from 906 00:30:12,790 --> 00:30:15,039 the the primitive term 907 00:30:15,040 --> 00:30:17,589 language with just 10 or so steps in it. 908 00:30:17,590 --> 00:30:19,599 We can build up whatever abstractions we 909 00:30:19,600 --> 00:30:21,699 need here with loops and recursion and 910 00:30:21,700 --> 00:30:22,809 data structures. 911 00:30:22,810 --> 00:30:24,129 So what we're really doing is we're 912 00:30:24,130 --> 00:30:26,109 writing scripts that find the proofs for 913 00:30:26,110 --> 00:30:27,279 us. 914 00:30:27,280 --> 00:30:28,839 A particular script is designed to be 915 00:30:28,840 --> 00:30:30,639 readable by humans. 916 00:30:30,640 --> 00:30:32,079 But because you can read the script 917 00:30:32,080 --> 00:30:34,419 doesn't mean you understand every precise 918 00:30:34,420 --> 00:30:36,579 detail behind the low level proof 919 00:30:36,580 --> 00:30:38,259 it's generating. That would generally be 920 00:30:38,260 --> 00:30:39,569 completely overwhelming to have to 921 00:30:39,570 --> 00:30:41,109 understand those details. 922 00:30:41,110 --> 00:30:42,459 And a nice thing about writing scripts 923 00:30:42,460 --> 00:30:44,109 like this is that often if you change the 924 00:30:44,110 --> 00:30:45,969 system, you're verifying if you wrote the 925 00:30:45,970 --> 00:30:47,889 script well, it will keep working. 926 00:30:47,890 --> 00:30:49,539 And that's a partial answer to the 927 00:30:49,540 --> 00:30:51,189 concerns from that nineteen seventy nine 928 00:30:51,190 --> 00:30:53,529 article about Proops keeping 929 00:30:53,530 --> 00:30:55,119 up with changes to systems. 930 00:30:55,120 --> 00:30:57,219 So it's an engineering problem. 931 00:30:57,220 --> 00:31:00,309 Just like good library design. 932 00:31:00,310 --> 00:31:02,379 We want to design good proofs that 933 00:31:02,380 --> 00:31:04,569 are adaptable to many similar 934 00:31:04,570 --> 00:31:05,739 situations. 935 00:31:05,740 --> 00:31:07,389 We don't have to give up any formal 936 00:31:07,390 --> 00:31:09,549 assurance if we do this properly. 937 00:31:09,550 --> 00:31:11,019 OK, so here's this example, which I 938 00:31:11,020 --> 00:31:12,789 didn't have time to explain in detail. 939 00:31:12,790 --> 00:31:14,529 I just want to show how short this proof 940 00:31:14,530 --> 00:31:16,029 can be approved to us before we won't 941 00:31:16,030 --> 00:31:17,030 even need those. 942 00:31:18,910 --> 00:31:20,980 What we can do here, Adduction E. 943 00:31:22,980 --> 00:31:24,089 Let's see. 944 00:31:24,090 --> 00:31:26,549 Simple mention, 945 00:31:26,550 --> 00:31:27,780 a few cases are left 946 00:31:28,920 --> 00:31:31,079 when I see you, I look at this is this 947 00:31:31,080 --> 00:31:33,179 program contains a program 948 00:31:33,180 --> 00:31:35,249 level pattern match over a term 949 00:31:35,250 --> 00:31:37,709 this we should that that's essentially 950 00:31:37,710 --> 00:31:39,299 a case split in the program. 951 00:31:39,300 --> 00:31:41,129 We should mirror it with a case split in 952 00:31:41,130 --> 00:31:43,559 the proof. So let's try that 953 00:31:43,560 --> 00:31:45,479 match goal is construction for examining 954 00:31:45,480 --> 00:31:46,559 what we're trying to prove and doing 955 00:31:46,560 --> 00:31:47,759 something differently based on what we 956 00:31:47,760 --> 00:31:48,779 see. 957 00:31:48,780 --> 00:31:50,939 And what I'll do is 958 00:31:50,940 --> 00:31:52,769 I'll look for any pattern match. 959 00:31:53,880 --> 00:31:55,799 This is a syntax for matching any pattern 960 00:31:55,800 --> 00:31:57,629 match inside the goal. 961 00:31:57,630 --> 00:31:59,729 And I will do a case analysis 962 00:31:59,730 --> 00:32:01,949 on that and then simplify 963 00:32:01,950 --> 00:32:03,599 things afterward. 964 00:32:03,600 --> 00:32:05,699 And I'll say do 965 00:32:05,700 --> 00:32:07,589 this over and over again on all the 966 00:32:07,590 --> 00:32:09,539 subgoals that we generated. 967 00:32:09,540 --> 00:32:11,219 Now we're very close to done. 968 00:32:11,220 --> 00:32:14,069 We have just a bunch of arithmetic facts 969 00:32:14,070 --> 00:32:16,139 and here we have an eval on 970 00:32:16,140 --> 00:32:18,239 a particular term which given 971 00:32:18,240 --> 00:32:19,709 the definition of eval, take that word 972 00:32:19,710 --> 00:32:20,789 for it can be simplified. 973 00:32:20,790 --> 00:32:22,799 So if I did say simplify, not just below 974 00:32:22,800 --> 00:32:24,929 the line, but everywhere, then those will 975 00:32:24,930 --> 00:32:26,989 go into simpler forms as well. 976 00:32:26,990 --> 00:32:28,529 And now I notice I have this form with a 977 00:32:28,530 --> 00:32:30,839 bunch of things are equal to evaluation 978 00:32:30,840 --> 00:32:32,969 results. I want to use those equality's 979 00:32:32,970 --> 00:32:35,429 to rewrite down below here. 980 00:32:35,430 --> 00:32:38,069 So let me do that as well. 981 00:32:38,070 --> 00:32:40,170 Repeatedly look for. 982 00:32:42,420 --> 00:32:44,909 Cases of hypothesis that 983 00:32:44,910 --> 00:32:46,499 says something is equal to evaluate 984 00:32:46,500 --> 00:32:47,500 something. 985 00:32:48,930 --> 00:32:51,119 Rewrite that and then 986 00:32:51,120 --> 00:32:52,120 get rid of it. 987 00:32:52,830 --> 00:32:54,929 So now all those are gone, we get simple 988 00:32:54,930 --> 00:32:56,879 looking things like an equal zero percent 989 00:32:56,880 --> 00:32:58,469 and it turns out I have how many of 990 00:32:58,470 --> 00:33:01,089 these? It says I have 53 of these facts, 991 00:33:01,090 --> 00:33:02,759 a whole bunch of not two interesting 992 00:33:02,760 --> 00:33:04,169 things with zeroes and pluses and 993 00:33:04,170 --> 00:33:04,859 everything. 994 00:33:04,860 --> 00:33:06,569 It turns out all of these follow from the 995 00:33:06,570 --> 00:33:08,369 axioms of the algebraic structure called 996 00:33:08,370 --> 00:33:10,469 Siemering, which the natural numbers 997 00:33:10,470 --> 00:33:11,669 that I'm using here belong to. 998 00:33:11,670 --> 00:33:14,369 So I can just say prove that by ring 999 00:33:14,370 --> 00:33:15,869 and then 1000 00:33:17,220 --> 00:33:19,379 that is correct for 1001 00:33:19,380 --> 00:33:20,729 that proof. So 1002 00:33:21,810 --> 00:33:23,429 cool thing about this kind of proof. 1003 00:33:23,430 --> 00:33:25,499 Imagine I added a completely 1004 00:33:25,500 --> 00:33:27,659 new kind of note to my syntax trees 1005 00:33:27,660 --> 00:33:29,220 like I got really, 1006 00:33:31,710 --> 00:33:34,349 really ambitious and added subtraction. 1007 00:33:34,350 --> 00:33:36,629 And then this same proof 1008 00:33:36,630 --> 00:33:38,369 would pretty much work without any 1009 00:33:38,370 --> 00:33:40,859 changes and even more ambitious 1010 00:33:40,860 --> 00:33:41,999 changes to the language or the 1011 00:33:42,000 --> 00:33:44,339 optimization could lead to similar 1012 00:33:44,340 --> 00:33:46,799 outcomes. So this essentially is how 1013 00:33:46,800 --> 00:33:49,079 proof authoring scales to large systems 1014 00:33:49,080 --> 00:33:50,999 and to rapidly changing systems. 1015 00:33:53,550 --> 00:33:54,900 So back to these slides. 1016 00:33:57,300 --> 00:33:59,369 So this is the underlying technology 1017 00:33:59,370 --> 00:34:00,959 for a bunch of different cool things that 1018 00:34:00,960 --> 00:34:02,779 we can do with proof assistance. 1019 00:34:02,780 --> 00:34:05,069 So the traditional mode math proof 1020 00:34:05,070 --> 00:34:07,199 says the user has some sort of theorem in 1021 00:34:07,200 --> 00:34:08,189 mind. 1022 00:34:08,190 --> 00:34:09,539 It's a proof of it passes it off to the 1023 00:34:09,540 --> 00:34:11,039 proof checker and we make sure that it's 1024 00:34:11,040 --> 00:34:12,089 convincing. 1025 00:34:12,090 --> 00:34:13,499 We do the same thing for programs where 1026 00:34:13,500 --> 00:34:16,259 we start with an implementation and 1027 00:34:16,260 --> 00:34:18,479 prove a theorem about it and 1028 00:34:18,480 --> 00:34:19,799 check that that proof is true. 1029 00:34:19,800 --> 00:34:21,299 That's called verification. 1030 00:34:21,300 --> 00:34:22,678 There's also kind of the other direction 1031 00:34:22,679 --> 00:34:23,999 called synthesis, where we don't even 1032 00:34:24,000 --> 00:34:25,948 have the artifact to begin with. 1033 00:34:25,949 --> 00:34:28,138 We just have a wish list, a specification 1034 00:34:28,139 --> 00:34:30,059 that says, I sure wish I had a foobar 1035 00:34:30,060 --> 00:34:31,079 that did best. 1036 00:34:31,080 --> 00:34:33,059 And then we actually passed that 1037 00:34:33,060 --> 00:34:35,009 specification into some automated system 1038 00:34:35,010 --> 00:34:37,049 that builds the implementation for us and 1039 00:34:37,050 --> 00:34:38,519 actually builds the proof of the 1040 00:34:38,520 --> 00:34:39,959 implementation at the same time. 1041 00:34:39,960 --> 00:34:41,459 And we can check that proof without 1042 00:34:41,460 --> 00:34:43,198 having to trust the synthesis engine that 1043 00:34:43,199 --> 00:34:44,369 did this work for us. 1044 00:34:44,370 --> 00:34:45,749 Both of these can often be useful and 1045 00:34:45,750 --> 00:34:47,340 they build on the same technology. 1046 00:34:48,810 --> 00:34:50,669 I also showed examples so far that only 1047 00:34:50,670 --> 00:34:52,229 write programs in the functional language 1048 00:34:52,230 --> 00:34:54,419 that's built into the caucus system. 1049 00:34:54,420 --> 00:34:56,519 But it's such a rich logic that we 1050 00:34:56,520 --> 00:34:58,019 could encode pretty much any other 1051 00:34:58,020 --> 00:34:58,979 language you can think of. 1052 00:34:58,980 --> 00:35:00,269 We can write down its syntax and 1053 00:35:00,270 --> 00:35:01,270 semantics. 1054 00:35:02,700 --> 00:35:04,469 So here are some examples of languages 1055 00:35:04,470 --> 00:35:06,689 that developments in Cork have worked 1056 00:35:06,690 --> 00:35:08,699 with by formalizing what exactly these 1057 00:35:08,700 --> 00:35:10,649 languages are going all the way from high 1058 00:35:10,650 --> 00:35:12,659 level database, query languages down to 1059 00:35:12,660 --> 00:35:14,129 hardware, description languages, and we 1060 00:35:14,130 --> 00:35:16,139 can find these, compose them, do all 1061 00:35:16,140 --> 00:35:17,140 sorts of cool things. 1062 00:35:18,510 --> 00:35:20,669 We can also use a wide variety of tools 1063 00:35:20,670 --> 00:35:22,500 to construct our programs and our proofs. 1064 00:35:23,580 --> 00:35:25,499 We can use compilers between languages. 1065 00:35:25,500 --> 00:35:27,359 We can build automated proofers that live 1066 00:35:27,360 --> 00:35:29,309 within the ecosystem and generate calk 1067 00:35:29,310 --> 00:35:31,979 proofs. We can build synthesis tools. 1068 00:35:31,980 --> 00:35:33,749 The key thing is every tool we introduce 1069 00:35:33,750 --> 00:35:35,849 in this ecosystem, just like, 1070 00:35:35,850 --> 00:35:38,339 say, in a regular software ecosystem, 1071 00:35:38,340 --> 00:35:39,959 everything needs to somehow connect to a 1072 00:35:39,960 --> 00:35:41,909 common binary object format. 1073 00:35:41,910 --> 00:35:43,349 Here, everything needs to connect to a 1074 00:35:43,350 --> 00:35:45,209 common proof format. 1075 00:35:45,210 --> 00:35:46,499 The most straightforward way of doing 1076 00:35:46,500 --> 00:35:48,359 that is to take each of these pieces, say 1077 00:35:48,360 --> 00:35:50,939 some arrow at this translation, and prove 1078 00:35:50,940 --> 00:35:53,249 that the translator itself is correct. 1079 00:35:53,250 --> 00:35:55,049 It always maps inputs to appropriate 1080 00:35:55,050 --> 00:35:57,149 outputs. But another approach is to 1081 00:35:57,150 --> 00:35:59,309 create a proof generating piece 1082 00:35:59,310 --> 00:36:01,169 that, for instance, takes in an input, 1083 00:36:01,170 --> 00:36:02,969 produces not just the output, but also a 1084 00:36:02,970 --> 00:36:04,859 proof that that particular output is 1085 00:36:04,860 --> 00:36:07,049 correct. And for different circumstances, 1086 00:36:07,050 --> 00:36:08,879 it's more appropriate to use one of these 1087 00:36:08,880 --> 00:36:10,679 two styles. Luckily, the same proof 1088 00:36:10,680 --> 00:36:12,090 checker can be used with both of them. 1089 00:36:13,620 --> 00:36:15,299 So to get more concrete, let me tell you 1090 00:36:15,300 --> 00:36:16,739 about two projects that I've been working 1091 00:36:16,740 --> 00:36:17,759 on recently. 1092 00:36:17,760 --> 00:36:19,049 First one is called Commy. 1093 00:36:19,050 --> 00:36:21,389 It's all about proof, 1094 00:36:21,390 --> 00:36:23,199 support for digital hardware. 1095 00:36:23,200 --> 00:36:24,719 It's joint work with a number of other 1096 00:36:24,720 --> 00:36:26,219 people at MIT listed here. 1097 00:36:27,360 --> 00:36:29,969 And basically the story is, 1098 00:36:29,970 --> 00:36:32,219 let's imagine that we want to support 1099 00:36:32,220 --> 00:36:34,109 rapid opensource development of new 1100 00:36:34,110 --> 00:36:36,299 digital hardware systems by mashing up 1101 00:36:36,300 --> 00:36:38,339 components from libraries. 1102 00:36:38,340 --> 00:36:39,869 So, for instance, we might pull some 1103 00:36:39,870 --> 00:36:42,089 processors out of one library, a cache 1104 00:36:42,090 --> 00:36:43,859 coherent memory system out of another. 1105 00:36:43,860 --> 00:36:45,779 Maybe we write our own new accelerators 1106 00:36:45,780 --> 00:36:47,579 for a particular purpose. 1107 00:36:47,580 --> 00:36:49,649 And by the way, all these boxes have 1108 00:36:49,650 --> 00:36:52,319 proofs. So when you assemble your system, 1109 00:36:52,320 --> 00:36:54,059 you don't just run it. 1110 00:36:54,060 --> 00:36:56,189 You also combine the proofs of the pieces 1111 00:36:56,190 --> 00:36:58,049 into a proof that the full system is 1112 00:36:58,050 --> 00:36:59,249 correct. 1113 00:36:59,250 --> 00:37:01,499 And you can potentially know correctness 1114 00:37:01,500 --> 00:37:03,899 without needing to do any debugging. 1115 00:37:03,900 --> 00:37:04,900 If you've 1116 00:37:06,480 --> 00:37:08,369 proved your top level, it's written your 1117 00:37:08,370 --> 00:37:09,869 top level theorem correctly and check 1118 00:37:09,870 --> 00:37:11,759 that it follows from the constituent 1119 00:37:11,760 --> 00:37:12,760 proofs. 1120 00:37:13,680 --> 00:37:15,069 What do I mean by correctness? 1121 00:37:15,070 --> 00:37:17,279 Well, imagine we have a highly optimized 1122 00:37:17,280 --> 00:37:19,259 implementation of some hardware component 1123 00:37:19,260 --> 00:37:21,209 and we've also written a specification 1124 00:37:21,210 --> 00:37:22,709 which is essentially the simplest 1125 00:37:22,710 --> 00:37:25,199 possible way to express functionally 1126 00:37:25,200 --> 00:37:26,909 what that box is supposed to do. 1127 00:37:26,910 --> 00:37:29,399 We ignore optimization, performance, etc. 1128 00:37:29,400 --> 00:37:30,479 We might call this a reference 1129 00:37:30,480 --> 00:37:32,699 implementation, not just a spec, 1130 00:37:32,700 --> 00:37:34,769 and we want to prove 1131 00:37:34,770 --> 00:37:36,899 it in the semantics literature 1132 00:37:36,900 --> 00:37:38,399 we would call a contextual refinement 1133 00:37:38,400 --> 00:37:39,749 property. 1134 00:37:39,750 --> 00:37:41,279 What that really means is let's invite 1135 00:37:41,280 --> 00:37:43,409 into the picture and omniscient judge is 1136 00:37:43,410 --> 00:37:45,299 going to very carefully inspect both 1137 00:37:45,300 --> 00:37:46,289 boxes. 1138 00:37:46,290 --> 00:37:47,759 He's going to send them inputs, look at 1139 00:37:47,760 --> 00:37:49,889 the outputs. And what he's trying to do 1140 00:37:49,890 --> 00:37:51,599 is find any difference in the way they 1141 00:37:51,600 --> 00:37:52,499 behave. 1142 00:37:52,500 --> 00:37:54,359 So if this judge can ever get the 1143 00:37:54,360 --> 00:37:56,429 implementation to have some 1144 00:37:56,430 --> 00:37:58,559 externally visible EYOB behavior that 1145 00:37:58,560 --> 00:38:00,989 the spec couldn't possibly generate, 1146 00:38:00,990 --> 00:38:03,389 then our system loses. 1147 00:38:03,390 --> 00:38:05,489 But if an every possible interaction he 1148 00:38:05,490 --> 00:38:07,529 could have, anything the left side can 1149 00:38:07,530 --> 00:38:09,419 do, the right side can also do, then we 1150 00:38:09,420 --> 00:38:11,489 say it is a correct implementation. 1151 00:38:13,140 --> 00:38:14,879 And in our system, we write these 1152 00:38:14,880 --> 00:38:16,769 hardware components in the blue spec, 1153 00:38:16,770 --> 00:38:18,839 high level hardware language, which 1154 00:38:18,840 --> 00:38:20,519 makes it particularly straightforward to 1155 00:38:20,520 --> 00:38:22,319 define what exactly are the legal ways to 1156 00:38:22,320 --> 00:38:25,169 interact with the the system. 1157 00:38:25,170 --> 00:38:27,449 And importantly, many internal 1158 00:38:27,450 --> 00:38:29,009 implementation details may not be 1159 00:38:29,010 --> 00:38:30,719 observed directly, just like you can't 1160 00:38:30,720 --> 00:38:32,609 peek into the private fields of a Java 1161 00:38:32,610 --> 00:38:34,199 class. And that's important for 1162 00:38:34,200 --> 00:38:36,059 facilitating optimization without 1163 00:38:36,060 --> 00:38:37,739 foreseeing the details in the spec. 1164 00:38:39,630 --> 00:38:41,249 So a few words about the brushback 1165 00:38:41,250 --> 00:38:43,169 language. It's kind of it in an object 1166 00:38:43,170 --> 00:38:45,089 oriented style where program modules are 1167 00:38:45,090 --> 00:38:47,399 objects that have private state and 1168 00:38:47,400 --> 00:38:48,839 public methods that are the only ones 1169 00:38:48,840 --> 00:38:50,729 allowed to access the private state. 1170 00:38:50,730 --> 00:38:52,739 And methods can call other methods and 1171 00:38:52,740 --> 00:38:53,760 other objects 1172 00:38:55,290 --> 00:38:56,879 every method. Karl appears to execute 1173 00:38:56,880 --> 00:38:58,799 atomically without any interleaving 1174 00:38:58,800 --> 00:38:59,800 concurrency. 1175 00:39:01,080 --> 00:39:03,209 And we can summarize any 1176 00:39:03,210 --> 00:39:05,519 atomic method call with a trace of all 1177 00:39:05,520 --> 00:39:07,709 the sub method calls that happened, 1178 00:39:07,710 --> 00:39:09,149 some coming into this module or some 1179 00:39:09,150 --> 00:39:10,829 going out to others with their arguments 1180 00:39:10,830 --> 00:39:12,209 and return values. 1181 00:39:12,210 --> 00:39:14,099 And then this property that connects the 1182 00:39:14,100 --> 00:39:15,569 implementation to the back is basically 1183 00:39:15,570 --> 00:39:17,729 saying any trace that my implementation 1184 00:39:17,730 --> 00:39:19,559 could ever generate could also be 1185 00:39:19,560 --> 00:39:22,529 generated by the specification. 1186 00:39:22,530 --> 00:39:24,449 This is a nice modular way of setting 1187 00:39:24,450 --> 00:39:25,949 things up because it allows you to then 1188 00:39:25,950 --> 00:39:28,469 link modules or objects together 1189 00:39:28,470 --> 00:39:30,089 as you do that certain methods become 1190 00:39:30,090 --> 00:39:32,159 private and they stop appearing in the 1191 00:39:32,160 --> 00:39:33,899 traces and you can hide the details of 1192 00:39:33,900 --> 00:39:35,549 how your spec is implemented with a 1193 00:39:35,550 --> 00:39:37,259 particular hierarchy of optimized 1194 00:39:37,260 --> 00:39:38,260 components. 1195 00:39:40,170 --> 00:39:42,479 So let me show you a quick 1196 00:39:42,480 --> 00:39:44,279 example of some code in the county 1197 00:39:44,280 --> 00:39:45,280 framework. 1198 00:39:46,230 --> 00:39:48,480 This is called Tutorial. 1199 00:39:52,810 --> 00:39:54,969 Here is some code for 1200 00:39:54,970 --> 00:39:57,189 this is a producer consumer example with 1201 00:39:57,190 --> 00:39:59,559 with two hardware nodes, with a 1202 00:39:59,560 --> 00:40:00,609 Q in between them. 1203 00:40:00,610 --> 00:40:02,229 This is the producer side. 1204 00:40:02,230 --> 00:40:04,359 It's a module with a particular register 1205 00:40:04,360 --> 00:40:06,279 in it and a rule that might run on a 1206 00:40:06,280 --> 00:40:08,409 particular clock cycle, reads a register, 1207 00:40:08,410 --> 00:40:10,659 calls a method of another module, 1208 00:40:10,660 --> 00:40:11,679 etc.. 1209 00:40:11,680 --> 00:40:13,389 One thing I want to point out here, Corke 1210 00:40:13,390 --> 00:40:15,009 has an extensible parser. 1211 00:40:15,010 --> 00:40:16,599 So in this framework, we've talked 1212 00:40:16,600 --> 00:40:18,459 caulked the syntax of this hardware 1213 00:40:18,460 --> 00:40:19,599 description, language. There's no 1214 00:40:19,600 --> 00:40:21,729 built-In keyword module register rule any 1215 00:40:21,730 --> 00:40:23,589 of that. We've defined the syntax and 1216 00:40:23,590 --> 00:40:26,349 semantics in logic of this 1217 00:40:26,350 --> 00:40:28,689 hardware language, and that allows 1218 00:40:28,690 --> 00:40:30,879 us to use it to define 1219 00:40:30,880 --> 00:40:31,989 systems and do proofs. 1220 00:40:33,310 --> 00:40:34,929 And here's the consumer definition. 1221 00:40:34,930 --> 00:40:37,419 It's a lot shorter than we define a spec 1222 00:40:37,420 --> 00:40:39,039 that actually combines the producer and 1223 00:40:39,040 --> 00:40:40,749 the consumer into one module and avoids 1224 00:40:40,750 --> 00:40:42,489 using an intermediate queue. 1225 00:40:42,490 --> 00:40:44,619 And then with less than one hundred lines 1226 00:40:44,620 --> 00:40:46,179 in the file, we get down and prove a 1227 00:40:46,180 --> 00:40:48,309 theorem that the implementation 1228 00:40:48,310 --> 00:40:50,289 is a correct refinement of the 1229 00:40:50,290 --> 00:40:52,209 specification and take my word for it 1230 00:40:52,210 --> 00:40:54,339 all. This can be scaled up to pipeline 1231 00:40:54,340 --> 00:40:55,989 processors, cash, coherent memory 1232 00:40:55,990 --> 00:40:58,389 systems. We've done all those examples, 1233 00:40:58,390 --> 00:41:00,009 so we prove those individually. 1234 00:41:00,010 --> 00:41:02,139 We compose them and also compose 1235 00:41:02,140 --> 00:41:03,759 their proofs at the same time. 1236 00:41:03,760 --> 00:41:06,099 And it is quite 1237 00:41:06,100 --> 00:41:07,100 general. 1238 00:41:10,180 --> 00:41:12,249 I also want to mention one 1239 00:41:12,250 --> 00:41:13,599 application area that we've been working 1240 00:41:13,600 --> 00:41:15,339 on for this technology, some of you might 1241 00:41:15,340 --> 00:41:17,199 have heard of the risk five instruction 1242 00:41:17,200 --> 00:41:18,549 set. I think there was another 1243 00:41:18,550 --> 00:41:20,739 verification talk about it on the first 1244 00:41:20,740 --> 00:41:22,959 day of the Congress and I'm wearing 1245 00:41:22,960 --> 00:41:24,729 it five t shirt here in the back. 1246 00:41:24,730 --> 00:41:26,859 It says Instruction sets want to be 1247 00:41:26,860 --> 00:41:29,379 free. Risk five is is basically 1248 00:41:29,380 --> 00:41:31,329 like the Linux of hardware instruction 1249 00:41:31,330 --> 00:41:33,429 sets. If you think of Linux in nineteen 1250 00:41:33,430 --> 00:41:35,649 ninety five or so, it is an 1251 00:41:35,650 --> 00:41:37,179 open instruction set controlled by a 1252 00:41:37,180 --> 00:41:38,319 nonprofit foundation. 1253 00:41:38,320 --> 00:41:40,059 In terms of the definition of what are 1254 00:41:40,060 --> 00:41:41,979 the instructions, what do they do? 1255 00:41:41,980 --> 00:41:44,409 And has a lot of opportunities 1256 00:41:44,410 --> 00:41:46,269 for getting involved, including informal 1257 00:41:46,270 --> 00:41:48,399 methods. So we've been working within 1258 00:41:48,400 --> 00:41:50,349 the risk by foundation to formally define 1259 00:41:50,350 --> 00:41:52,119 the meaning of the instruction set. 1260 00:41:52,120 --> 00:41:53,919 We've been verifying components that you 1261 00:41:53,920 --> 00:41:55,689 can use to assemble your own responsive 1262 00:41:55,690 --> 00:41:57,049 systems with correctness. 1263 00:41:57,050 --> 00:41:59,439 Proops We've been working towards some 1264 00:41:59,440 --> 00:42:01,539 support for verifying the correctness 1265 00:42:01,540 --> 00:42:03,249 of risk by a machine code as well as 1266 00:42:03,250 --> 00:42:05,259 infrastructure for composing those proofs 1267 00:42:05,260 --> 00:42:06,260 together. 1268 00:42:06,940 --> 00:42:09,309 And there are some nice consequences 1269 00:42:09,310 --> 00:42:11,229 of this open model that fits really well 1270 00:42:11,230 --> 00:42:12,399 with formal methods. So in the 1271 00:42:12,400 --> 00:42:14,109 traditional world, the processor 1272 00:42:14,110 --> 00:42:15,969 manufacturers control the intellectual 1273 00:42:15,970 --> 00:42:17,979 property that defines the instruction set 1274 00:42:17,980 --> 00:42:19,569 and the processors, and they actually 1275 00:42:19,570 --> 00:42:21,789 don't want most developers to know 1276 00:42:21,790 --> 00:42:23,469 how things really work because then they 1277 00:42:23,470 --> 00:42:24,969 could potentially build competing 1278 00:42:24,970 --> 00:42:27,399 processors and and 1279 00:42:27,400 --> 00:42:28,809 hurt the bottom lines of the companies 1280 00:42:28,810 --> 00:42:29,859 that own the IP. 1281 00:42:29,860 --> 00:42:32,259 But in the risk five open model, 1282 00:42:32,260 --> 00:42:34,209 then the we can have a formal semantics 1283 00:42:34,210 --> 00:42:35,469 at the center of everything that says 1284 00:42:35,470 --> 00:42:37,149 exactly what the instruction set is, 1285 00:42:37,150 --> 00:42:39,279 allows everyone to build a cloned version 1286 00:42:39,280 --> 00:42:40,869 of the processor. 1287 00:42:40,870 --> 00:42:42,279 All those can be made open source. 1288 00:42:42,280 --> 00:42:43,569 They can all have theorems associated 1289 00:42:43,570 --> 00:42:45,549 with them connected to this formal spec. 1290 00:42:45,550 --> 00:42:47,019 Others can come along and mash up 1291 00:42:47,020 --> 00:42:50,679 existing components and 1292 00:42:50,680 --> 00:42:52,389 it can dramatically 1293 00:42:53,620 --> 00:42:56,349 lower the cost of getting into 1294 00:42:56,350 --> 00:42:58,239 hardware development in this kind of 1295 00:42:58,240 --> 00:42:59,709 space. So we're trying to build that 1296 00:42:59,710 --> 00:43:01,480 formal semantics to enable this story. 1297 00:43:04,460 --> 00:43:05,959 A second case study I want to talk about 1298 00:43:05,960 --> 00:43:07,969 to finish up here is what we've done in 1299 00:43:07,970 --> 00:43:10,069 cryptographic code, namely elliptic 1300 00:43:10,070 --> 00:43:12,199 curve cryptography, which is used in 1301 00:43:12,200 --> 00:43:14,759 class and many other settings, 1302 00:43:14,760 --> 00:43:17,329 also joint work with a few folks at MIT. 1303 00:43:17,330 --> 00:43:19,399 And so here's 1304 00:43:19,400 --> 00:43:20,569 the story here. 1305 00:43:20,570 --> 00:43:22,009 Cryptography, as we know, is really 1306 00:43:22,010 --> 00:43:24,379 important. And most applications 1307 00:43:24,380 --> 00:43:26,269 using cryptography just take libraries 1308 00:43:26,270 --> 00:43:27,529 off the shelf, which are written by a 1309 00:43:27,530 --> 00:43:29,569 small handful of elite cryptographic 1310 00:43:29,570 --> 00:43:30,570 implementers. 1311 00:43:31,490 --> 00:43:32,659 Let's think about how hard they have to 1312 00:43:32,660 --> 00:43:34,639 work to get everything together here. 1313 00:43:34,640 --> 00:43:36,379 There's a medium sized set of 1314 00:43:36,380 --> 00:43:38,059 cryptographic algorithms for all the 1315 00:43:38,060 --> 00:43:39,060 primitives we need. 1316 00:43:39,890 --> 00:43:41,239 It turns out many of these are 1317 00:43:41,240 --> 00:43:43,579 parametrized by large prime numbers 1318 00:43:43,580 --> 00:43:46,459 that are the modules for arithmetic, 1319 00:43:46,460 --> 00:43:48,259 and there's a medium sized set of 1320 00:43:48,260 --> 00:43:49,999 different hardware architectures that 1321 00:43:50,000 --> 00:43:51,169 we'd like to target. 1322 00:43:51,170 --> 00:43:53,719 It turns out that in practice, 1323 00:43:53,720 --> 00:43:55,909 for every choice of one 1324 00:43:55,910 --> 00:43:58,249 circle from each of these bubbles, 1325 00:43:58,250 --> 00:44:00,409 an expert spends at least a few days 1326 00:44:00,410 --> 00:44:02,449 rewriting everything from scratch, 1327 00:44:02,450 --> 00:44:04,579 possibly in assembly language. 1328 00:44:04,580 --> 00:44:07,429 And this is not a very scalable process. 1329 00:44:07,430 --> 00:44:09,229 And of course, when doing that sort of 1330 00:44:09,230 --> 00:44:11,059 thing, we can believe that bugs can be 1331 00:44:11,060 --> 00:44:12,889 introduced that have serious security 1332 00:44:12,890 --> 00:44:14,179 consequences. 1333 00:44:14,180 --> 00:44:15,949 So what if we could have a compiler that 1334 00:44:15,950 --> 00:44:17,149 automatically did that work? 1335 00:44:17,150 --> 00:44:18,529 When you choose something from each of 1336 00:44:18,530 --> 00:44:20,749 these bubbles and produces not just the 1337 00:44:20,750 --> 00:44:22,819 vast low level code, but also 1338 00:44:22,820 --> 00:44:24,919 a machine checkable proof that it 1339 00:44:24,920 --> 00:44:26,449 truly implements the whiteboard level 1340 00:44:26,450 --> 00:44:27,450 math. 1341 00:44:27,860 --> 00:44:29,029 So that's what we're after here. 1342 00:44:29,030 --> 00:44:31,579 So we're doing crypto verification. 1343 00:44:31,580 --> 00:44:33,229 One kind of property people often prove 1344 00:44:33,230 --> 00:44:34,610 about cryptography is that 1345 00:44:35,650 --> 00:44:37,189 these high level security properties, 1346 00:44:37,190 --> 00:44:38,959 like if you don't know the secret key, 1347 00:44:38,960 --> 00:44:39,919 then they'll take you more than 1348 00:44:39,920 --> 00:44:41,599 polynomial time to compute something 1349 00:44:42,680 --> 00:44:43,999 through protocol verification. 1350 00:44:44,000 --> 00:44:45,229 We can show that a particular 1351 00:44:45,230 --> 00:44:46,849 mathematical algorithm meets such a 1352 00:44:46,850 --> 00:44:49,369 property and through 1353 00:44:49,370 --> 00:44:51,289 an implementation synthesis process, we 1354 00:44:51,290 --> 00:44:53,389 can then generate low level 1355 00:44:53,390 --> 00:44:54,889 efficient code that follows the 1356 00:44:54,890 --> 00:44:56,269 mathematical algorithm. 1357 00:44:56,270 --> 00:44:57,769 Our project is only in that second 1358 00:44:57,770 --> 00:44:59,059 category. There's been a lot of 1359 00:44:59,060 --> 00:45:00,469 interesting work in the first category 1360 00:45:00,470 --> 00:45:02,539 too, and 1361 00:45:02,540 --> 00:45:04,609 our system has been adopted by Chrome 1362 00:45:04,610 --> 00:45:06,679 through the boring SSL library to 1363 00:45:06,680 --> 00:45:08,749 replace the manually written elliptic 1364 00:45:08,750 --> 00:45:11,479 curve finite field arithmetic 1365 00:45:11,480 --> 00:45:13,429 that they had had previously. 1366 00:45:13,430 --> 00:45:15,019 You can find our implementation of curve 1367 00:45:15,020 --> 00:45:16,699 two five five one nine and Chrome version 1368 00:45:16,700 --> 00:45:18,829 sixty four, which is in beta now, and 1369 00:45:18,830 --> 00:45:19,669 the next version of Chrome. 1370 00:45:19,670 --> 00:45:21,229 After that we'll have our implementation 1371 00:45:21,230 --> 00:45:22,769 of P two fifty six together. 1372 00:45:22,770 --> 00:45:24,349 These are the curves that are used in the 1373 00:45:24,350 --> 00:45:26,539 vast majority of TLS connections. 1374 00:45:29,100 --> 00:45:31,169 So really quick demo of 1375 00:45:31,170 --> 00:45:33,389 running this generation 1376 00:45:33,390 --> 00:45:34,390 process. 1377 00:45:36,640 --> 00:45:38,589 What we've done is we've built a library 1378 00:45:38,590 --> 00:45:40,239 of functional programs that capture the 1379 00:45:40,240 --> 00:45:42,530 generality of arithmetic in this domain. 1380 00:45:43,630 --> 00:45:45,789 So it turns out these numbers are way 1381 00:45:45,790 --> 00:45:47,469 too big to fit in hardware registers. 1382 00:45:47,470 --> 00:45:48,909 We have to be clever about splitting one 1383 00:45:48,910 --> 00:45:50,919 number among many registers and then 1384 00:45:50,920 --> 00:45:52,269 essentially implement grade school 1385 00:45:52,270 --> 00:45:54,129 arithmetic ourselves with some catches 1386 00:45:54,130 --> 00:45:56,559 for performance on popular processors. 1387 00:45:56,560 --> 00:45:57,999 And we do that using standard functional 1388 00:45:58,000 --> 00:46:00,099 programing stuff like folds 1389 00:46:00,100 --> 00:46:02,169 and maps and so on. 1390 00:46:02,170 --> 00:46:03,729 For instance, here's a definition of 1391 00:46:03,730 --> 00:46:06,079 multiplication, flat map and a map. 1392 00:46:06,080 --> 00:46:07,599 I won't go into details on exactly our 1393 00:46:07,600 --> 00:46:08,919 representation. This is kind of a 1394 00:46:08,920 --> 00:46:11,139 simplified version for demo purposes, 1395 00:46:12,250 --> 00:46:15,609 but we get to process all this and 1396 00:46:15,610 --> 00:46:16,779 make our way down to the bottom of this 1397 00:46:16,780 --> 00:46:18,909 file where I'm going to essentially 1398 00:46:18,910 --> 00:46:20,259 state the following goal. 1399 00:46:20,260 --> 00:46:22,719 Assume we have two 10 1400 00:46:22,720 --> 00:46:24,549 tuples of integers. 1401 00:46:24,550 --> 00:46:26,649 Turns out that on 32 bit processors, 1402 00:46:26,650 --> 00:46:28,149 four curved two five five one nine, you 1403 00:46:28,150 --> 00:46:30,609 want to represent each number 1404 00:46:30,610 --> 00:46:32,749 with ten different word 1405 00:46:32,750 --> 00:46:34,509 size digits. So that's why these are ten 1406 00:46:34,510 --> 00:46:36,249 tuples. And we say, please find an 1407 00:46:36,250 --> 00:46:38,229 element of the set tuples that are the 1408 00:46:38,230 --> 00:46:40,299 correct answers to the problem of 1409 00:46:40,300 --> 00:46:42,409 multiplying the two inputs mod 1410 00:46:42,410 --> 00:46:44,320 two to the two fifty five minus 19. 1411 00:46:45,670 --> 00:46:48,009 And I run a few 1412 00:46:48,010 --> 00:46:49,479 steps here that I want to explain in 1413 00:46:49,480 --> 00:46:52,539 detail to start symbolic manipulation. 1414 00:46:52,540 --> 00:46:54,039 And then key step is here. 1415 00:46:54,040 --> 00:46:55,119 This is going to run for about ten 1416 00:46:55,120 --> 00:46:57,669 seconds. I think we are somewhere 1417 00:46:57,670 --> 00:47:00,129 asking Corke to do some correct 1418 00:47:00,130 --> 00:47:02,709 by construction derivation of concrete 1419 00:47:02,710 --> 00:47:04,719 code for computing this multiplication 1420 00:47:04,720 --> 00:47:07,149 result and get a bunch of outputs 1421 00:47:07,150 --> 00:47:07,569 like this. 1422 00:47:07,570 --> 00:47:09,579 I think this is the last digit of the 1423 00:47:09,580 --> 00:47:11,049 answer and some of the arithmetic for 1424 00:47:11,050 --> 00:47:11,979 computing it. 1425 00:47:11,980 --> 00:47:13,839 That's a lot of redundant parts to it. 1426 00:47:13,840 --> 00:47:15,909 So let's use the the 1427 00:47:15,910 --> 00:47:18,009 ring algebraic laws to simplify 1428 00:47:18,010 --> 00:47:20,169 this a bit more and this one takes 1429 00:47:20,170 --> 00:47:21,759 a few tenths of seconds to run. 1430 00:47:21,760 --> 00:47:23,859 But what's happening is the system is 1431 00:47:23,860 --> 00:47:25,579 automatically computing the code that 1432 00:47:25,580 --> 00:47:27,339 traditionally has been written by hand 1433 00:47:27,340 --> 00:47:29,919 for this cryptographic 1434 00:47:29,920 --> 00:47:31,989 primitive. And it's 1435 00:47:31,990 --> 00:47:33,549 doing it in a way that doesn't just build 1436 00:47:33,550 --> 00:47:35,619 the code. It also builds the proof that 1437 00:47:35,620 --> 00:47:37,539 it's related back to the original 1438 00:47:37,540 --> 00:47:39,519 whiteboard level modular arithmetic 1439 00:47:39,520 --> 00:47:40,520 definition. 1440 00:47:41,710 --> 00:47:43,909 And this will finish 1441 00:47:43,910 --> 00:47:46,149 any moment now and it 1442 00:47:46,150 --> 00:47:48,249 will no longer have all these 1443 00:47:48,250 --> 00:47:50,019 constants in it. These will be combined 1444 00:47:50,020 --> 00:47:52,209 into a nicer form and the 1445 00:47:52,210 --> 00:47:53,889 output that'll pop up here is not the end 1446 00:47:53,890 --> 00:47:55,449 of the story. We also have later phases 1447 00:47:55,450 --> 00:47:57,789 that lower it into C like code 1448 00:47:57,790 --> 00:47:59,439 do bounds analysis to make sure we've 1449 00:47:59,440 --> 00:48:01,569 assigned enough bits to each register. 1450 00:48:01,570 --> 00:48:03,399 That all happens automatically given the 1451 00:48:03,400 --> 00:48:04,400 other parameters. 1452 00:48:06,550 --> 00:48:08,679 And here 1453 00:48:08,680 --> 00:48:11,199 we have an example of the output. 1454 00:48:11,200 --> 00:48:13,359 So this might look familiar to people 1455 00:48:13,360 --> 00:48:14,679 who have been implementing curve two five 1456 00:48:14,680 --> 00:48:15,579 five one nine. 1457 00:48:15,580 --> 00:48:17,739 It's still in a relatively high 1458 00:48:17,740 --> 00:48:19,959 level form without breaking things 1459 00:48:19,960 --> 00:48:22,089 down into individual temporary variables. 1460 00:48:22,090 --> 00:48:23,679 But we didn't have to manually figure 1461 00:48:23,680 --> 00:48:24,669 out. For instance, there should be a 1462 00:48:24,670 --> 00:48:26,799 thirty eight here and nineteen 1463 00:48:26,800 --> 00:48:28,149 here and all those things. 1464 00:48:28,150 --> 00:48:30,129 It was done for us automatically and you 1465 00:48:30,130 --> 00:48:31,269 can put in different parameters and 1466 00:48:31,270 --> 00:48:33,459 generate the standard code for any 1467 00:48:33,460 --> 00:48:35,139 of the other curves in any of the 1468 00:48:35,140 --> 00:48:37,089 hardware architectures that folks like to 1469 00:48:37,090 --> 00:48:38,090 use them on. 1470 00:48:39,460 --> 00:48:42,219 So to finish this up, we 1471 00:48:42,220 --> 00:48:44,049 are implementation automatically builds 1472 00:48:44,050 --> 00:48:46,329 that code from a way of writing down the 1473 00:48:46,330 --> 00:48:48,399 modules in a suggestive way like that 1474 00:48:48,400 --> 00:48:48,939 here. 1475 00:48:48,940 --> 00:48:50,289 This is typically how these are written 1476 00:48:50,290 --> 00:48:52,539 as sums and differences of powers of two 1477 00:48:52,540 --> 00:48:54,069 or small multiples of them. 1478 00:48:54,070 --> 00:48:55,539 So we wrote a Python script that uses 1479 00:48:55,540 --> 00:48:57,219 that to generate a few lines of 1480 00:48:57,220 --> 00:48:59,409 configuration to our system. 1481 00:48:59,410 --> 00:49:01,899 And then we scraped 1482 00:49:01,900 --> 00:49:04,209 the elliptic curve mailing 1483 00:49:04,210 --> 00:49:06,039 list that modern crypto cryptography to 1484 00:49:06,040 --> 00:49:07,599 find every prime number that appears in 1485 00:49:07,600 --> 00:49:09,789 the archives. So we use them all as 1486 00:49:09,790 --> 00:49:11,949 inputs for automatically generating code. 1487 00:49:11,950 --> 00:49:14,139 There are only a few weird prime numbers 1488 00:49:14,140 --> 00:49:15,699 posted there that are actually very 1489 00:49:15,700 --> 00:49:16,719 inefficient to implement. 1490 00:49:16,720 --> 00:49:18,699 That was probably a newbie to the list 1491 00:49:18,700 --> 00:49:20,799 who was made a suggestion was quickly 1492 00:49:20,800 --> 00:49:22,569 rebuffed. But we generated those two or 1493 00:49:22,570 --> 00:49:24,009 tried to. Sometimes we ran out of memory 1494 00:49:24,010 --> 00:49:26,169 with those and we 1495 00:49:26,170 --> 00:49:27,849 automatically built code for all these 1496 00:49:27,850 --> 00:49:29,859 cases compared against the Guanyu Multi 1497 00:49:29,860 --> 00:49:31,239 Position Library, which is a very 1498 00:49:31,240 --> 00:49:33,339 standard off the shelf library for 1499 00:49:33,340 --> 00:49:34,899 big integer arithmetic. 1500 00:49:34,900 --> 00:49:36,609 Here are results generated for sixty four 1501 00:49:36,610 --> 00:49:38,769 point eighty six year for 32 bit arm 1502 00:49:38,770 --> 00:49:40,329 running on an Android phone. 1503 00:49:40,330 --> 00:49:42,519 And generally the new multi 1504 00:49:42,520 --> 00:49:45,129 position version of things will take 1505 00:49:45,130 --> 00:49:47,289 two to ten times longer to run one 1506 00:49:47,290 --> 00:49:49,059 of the elliptic curve operations than 1507 00:49:49,060 --> 00:49:50,319 with our code, which was created 1508 00:49:50,320 --> 00:49:52,479 completely, automatically and also has 1509 00:49:52,480 --> 00:49:54,969 strong guarantees of correctness. 1510 00:49:54,970 --> 00:49:57,159 So thanks. 1511 00:50:03,610 --> 00:50:05,169 I'll just leave you with a few pointers 1512 00:50:05,170 --> 00:50:06,849 to places to follow up if you'd like to 1513 00:50:06,850 --> 00:50:09,129 learn more about this subject, No. 1514 00:50:09,130 --> 00:50:11,079 One place to start an online book called 1515 00:50:11,080 --> 00:50:12,429 Software or Foundations, written by 1516 00:50:12,430 --> 00:50:14,319 Benjamin Pierce and others. 1517 00:50:14,320 --> 00:50:16,659 It introduces Calk just assuming 1518 00:50:16,660 --> 00:50:18,369 standard undergraduate computer science 1519 00:50:18,370 --> 00:50:20,109 material, no formal methods or even 1520 00:50:20,110 --> 00:50:22,269 functional programing experience. 1521 00:50:22,270 --> 00:50:23,949 I should also advertise our Multi 1522 00:50:23,950 --> 00:50:25,779 University Initiative in the US called 1523 00:50:25,780 --> 00:50:27,639 The Science of Deep Specification or deep 1524 00:50:27,640 --> 00:50:29,649 spec in particular. 1525 00:50:29,650 --> 00:50:31,149 In Summer Twenty eighteen, we'll be 1526 00:50:31,150 --> 00:50:32,769 running a summer school that introduces 1527 00:50:32,770 --> 00:50:34,029 all this stuff with some hands on 1528 00:50:34,030 --> 00:50:35,919 practice, with hardware verification and 1529 00:50:35,920 --> 00:50:37,929 C program verification and a few other 1530 00:50:37,930 --> 00:50:39,399 topics in Cork. 1531 00:50:39,400 --> 00:50:41,049 It'll be held in Princeton, New Jersey, 1532 00:50:41,050 --> 00:50:43,179 and you can subscribe 1533 00:50:43,180 --> 00:50:44,529 to our mailing list on the website if 1534 00:50:44,530 --> 00:50:46,599 you'd like to get updates. As we pin down 1535 00:50:46,600 --> 00:50:48,609 more of the details, here are the GitHub 1536 00:50:48,610 --> 00:50:50,739 projects for my own projects that I 1537 00:50:50,740 --> 00:50:52,959 mentioned here and two of the 1538 00:50:52,960 --> 00:50:55,269 others that we made code 1539 00:50:55,270 --> 00:50:56,439 contributions to. 1540 00:50:56,440 --> 00:50:57,440 Thank you. 1541 00:51:08,930 --> 00:51:11,239 We have about 10 minutes for a Q and a 1542 00:51:11,240 --> 00:51:12,240 microphone want 1543 00:51:13,460 --> 00:51:14,609 Sengstacke? 1544 00:51:14,610 --> 00:51:15,799 Great talk. 1545 00:51:15,800 --> 00:51:17,779 I enjoyed it really much. 1546 00:51:17,780 --> 00:51:18,889 I have a question. 1547 00:51:18,890 --> 00:51:21,289 Watch what strategy 1548 00:51:21,290 --> 00:51:22,460 you have for 1549 00:51:23,660 --> 00:51:26,029 finding the right series, because 1550 00:51:26,030 --> 00:51:28,309 I think this is the hardest part 1551 00:51:28,310 --> 00:51:29,479 here. 1552 00:51:29,480 --> 00:51:32,059 And you said you make 1553 00:51:32,060 --> 00:51:34,339 reviews for this, and 1554 00:51:34,340 --> 00:51:36,769 I think maybe that's not enough. 1555 00:51:36,770 --> 00:51:38,299 For example, when you have a reverse 1556 00:51:38,300 --> 00:51:40,459 function and the sorting function, 1557 00:51:40,460 --> 00:51:42,709 you can have the theorem that 1558 00:51:42,710 --> 00:51:44,869 for the invariant, that the length is 1559 00:51:44,870 --> 00:51:47,539 always the same when you execute it. 1560 00:51:47,540 --> 00:51:49,729 But maybe that's not enough 1561 00:51:49,730 --> 00:51:52,219 for, um, 1562 00:51:52,220 --> 00:51:53,479 for the series. 1563 00:51:53,480 --> 00:51:54,979 So what are your strategies? 1564 00:51:54,980 --> 00:51:57,319 Because maybe I think a combination 1565 00:51:57,320 --> 00:51:59,509 to have some concrete example 1566 00:51:59,510 --> 00:52:01,669 test that test the functions 1567 00:52:01,670 --> 00:52:04,009 also. Good. So theorems maybe 1568 00:52:04,010 --> 00:52:06,289 not enough what this 1569 00:52:06,290 --> 00:52:07,469 thinks. Yeah. 1570 00:52:07,470 --> 00:52:09,079 So this relates to a number of the 1571 00:52:09,080 --> 00:52:10,819 frequently asked questions, answers that 1572 00:52:10,820 --> 00:52:13,819 I had in the middle of the talk. 1573 00:52:13,820 --> 00:52:15,949 Generally we build 1574 00:52:15,950 --> 00:52:18,019 infrastructure to support applications 1575 00:52:18,020 --> 00:52:20,119 and applications tend to have more 1576 00:52:20,120 --> 00:52:22,249 straightforward specifications in 1577 00:52:22,250 --> 00:52:23,569 many cases than the systems 1578 00:52:23,570 --> 00:52:24,979 infrastructure below. 1579 00:52:24,980 --> 00:52:25,939 We try to set things up. 1580 00:52:25,940 --> 00:52:27,709 So if we got the infrastructure theorem 1581 00:52:27,710 --> 00:52:29,779 wrong, we will not be able to use it to 1582 00:52:29,780 --> 00:52:31,909 prove the application result. 1583 00:52:31,910 --> 00:52:32,910 So in general, 1584 00:52:34,160 --> 00:52:36,289 one technique is when 1585 00:52:36,290 --> 00:52:38,119 you build a verified component, use it in 1586 00:52:38,120 --> 00:52:39,919 as many different contexts as possible 1587 00:52:39,920 --> 00:52:41,149 and don't peek back inside the 1588 00:52:41,150 --> 00:52:42,829 implementation. Only look at the original 1589 00:52:42,830 --> 00:52:43,729 theorem you chose. 1590 00:52:43,730 --> 00:52:45,709 And by forcing yourself to prove all the 1591 00:52:45,710 --> 00:52:47,389 larger systems that that used your 1592 00:52:47,390 --> 00:52:49,489 component, you will get out the bugs 1593 00:52:49,490 --> 00:52:52,009 in the specification of the components 1594 00:52:52,010 --> 00:52:54,259 that you're reusing it. 1595 00:52:54,260 --> 00:52:56,089 And you mentioned it might be useful to 1596 00:52:56,090 --> 00:52:58,069 have a set of test cases as part of a 1597 00:52:58,070 --> 00:53:00,529 specification for a component 1598 00:53:00,530 --> 00:53:03,139 that that could be a 1599 00:53:03,140 --> 00:53:04,279 good strategy also. 1600 00:53:04,280 --> 00:53:06,379 But it's kind of covered by 1601 00:53:06,380 --> 00:53:08,029 reusing your component in many different 1602 00:53:08,030 --> 00:53:09,469 settings where you're naturally, say, 1603 00:53:09,470 --> 00:53:10,879 calling your function with with different 1604 00:53:10,880 --> 00:53:12,949 arguments. And you need that function to 1605 00:53:12,950 --> 00:53:14,539 be correct to get the larger system to be 1606 00:53:14,540 --> 00:53:15,540 correct. 1607 00:53:17,090 --> 00:53:18,669 I go from three. 1608 00:53:18,670 --> 00:53:20,179 Right, thank you very much for the 1609 00:53:20,180 --> 00:53:21,180 excellent talk. 1610 00:53:22,080 --> 00:53:24,199 Um, the 1611 00:53:24,200 --> 00:53:26,299 specifications of many, 1612 00:53:26,300 --> 00:53:28,609 many problems also involve 1613 00:53:28,610 --> 00:53:30,109 time. 1614 00:53:30,110 --> 00:53:32,209 So you have not only to 1615 00:53:32,210 --> 00:53:34,129 have the correct answer, you have to have 1616 00:53:34,130 --> 00:53:36,079 the correct answer before the Tsipi 1617 00:53:36,080 --> 00:53:37,049 session times. 1618 00:53:37,050 --> 00:53:39,199 But the user gets board, the machine 1619 00:53:39,200 --> 00:53:40,459 gets rebooted. 1620 00:53:40,460 --> 00:53:42,649 Your deadline is over. 1621 00:53:42,650 --> 00:53:45,349 How do you write down 1622 00:53:45,350 --> 00:53:47,509 time specifications 1623 00:53:47,510 --> 00:53:49,819 in clock? Like this thing will 1624 00:53:49,820 --> 00:53:51,949 finish after at most this 1625 00:53:51,950 --> 00:53:52,999 many milliseconds. 1626 00:53:53,000 --> 00:53:54,859 Yeah, I think we'll probably hold off and 1627 00:53:54,860 --> 00:53:56,329 trying to prove that users don't get 1628 00:53:56,330 --> 00:53:58,579 bored for now, but maybe someday, 1629 00:53:58,580 --> 00:54:00,649 but for the time out property, 1630 00:54:00,650 --> 00:54:02,779 I, I think so. 1631 00:54:02,780 --> 00:54:03,799 We're working in a very general 1632 00:54:03,800 --> 00:54:05,149 mathematical framework that can 1633 00:54:05,150 --> 00:54:06,709 essentially encode anything that you'd 1634 00:54:06,710 --> 00:54:07,789 find in a math textbook. 1635 00:54:07,790 --> 00:54:10,309 And it's it's pretty easy to define 1636 00:54:10,310 --> 00:54:12,559 what time is from 1637 00:54:12,560 --> 00:54:15,169 the program execution perspective. 1638 00:54:15,170 --> 00:54:16,759 And there have been projects that, for 1639 00:54:16,760 --> 00:54:18,979 instance, prove that a C 1640 00:54:18,980 --> 00:54:21,319 compiler preserves running time between 1641 00:54:21,320 --> 00:54:23,659 the source code and the assembly 1642 00:54:23,660 --> 00:54:26,029 code with a pretty probably a naive model 1643 00:54:26,030 --> 00:54:27,289 at the bottom. 1644 00:54:27,290 --> 00:54:29,779 And I 1645 00:54:29,780 --> 00:54:32,479 don't have any projects of my own in that 1646 00:54:32,480 --> 00:54:34,069 vein so far. But a number of people have 1647 00:54:34,070 --> 00:54:36,079 looked at that. So I think my main answer 1648 00:54:36,080 --> 00:54:37,459 is very flexible framework. 1649 00:54:37,460 --> 00:54:38,749 You can encode whatever you want, you can 1650 00:54:38,750 --> 00:54:39,859 prove whatever you want. It's just a 1651 00:54:39,860 --> 00:54:41,869 question of how much labor is going to be 1652 00:54:41,870 --> 00:54:42,529 required. 1653 00:54:42,530 --> 00:54:43,909 And we'll learn more about better 1654 00:54:43,910 --> 00:54:45,499 abstractions that will reduce that labor 1655 00:54:45,500 --> 00:54:46,639 over time. 1656 00:54:46,640 --> 00:54:47,779 Thank you. 1657 00:54:47,780 --> 00:54:48,989 We have a question from the signal 1658 00:54:48,990 --> 00:54:51,259 engine. Yeah, it's maybe somewhat related 1659 00:54:51,260 --> 00:54:53,359 to the last one so I can prove 1660 00:54:53,360 --> 00:54:55,249 self help. Again, such analytics, for 1661 00:54:55,250 --> 00:54:56,749 example, by proving a certain function 1662 00:54:56,750 --> 00:54:58,369 runs in constant time because the main 1663 00:54:58,370 --> 00:55:00,169 point of manuscript implementation is to 1664 00:55:00,170 --> 00:55:02,239 avoid such analytics 1665 00:55:02,240 --> 00:55:03,949 and do the automatically created 1666 00:55:03,950 --> 00:55:05,449 implementations that you presented to 1667 00:55:05,450 --> 00:55:07,009 address this issue. 1668 00:55:07,010 --> 00:55:08,719 So the automatically created 1669 00:55:08,720 --> 00:55:10,429 implementations I presented are actually 1670 00:55:10,430 --> 00:55:12,379 in a very restrictive language that 1671 00:55:12,380 --> 00:55:14,599 enforces Konstantine by construction. 1672 00:55:14,600 --> 00:55:15,919 It's all straight line code. 1673 00:55:15,920 --> 00:55:17,749 Think of it as the performance critical 1674 00:55:17,750 --> 00:55:20,179 inner loops of the cryptographic library. 1675 00:55:20,180 --> 00:55:22,849 And so yes, that one does 1676 00:55:22,850 --> 00:55:25,039 by construction provide 1677 00:55:25,040 --> 00:55:27,049 constant time. Other formal methods 1678 00:55:27,050 --> 00:55:29,089 projects have done automatic program 1679 00:55:29,090 --> 00:55:31,219 analysis to guarantee Konstantine 1680 00:55:31,220 --> 00:55:32,149 properties. 1681 00:55:32,150 --> 00:55:34,399 I think this is an important kind 1682 00:55:34,400 --> 00:55:35,539 of analysis to do. 1683 00:55:35,540 --> 00:55:37,879 It's among the easier ones for 1684 00:55:37,880 --> 00:55:39,619 for formal methods, which is great given 1685 00:55:39,620 --> 00:55:41,299 the big payoff in security. 1686 00:55:41,300 --> 00:55:43,519 So I'm optimistic about scaling 1687 00:55:43,520 --> 00:55:45,590 up to whatever settings it's needed in 1688 00:55:47,520 --> 00:55:48,520 microphone for. 1689 00:55:49,400 --> 00:55:52,069 Yeah, so about scaling up. 1690 00:55:52,070 --> 00:55:54,919 To what extent is the user 1691 00:55:54,920 --> 00:55:56,119 to what extent do you look at human 1692 00:55:56,120 --> 00:55:57,289 factors? 1693 00:55:57,290 --> 00:55:59,149 Because in many cases that's how they 1694 00:55:59,150 --> 00:56:00,889 break. Well, it depends. 1695 00:56:00,890 --> 00:56:02,149 If you're talking about the humans who 1696 00:56:02,150 --> 00:56:04,489 are writing the proofs, then that's us. 1697 00:56:04,490 --> 00:56:06,559 We're very concerned about us and we make 1698 00:56:06,560 --> 00:56:08,299 an effort to help ourselves by improving 1699 00:56:08,300 --> 00:56:09,259 tools. 1700 00:56:09,260 --> 00:56:11,419 Um, do you mean 1701 00:56:11,420 --> 00:56:13,279 the users of the software or hardware 1702 00:56:13,280 --> 00:56:14,899 that we're creating and some way that 1703 00:56:14,900 --> 00:56:16,309 that might be worse off because of this 1704 00:56:16,310 --> 00:56:17,689 methodology? 1705 00:56:17,690 --> 00:56:19,819 Yeah. So to what extent so 1706 00:56:19,820 --> 00:56:21,409 when you're talking about like full 1707 00:56:21,410 --> 00:56:23,839 system specifications, it's 1708 00:56:23,840 --> 00:56:25,759 really a question of what do you include 1709 00:56:25,760 --> 00:56:28,039 in your system, as I see. 1710 00:56:28,040 --> 00:56:29,569 Oh, you're asking, do we formally model 1711 00:56:29,570 --> 00:56:31,220 the user in the theorems? 1712 00:56:32,810 --> 00:56:34,999 So we'll do things like the 1713 00:56:35,000 --> 00:56:37,219 top level theorem of the hardware 1714 00:56:37,220 --> 00:56:38,899 models, that it has an input channel and 1715 00:56:38,900 --> 00:56:39,799 output channel. 1716 00:56:39,800 --> 00:56:41,629 We have no idea what's on the other side, 1717 00:56:41,630 --> 00:56:43,189 but our specification says if you get 1718 00:56:43,190 --> 00:56:45,049 this output, sorry, get this input, you 1719 00:56:45,050 --> 00:56:47,149 need to give that output, that sort of 1720 00:56:47,150 --> 00:56:49,189 thing. I don't know if anyone who's been 1721 00:56:49,190 --> 00:56:50,989 doing anything like cognitively modeling 1722 00:56:50,990 --> 00:56:52,639 users and what you should expect from 1723 00:56:52,640 --> 00:56:54,889 them in this kind of setting, that 1724 00:56:54,890 --> 00:56:57,019 that could be worthwhile, though 1725 00:56:57,020 --> 00:56:58,759 I expect there'd be a lot of disagreement 1726 00:56:58,760 --> 00:57:00,049 about exactly how to write the theorem 1727 00:57:00,050 --> 00:57:01,050 down. 1728 00:57:01,970 --> 00:57:03,769 Thanks, Michael, for one. 1729 00:57:05,030 --> 00:57:06,619 So I'd just like to start by saying that 1730 00:57:06,620 --> 00:57:07,879 I already agree with everything you said 1731 00:57:07,880 --> 00:57:09,109 here, so you don't need to accept your 1732 00:57:09,110 --> 00:57:10,099 soul mate on any of it. 1733 00:57:10,100 --> 00:57:12,239 But I basically believe 1734 00:57:12,240 --> 00:57:13,939 that the average industrial programmers, 1735 00:57:13,940 --> 00:57:15,230 the Dunning, Kruger Posterchild 1736 00:57:17,120 --> 00:57:19,369 and so anyway, Tony Hawk said 1737 00:57:19,370 --> 00:57:21,139 that no references were a billion dollar 1738 00:57:21,140 --> 00:57:23,359 mistake. And he has the following quote, 1739 00:57:23,360 --> 00:57:25,549 which is, Ten years ago, researchers 1740 00:57:25,550 --> 00:57:27,049 into formal methods predicted that the 1741 00:57:27,050 --> 00:57:28,489 programing world would embrace with 1742 00:57:28,490 --> 00:57:30,049 gratitude every assistance promised by 1743 00:57:30,050 --> 00:57:31,549 Formalization to solve the problems of 1744 00:57:31,550 --> 00:57:33,679 reliability that arise when programs get 1745 00:57:33,680 --> 00:57:36,859 larger and more safety critical, then 1746 00:57:36,860 --> 00:57:39,019 more stuff and concludes in, which 1747 00:57:39,020 --> 00:57:41,239 is like the most dismal, sad 1748 00:57:41,240 --> 00:57:42,469 thing I've ever heard. 1749 00:57:42,470 --> 00:57:44,029 It has turned out that the world just 1750 00:57:44,030 --> 00:57:45,589 does not suffer significantly from the 1751 00:57:45,590 --> 00:57:47,119 kind of problem that our research was 1752 00:57:47,120 --> 00:57:48,309 originally intended to solve. 1753 00:57:49,670 --> 00:57:51,769 OK, Enlight, why do 1754 00:57:51,770 --> 00:57:53,809 you believe that programmers will accept 1755 00:57:53,810 --> 00:57:55,969 this now when they have for thirty 1756 00:57:55,970 --> 00:57:58,279 years developed so many confused 1757 00:57:58,280 --> 00:58:01,279 epicycles on the ideas of research that 1758 00:58:01,280 --> 00:58:02,280 do work? 1759 00:58:02,930 --> 00:58:04,639 In my experience, at least for some 1760 00:58:04,640 --> 00:58:06,529 isolated communities of developers like 1761 00:58:06,530 --> 00:58:08,689 the authors of the main crypto libraries, 1762 00:58:08,690 --> 00:58:11,329 they do worry about correctness and 1763 00:58:11,330 --> 00:58:13,719 they're very interested in exploring 1764 00:58:13,720 --> 00:58:15,539 these kinds of techniques. 1765 00:58:15,540 --> 00:58:17,729 So that's one answer the let's say the 1766 00:58:17,730 --> 00:58:19,979 very high assurance domain where people 1767 00:58:19,980 --> 00:58:21,899 are especially interested, especially 1768 00:58:21,900 --> 00:58:23,999 willing to accept certain costs for 1769 00:58:24,000 --> 00:58:25,389 correctness. 1770 00:58:25,390 --> 00:58:27,149 The other answer I'd give is that I think 1771 00:58:27,150 --> 00:58:29,129 we're progressively improving the 1772 00:58:29,130 --> 00:58:30,989 usability and 1773 00:58:32,790 --> 00:58:35,009 creating many component 1774 00:58:35,010 --> 00:58:36,809 libraries that can do most of the work 1775 00:58:36,810 --> 00:58:38,159 for you for each project so you don't 1776 00:58:38,160 --> 00:58:39,419 have to start from scratch. 1777 00:58:39,420 --> 00:58:41,699 And I think we're heading towards passing 1778 00:58:41,700 --> 00:58:44,369 some critical point where the ecosystem 1779 00:58:44,370 --> 00:58:47,069 is big enough that there's a 1780 00:58:47,070 --> 00:58:48,389 fundamental change in the cost of 1781 00:58:48,390 --> 00:58:49,499 applying these tools. 1782 00:58:49,500 --> 00:58:51,209 And if we get there, then I think that'll 1783 00:58:51,210 --> 00:58:52,589 make the difference for the kind of 1784 00:58:52,590 --> 00:58:55,439 barrier that Paul was talking about. 1785 00:58:55,440 --> 00:58:56,339 I agree with that. 1786 00:58:56,340 --> 00:58:58,409 But I also think, like my example 1787 00:58:58,410 --> 00:59:00,119 was no references. 1788 00:59:00,120 --> 00:59:02,019 And all you need in order to eliminate 1789 00:59:02,020 --> 00:59:03,239 noise from a language are 1790 00:59:04,350 --> 00:59:07,019 parametric polymorphism and some types. 1791 00:59:07,020 --> 00:59:09,059 And people still say, oh, you know, Neal, 1792 00:59:09,060 --> 00:59:10,169 it's not a problem. 1793 00:59:10,170 --> 00:59:11,399 It's all good. 1794 00:59:11,400 --> 00:59:12,749 I don't make those kind of mistakes. 1795 00:59:14,430 --> 00:59:16,979 Well, I guess we can discuss this later. 1796 00:59:16,980 --> 00:59:19,529 There's a question from the Internet, um, 1797 00:59:19,530 --> 00:59:21,179 how does the halting problem limit what 1798 00:59:21,180 --> 00:59:23,099 you can improve? And is that a problem in 1799 00:59:23,100 --> 00:59:25,559 real life? Applications of 1800 00:59:25,560 --> 00:59:27,359 the holding problem applies to arbitrary 1801 00:59:27,360 --> 00:59:28,889 pieces of code that get thrown down in 1802 00:59:28,890 --> 00:59:30,149 front of you and you get asked questions 1803 00:59:30,150 --> 00:59:32,309 about them. Luckily, the pieces of code 1804 00:59:32,310 --> 00:59:34,259 we care about were written by people who 1805 00:59:34,260 --> 00:59:36,359 had understandings of them in mind. 1806 00:59:36,360 --> 00:59:38,609 And I personally believe 1807 00:59:38,610 --> 00:59:40,289 that the author of the code should be 1808 00:59:40,290 --> 00:59:42,449 responsible for distributing proof with 1809 00:59:42,450 --> 00:59:44,669 it, and it is then decided 1810 00:59:44,670 --> 00:59:46,379 to check if the proof is convincing, 1811 00:59:46,380 --> 00:59:48,089 which is the real problem that that 1812 00:59:48,090 --> 00:59:49,090 matters for this domain. 1813 00:59:50,160 --> 00:59:51,160 Microphone three, 1814 00:59:53,340 --> 00:59:55,829 the baby examples you get for us 1815 00:59:55,830 --> 00:59:57,899 with proofs. The proofs 1816 00:59:57,900 --> 01:00:01,109 to me looked still very verbose 1817 01:00:01,110 --> 01:00:03,389 because this exam was 1818 01:00:03,390 --> 01:00:05,489 very simple. So it's the 1819 01:00:05,490 --> 01:00:08,129 more optimization possible and 1820 01:00:08,130 --> 01:00:09,269 people are doing that. 1821 01:00:10,920 --> 01:00:13,229 There are a wide variety of approaches 1822 01:00:13,230 --> 01:00:14,669 to proof automation. 1823 01:00:14,670 --> 01:00:16,289 I think of those proofs is pretty short. 1824 01:00:16,290 --> 01:00:17,759 They're definitely significantly shorter 1825 01:00:17,760 --> 01:00:19,289 than the programs that they're applying 1826 01:00:19,290 --> 01:00:20,429 to. It was maybe 1827 01:00:21,600 --> 01:00:24,059 one fifth the size of the program, which 1828 01:00:24,060 --> 01:00:26,129 if you look at these kind 1829 01:00:26,130 --> 01:00:28,319 of proofs 20 years ago, that's a big 1830 01:00:28,320 --> 01:00:29,639 improvement already. And we can 1831 01:00:29,640 --> 01:00:31,709 extrapolate from that and hope for even 1832 01:00:31,710 --> 01:00:32,969 shorter proofs in the future. 1833 01:00:34,530 --> 01:00:35,849 Last question. Microphone three, 1834 01:00:37,320 --> 01:00:39,089 you said that in the semantics of Blue 1835 01:00:39,090 --> 01:00:41,219 Spack methods, 1836 01:00:41,220 --> 01:00:43,319 executed atomically. 1837 01:00:43,320 --> 01:00:44,999 So what actually happens if there's a 1838 01:00:45,000 --> 01:00:47,099 failure during 1839 01:00:47,100 --> 01:00:48,539 a method? 1840 01:00:48,540 --> 01:00:50,889 Well, so this is in hardware land 1841 01:00:50,890 --> 01:00:52,499 where we're within a method. 1842 01:00:52,500 --> 01:00:54,689 Everything's running just as propagation 1843 01:00:54,690 --> 01:00:56,729 of signals over wires. 1844 01:00:56,730 --> 01:00:58,889 So you have to plan ahead and 1845 01:00:58,890 --> 01:01:00,959 detect the failures with guard conditions 1846 01:01:00,960 --> 01:01:02,489 at the beginning. And then you just don't 1847 01:01:03,690 --> 01:01:05,429 look at what's flowing down that part of 1848 01:01:05,430 --> 01:01:06,430 the circuit 1849 01:01:07,560 --> 01:01:09,009 right now. 1850 01:01:09,010 --> 01:01:11,609 That's it. And sorry if it 1851 01:01:11,610 --> 01:01:12,599 could be in 15 minutes. 1852 01:01:12,600 --> 01:01:13,600 Thanks. Thanks.