0
00:00:00,000 --> 00:00:30,000
Dear viewer, these subtitles were generated
by a machine via the service Trint
and therefore are (very) buggy.
If you are capable, please help us to
create good quality subtitles:
https://c3subtitles.de/talk/914 Thanks!

1
00:00:14,960 --> 00:00:17,179
So the next speaker of today Will

2
00:00:17,180 --> 00:00:18,949
Tulka will tell us more about the

3
00:00:18,950 --> 00:00:21,229
glorious future, where mushroom

4
00:00:21,230 --> 00:00:23,359
check the proof will be used in day.

5
00:00:23,360 --> 00:00:25,579
After all, the way

6
00:00:25,580 --> 00:00:27,799
down to our development, we have

7
00:00:27,800 --> 00:00:29,089
Adam Chapala.

8
00:00:29,090 --> 00:00:31,579
He's a faculty at MIT and

9
00:00:31,580 --> 00:00:33,499
he builds a computer system and has fun

10
00:00:33,500 --> 00:00:35,119
proving them using COK.

11
00:00:35,120 --> 00:00:36,470
Not written as you think it is.

12
00:00:38,780 --> 00:00:39,780
Thank you.

13
00:00:47,420 --> 00:00:50,419
So from the earliest days of programing,

14
00:00:50,420 --> 00:00:52,159
people started to realize that the actual

15
00:00:52,160 --> 00:00:53,989
act of writing the code was going to be a

16
00:00:53,990 --> 00:00:55,609
relatively small fraction of the time

17
00:00:55,610 --> 00:00:57,589
that they spent building useful things.

18
00:00:57,590 --> 00:00:58,819
And here's a quote from one of the

19
00:00:58,820 --> 00:01:00,559
pioneers of computer science telling us

20
00:01:00,560 --> 00:01:02,269
about how at some point he realized a

21
00:01:02,270 --> 00:01:03,529
good part of his life was going to be

22
00:01:03,530 --> 00:01:05,629
spent finding errors in his own code.

23
00:01:05,630 --> 00:01:07,729
So we know that it would make a

24
00:01:07,730 --> 00:01:09,469
huge difference for developers that get a

25
00:01:09,470 --> 00:01:11,179
lot more cool stuff built if they could

26
00:01:11,180 --> 00:01:13,429
reduce the time that they spent

27
00:01:13,430 --> 00:01:15,559
trying to get their code to be

28
00:01:15,560 --> 00:01:16,969
correct, trying to understand what their

29
00:01:16,970 --> 00:01:19,279
code does. All these activities besides

30
00:01:19,280 --> 00:01:21,499
just writing it. And I want to tell

31
00:01:21,500 --> 00:01:22,909
you about some opportunities with

32
00:01:22,910 --> 00:01:24,229
technologies that are getting mature

33
00:01:24,230 --> 00:01:26,749
enough to potentially make a big impact

34
00:01:26,750 --> 00:01:29,179
in saving a lot of people's time building

35
00:01:29,180 --> 00:01:30,739
correct systems.

36
00:01:30,740 --> 00:01:32,659
And I think these are just ready enough

37
00:01:32,660 --> 00:01:33,739
that if you want to be on the bleeding

38
00:01:33,740 --> 00:01:35,299
edge, you could start looking into these

39
00:01:35,300 --> 00:01:36,229
systems today.

40
00:01:36,230 --> 00:01:38,599
So first, let me summarize

41
00:01:38,600 --> 00:01:40,699
a perspective on what programmers

42
00:01:40,700 --> 00:01:43,129
are doing today to gain

43
00:01:43,130 --> 00:01:45,289
confidence that their code is correct.

44
00:01:45,290 --> 00:01:47,659
So there's debugging the classic way to

45
00:01:47,660 --> 00:01:49,729
find a flaw in a program where

46
00:01:49,730 --> 00:01:52,009
you sort of do some detective work,

47
00:01:52,010 --> 00:01:54,059
make successive runs of a program, change

48
00:01:54,060 --> 00:01:55,879
a small thing between runs, see what

49
00:01:55,880 --> 00:01:58,849
happens differently on each execution.

50
00:01:58,850 --> 00:02:00,799
I'll graph these different approaches and

51
00:02:00,800 --> 00:02:03,709
a very wishy washy two axis way.

52
00:02:03,710 --> 00:02:06,079
Riggo I'm going to use to refer to

53
00:02:06,080 --> 00:02:08,119
the when you use some sort of method to

54
00:02:08,120 --> 00:02:10,159
understand your program, how closely tied

55
00:02:10,160 --> 00:02:12,079
it is to the actual code that runs.

56
00:02:12,080 --> 00:02:13,549
So, you know, it's accurate.

57
00:02:13,550 --> 00:02:15,649
And then completeness is covering how

58
00:02:15,650 --> 00:02:17,869
many different execution scenarios you

59
00:02:17,870 --> 00:02:19,159
get some guarantees about.

60
00:02:19,160 --> 00:02:21,019
So with with debugging, we only get

61
00:02:21,020 --> 00:02:23,009
guarantees about the scenarios we run.

62
00:02:23,010 --> 00:02:24,259
So that's low completeness.

63
00:02:24,260 --> 00:02:26,149
But I'm calling it high rigor because

64
00:02:26,150 --> 00:02:29,239
it's literally running your code.

65
00:02:29,240 --> 00:02:31,099
You can also when you're writing your

66
00:02:31,100 --> 00:02:32,569
code in the first place at a whiteboard

67
00:02:32,570 --> 00:02:34,069
or just in your head, think through

68
00:02:34,070 --> 00:02:36,019
carefully all the things that can happen.

69
00:02:36,020 --> 00:02:37,639
Consider all sorts of what if scenarios.

70
00:02:37,640 --> 00:02:38,779
Convince yourself, don't worry, my

71
00:02:38,780 --> 00:02:40,129
program is ready for those or I'll add

72
00:02:40,130 --> 00:02:41,329
some new code to be ready for that.

73
00:02:41,330 --> 00:02:42,589
And there can be all sorts of branching

74
00:02:42,590 --> 00:02:43,909
and we can think very carefully.

75
00:02:43,910 --> 00:02:46,189
And if we're very principled, then we get

76
00:02:46,190 --> 00:02:47,959
high levels of completeness but low rigor

77
00:02:47,960 --> 00:02:49,489
because this is just an abstract

78
00:02:49,490 --> 00:02:51,289
whiteboard level kind of exercise.

79
00:02:53,440 --> 00:02:55,359
Testing has a very similar profile to

80
00:02:55,360 --> 00:02:57,519
debugging, we create a bunch of scenarios

81
00:02:57,520 --> 00:02:59,469
that are saved in a list of test cases

82
00:02:59,470 --> 00:03:01,269
that we can run over and over again in an

83
00:03:01,270 --> 00:03:02,499
automatic way.

84
00:03:02,500 --> 00:03:04,629
And an alternative is to put lots of

85
00:03:04,630 --> 00:03:06,789
comments in your program that capture

86
00:03:06,790 --> 00:03:08,079
exactly your expectations.

87
00:03:08,080 --> 00:03:09,489
What should be true whenever you reach

88
00:03:09,490 --> 00:03:11,499
that comment and then read the code and

89
00:03:11,500 --> 00:03:13,599
think through the interactions

90
00:03:13,600 --> 00:03:15,729
between all these expectations so we

91
00:03:15,730 --> 00:03:17,229
could we could call this the comments

92
00:03:17,230 --> 00:03:18,639
approach, which might have a high level

93
00:03:18,640 --> 00:03:19,929
of completeness. There could be comments

94
00:03:19,930 --> 00:03:21,339
everywhere saying a lot of things about a

95
00:03:21,340 --> 00:03:23,229
lot of code, but it's not so rigorous

96
00:03:23,230 --> 00:03:25,179
because there's there's no typically no

97
00:03:25,180 --> 00:03:26,949
tool that connects the comments to the

98
00:03:26,950 --> 00:03:28,539
actual running behavior of the code.

99
00:03:30,560 --> 00:03:32,749
Another approach is auditing, and there's

100
00:03:32,750 --> 00:03:34,669
a whole spectrum of different levels of

101
00:03:34,670 --> 00:03:36,619
seriousness we can use in auditing, we

102
00:03:36,620 --> 00:03:38,419
can do relatively cheap auditing where we

103
00:03:38,420 --> 00:03:40,069
maybe read a lot of code but don't think

104
00:03:40,070 --> 00:03:41,299
too hard about it.

105
00:03:41,300 --> 00:03:43,429
Or we can have some more

106
00:03:43,430 --> 00:03:45,889
in-depth auditing where we probably

107
00:03:45,890 --> 00:03:47,479
need to read a smaller amount of code,

108
00:03:47,480 --> 00:03:50,359
but we can be much more

109
00:03:50,360 --> 00:03:52,399
thorough in checking the properties of

110
00:03:52,400 --> 00:03:53,869
that code.

111
00:03:53,870 --> 00:03:56,029
So the techniques I want

112
00:03:56,030 --> 00:03:58,129
to tell you about our way to sort of get

113
00:03:58,130 --> 00:04:00,259
the best of both worlds to get high

114
00:04:00,260 --> 00:04:02,629
ratings on both of these axes

115
00:04:02,630 --> 00:04:04,699
in contrast to the traditional

116
00:04:04,700 --> 00:04:06,079
methods.

117
00:04:06,080 --> 00:04:08,719
So I'm a professor at university.

118
00:04:08,720 --> 00:04:10,819
You're probably used to people from that

119
00:04:10,820 --> 00:04:12,859
sort of place shooting out all these

120
00:04:12,860 --> 00:04:14,779
little Greek letters and fancy

121
00:04:14,780 --> 00:04:16,398
mathematical symbols and saying this is

122
00:04:16,399 --> 00:04:17,689
formal methods, this is great.

123
00:04:17,690 --> 00:04:18,799
We're going to prove things about

124
00:04:18,800 --> 00:04:20,599
programs. I think this is an even more

125
00:04:20,600 --> 00:04:22,669
popular research area here in Germany

126
00:04:22,670 --> 00:04:24,439
than in the US where I'm from.

127
00:04:24,440 --> 00:04:26,029
It's been around for a long time, still

128
00:04:26,030 --> 00:04:28,279
not too widely used in

129
00:04:28,280 --> 00:04:31,399
industry and the open source world, but

130
00:04:31,400 --> 00:04:32,329
maybe there's hope.

131
00:04:32,330 --> 00:04:34,369
I think most people in those worlds are

132
00:04:34,370 --> 00:04:35,569
used to this kind of scenario where

133
00:04:35,570 --> 00:04:37,159
people doing phone methods, research are

134
00:04:37,160 --> 00:04:39,019
saying it's really the right thing to do

135
00:04:39,020 --> 00:04:40,159
to prove your program.

136
00:04:40,160 --> 00:04:41,659
Don't you want to do the right thing?

137
00:04:41,660 --> 00:04:44,059
And then the developers just sort of nod

138
00:04:44,060 --> 00:04:46,129
and smile politely and then they need to

139
00:04:46,130 --> 00:04:47,659
get back to work because they need to

140
00:04:47,660 --> 00:04:49,219
implement a few new features for the

141
00:04:49,220 --> 00:04:51,049
version of the system that ships next

142
00:04:51,050 --> 00:04:51,979
week.

143
00:04:51,980 --> 00:04:53,479
I want to try to argue that formal

144
00:04:53,480 --> 00:04:55,549
methods, if not already there today,

145
00:04:55,550 --> 00:04:56,839
are very close to the point where they

146
00:04:56,840 --> 00:04:58,069
will actually save you time.

147
00:04:58,070 --> 00:05:00,019
They will reduce the cost of developing

148
00:05:00,020 --> 00:05:02,149
systems, especially those where

149
00:05:02,150 --> 00:05:04,699
bugs are especially costly.

150
00:05:04,700 --> 00:05:06,829
So the the pitch I'm trying to make is

151
00:05:06,830 --> 00:05:08,689
that in the history of computing, we've

152
00:05:08,690 --> 00:05:10,849
gone through layers upon

153
00:05:10,850 --> 00:05:12,469
layers of building new, higher level

154
00:05:12,470 --> 00:05:14,629
abstractions that lower the cost of

155
00:05:14,630 --> 00:05:16,639
creating particular functionality.

156
00:05:16,640 --> 00:05:18,739
And we've built

157
00:05:18,740 --> 00:05:20,539
each of these levels upon earlier ones.

158
00:05:20,540 --> 00:05:22,039
They provide a new possibilities for

159
00:05:22,040 --> 00:05:24,259
modularity and abstraction and formal

160
00:05:24,260 --> 00:05:26,419
methods is actually an enabler

161
00:05:26,420 --> 00:05:28,519
for some of these

162
00:05:28,520 --> 00:05:30,949
next level of possibilities for dividing

163
00:05:30,950 --> 00:05:32,989
big programs into small pieces that we

164
00:05:32,990 --> 00:05:34,370
can understand effectively.

165
00:05:35,750 --> 00:05:37,849
So what I'm claiming

166
00:05:37,850 --> 00:05:39,949
should be feasible by, let's say,

167
00:05:39,950 --> 00:05:41,809
10 years from now with the right kind of

168
00:05:41,810 --> 00:05:43,609
community forming around it and doing the

169
00:05:43,610 --> 00:05:45,919
work is complete.

170
00:05:45,920 --> 00:05:47,749
Deployed computer systems with top to

171
00:05:47,750 --> 00:05:50,179
bottom correctness proofs, meaning

172
00:05:50,180 --> 00:05:53,359
from hardware up to applications

173
00:05:53,360 --> 00:05:54,739
where these troops will be checked

174
00:05:54,740 --> 00:05:56,419
algorithmically.

175
00:05:56,420 --> 00:05:58,519
So we don't have to worry about

176
00:05:58,520 --> 00:06:00,739
low reliability people reading long

177
00:06:00,740 --> 00:06:02,149
mathematical documents.

178
00:06:02,150 --> 00:06:04,219
And we will get to the point

179
00:06:04,220 --> 00:06:06,109
where none of the code that you run needs

180
00:06:06,110 --> 00:06:08,449
to be trusted to guarantee the properties

181
00:06:08,450 --> 00:06:09,619
you care about, whether they're in

182
00:06:09,620 --> 00:06:11,959
security or other kinds of correctness.

183
00:06:11,960 --> 00:06:13,579
And as if that weren't good enough,

184
00:06:14,630 --> 00:06:15,829
by the way, that this will apply to

185
00:06:15,830 --> 00:06:18,439
hardware designs, not just software code,

186
00:06:18,440 --> 00:06:20,539
and this

187
00:06:20,540 --> 00:06:22,219
kind of approach should be able to lower

188
00:06:22,220 --> 00:06:23,779
development costs in the same way that a

189
00:06:23,780 --> 00:06:26,239
good systematic testing framework can

190
00:06:26,240 --> 00:06:27,979
and to a large extent should be able to

191
00:06:27,980 --> 00:06:30,709
replace very time intensive activities

192
00:06:30,710 --> 00:06:33,559
like debugging, testing and code review.

193
00:06:33,560 --> 00:06:35,959
So I better go into a little more detail

194
00:06:35,960 --> 00:06:37,699
on what kind of techniques I'm talking

195
00:06:37,700 --> 00:06:39,359
about here and why we should believe the

196
00:06:39,360 --> 00:06:40,549
stuff is ready.

197
00:06:40,550 --> 00:06:43,249
But before I get into the

198
00:06:43,250 --> 00:06:44,539
the next level of detail, I'll just

199
00:06:44,540 --> 00:06:46,609
summarize how this kind of

200
00:06:46,610 --> 00:06:47,929
approach can substitute for the

201
00:06:47,930 --> 00:06:49,789
traditional development activities.

202
00:06:49,790 --> 00:06:51,049
Instead of debugging where we're

203
00:06:51,050 --> 00:06:52,909
exploring concrete executions of a

204
00:06:52,910 --> 00:06:54,949
system, we can do proving where we're

205
00:06:54,950 --> 00:06:56,989
exploring symbolic arguments that cover

206
00:06:56,990 --> 00:06:59,239
all possible executions

207
00:06:59,240 --> 00:07:00,859
instead of testing, which applies to

208
00:07:00,860 --> 00:07:02,389
concrete scenarios, we can do

209
00:07:02,390 --> 00:07:04,669
specification writing where we describe

210
00:07:04,670 --> 00:07:07,009
general requirements that can cover

211
00:07:07,010 --> 00:07:08,399
all scenarios.

212
00:07:08,400 --> 00:07:10,789
Instead of auditing our code in detail,

213
00:07:10,790 --> 00:07:12,379
we can audit the specifications in

214
00:07:12,380 --> 00:07:14,389
detail. Specifications should be much

215
00:07:14,390 --> 00:07:16,399
shorter and typically don't include

216
00:07:16,400 --> 00:07:18,439
optimizations, which can be some of the

217
00:07:18,440 --> 00:07:20,149
hardest parts of code to understand.

218
00:07:21,430 --> 00:07:23,589
So a summary of what I'm

219
00:07:23,590 --> 00:07:25,179
going to cover in the rest of this talk.

220
00:07:25,180 --> 00:07:27,099
I'm going to introduce the key technology

221
00:07:27,100 --> 00:07:28,419
that I'm selling here, which is called

222
00:07:28,420 --> 00:07:29,829
proof assistance.

223
00:07:29,830 --> 00:07:31,959
I'll cover some common objections

224
00:07:31,960 --> 00:07:34,089
and the standard answers to those

225
00:07:34,090 --> 00:07:36,249
questions about why do

226
00:07:36,250 --> 00:07:38,319
we believe this can actually

227
00:07:38,320 --> 00:07:40,509
be applied to real systems, why

228
00:07:40,510 --> 00:07:42,519
it can be applied cost effectively.

229
00:07:42,520 --> 00:07:44,049
I'll explain a few different ways that

230
00:07:44,050 --> 00:07:45,369
this technology can be used in the

231
00:07:45,370 --> 00:07:46,929
development of computer systems.

232
00:07:46,930 --> 00:07:48,549
I'll talk about two case studies, one in

233
00:07:48,550 --> 00:07:50,109
hardware and one in cryptographic

234
00:07:50,110 --> 00:07:51,849
software, and I'll finish up with some

235
00:07:51,850 --> 00:07:54,999
pointers to further reading for

236
00:07:55,000 --> 00:07:56,619
continuing the journey along learning

237
00:07:56,620 --> 00:07:57,620
this subject.

238
00:07:58,600 --> 00:08:00,879
So one of the first questions people

239
00:08:00,880 --> 00:08:03,249
have about formal methods is essentially

240
00:08:03,250 --> 00:08:05,559
based on an argument embodied

241
00:08:05,560 --> 00:08:07,329
in a paper from nineteen seventy nine by

242
00:08:07,330 --> 00:08:09,519
DeMello, Lipton and Perlis called

243
00:08:09,520 --> 00:08:11,259
Social Processes and Proofs of Theorems

244
00:08:11,260 --> 00:08:12,369
and Programs.

245
00:08:12,370 --> 00:08:14,139
They had a lot to say in that paper, but

246
00:08:14,140 --> 00:08:16,629
I would kind of boil it down to

247
00:08:16,630 --> 00:08:19,389
they say real systems are so complex

248
00:08:19,390 --> 00:08:21,549
and they change so frequently

249
00:08:21,550 --> 00:08:23,949
that it is beyond the powers

250
00:08:23,950 --> 00:08:26,109
of human concentration to read

251
00:08:26,110 --> 00:08:28,779
careful proofs about those systems.

252
00:08:28,780 --> 00:08:30,459
And it's not just one proof its version

253
00:08:30,460 --> 00:08:31,749
after version of the system, you'll need

254
00:08:31,750 --> 00:08:33,969
a new proof to cover the correctness

255
00:08:33,970 --> 00:08:35,918
of the new system.

256
00:08:35,919 --> 00:08:38,079
And I would say today

257
00:08:38,080 --> 00:08:40,178
we we learned that these authors are both

258
00:08:40,179 --> 00:08:41,949
right and wrong about this point.

259
00:08:41,950 --> 00:08:44,048
They're probably right that people can't

260
00:08:44,049 --> 00:08:46,209
be trusted to read proofs about real

261
00:08:46,210 --> 00:08:48,699
systems, but they were implicitly

262
00:08:48,700 --> 00:08:50,289
assuming you would be people who read the

263
00:08:50,290 --> 00:08:51,909
proofs today.

264
00:08:51,910 --> 00:08:52,899
We have a better way.

265
00:08:52,900 --> 00:08:55,869
Let's get algorithms to read the proofs.

266
00:08:55,870 --> 00:08:57,939
So here's how this kind

267
00:08:57,940 --> 00:09:00,339
of proving we can scale

268
00:09:00,340 --> 00:09:01,749
up to real systems.

269
00:09:01,750 --> 00:09:03,909
So we have here's a proof engineer trying

270
00:09:03,910 --> 00:09:04,899
to prove a theorem.

271
00:09:04,900 --> 00:09:06,009
He wrote a proof about it.

272
00:09:07,030 --> 00:09:08,529
And don't worry, he didn't have to write

273
00:09:08,530 --> 00:09:10,119
it all himself. There's an ecosystem of

274
00:09:10,120 --> 00:09:12,669
libraries, of existing proof facts,

275
00:09:12,670 --> 00:09:14,859
as well as proof techniques

276
00:09:14,860 --> 00:09:15,909
that can be reused.

277
00:09:16,960 --> 00:09:18,969
And the proof is actually aski source

278
00:09:18,970 --> 00:09:21,129
code. It's checked into version control

279
00:09:21,130 --> 00:09:23,589
like on GitHub, and that allows

280
00:09:23,590 --> 00:09:26,259
a number of developers to collaborate

281
00:09:26,260 --> 00:09:27,549
writing different theorems within a

282
00:09:27,550 --> 00:09:29,739
larger development, get them to

283
00:09:29,740 --> 00:09:31,149
all wind up in the same place and have

284
00:09:31,150 --> 00:09:32,559
their versions tracked.

285
00:09:32,560 --> 00:09:34,359
We can plug our version control into

286
00:09:34,360 --> 00:09:35,709
continuous integration, which is

287
00:09:35,710 --> 00:09:37,239
effectively running over and over again

288
00:09:37,240 --> 00:09:39,159
checking. Is this proof still convincing?

289
00:09:39,160 --> 00:09:40,689
Is it still convincing every time there's

290
00:09:40,690 --> 00:09:41,739
a change?

291
00:09:41,740 --> 00:09:42,939
And how do we know it's convincing?

292
00:09:42,940 --> 00:09:45,249
We have an automatic proof checker

293
00:09:45,250 --> 00:09:47,319
that reads the source code of our

294
00:09:47,320 --> 00:09:49,539
proofs and applies some

295
00:09:49,540 --> 00:09:51,639
straightforward rules to validate

296
00:09:51,640 --> 00:09:53,259
that they really prove the facts that

297
00:09:53,260 --> 00:09:55,149
they're claiming to prove.

298
00:09:55,150 --> 00:09:57,249
By the way, one risk here is

299
00:09:57,250 --> 00:09:58,869
that some of these programmers or

300
00:09:58,870 --> 00:10:01,209
developers prove the wrong theorems.

301
00:10:01,210 --> 00:10:03,939
So we do some code review where

302
00:10:03,940 --> 00:10:06,309
each proof engineer might read

303
00:10:06,310 --> 00:10:07,809
the theorem statements written by the

304
00:10:07,810 --> 00:10:09,579
others and make sure that we're really

305
00:10:09,580 --> 00:10:12,579
proving what we set out to prove.

306
00:10:12,580 --> 00:10:14,709
So one, probably some of you are thinking

307
00:10:14,710 --> 00:10:16,899
this picture looks just like the way that

308
00:10:16,900 --> 00:10:18,579
software development works today in the

309
00:10:18,580 --> 00:10:19,659
real world.

310
00:10:19,660 --> 00:10:20,709
And that is very true.

311
00:10:20,710 --> 00:10:23,289
It's also true that we also have

312
00:10:23,290 --> 00:10:25,509
all these tools already for doing this

313
00:10:25,510 --> 00:10:27,879
kind of development for systems with

314
00:10:27,880 --> 00:10:29,259
correctness proofs.

315
00:10:29,260 --> 00:10:31,509
And I will try to demonstrate that

316
00:10:31,510 --> 00:10:33,489
in the rest of this talk. So this future

317
00:10:33,490 --> 00:10:35,619
is more or less here today.

318
00:10:35,620 --> 00:10:37,599
There are a lot more rough edges on the

319
00:10:37,600 --> 00:10:39,369
proof versions of some of these tools

320
00:10:39,370 --> 00:10:40,659
than in the traditional programing

321
00:10:40,660 --> 00:10:41,979
versions. And there's a lot of

322
00:10:41,980 --> 00:10:43,539
opportunity to get involved with

323
00:10:43,540 --> 00:10:44,540
improving the tools.

324
00:10:45,490 --> 00:10:47,469
So the key system we're going to use here

325
00:10:47,470 --> 00:10:49,119
is called a proof assistant.

326
00:10:49,120 --> 00:10:51,369
It's a software package that is basically

327
00:10:51,370 --> 00:10:53,109
an idea for stating improving

328
00:10:53,110 --> 00:10:55,209
mathematical theorems where you have to

329
00:10:55,210 --> 00:10:57,189
do manual work to write the proofs of the

330
00:10:57,190 --> 00:10:59,679
theorems as ASCII source code.

331
00:10:59,680 --> 00:11:02,559
But then they are checked automatically,

332
00:11:02,560 --> 00:11:04,449
just like we're used to with, say, type

333
00:11:04,450 --> 00:11:05,559
Checker's in programing.

334
00:11:07,330 --> 00:11:08,739
There are a number of proof assistants

335
00:11:08,740 --> 00:11:10,809
that have been developed for the

336
00:11:10,810 --> 00:11:12,369
last few decades and that are pretty

337
00:11:12,370 --> 00:11:13,389
usable today.

338
00:11:13,390 --> 00:11:15,019
One of them is called Isabelle Hall.

339
00:11:15,020 --> 00:11:16,839
A large part of that development was done

340
00:11:16,840 --> 00:11:19,209
here in Germany at TI Munich, and

341
00:11:19,210 --> 00:11:20,859
probably its best known application is

342
00:11:20,860 --> 00:11:22,959
the L for Verified Project, which has

343
00:11:22,960 --> 00:11:24,549
done a correctness proof of an operating

344
00:11:24,550 --> 00:11:25,719
system.

345
00:11:25,720 --> 00:11:27,189
Another proof of system is called Calk

346
00:11:27,190 --> 00:11:29,649
Developed at India and France.

347
00:11:29,650 --> 00:11:31,599
One of its best known applications is

348
00:11:31,600 --> 00:11:33,799
CERT, which is a

349
00:11:33,800 --> 00:11:36,399
C compiler with a proof of correctness

350
00:11:36,400 --> 00:11:38,259
bridging the gap from the C-code down to

351
00:11:38,260 --> 00:11:40,630
the assembly code that is produced.

352
00:11:42,040 --> 00:11:43,899
I'm going to focus on the court system in

353
00:11:43,900 --> 00:11:45,639
this presentation because that's what I

354
00:11:45,640 --> 00:11:46,720
use in most of my work.

355
00:11:48,580 --> 00:11:50,679
So how does this process

356
00:11:50,680 --> 00:11:52,119
work where we have a sort of

357
00:11:52,120 --> 00:11:54,129
collaboration between the human and the

358
00:11:54,130 --> 00:11:56,289
machine to come to a

359
00:11:56,290 --> 00:11:57,309
convincing proof?

360
00:11:57,310 --> 00:11:59,559
Well, the human user is repeatedly

361
00:11:59,560 --> 00:12:01,839
making suggestions to the automatic

362
00:12:01,840 --> 00:12:03,999
engine, basically saying, I think

363
00:12:04,000 --> 00:12:05,709
the following strategy ought to work for

364
00:12:05,710 --> 00:12:06,809
the next step of this proof.

365
00:12:06,810 --> 00:12:09,159
And the engine spins for a while

366
00:12:09,160 --> 00:12:10,659
and then says, OK, if we do that, then

367
00:12:10,660 --> 00:12:11,739
here's what's left to prove.

368
00:12:11,740 --> 00:12:13,299
Please tell me what to do here.

369
00:12:13,300 --> 00:12:16,209
And the library of steps or tactics

370
00:12:16,210 --> 00:12:18,159
is not fixed. We can plug in new

371
00:12:18,160 --> 00:12:20,469
libraries to the tactic engine

372
00:12:20,470 --> 00:12:21,789
introducing higher and higher level

373
00:12:21,790 --> 00:12:24,729
abstractions of what constitutes a valid

374
00:12:24,730 --> 00:12:25,730
piece of an argument

375
00:12:26,800 --> 00:12:29,319
in the end, after we finish this cycle

376
00:12:29,320 --> 00:12:31,869
and there is nothing left to prove, then

377
00:12:31,870 --> 00:12:33,399
the tactic engine outputs what's called a

378
00:12:33,400 --> 00:12:36,369
proof term, which is a

379
00:12:36,370 --> 00:12:38,289
essentially a program in a small language

380
00:12:38,290 --> 00:12:40,179
with just a few reasoning steps built

381
00:12:40,180 --> 00:12:42,249
into it. And in fact, everything I've

382
00:12:42,250 --> 00:12:44,709
shown you up to this point is untrusted

383
00:12:44,710 --> 00:12:46,689
in the sense of trusted computing bases

384
00:12:46,690 --> 00:12:48,159
and security.

385
00:12:48,160 --> 00:12:49,749
The only thing we need to trust, the only

386
00:12:49,750 --> 00:12:51,459
place where a bug in the infrastructure

387
00:12:51,460 --> 00:12:52,989
could lead us to accept an incorrect

388
00:12:52,990 --> 00:12:55,389
proof is a separate, relatively

389
00:12:55,390 --> 00:12:57,399
small proof checker that only looks at

390
00:12:57,400 --> 00:12:59,019
the proof term, ignores the way it was

391
00:12:59,020 --> 00:13:00,909
produced and checked whether it actually

392
00:13:00,910 --> 00:13:02,589
establishes the theorem that we were

393
00:13:02,590 --> 00:13:03,590
interested in.

394
00:13:04,950 --> 00:13:06,419
So what are those primitive deduction

395
00:13:06,420 --> 00:13:08,489
steps that the the small proof

396
00:13:08,490 --> 00:13:09,899
checker is implementing?

397
00:13:09,900 --> 00:13:12,259
I'll just give you one classic example,

398
00:13:12,260 --> 00:13:13,709
a kind of step which is more or less

399
00:13:13,710 --> 00:13:15,899
built into systems like this is the mode

400
00:13:15,900 --> 00:13:18,149
Poland's rule. It says if you know

401
00:13:18,150 --> 00:13:20,309
P implies Q and you know

402
00:13:20,310 --> 00:13:22,289
P, then you are allowed to deduce.

403
00:13:22,290 --> 00:13:24,359
Q So imagine

404
00:13:24,360 --> 00:13:26,639
a set of roughly 10 rules

405
00:13:26,640 --> 00:13:28,829
like this that we write out

406
00:13:28,830 --> 00:13:31,109
in ASCII syntax for the core

407
00:13:31,110 --> 00:13:32,819
proof terms of the system.

408
00:13:32,820 --> 00:13:35,039
And this is actually

409
00:13:35,040 --> 00:13:36,479
a language inspired by functional

410
00:13:36,480 --> 00:13:37,409
programing.

411
00:13:37,410 --> 00:13:38,759
It's a lot like a camel.

412
00:13:38,760 --> 00:13:40,799
In fact, Camel was invented to use to

413
00:13:40,800 --> 00:13:43,559
implement COK in the 80s.

414
00:13:43,560 --> 00:13:45,359
And so we have lines like this that are

415
00:13:45,360 --> 00:13:46,739
essentially changing the other proof

416
00:13:46,740 --> 00:13:48,599
steps and we have little applications

417
00:13:48,600 --> 00:13:50,039
like that. Monas Poland's at the end

418
00:13:50,040 --> 00:13:52,139
saying, and here's the deduction rule

419
00:13:52,140 --> 00:13:54,689
we use here. And believe it or not,

420
00:13:54,690 --> 00:13:56,249
these chains of steps are enough to

421
00:13:56,250 --> 00:13:58,439
reproduce all of the results from math

422
00:13:58,440 --> 00:13:59,729
and computer science and all sorts of

423
00:13:59,730 --> 00:14:02,129
other logic based fields.

424
00:14:02,130 --> 00:14:04,139
And it's easy to write a relatively

425
00:14:04,140 --> 00:14:05,909
small, trustworthy checker for

426
00:14:05,910 --> 00:14:07,739
applications of these rules that will

427
00:14:07,740 --> 00:14:09,299
apply to all those different domains.

428
00:14:10,890 --> 00:14:13,079
So let me make this a bit more concrete

429
00:14:13,080 --> 00:14:15,269
and show you an example of

430
00:14:15,270 --> 00:14:16,649
a proof or two in Cock's.

431
00:14:16,650 --> 00:14:18,819
So I'm going to be working with this code

432
00:14:18,820 --> 00:14:21,179
in in Emacs, in emacs

433
00:14:21,180 --> 00:14:23,309
mode for cok authoring,

434
00:14:23,310 --> 00:14:24,269
of course.

435
00:14:24,270 --> 00:14:25,889
So Corke is based on a functional

436
00:14:25,890 --> 00:14:27,269
programing language. It's a lot like

437
00:14:27,270 --> 00:14:28,259
Haskell or Okemos.

438
00:14:28,260 --> 00:14:30,029
For those of you who know those, we can

439
00:14:30,030 --> 00:14:31,589
do things like. Right. This expression,

440
00:14:31,590 --> 00:14:33,509
which is a list of natural numbers, five

441
00:14:33,510 --> 00:14:34,859
thirteen eighty nine.

442
00:14:34,860 --> 00:14:37,079
I run a check command and I press

443
00:14:37,080 --> 00:14:38,459
some keys. And then I got this

444
00:14:38,460 --> 00:14:39,749
highlighting in this region, which

445
00:14:39,750 --> 00:14:41,099
indicates this is now where the

446
00:14:41,100 --> 00:14:42,539
processing of the proof Persistent has

447
00:14:42,540 --> 00:14:43,949
moved up to.

448
00:14:43,950 --> 00:14:45,929
And it tells me this is a list of natural

449
00:14:45,930 --> 00:14:48,389
numbers. I can make a list booleans

450
00:14:48,390 --> 00:14:50,039
and I can use this alternative notation

451
00:14:50,040 --> 00:14:51,749
for list with an index operator, colon

452
00:14:51,750 --> 00:14:53,819
colon for adding a new

453
00:14:53,820 --> 00:14:54,959
element to the front of a list.

454
00:14:54,960 --> 00:14:56,459
So this is the list consisting of five

455
00:14:56,460 --> 00:14:58,499
thirteen eighty nine with an empty list

456
00:14:58,500 --> 00:15:00,509
here. It's another way of writing this.

457
00:15:02,220 --> 00:15:04,889
And I've also written a recursive

458
00:15:04,890 --> 00:15:06,839
function here called length that for any

459
00:15:06,840 --> 00:15:08,699
type A takes and a list of type and

460
00:15:08,700 --> 00:15:09,989
returns a natural number.

461
00:15:09,990 --> 00:15:12,239
And we do pattern matching and say an

462
00:15:12,240 --> 00:15:14,369
empty list, match the number zero and

463
00:15:14,370 --> 00:15:16,199
a non-empty list match to one plus the

464
00:15:16,200 --> 00:15:17,069
length of the table.

465
00:15:17,070 --> 00:15:18,989
The list where I'm signing the name ls

466
00:15:18,990 --> 00:15:19,949
prime to the tail.

467
00:15:19,950 --> 00:15:21,329
I don't have time to go over all the

468
00:15:21,330 --> 00:15:23,039
classic ideas from functional programing

469
00:15:23,040 --> 00:15:24,899
like in Haskell and Emelle.

470
00:15:24,900 --> 00:15:27,209
Hopefully this will make

471
00:15:27,210 --> 00:15:28,649
just enough sense for people who aren't

472
00:15:28,650 --> 00:15:30,569
used to seeing those, but it's very

473
00:15:30,570 --> 00:15:32,579
similar notation to in those languages.

474
00:15:32,580 --> 00:15:33,479
So let's prove a theorem.

475
00:15:33,480 --> 00:15:34,789
Well, first I'll check what's the length?

476
00:15:34,790 --> 00:15:36,689
This list, it's three good.

477
00:15:36,690 --> 00:15:38,519
And I also define another classic

478
00:15:38,520 --> 00:15:39,809
function that takes two lists and

479
00:15:39,810 --> 00:15:41,879
essentially concatenate them makes a

480
00:15:41,880 --> 00:15:43,259
new list. That's first the elements of

481
00:15:43,260 --> 00:15:44,789
the first list, then the elements of the

482
00:15:44,790 --> 00:15:46,649
second list. You just write this as plus

483
00:15:46,650 --> 00:15:49,259
four lists in Python, for instance,

484
00:15:49,260 --> 00:15:51,149
here I'm calling it append and append.

485
00:15:51,150 --> 00:15:53,159
One, two with three, four, five and get

486
00:15:53,160 --> 00:15:54,990
one, two, three, four, five down here.

487
00:15:56,670 --> 00:15:58,349
So you can already see some of these

488
00:15:58,350 --> 00:15:59,959
elements that led me to call this an

489
00:15:59,960 --> 00:16:02,039
idea. We're getting constant

490
00:16:02,040 --> 00:16:04,439
feedback as we go, much like the

491
00:16:04,440 --> 00:16:07,019
ominous red squiggly is in eclipse

492
00:16:07,020 --> 00:16:09,149
when you make a mistake in your code.

493
00:16:09,150 --> 00:16:10,619
We'll have those coming up shortly.

494
00:16:10,620 --> 00:16:12,029
When we try to do some proofs and we get

495
00:16:12,030 --> 00:16:14,099
this feedback on the

496
00:16:14,100 --> 00:16:15,359
computational behavior of what we're

497
00:16:15,360 --> 00:16:17,429
defining. So here's a theorem

498
00:16:17,430 --> 00:16:19,529
for any type and for two lists of that

499
00:16:19,530 --> 00:16:21,629
type. If I append them and

500
00:16:21,630 --> 00:16:23,459
then take the length, that's the same as

501
00:16:23,460 --> 00:16:25,109
taking the sum of their individual

502
00:16:25,110 --> 00:16:26,110
lengths.

503
00:16:26,860 --> 00:16:29,259
So there's this red

504
00:16:29,260 --> 00:16:31,269
backgrounded admitted telling us this is

505
00:16:31,270 --> 00:16:33,429
a fact we didn't prove yet, so let's

506
00:16:33,430 --> 00:16:34,869
prove it so we don't have to look at that

507
00:16:34,870 --> 00:16:35,769
red anymore.

508
00:16:35,770 --> 00:16:37,959
And so I'm going to say my first step is

509
00:16:37,960 --> 00:16:39,759
I'm going to prove this by induction on

510
00:16:39,760 --> 00:16:41,319
the variable one.

511
00:16:41,320 --> 00:16:42,669
And then what this is going to do is

512
00:16:42,670 --> 00:16:44,169
create two new subgoals.

513
00:16:44,170 --> 00:16:45,099
Here's our current goal.

514
00:16:45,100 --> 00:16:47,259
Just repeat it from up here when I run

515
00:16:47,260 --> 00:16:48,260
this.

516
00:16:48,990 --> 00:16:50,019
Let's give us more space.

517
00:16:50,020 --> 00:16:52,569
There there are now two subgoals,

518
00:16:52,570 --> 00:16:55,059
a base case and an inductive

519
00:16:55,060 --> 00:16:57,189
case in the base

520
00:16:57,190 --> 00:16:59,259
case. Plus one was replaced with an

521
00:16:59,260 --> 00:17:01,299
empty list and an inductive case, plus

522
00:17:01,300 --> 00:17:03,879
one was replaced with an arbitrary

523
00:17:03,880 --> 00:17:05,949
non-empty list with some head element

524
00:17:05,950 --> 00:17:07,999
and some tail elements.

525
00:17:08,000 --> 00:17:10,088
So let's start proving

526
00:17:10,089 --> 00:17:11,679
the base case.

527
00:17:11,680 --> 00:17:14,318
One thing I'll do I noticed the

528
00:17:14,319 --> 00:17:16,209
goal here starts with a four all so I'll

529
00:17:16,210 --> 00:17:18,309
run a step called intro that's going to

530
00:17:18,310 --> 00:17:19,989
take out the overall and instead give us

531
00:17:19,990 --> 00:17:22,729
a local variable called Elst to

532
00:17:22,730 --> 00:17:23,979
here it is.

533
00:17:23,980 --> 00:17:26,108
And then I notice, oh, this

534
00:17:26,109 --> 00:17:27,729
looks pretty obvious just from the

535
00:17:27,730 --> 00:17:28,839
definition of a pen.

536
00:17:28,840 --> 00:17:30,729
So let's ask Corke to do some algebraic

537
00:17:30,730 --> 00:17:33,249
simplification with that definition,

538
00:17:33,250 --> 00:17:35,139
essentially running parts of the program

539
00:17:35,140 --> 00:17:36,759
at compile time, if you will.

540
00:17:36,760 --> 00:17:38,459
And then that was looks pretty clear.

541
00:17:38,460 --> 00:17:39,919
The same thing is on both sides of the

542
00:17:39,920 --> 00:17:42,459
equality. So I can say reflexivity

543
00:17:42,460 --> 00:17:44,419
and that case is proved.

544
00:17:44,420 --> 00:17:46,629
Let's go into the inductive case.

545
00:17:46,630 --> 00:17:48,789
I'll start at the same way I

546
00:17:48,790 --> 00:17:51,009
get rid of that for all down below here

547
00:17:51,010 --> 00:17:53,949
and simplify.

548
00:17:53,950 --> 00:17:56,109
Now I notice there's this term here,

549
00:17:56,110 --> 00:17:58,239
length of append, blah, blah, blah,

550
00:17:58,240 --> 00:18:00,609
which is actually also present up here,

551
00:18:00,610 --> 00:18:02,529
what appears here. So in general, above

552
00:18:02,530 --> 00:18:04,569
the double line are our hypotheses are

553
00:18:04,570 --> 00:18:06,189
known facts that we're allowed to appeal

554
00:18:06,190 --> 00:18:07,299
to. And here we have an induction

555
00:18:07,300 --> 00:18:09,339
hypothesis which has the theorem

556
00:18:09,340 --> 00:18:11,079
restated, essentially, but applying

557
00:18:11,080 --> 00:18:12,729
specifically to the tail of the current

558
00:18:12,730 --> 00:18:15,039
list. So I can say, hey, let's take that

559
00:18:15,040 --> 00:18:17,319
n'goni quality, use it to rewrite,

560
00:18:17,320 --> 00:18:20,169
will match this with this

561
00:18:20,170 --> 00:18:22,059
and therefore we'll be able to replace it

562
00:18:22,060 --> 00:18:23,169
with this.

563
00:18:23,170 --> 00:18:25,779
So we went like that and now reflexivity

564
00:18:25,780 --> 00:18:26,780
applies.

565
00:18:27,390 --> 00:18:29,499
Very good. And the most exciting

566
00:18:29,500 --> 00:18:32,469
moment of any proof in copyright QAD

567
00:18:32,470 --> 00:18:34,299
process that and if the system doesn't

568
00:18:34,300 --> 00:18:35,979
complain, this is the equivalent of

569
00:18:35,980 --> 00:18:37,719
taking your code into eclipse and not

570
00:18:37,720 --> 00:18:39,429
getting any red squiggles under it.

571
00:18:39,430 --> 00:18:40,929
This is a correct proof.

572
00:18:40,930 --> 00:18:42,579
Cork is convinced of the theorem is true.

573
00:18:46,300 --> 00:18:47,300
Thank you.

574
00:18:51,060 --> 00:18:53,579
One more example to show

575
00:18:53,580 --> 00:18:55,439
how we reuse Limas improves.

576
00:18:55,440 --> 00:18:57,569
This is a reverse function for

577
00:18:57,570 --> 00:18:59,549
lists that takes in a list and then steps

578
00:18:59,550 --> 00:19:01,559
down and building a new list using the

579
00:19:01,560 --> 00:19:04,229
append function to repeatedly add

580
00:19:04,230 --> 00:19:06,059
elements that were on the front onto the

581
00:19:06,060 --> 00:19:07,299
end.

582
00:19:07,300 --> 00:19:09,149
So reverse one, two, three into three to

583
00:19:09,150 --> 00:19:10,139
one.

584
00:19:10,140 --> 00:19:12,239
And so then we can quickly prove length.

585
00:19:12,240 --> 00:19:13,240
Rev.

586
00:19:14,900 --> 00:19:16,639
I'll do this by induction on the list I'm

587
00:19:16,640 --> 00:19:18,079
reversing, I want to show the length does

588
00:19:18,080 --> 00:19:19,459
not change when you reverse the list

589
00:19:20,540 --> 00:19:22,259
to the base case is pretty simple here.

590
00:19:22,260 --> 00:19:24,409
I use algebraic simplification and

591
00:19:24,410 --> 00:19:25,549
then that's obviously true.

592
00:19:28,200 --> 00:19:30,359
And see what happens

593
00:19:30,360 --> 00:19:31,360
in this case.

594
00:19:32,580 --> 00:19:34,679
Well, here I see I have

595
00:19:34,680 --> 00:19:36,449
a length of an append that doesn't

596
00:19:36,450 --> 00:19:38,399
directly match my induction hypothesis,

597
00:19:38,400 --> 00:19:40,619
but luckily I approved

598
00:19:40,620 --> 00:19:42,689
something called length append or a

599
00:19:42,690 --> 00:19:43,619
pen length. I don't remember.

600
00:19:43,620 --> 00:19:46,019
We'll find out length append, which

601
00:19:46,020 --> 00:19:48,659
replaced Pend

602
00:19:48,660 --> 00:19:51,029
length of append with some of

603
00:19:51,030 --> 00:19:52,739
the lengths I would play that again.

604
00:19:52,740 --> 00:19:54,029
We have the length of the Rev plus the

605
00:19:54,030 --> 00:19:55,949
length of the Singleton list.

606
00:19:55,950 --> 00:19:59,069
Then I can simplify this

607
00:19:59,070 --> 00:20:00,899
and now we're very close so I can rewrite

608
00:20:00,900 --> 00:20:02,099
with IHSS.

609
00:20:02,100 --> 00:20:04,199
By the way, this capitalist's is the plus

610
00:20:04,200 --> 00:20:06,149
one function. It stands for successor.

611
00:20:06,150 --> 00:20:09,149
So this is pretty obviously true,

612
00:20:09,150 --> 00:20:11,399
though capital s is not defined as plus

613
00:20:11,400 --> 00:20:13,259
one is a little different, but I can call

614
00:20:13,260 --> 00:20:14,669
a procedure called Omega, which is a

615
00:20:14,670 --> 00:20:16,049
solver for what's called linear

616
00:20:16,050 --> 00:20:17,849
arithmetic, which knows how to prove all

617
00:20:17,850 --> 00:20:19,919
sorts of numerical facts like

618
00:20:19,920 --> 00:20:20,669
that.

619
00:20:20,670 --> 00:20:22,859
So that was an example primarily of using

620
00:20:22,860 --> 00:20:24,389
a fact we already proved within a larger

621
00:20:24,390 --> 00:20:26,459
proof. And this allows us

622
00:20:26,460 --> 00:20:28,799
to build up to proofs that cover pretty

623
00:20:28,800 --> 00:20:30,659
significant systems and their correctness

624
00:20:30,660 --> 00:20:31,660
properties.

625
00:20:33,910 --> 00:20:35,439
One last example, and I will not go

626
00:20:35,440 --> 00:20:36,609
through as much detail, I just want to

627
00:20:36,610 --> 00:20:38,949
show that even for programs

628
00:20:38,950 --> 00:20:40,749
I can write in one page of code, it's not

629
00:20:40,750 --> 00:20:43,029
very straightforward, naively

630
00:20:43,030 --> 00:20:45,039
to establish their correctness or define

631
00:20:45,040 --> 00:20:46,659
the language of abstract syntax, trees

632
00:20:46,660 --> 00:20:48,609
for expressions with a plus.

633
00:20:48,610 --> 00:20:50,709
Sometimes, as well as constants and

634
00:20:50,710 --> 00:20:52,809
variables, I

635
00:20:52,810 --> 00:20:54,249
have defined a recursive function to

636
00:20:54,250 --> 00:20:56,409
evaluate those expressions to

637
00:20:56,410 --> 00:20:58,719
numbers given values for all variables.

638
00:20:58,720 --> 00:21:01,419
And I've defined

639
00:21:01,420 --> 00:21:03,879
medium complexity optimizer that finds

640
00:21:03,880 --> 00:21:05,139
parts of the arithmetic that we can

641
00:21:05,140 --> 00:21:06,969
actually simplify at compile time.

642
00:21:06,970 --> 00:21:09,339
This is a classic optimization called

643
00:21:09,340 --> 00:21:10,340
constant folding,

644
00:21:11,800 --> 00:21:13,149
but I won't dwell on the code there.

645
00:21:13,150 --> 00:21:15,219
I just want to show here I started trying

646
00:21:15,220 --> 00:21:17,559
to prove the correctness, namely when you

647
00:21:17,560 --> 00:21:19,479
constant fold and then interpret.

648
00:21:19,480 --> 00:21:20,709
It's the same answer as if you

649
00:21:20,710 --> 00:21:22,269
interpreted the original program in the

650
00:21:22,270 --> 00:21:24,219
same variable environment excess.

651
00:21:24,220 --> 00:21:27,169
And then I started approved by induction.

652
00:21:27,170 --> 00:21:28,989
The first case was easy, second case was

653
00:21:28,990 --> 00:21:30,549
easy. But then I started needing to do

654
00:21:30,550 --> 00:21:33,249
case analysis on which constructor

655
00:21:33,250 --> 00:21:35,199
was used to build particular terms and

656
00:21:35,200 --> 00:21:37,029
the abstract syntax tree type.

657
00:21:37,030 --> 00:21:38,769
And then I need to do a subclass analysis

658
00:21:38,770 --> 00:21:40,209
inside that one. Then I need to do a

659
00:21:40,210 --> 00:21:41,829
subclass analysis inside that one.

660
00:21:41,830 --> 00:21:43,179
I'm typing the same thing over and over

661
00:21:43,180 --> 00:21:45,069
again. A lot of stuff is going on here,

662
00:21:45,070 --> 00:21:47,139
but it's very repetitive.

663
00:21:47,140 --> 00:21:49,239
And at some point I just gave up

664
00:21:49,240 --> 00:21:51,429
and wrote, that's

665
00:21:51,430 --> 00:21:52,959
not going to work too hard to write that

666
00:21:52,960 --> 00:21:54,489
manually. So keep that in mind.

667
00:21:54,490 --> 00:21:56,109
But for now, this is some reason to be

668
00:21:56,110 --> 00:21:58,299
skeptical that this approach will scale

669
00:21:58,300 --> 00:22:00,639
to full scale, realistic deployed systems

670
00:22:00,640 --> 00:22:02,439
if we have to work this hard on a program

671
00:22:02,440 --> 00:22:04,239
that fits in one page.

672
00:22:04,240 --> 00:22:05,709
But I will come back to that.

673
00:22:10,050 --> 00:22:12,659
All right, so having concretized

674
00:22:12,660 --> 00:22:14,159
things about the use of assistance, you

675
00:22:14,160 --> 00:22:15,449
might have a few questions and I'm going

676
00:22:15,450 --> 00:22:16,739
to try to answer the standard ones.

677
00:22:16,740 --> 00:22:18,899
Now, one is

678
00:22:18,900 --> 00:22:20,849
OK, we know it's pretty hard for

679
00:22:20,850 --> 00:22:22,319
programmers to get the code right.

680
00:22:22,320 --> 00:22:23,789
Despite all these techniques that we

681
00:22:23,790 --> 00:22:25,799
have, there are still frequently high

682
00:22:25,800 --> 00:22:27,030
cost bugs in the wild.

683
00:22:28,080 --> 00:22:30,029
Is it really that much better to make

684
00:22:30,030 --> 00:22:32,009
programmers get the specs right?

685
00:22:32,010 --> 00:22:33,929
For many systems, we have trouble

686
00:22:33,930 --> 00:22:36,509
imagining how a an unambiguous

687
00:22:36,510 --> 00:22:38,549
specification would be much shorter or

688
00:22:38,550 --> 00:22:40,410
simpler than the original code.

689
00:22:41,610 --> 00:22:44,729
One answer is I would claim

690
00:22:44,730 --> 00:22:46,619
verification of this kind is most

691
00:22:46,620 --> 00:22:48,929
worthwhile when we apply it to

692
00:22:48,930 --> 00:22:51,059
the most commonly used system

693
00:22:51,060 --> 00:22:52,829
infrastructure that sits underneath

694
00:22:52,830 --> 00:22:54,179
everything else that we run.

695
00:22:54,180 --> 00:22:56,339
So processors, operating systems,

696
00:22:56,340 --> 00:22:59,009
compilers, databases and so on

697
00:22:59,010 --> 00:23:01,109
our first foundation for

698
00:23:01,110 --> 00:23:02,009
everything else we do.

699
00:23:02,010 --> 00:23:04,079
And second, they do turn out

700
00:23:04,080 --> 00:23:06,149
to have much shorter specifications

701
00:23:06,150 --> 00:23:07,150
than implementations.

702
00:23:08,790 --> 00:23:10,259
For instance, in a compiler, you can have

703
00:23:10,260 --> 00:23:11,759
a specification that doesn't say anything

704
00:23:11,760 --> 00:23:13,499
about the particular optimizations the

705
00:23:13,500 --> 00:23:15,569
compiler uses, even though those are the

706
00:23:15,570 --> 00:23:17,789
most complex parts of it and the easiest

707
00:23:17,790 --> 00:23:19,230
ones to introduce bugs in.

708
00:23:23,140 --> 00:23:25,449
So in my experience,

709
00:23:25,450 --> 00:23:27,369
when we think about what makes a speck a

710
00:23:27,370 --> 00:23:29,529
lot shorter than the original program,

711
00:23:29,530 --> 00:23:31,089
if we take a speck which covers the

712
00:23:31,090 --> 00:23:33,309
functional behavior of a system, and

713
00:23:33,310 --> 00:23:35,529
then we add in optimizations that make

714
00:23:35,530 --> 00:23:37,719
the system fast enough, little

715
00:23:37,720 --> 00:23:39,789
enough memory and so on, then

716
00:23:39,790 --> 00:23:41,829
we recovered the original implementation.

717
00:23:41,830 --> 00:23:43,149
So to the extent there are a lot of

718
00:23:43,150 --> 00:23:45,219
optimizations in the system, then

719
00:23:45,220 --> 00:23:47,439
there's a real opportunity to read

720
00:23:47,440 --> 00:23:49,509
aspect that is much clearer than the

721
00:23:49,510 --> 00:23:50,510
original system.

722
00:23:52,740 --> 00:23:55,319
But still, this is essentially programing

723
00:23:55,320 --> 00:23:56,669
just in a different language, and it's

724
00:23:56,670 --> 00:23:58,169
still hard to get right.

725
00:23:58,170 --> 00:24:00,239
So let's take an example

726
00:24:00,240 --> 00:24:02,189
to think through what might be the

727
00:24:02,190 --> 00:24:03,539
biggest challenges. Imagine we have a

728
00:24:03,540 --> 00:24:05,849
compiler who practice theorem essentially

729
00:24:05,850 --> 00:24:07,709
says the compiler bridges the gap between

730
00:24:07,710 --> 00:24:09,839
the semantics of the source language and

731
00:24:09,840 --> 00:24:11,519
the target language for this purpose.

732
00:24:11,520 --> 00:24:13,349
Let's say the target language is machine

733
00:24:13,350 --> 00:24:14,429
language.

734
00:24:14,430 --> 00:24:16,439
And imagine each of these semantics.

735
00:24:16,440 --> 00:24:17,639
When I write the word semantics, it

736
00:24:17,640 --> 00:24:19,739
really means reference interpreter or

737
00:24:19,740 --> 00:24:20,999
something like that, though usually it's

738
00:24:21,000 --> 00:24:23,190
written in a logical notation.

739
00:24:24,900 --> 00:24:26,399
I only wrote the compiler because I

740
00:24:26,400 --> 00:24:28,199
wanted to use it to compile some

741
00:24:28,200 --> 00:24:31,019
applications and imagine each application

742
00:24:31,020 --> 00:24:33,089
has its own specification and then the

743
00:24:33,090 --> 00:24:34,859
application theorem kind of snaps

744
00:24:34,860 --> 00:24:36,509
together with the compiler theorem

745
00:24:36,510 --> 00:24:37,799
through the semantics of the source

746
00:24:37,800 --> 00:24:39,959
language of the compiler and then down

747
00:24:39,960 --> 00:24:42,149
below AB, my processor that implements

748
00:24:42,150 --> 00:24:43,769
the machine language semantics on top of,

749
00:24:43,770 --> 00:24:46,169
say, VDL semantics or semantics

750
00:24:46,170 --> 00:24:47,969
or whatever language the processor is

751
00:24:47,970 --> 00:24:49,079
written in.

752
00:24:49,080 --> 00:24:51,449
So now think of all this together as one

753
00:24:51,450 --> 00:24:53,849
composite verified unit.

754
00:24:53,850 --> 00:24:55,049
The theorems are approved for the

755
00:24:55,050 --> 00:24:56,669
different parts all compose with each

756
00:24:56,670 --> 00:24:58,829
other to create the abstraction

757
00:24:58,830 --> 00:25:01,319
of a thing that sits

758
00:25:01,320 --> 00:25:03,509
on a particular circuit board and behaves

759
00:25:03,510 --> 00:25:05,579
in this application

760
00:25:05,580 --> 00:25:06,580
specific way.

761
00:25:07,920 --> 00:25:10,049
The really neat thing about this style of

762
00:25:10,050 --> 00:25:12,329
development is that first,

763
00:25:12,330 --> 00:25:14,279
none of the code we wrote is trusted

764
00:25:14,280 --> 00:25:16,499
anymore. It's all internal details

765
00:25:16,500 --> 00:25:18,839
inside this big block and

766
00:25:18,840 --> 00:25:20,339
it's all covered by the proofs that we've

767
00:25:20,340 --> 00:25:22,769
done. Second, even

768
00:25:22,770 --> 00:25:25,049
the internal specifications are now

769
00:25:25,050 --> 00:25:26,069
no longer trusted.

770
00:25:26,070 --> 00:25:28,199
They've become encapsulated details

771
00:25:28,200 --> 00:25:30,269
of the system and it's proof.

772
00:25:30,270 --> 00:25:32,759
So a mistake and say the semantics

773
00:25:32,760 --> 00:25:34,199
of the C programing language, which you

774
00:25:34,200 --> 00:25:35,729
can imagine is pretty easy to make if

775
00:25:35,730 --> 00:25:37,349
you're trying to write out the meaning of

776
00:25:37,350 --> 00:25:39,599
C and not miss any cases does

777
00:25:39,600 --> 00:25:41,939
not allow you to accidentally accept

778
00:25:41,940 --> 00:25:44,009
an incorrect system in this style as

779
00:25:44,010 --> 00:25:45,899
long as you keep that specification

780
00:25:45,900 --> 00:25:48,629
internal to the overall functionality

781
00:25:48,630 --> 00:25:49,630
that you're building.

782
00:25:51,420 --> 00:25:53,519
So let's contrast this with

783
00:25:53,520 --> 00:25:55,019
what we have to do to convince ourselves

784
00:25:55,020 --> 00:25:56,669
in the current standard software

785
00:25:56,670 --> 00:25:58,709
development approach, we have to run not

786
00:25:58,710 --> 00:26:00,419
just unit tests, but our individual

787
00:26:00,420 --> 00:26:02,339
libraries, but also for system

788
00:26:02,340 --> 00:26:04,619
integration tests, because as we compose

789
00:26:04,620 --> 00:26:06,779
libraries and modules together,

790
00:26:06,780 --> 00:26:09,029
the state space of the system is growing

791
00:26:09,030 --> 00:26:11,279
exponentially and therefore it's

792
00:26:11,280 --> 00:26:13,559
much easier to have a complete feeling

793
00:26:13,560 --> 00:26:15,689
coverage for individual functions

794
00:26:15,690 --> 00:26:17,309
than for full systems.

795
00:26:17,310 --> 00:26:19,499
In contrast with the

796
00:26:19,500 --> 00:26:21,629
computer proof assistant approach,

797
00:26:21,630 --> 00:26:23,519
the system integration theorems that we

798
00:26:23,520 --> 00:26:26,309
prove actually imply all the components

799
00:26:26,310 --> 00:26:28,559
inside the box are doing what they need

800
00:26:28,560 --> 00:26:30,689
to do to make the full system work.

801
00:26:30,690 --> 00:26:32,879
And we really fundamentally don't need

802
00:26:32,880 --> 00:26:35,009
the equivalent of unit tests in such a

803
00:26:35,010 --> 00:26:36,479
central role as it has today.

804
00:26:37,950 --> 00:26:40,019
In the old way, we have to do

805
00:26:40,020 --> 00:26:42,209
a careful review of all of our components

806
00:26:42,210 --> 00:26:44,369
because a bug in a corner case in any one

807
00:26:44,370 --> 00:26:46,319
of them could completely destroy our

808
00:26:46,320 --> 00:26:48,359
guarantees of the whole system.

809
00:26:48,360 --> 00:26:51,119
In contrast with the proof base style,

810
00:26:51,120 --> 00:26:53,189
we only need to do careful code review

811
00:26:53,190 --> 00:26:55,649
on the externally facing specifications.

812
00:26:55,650 --> 00:26:57,809
In fact, you could even accept,

813
00:26:57,810 --> 00:26:59,879
in theory, a component from

814
00:26:59,880 --> 00:27:01,019
your worst enemy.

815
00:27:01,020 --> 00:27:03,149
As long as it was proved to meet

816
00:27:03,150 --> 00:27:04,859
the appropriate specification, you can

817
00:27:04,860 --> 00:27:07,619
run it without even inspecting the code.

818
00:27:07,620 --> 00:27:09,749
And this is potentially a way

819
00:27:09,750 --> 00:27:11,849
to support, say,

820
00:27:11,850 --> 00:27:13,679
downloading applications from untrusted

821
00:27:13,680 --> 00:27:15,329
parties and doing proof, checking to

822
00:27:15,330 --> 00:27:17,009
guarantee that they meet particular

823
00:27:17,010 --> 00:27:18,010
security policies.

824
00:27:19,800 --> 00:27:22,049
So another concern I showed

825
00:27:22,050 --> 00:27:24,029
that as I tried to write out that proof

826
00:27:24,030 --> 00:27:26,129
for a medium sized program modeled after

827
00:27:26,130 --> 00:27:28,079
a compiler optimization, it got out of

828
00:27:28,080 --> 00:27:30,569
hand quickly and I ran out of patience.

829
00:27:30,570 --> 00:27:33,299
Well, we're familiar with the scenario,

830
00:27:33,300 --> 00:27:35,399
if you imagine it's nineteen

831
00:27:35,400 --> 00:27:37,379
fifty or so and someone says writing

832
00:27:37,380 --> 00:27:38,939
machine code program sure is a lot of

833
00:27:38,940 --> 00:27:39,359
work.

834
00:27:39,360 --> 00:27:40,979
It's never going to be very popular

835
00:27:40,980 --> 00:27:42,479
today. We know we have compilers that

836
00:27:42,480 --> 00:27:44,339
will go from successively higher levels

837
00:27:44,340 --> 00:27:45,779
of abstraction and automatically write

838
00:27:45,780 --> 00:27:46,889
that code for us.

839
00:27:46,890 --> 00:27:48,689
And by the way, we also have libraries at

840
00:27:48,690 --> 00:27:50,759
each of these language levels that we

841
00:27:50,760 --> 00:27:52,859
can typically reuse for most of the work.

842
00:27:52,860 --> 00:27:54,629
Say, if you're putting together a Web

843
00:27:54,630 --> 00:27:56,999
application with a popular framework,

844
00:27:57,000 --> 00:27:58,739
we don't expect to rewrite everything

845
00:27:58,740 --> 00:27:59,849
each time.

846
00:27:59,850 --> 00:28:02,009
In fact, the same story applies

847
00:28:02,010 --> 00:28:03,629
in the world of machine check.

848
00:28:03,630 --> 00:28:05,429
They're improving where we can build up

849
00:28:05,430 --> 00:28:07,859
layers of automatic proof generators.

850
00:28:07,860 --> 00:28:09,959
We can build up libraries of Limas that

851
00:28:09,960 --> 00:28:12,269
we appeal to for commonly

852
00:28:12,270 --> 00:28:13,829
useful facts.

853
00:28:13,830 --> 00:28:15,839
And this is the secret to building up an

854
00:28:15,840 --> 00:28:18,299
ecosystem of tools to make it

855
00:28:18,300 --> 00:28:20,399
relatively low cost to create a new

856
00:28:20,400 --> 00:28:22,499
development with a proof that

857
00:28:22,500 --> 00:28:24,659
is roughly similar to one you've

858
00:28:24,660 --> 00:28:25,660
done previously.

859
00:28:26,520 --> 00:28:28,289
So let me now go back in and make these

860
00:28:28,290 --> 00:28:30,599
proofs that I wrote before a little more

861
00:28:30,600 --> 00:28:31,769
appetizing.

862
00:28:31,770 --> 00:28:34,349
And so I'll

863
00:28:34,350 --> 00:28:35,549
restart this one.

864
00:28:37,650 --> 00:28:40,109
So I wrote this proof out here

865
00:28:40,110 --> 00:28:42,059
with some very manual steps, but it turns

866
00:28:42,060 --> 00:28:44,159
out I can just do this.

867
00:28:44,160 --> 00:28:46,229
I can. First, I'm going to use

868
00:28:46,230 --> 00:28:47,640
a semicolon to say

869
00:28:49,080 --> 00:28:51,179
take this, run this step a bunch of

870
00:28:51,180 --> 00:28:53,539
new stuff. Goals will exist afterward,

871
00:28:53,540 --> 00:28:55,599
then we're on this one, on all of them,

872
00:28:55,600 --> 00:28:57,729
and then run this one and all

873
00:28:57,730 --> 00:28:59,259
them, the next one has a kind of funny

874
00:28:59,260 --> 00:29:01,059
name called intuition that actually

875
00:29:01,060 --> 00:29:03,339
refers to intuitionist logic,

876
00:29:03,340 --> 00:29:05,889
but people

877
00:29:05,890 --> 00:29:07,569
typically experience it in a different

878
00:29:07,570 --> 00:29:09,789
way at first. That's not what it means.

879
00:29:09,790 --> 00:29:11,650
And then that just proves everything.

880
00:29:16,370 --> 00:29:18,469
And let's see if we can get lucky with

881
00:29:18,470 --> 00:29:19,730
the same thing, the next proof.

882
00:29:21,780 --> 00:29:22,780
Now it's called less

883
00:29:23,940 --> 00:29:26,489
and almost it almost worked.

884
00:29:26,490 --> 00:29:28,109
What we're missing is

885
00:29:30,090 --> 00:29:31,919
recall down here, I mentioned Alema.

886
00:29:31,920 --> 00:29:33,809
We already proved Caulked hadn't been

887
00:29:33,810 --> 00:29:35,279
told that it should consider that Lamba.

888
00:29:35,280 --> 00:29:36,479
So it didn't try.

889
00:29:36,480 --> 00:29:37,680
So what if I say

890
00:29:39,060 --> 00:29:40,319
two steps, I'm going to take dilema.

891
00:29:40,320 --> 00:29:42,719
We already proved it's called length

892
00:29:42,720 --> 00:29:43,919
Pend.

893
00:29:43,920 --> 00:29:46,059
And I'm going to say, hey, remember that

894
00:29:46,060 --> 00:29:47,879
Lemhi approved? That'll be useful.

895
00:29:47,880 --> 00:29:49,769
Try to use it as a rewriting rule

896
00:29:51,300 --> 00:29:53,730
and I can say on a rewrite with core.

897
00:29:56,320 --> 00:29:58,389
And now it's applied here

898
00:29:58,390 --> 00:29:59,709
and we're almost done.

899
00:29:59,710 --> 00:30:01,539
I just need to say these three steps

900
00:30:01,540 --> 00:30:03,759
should be done over and over again until

901
00:30:03,760 --> 00:30:05,139
theorem is proved.

902
00:30:05,140 --> 00:30:07,599
So this is an example of a loop here.

903
00:30:07,600 --> 00:30:08,979
This is actually a Turing complete

904
00:30:08,980 --> 00:30:10,119
scripting language.

905
00:30:10,120 --> 00:30:11,500
It's very different from

906
00:30:12,790 --> 00:30:15,039
the the primitive term

907
00:30:15,040 --> 00:30:17,589
language with just 10 or so steps in it.

908
00:30:17,590 --> 00:30:19,599
We can build up whatever abstractions we

909
00:30:19,600 --> 00:30:21,699
need here with loops and recursion and

910
00:30:21,700 --> 00:30:22,809
data structures.

911
00:30:22,810 --> 00:30:24,129
So what we're really doing is we're

912
00:30:24,130 --> 00:30:26,109
writing scripts that find the proofs for

913
00:30:26,110 --> 00:30:27,279
us.

914
00:30:27,280 --> 00:30:28,839
A particular script is designed to be

915
00:30:28,840 --> 00:30:30,639
readable by humans.

916
00:30:30,640 --> 00:30:32,079
But because you can read the script

917
00:30:32,080 --> 00:30:34,419
doesn't mean you understand every precise

918
00:30:34,420 --> 00:30:36,579
detail behind the low level proof

919
00:30:36,580 --> 00:30:38,259
it's generating. That would generally be

920
00:30:38,260 --> 00:30:39,569
completely overwhelming to have to

921
00:30:39,570 --> 00:30:41,109
understand those details.

922
00:30:41,110 --> 00:30:42,459
And a nice thing about writing scripts

923
00:30:42,460 --> 00:30:44,109
like this is that often if you change the

924
00:30:44,110 --> 00:30:45,969
system, you're verifying if you wrote the

925
00:30:45,970 --> 00:30:47,889
script well, it will keep working.

926
00:30:47,890 --> 00:30:49,539
And that's a partial answer to the

927
00:30:49,540 --> 00:30:51,189
concerns from that nineteen seventy nine

928
00:30:51,190 --> 00:30:53,529
article about Proops keeping

929
00:30:53,530 --> 00:30:55,119
up with changes to systems.

930
00:30:55,120 --> 00:30:57,219
So it's an engineering problem.

931
00:30:57,220 --> 00:31:00,309
Just like good library design.

932
00:31:00,310 --> 00:31:02,379
We want to design good proofs that

933
00:31:02,380 --> 00:31:04,569
are adaptable to many similar

934
00:31:04,570 --> 00:31:05,739
situations.

935
00:31:05,740 --> 00:31:07,389
We don't have to give up any formal

936
00:31:07,390 --> 00:31:09,549
assurance if we do this properly.

937
00:31:09,550 --> 00:31:11,019
OK, so here's this example, which I

938
00:31:11,020 --> 00:31:12,789
didn't have time to explain in detail.

939
00:31:12,790 --> 00:31:14,529
I just want to show how short this proof

940
00:31:14,530 --> 00:31:16,029
can be approved to us before we won't

941
00:31:16,030 --> 00:31:17,030
even need those.

942
00:31:18,910 --> 00:31:20,980
What we can do here, Adduction E.

943
00:31:22,980 --> 00:31:24,089
Let's see.

944
00:31:24,090 --> 00:31:26,549
Simple mention,

945
00:31:26,550 --> 00:31:27,780
a few cases are left

946
00:31:28,920 --> 00:31:31,079
when I see you, I look at this is this

947
00:31:31,080 --> 00:31:33,179
program contains a program

948
00:31:33,180 --> 00:31:35,249
level pattern match over a term

949
00:31:35,250 --> 00:31:37,709
this we should that that's essentially

950
00:31:37,710 --> 00:31:39,299
a case split in the program.

951
00:31:39,300 --> 00:31:41,129
We should mirror it with a case split in

952
00:31:41,130 --> 00:31:43,559
the proof. So let's try that

953
00:31:43,560 --> 00:31:45,479
match goal is construction for examining

954
00:31:45,480 --> 00:31:46,559
what we're trying to prove and doing

955
00:31:46,560 --> 00:31:47,759
something differently based on what we

956
00:31:47,760 --> 00:31:48,779
see.

957
00:31:48,780 --> 00:31:50,939
And what I'll do is

958
00:31:50,940 --> 00:31:52,769
I'll look for any pattern match.

959
00:31:53,880 --> 00:31:55,799
This is a syntax for matching any pattern

960
00:31:55,800 --> 00:31:57,629
match inside the goal.

961
00:31:57,630 --> 00:31:59,729
And I will do a case analysis

962
00:31:59,730 --> 00:32:01,949
on that and then simplify

963
00:32:01,950 --> 00:32:03,599
things afterward.

964
00:32:03,600 --> 00:32:05,699
And I'll say do

965
00:32:05,700 --> 00:32:07,589
this over and over again on all the

966
00:32:07,590 --> 00:32:09,539
subgoals that we generated.

967
00:32:09,540 --> 00:32:11,219
Now we're very close to done.

968
00:32:11,220 --> 00:32:14,069
We have just a bunch of arithmetic facts

969
00:32:14,070 --> 00:32:16,139
and here we have an eval on

970
00:32:16,140 --> 00:32:18,239
a particular term which given

971
00:32:18,240 --> 00:32:19,709
the definition of eval, take that word

972
00:32:19,710 --> 00:32:20,789
for it can be simplified.

973
00:32:20,790 --> 00:32:22,799
So if I did say simplify, not just below

974
00:32:22,800 --> 00:32:24,929
the line, but everywhere, then those will

975
00:32:24,930 --> 00:32:26,989
go into simpler forms as well.

976
00:32:26,990 --> 00:32:28,529
And now I notice I have this form with a

977
00:32:28,530 --> 00:32:30,839
bunch of things are equal to evaluation

978
00:32:30,840 --> 00:32:32,969
results. I want to use those equality's

979
00:32:32,970 --> 00:32:35,429
to rewrite down below here.

980
00:32:35,430 --> 00:32:38,069
So let me do that as well.

981
00:32:38,070 --> 00:32:40,170
Repeatedly look for.

982
00:32:42,420 --> 00:32:44,909
Cases of hypothesis that

983
00:32:44,910 --> 00:32:46,499
says something is equal to evaluate

984
00:32:46,500 --> 00:32:47,500
something.

985
00:32:48,930 --> 00:32:51,119
Rewrite that and then

986
00:32:51,120 --> 00:32:52,120
get rid of it.

987
00:32:52,830 --> 00:32:54,929
So now all those are gone, we get simple

988
00:32:54,930 --> 00:32:56,879
looking things like an equal zero percent

989
00:32:56,880 --> 00:32:58,469
and it turns out I have how many of

990
00:32:58,470 --> 00:33:01,089
these? It says I have 53 of these facts,

991
00:33:01,090 --> 00:33:02,759
a whole bunch of not two interesting

992
00:33:02,760 --> 00:33:04,169
things with zeroes and pluses and

993
00:33:04,170 --> 00:33:04,859
everything.

994
00:33:04,860 --> 00:33:06,569
It turns out all of these follow from the

995
00:33:06,570 --> 00:33:08,369
axioms of the algebraic structure called

996
00:33:08,370 --> 00:33:10,469
Siemering, which the natural numbers

997
00:33:10,470 --> 00:33:11,669
that I'm using here belong to.

998
00:33:11,670 --> 00:33:14,369
So I can just say prove that by ring

999
00:33:14,370 --> 00:33:15,869
and then

1000
00:33:17,220 --> 00:33:19,379
that is correct for

1001
00:33:19,380 --> 00:33:20,729
that proof. So

1002
00:33:21,810 --> 00:33:23,429
cool thing about this kind of proof.

1003
00:33:23,430 --> 00:33:25,499
Imagine I added a completely

1004
00:33:25,500 --> 00:33:27,659
new kind of note to my syntax trees

1005
00:33:27,660 --> 00:33:29,220
like I got really,

1006
00:33:31,710 --> 00:33:34,349
really ambitious and added subtraction.

1007
00:33:34,350 --> 00:33:36,629
And then this same proof

1008
00:33:36,630 --> 00:33:38,369
would pretty much work without any

1009
00:33:38,370 --> 00:33:40,859
changes and even more ambitious

1010
00:33:40,860 --> 00:33:41,999
changes to the language or the

1011
00:33:42,000 --> 00:33:44,339
optimization could lead to similar

1012
00:33:44,340 --> 00:33:46,799
outcomes. So this essentially is how

1013
00:33:46,800 --> 00:33:49,079
proof authoring scales to large systems

1014
00:33:49,080 --> 00:33:50,999
and to rapidly changing systems.

1015
00:33:53,550 --> 00:33:54,900
So back to these slides.

1016
00:33:57,300 --> 00:33:59,369
So this is the underlying technology

1017
00:33:59,370 --> 00:34:00,959
for a bunch of different cool things that

1018
00:34:00,960 --> 00:34:02,779
we can do with proof assistance.

1019
00:34:02,780 --> 00:34:05,069
So the traditional mode math proof

1020
00:34:05,070 --> 00:34:07,199
says the user has some sort of theorem in

1021
00:34:07,200 --> 00:34:08,189
mind.

1022
00:34:08,190 --> 00:34:09,539
It's a proof of it passes it off to the

1023
00:34:09,540 --> 00:34:11,039
proof checker and we make sure that it's

1024
00:34:11,040 --> 00:34:12,089
convincing.

1025
00:34:12,090 --> 00:34:13,499
We do the same thing for programs where

1026
00:34:13,500 --> 00:34:16,259
we start with an implementation and

1027
00:34:16,260 --> 00:34:18,479
prove a theorem about it and

1028
00:34:18,480 --> 00:34:19,799
check that that proof is true.

1029
00:34:19,800 --> 00:34:21,299
That's called verification.

1030
00:34:21,300 --> 00:34:22,678
There's also kind of the other direction

1031
00:34:22,679 --> 00:34:23,999
called synthesis, where we don't even

1032
00:34:24,000 --> 00:34:25,948
have the artifact to begin with.

1033
00:34:25,949 --> 00:34:28,138
We just have a wish list, a specification

1034
00:34:28,139 --> 00:34:30,059
that says, I sure wish I had a foobar

1035
00:34:30,060 --> 00:34:31,079
that did best.

1036
00:34:31,080 --> 00:34:33,059
And then we actually passed that

1037
00:34:33,060 --> 00:34:35,009
specification into some automated system

1038
00:34:35,010 --> 00:34:37,049
that builds the implementation for us and

1039
00:34:37,050 --> 00:34:38,519
actually builds the proof of the

1040
00:34:38,520 --> 00:34:39,959
implementation at the same time.

1041
00:34:39,960 --> 00:34:41,459
And we can check that proof without

1042
00:34:41,460 --> 00:34:43,198
having to trust the synthesis engine that

1043
00:34:43,199 --> 00:34:44,369
did this work for us.

1044
00:34:44,370 --> 00:34:45,749
Both of these can often be useful and

1045
00:34:45,750 --> 00:34:47,340
they build on the same technology.

1046
00:34:48,810 --> 00:34:50,669
I also showed examples so far that only

1047
00:34:50,670 --> 00:34:52,229
write programs in the functional language

1048
00:34:52,230 --> 00:34:54,419
that's built into the caucus system.

1049
00:34:54,420 --> 00:34:56,519
But it's such a rich logic that we

1050
00:34:56,520 --> 00:34:58,019
could encode pretty much any other

1051
00:34:58,020 --> 00:34:58,979
language you can think of.

1052
00:34:58,980 --> 00:35:00,269
We can write down its syntax and

1053
00:35:00,270 --> 00:35:01,270
semantics.

1054
00:35:02,700 --> 00:35:04,469
So here are some examples of languages

1055
00:35:04,470 --> 00:35:06,689
that developments in Cork have worked

1056
00:35:06,690 --> 00:35:08,699
with by formalizing what exactly these

1057
00:35:08,700 --> 00:35:10,649
languages are going all the way from high

1058
00:35:10,650 --> 00:35:12,659
level database, query languages down to

1059
00:35:12,660 --> 00:35:14,129
hardware, description languages, and we

1060
00:35:14,130 --> 00:35:16,139
can find these, compose them, do all

1061
00:35:16,140 --> 00:35:17,140
sorts of cool things.

1062
00:35:18,510 --> 00:35:20,669
We can also use a wide variety of tools

1063
00:35:20,670 --> 00:35:22,500
to construct our programs and our proofs.

1064
00:35:23,580 --> 00:35:25,499
We can use compilers between languages.

1065
00:35:25,500 --> 00:35:27,359
We can build automated proofers that live

1066
00:35:27,360 --> 00:35:29,309
within the ecosystem and generate calk

1067
00:35:29,310 --> 00:35:31,979
proofs. We can build synthesis tools.

1068
00:35:31,980 --> 00:35:33,749
The key thing is every tool we introduce

1069
00:35:33,750 --> 00:35:35,849
in this ecosystem, just like,

1070
00:35:35,850 --> 00:35:38,339
say, in a regular software ecosystem,

1071
00:35:38,340 --> 00:35:39,959
everything needs to somehow connect to a

1072
00:35:39,960 --> 00:35:41,909
common binary object format.

1073
00:35:41,910 --> 00:35:43,349
Here, everything needs to connect to a

1074
00:35:43,350 --> 00:35:45,209
common proof format.

1075
00:35:45,210 --> 00:35:46,499
The most straightforward way of doing

1076
00:35:46,500 --> 00:35:48,359
that is to take each of these pieces, say

1077
00:35:48,360 --> 00:35:50,939
some arrow at this translation, and prove

1078
00:35:50,940 --> 00:35:53,249
that the translator itself is correct.

1079
00:35:53,250 --> 00:35:55,049
It always maps inputs to appropriate

1080
00:35:55,050 --> 00:35:57,149
outputs. But another approach is to

1081
00:35:57,150 --> 00:35:59,309
create a proof generating piece

1082
00:35:59,310 --> 00:36:01,169
that, for instance, takes in an input,

1083
00:36:01,170 --> 00:36:02,969
produces not just the output, but also a

1084
00:36:02,970 --> 00:36:04,859
proof that that particular output is

1085
00:36:04,860 --> 00:36:07,049
correct. And for different circumstances,

1086
00:36:07,050 --> 00:36:08,879
it's more appropriate to use one of these

1087
00:36:08,880 --> 00:36:10,679
two styles. Luckily, the same proof

1088
00:36:10,680 --> 00:36:12,090
checker can be used with both of them.

1089
00:36:13,620 --> 00:36:15,299
So to get more concrete, let me tell you

1090
00:36:15,300 --> 00:36:16,739
about two projects that I've been working

1091
00:36:16,740 --> 00:36:17,759
on recently.

1092
00:36:17,760 --> 00:36:19,049
First one is called Commy.

1093
00:36:19,050 --> 00:36:21,389
It's all about proof,

1094
00:36:21,390 --> 00:36:23,199
support for digital hardware.

1095
00:36:23,200 --> 00:36:24,719
It's joint work with a number of other

1096
00:36:24,720 --> 00:36:26,219
people at MIT listed here.

1097
00:36:27,360 --> 00:36:29,969
And basically the story is,

1098
00:36:29,970 --> 00:36:32,219
let's imagine that we want to support

1099
00:36:32,220 --> 00:36:34,109
rapid opensource development of new

1100
00:36:34,110 --> 00:36:36,299
digital hardware systems by mashing up

1101
00:36:36,300 --> 00:36:38,339
components from libraries.

1102
00:36:38,340 --> 00:36:39,869
So, for instance, we might pull some

1103
00:36:39,870 --> 00:36:42,089
processors out of one library, a cache

1104
00:36:42,090 --> 00:36:43,859
coherent memory system out of another.

1105
00:36:43,860 --> 00:36:45,779
Maybe we write our own new accelerators

1106
00:36:45,780 --> 00:36:47,579
for a particular purpose.

1107
00:36:47,580 --> 00:36:49,649
And by the way, all these boxes have

1108
00:36:49,650 --> 00:36:52,319
proofs. So when you assemble your system,

1109
00:36:52,320 --> 00:36:54,059
you don't just run it.

1110
00:36:54,060 --> 00:36:56,189
You also combine the proofs of the pieces

1111
00:36:56,190 --> 00:36:58,049
into a proof that the full system is

1112
00:36:58,050 --> 00:36:59,249
correct.

1113
00:36:59,250 --> 00:37:01,499
And you can potentially know correctness

1114
00:37:01,500 --> 00:37:03,899
without needing to do any debugging.

1115
00:37:03,900 --> 00:37:04,900
If you've

1116
00:37:06,480 --> 00:37:08,369
proved your top level, it's written your

1117
00:37:08,370 --> 00:37:09,869
top level theorem correctly and check

1118
00:37:09,870 --> 00:37:11,759
that it follows from the constituent

1119
00:37:11,760 --> 00:37:12,760
proofs.

1120
00:37:13,680 --> 00:37:15,069
What do I mean by correctness?

1121
00:37:15,070 --> 00:37:17,279
Well, imagine we have a highly optimized

1122
00:37:17,280 --> 00:37:19,259
implementation of some hardware component

1123
00:37:19,260 --> 00:37:21,209
and we've also written a specification

1124
00:37:21,210 --> 00:37:22,709
which is essentially the simplest

1125
00:37:22,710 --> 00:37:25,199
possible way to express functionally

1126
00:37:25,200 --> 00:37:26,909
what that box is supposed to do.

1127
00:37:26,910 --> 00:37:29,399
We ignore optimization, performance, etc.

1128
00:37:29,400 --> 00:37:30,479
We might call this a reference

1129
00:37:30,480 --> 00:37:32,699
implementation, not just a spec,

1130
00:37:32,700 --> 00:37:34,769
and we want to prove

1131
00:37:34,770 --> 00:37:36,899
it in the semantics literature

1132
00:37:36,900 --> 00:37:38,399
we would call a contextual refinement

1133
00:37:38,400 --> 00:37:39,749
property.

1134
00:37:39,750 --> 00:37:41,279
What that really means is let's invite

1135
00:37:41,280 --> 00:37:43,409
into the picture and omniscient judge is

1136
00:37:43,410 --> 00:37:45,299
going to very carefully inspect both

1137
00:37:45,300 --> 00:37:46,289
boxes.

1138
00:37:46,290 --> 00:37:47,759
He's going to send them inputs, look at

1139
00:37:47,760 --> 00:37:49,889
the outputs. And what he's trying to do

1140
00:37:49,890 --> 00:37:51,599
is find any difference in the way they

1141
00:37:51,600 --> 00:37:52,499
behave.

1142
00:37:52,500 --> 00:37:54,359
So if this judge can ever get the

1143
00:37:54,360 --> 00:37:56,429
implementation to have some

1144
00:37:56,430 --> 00:37:58,559
externally visible EYOB behavior that

1145
00:37:58,560 --> 00:38:00,989
the spec couldn't possibly generate,

1146
00:38:00,990 --> 00:38:03,389
then our system loses.

1147
00:38:03,390 --> 00:38:05,489
But if an every possible interaction he

1148
00:38:05,490 --> 00:38:07,529
could have, anything the left side can

1149
00:38:07,530 --> 00:38:09,419
do, the right side can also do, then we

1150
00:38:09,420 --> 00:38:11,489
say it is a correct implementation.

1151
00:38:13,140 --> 00:38:14,879
And in our system, we write these

1152
00:38:14,880 --> 00:38:16,769
hardware components in the blue spec,

1153
00:38:16,770 --> 00:38:18,839
high level hardware language, which

1154
00:38:18,840 --> 00:38:20,519
makes it particularly straightforward to

1155
00:38:20,520 --> 00:38:22,319
define what exactly are the legal ways to

1156
00:38:22,320 --> 00:38:25,169
interact with the the system.

1157
00:38:25,170 --> 00:38:27,449
And importantly, many internal

1158
00:38:27,450 --> 00:38:29,009
implementation details may not be

1159
00:38:29,010 --> 00:38:30,719
observed directly, just like you can't

1160
00:38:30,720 --> 00:38:32,609
peek into the private fields of a Java

1161
00:38:32,610 --> 00:38:34,199
class. And that's important for

1162
00:38:34,200 --> 00:38:36,059
facilitating optimization without

1163
00:38:36,060 --> 00:38:37,739
foreseeing the details in the spec.

1164
00:38:39,630 --> 00:38:41,249
So a few words about the brushback

1165
00:38:41,250 --> 00:38:43,169
language. It's kind of it in an object

1166
00:38:43,170 --> 00:38:45,089
oriented style where program modules are

1167
00:38:45,090 --> 00:38:47,399
objects that have private state and

1168
00:38:47,400 --> 00:38:48,839
public methods that are the only ones

1169
00:38:48,840 --> 00:38:50,729
allowed to access the private state.

1170
00:38:50,730 --> 00:38:52,739
And methods can call other methods and

1171
00:38:52,740 --> 00:38:53,760
other objects

1172
00:38:55,290 --> 00:38:56,879
every method. Karl appears to execute

1173
00:38:56,880 --> 00:38:58,799
atomically without any interleaving

1174
00:38:58,800 --> 00:38:59,800
concurrency.

1175
00:39:01,080 --> 00:39:03,209
And we can summarize any

1176
00:39:03,210 --> 00:39:05,519
atomic method call with a trace of all

1177
00:39:05,520 --> 00:39:07,709
the sub method calls that happened,

1178
00:39:07,710 --> 00:39:09,149
some coming into this module or some

1179
00:39:09,150 --> 00:39:10,829
going out to others with their arguments

1180
00:39:10,830 --> 00:39:12,209
and return values.

1181
00:39:12,210 --> 00:39:14,099
And then this property that connects the

1182
00:39:14,100 --> 00:39:15,569
implementation to the back is basically

1183
00:39:15,570 --> 00:39:17,729
saying any trace that my implementation

1184
00:39:17,730 --> 00:39:19,559
could ever generate could also be

1185
00:39:19,560 --> 00:39:22,529
generated by the specification.

1186
00:39:22,530 --> 00:39:24,449
This is a nice modular way of setting

1187
00:39:24,450 --> 00:39:25,949
things up because it allows you to then

1188
00:39:25,950 --> 00:39:28,469
link modules or objects together

1189
00:39:28,470 --> 00:39:30,089
as you do that certain methods become

1190
00:39:30,090 --> 00:39:32,159
private and they stop appearing in the

1191
00:39:32,160 --> 00:39:33,899
traces and you can hide the details of

1192
00:39:33,900 --> 00:39:35,549
how your spec is implemented with a

1193
00:39:35,550 --> 00:39:37,259
particular hierarchy of optimized

1194
00:39:37,260 --> 00:39:38,260
components.

1195
00:39:40,170 --> 00:39:42,479
So let me show you a quick

1196
00:39:42,480 --> 00:39:44,279
example of some code in the county

1197
00:39:44,280 --> 00:39:45,280
framework.

1198
00:39:46,230 --> 00:39:48,480
This is called Tutorial.

1199
00:39:52,810 --> 00:39:54,969
Here is some code for

1200
00:39:54,970 --> 00:39:57,189
this is a producer consumer example with

1201
00:39:57,190 --> 00:39:59,559
with two hardware nodes, with a

1202
00:39:59,560 --> 00:40:00,609
Q in between them.

1203
00:40:00,610 --> 00:40:02,229
This is the producer side.

1204
00:40:02,230 --> 00:40:04,359
It's a module with a particular register

1205
00:40:04,360 --> 00:40:06,279
in it and a rule that might run on a

1206
00:40:06,280 --> 00:40:08,409
particular clock cycle, reads a register,

1207
00:40:08,410 --> 00:40:10,659
calls a method of another module,

1208
00:40:10,660 --> 00:40:11,679
etc..

1209
00:40:11,680 --> 00:40:13,389
One thing I want to point out here, Corke

1210
00:40:13,390 --> 00:40:15,009
has an extensible parser.

1211
00:40:15,010 --> 00:40:16,599
So in this framework, we've talked

1212
00:40:16,600 --> 00:40:18,459
caulked the syntax of this hardware

1213
00:40:18,460 --> 00:40:19,599
description, language. There's no

1214
00:40:19,600 --> 00:40:21,729
built-In keyword module register rule any

1215
00:40:21,730 --> 00:40:23,589
of that. We've defined the syntax and

1216
00:40:23,590 --> 00:40:26,349
semantics in logic of this

1217
00:40:26,350 --> 00:40:28,689
hardware language, and that allows

1218
00:40:28,690 --> 00:40:30,879
us to use it to define

1219
00:40:30,880 --> 00:40:31,989
systems and do proofs.

1220
00:40:33,310 --> 00:40:34,929
And here's the consumer definition.

1221
00:40:34,930 --> 00:40:37,419
It's a lot shorter than we define a spec

1222
00:40:37,420 --> 00:40:39,039
that actually combines the producer and

1223
00:40:39,040 --> 00:40:40,749
the consumer into one module and avoids

1224
00:40:40,750 --> 00:40:42,489
using an intermediate queue.

1225
00:40:42,490 --> 00:40:44,619
And then with less than one hundred lines

1226
00:40:44,620 --> 00:40:46,179
in the file, we get down and prove a

1227
00:40:46,180 --> 00:40:48,309
theorem that the implementation

1228
00:40:48,310 --> 00:40:50,289
is a correct refinement of the

1229
00:40:50,290 --> 00:40:52,209
specification and take my word for it

1230
00:40:52,210 --> 00:40:54,339
all. This can be scaled up to pipeline

1231
00:40:54,340 --> 00:40:55,989
processors, cash, coherent memory

1232
00:40:55,990 --> 00:40:58,389
systems. We've done all those examples,

1233
00:40:58,390 --> 00:41:00,009
so we prove those individually.

1234
00:41:00,010 --> 00:41:02,139
We compose them and also compose

1235
00:41:02,140 --> 00:41:03,759
their proofs at the same time.

1236
00:41:03,760 --> 00:41:06,099
And it is quite

1237
00:41:06,100 --> 00:41:07,100
general.

1238
00:41:10,180 --> 00:41:12,249
I also want to mention one

1239
00:41:12,250 --> 00:41:13,599
application area that we've been working

1240
00:41:13,600 --> 00:41:15,339
on for this technology, some of you might

1241
00:41:15,340 --> 00:41:17,199
have heard of the risk five instruction

1242
00:41:17,200 --> 00:41:18,549
set. I think there was another

1243
00:41:18,550 --> 00:41:20,739
verification talk about it on the first

1244
00:41:20,740 --> 00:41:22,959
day of the Congress and I'm wearing

1245
00:41:22,960 --> 00:41:24,729
it five t shirt here in the back.

1246
00:41:24,730 --> 00:41:26,859
It says Instruction sets want to be

1247
00:41:26,860 --> 00:41:29,379
free. Risk five is is basically

1248
00:41:29,380 --> 00:41:31,329
like the Linux of hardware instruction

1249
00:41:31,330 --> 00:41:33,429
sets. If you think of Linux in nineteen

1250
00:41:33,430 --> 00:41:35,649
ninety five or so, it is an

1251
00:41:35,650 --> 00:41:37,179
open instruction set controlled by a

1252
00:41:37,180 --> 00:41:38,319
nonprofit foundation.

1253
00:41:38,320 --> 00:41:40,059
In terms of the definition of what are

1254
00:41:40,060 --> 00:41:41,979
the instructions, what do they do?

1255
00:41:41,980 --> 00:41:44,409
And has a lot of opportunities

1256
00:41:44,410 --> 00:41:46,269
for getting involved, including informal

1257
00:41:46,270 --> 00:41:48,399
methods. So we've been working within

1258
00:41:48,400 --> 00:41:50,349
the risk by foundation to formally define

1259
00:41:50,350 --> 00:41:52,119
the meaning of the instruction set.

1260
00:41:52,120 --> 00:41:53,919
We've been verifying components that you

1261
00:41:53,920 --> 00:41:55,689
can use to assemble your own responsive

1262
00:41:55,690 --> 00:41:57,049
systems with correctness.

1263
00:41:57,050 --> 00:41:59,439
Proops We've been working towards some

1264
00:41:59,440 --> 00:42:01,539
support for verifying the correctness

1265
00:42:01,540 --> 00:42:03,249
of risk by a machine code as well as

1266
00:42:03,250 --> 00:42:05,259
infrastructure for composing those proofs

1267
00:42:05,260 --> 00:42:06,260
together.

1268
00:42:06,940 --> 00:42:09,309
And there are some nice consequences

1269
00:42:09,310 --> 00:42:11,229
of this open model that fits really well

1270
00:42:11,230 --> 00:42:12,399
with formal methods. So in the

1271
00:42:12,400 --> 00:42:14,109
traditional world, the processor

1272
00:42:14,110 --> 00:42:15,969
manufacturers control the intellectual

1273
00:42:15,970 --> 00:42:17,979
property that defines the instruction set

1274
00:42:17,980 --> 00:42:19,569
and the processors, and they actually

1275
00:42:19,570 --> 00:42:21,789
don't want most developers to know

1276
00:42:21,790 --> 00:42:23,469
how things really work because then they

1277
00:42:23,470 --> 00:42:24,969
could potentially build competing

1278
00:42:24,970 --> 00:42:27,399
processors and and

1279
00:42:27,400 --> 00:42:28,809
hurt the bottom lines of the companies

1280
00:42:28,810 --> 00:42:29,859
that own the IP.

1281
00:42:29,860 --> 00:42:32,259
But in the risk five open model,

1282
00:42:32,260 --> 00:42:34,209
then the we can have a formal semantics

1283
00:42:34,210 --> 00:42:35,469
at the center of everything that says

1284
00:42:35,470 --> 00:42:37,149
exactly what the instruction set is,

1285
00:42:37,150 --> 00:42:39,279
allows everyone to build a cloned version

1286
00:42:39,280 --> 00:42:40,869
of the processor.

1287
00:42:40,870 --> 00:42:42,279
All those can be made open source.

1288
00:42:42,280 --> 00:42:43,569
They can all have theorems associated

1289
00:42:43,570 --> 00:42:45,549
with them connected to this formal spec.

1290
00:42:45,550 --> 00:42:47,019
Others can come along and mash up

1291
00:42:47,020 --> 00:42:50,679
existing components and

1292
00:42:50,680 --> 00:42:52,389
it can dramatically

1293
00:42:53,620 --> 00:42:56,349
lower the cost of getting into

1294
00:42:56,350 --> 00:42:58,239
hardware development in this kind of

1295
00:42:58,240 --> 00:42:59,709
space. So we're trying to build that

1296
00:42:59,710 --> 00:43:01,480
formal semantics to enable this story.

1297
00:43:04,460 --> 00:43:05,959
A second case study I want to talk about

1298
00:43:05,960 --> 00:43:07,969
to finish up here is what we've done in

1299
00:43:07,970 --> 00:43:10,069
cryptographic code, namely elliptic

1300
00:43:10,070 --> 00:43:12,199
curve cryptography, which is used in

1301
00:43:12,200 --> 00:43:14,759
class and many other settings,

1302
00:43:14,760 --> 00:43:17,329
also joint work with a few folks at MIT.

1303
00:43:17,330 --> 00:43:19,399
And so here's

1304
00:43:19,400 --> 00:43:20,569
the story here.

1305
00:43:20,570 --> 00:43:22,009
Cryptography, as we know, is really

1306
00:43:22,010 --> 00:43:24,379
important. And most applications

1307
00:43:24,380 --> 00:43:26,269
using cryptography just take libraries

1308
00:43:26,270 --> 00:43:27,529
off the shelf, which are written by a

1309
00:43:27,530 --> 00:43:29,569
small handful of elite cryptographic

1310
00:43:29,570 --> 00:43:30,570
implementers.

1311
00:43:31,490 --> 00:43:32,659
Let's think about how hard they have to

1312
00:43:32,660 --> 00:43:34,639
work to get everything together here.

1313
00:43:34,640 --> 00:43:36,379
There's a medium sized set of

1314
00:43:36,380 --> 00:43:38,059
cryptographic algorithms for all the

1315
00:43:38,060 --> 00:43:39,060
primitives we need.

1316
00:43:39,890 --> 00:43:41,239
It turns out many of these are

1317
00:43:41,240 --> 00:43:43,579
parametrized by large prime numbers

1318
00:43:43,580 --> 00:43:46,459
that are the modules for arithmetic,

1319
00:43:46,460 --> 00:43:48,259
and there's a medium sized set of

1320
00:43:48,260 --> 00:43:49,999
different hardware architectures that

1321
00:43:50,000 --> 00:43:51,169
we'd like to target.

1322
00:43:51,170 --> 00:43:53,719
It turns out that in practice,

1323
00:43:53,720 --> 00:43:55,909
for every choice of one

1324
00:43:55,910 --> 00:43:58,249
circle from each of these bubbles,

1325
00:43:58,250 --> 00:44:00,409
an expert spends at least a few days

1326
00:44:00,410 --> 00:44:02,449
rewriting everything from scratch,

1327
00:44:02,450 --> 00:44:04,579
possibly in assembly language.

1328
00:44:04,580 --> 00:44:07,429
And this is not a very scalable process.

1329
00:44:07,430 --> 00:44:09,229
And of course, when doing that sort of

1330
00:44:09,230 --> 00:44:11,059
thing, we can believe that bugs can be

1331
00:44:11,060 --> 00:44:12,889
introduced that have serious security

1332
00:44:12,890 --> 00:44:14,179
consequences.

1333
00:44:14,180 --> 00:44:15,949
So what if we could have a compiler that

1334
00:44:15,950 --> 00:44:17,149
automatically did that work?

1335
00:44:17,150 --> 00:44:18,529
When you choose something from each of

1336
00:44:18,530 --> 00:44:20,749
these bubbles and produces not just the

1337
00:44:20,750 --> 00:44:22,819
vast low level code, but also

1338
00:44:22,820 --> 00:44:24,919
a machine checkable proof that it

1339
00:44:24,920 --> 00:44:26,449
truly implements the whiteboard level

1340
00:44:26,450 --> 00:44:27,450
math.

1341
00:44:27,860 --> 00:44:29,029
So that's what we're after here.

1342
00:44:29,030 --> 00:44:31,579
So we're doing crypto verification.

1343
00:44:31,580 --> 00:44:33,229
One kind of property people often prove

1344
00:44:33,230 --> 00:44:34,610
about cryptography is that

1345
00:44:35,650 --> 00:44:37,189
these high level security properties,

1346
00:44:37,190 --> 00:44:38,959
like if you don't know the secret key,

1347
00:44:38,960 --> 00:44:39,919
then they'll take you more than

1348
00:44:39,920 --> 00:44:41,599
polynomial time to compute something

1349
00:44:42,680 --> 00:44:43,999
through protocol verification.

1350
00:44:44,000 --> 00:44:45,229
We can show that a particular

1351
00:44:45,230 --> 00:44:46,849
mathematical algorithm meets such a

1352
00:44:46,850 --> 00:44:49,369
property and through

1353
00:44:49,370 --> 00:44:51,289
an implementation synthesis process, we

1354
00:44:51,290 --> 00:44:53,389
can then generate low level

1355
00:44:53,390 --> 00:44:54,889
efficient code that follows the

1356
00:44:54,890 --> 00:44:56,269
mathematical algorithm.

1357
00:44:56,270 --> 00:44:57,769
Our project is only in that second

1358
00:44:57,770 --> 00:44:59,059
category. There's been a lot of

1359
00:44:59,060 --> 00:45:00,469
interesting work in the first category

1360
00:45:00,470 --> 00:45:02,539
too, and

1361
00:45:02,540 --> 00:45:04,609
our system has been adopted by Chrome

1362
00:45:04,610 --> 00:45:06,679
through the boring SSL library to

1363
00:45:06,680 --> 00:45:08,749
replace the manually written elliptic

1364
00:45:08,750 --> 00:45:11,479
curve finite field arithmetic

1365
00:45:11,480 --> 00:45:13,429
that they had had previously.

1366
00:45:13,430 --> 00:45:15,019
You can find our implementation of curve

1367
00:45:15,020 --> 00:45:16,699
two five five one nine and Chrome version

1368
00:45:16,700 --> 00:45:18,829
sixty four, which is in beta now, and

1369
00:45:18,830 --> 00:45:19,669
the next version of Chrome.

1370
00:45:19,670 --> 00:45:21,229
After that we'll have our implementation

1371
00:45:21,230 --> 00:45:22,769
of P two fifty six together.

1372
00:45:22,770 --> 00:45:24,349
These are the curves that are used in the

1373
00:45:24,350 --> 00:45:26,539
vast majority of TLS connections.

1374
00:45:29,100 --> 00:45:31,169
So really quick demo of

1375
00:45:31,170 --> 00:45:33,389
running this generation

1376
00:45:33,390 --> 00:45:34,390
process.

1377
00:45:36,640 --> 00:45:38,589
What we've done is we've built a library

1378
00:45:38,590 --> 00:45:40,239
of functional programs that capture the

1379
00:45:40,240 --> 00:45:42,530
generality of arithmetic in this domain.

1380
00:45:43,630 --> 00:45:45,789
So it turns out these numbers are way

1381
00:45:45,790 --> 00:45:47,469
too big to fit in hardware registers.

1382
00:45:47,470 --> 00:45:48,909
We have to be clever about splitting one

1383
00:45:48,910 --> 00:45:50,919
number among many registers and then

1384
00:45:50,920 --> 00:45:52,269
essentially implement grade school

1385
00:45:52,270 --> 00:45:54,129
arithmetic ourselves with some catches

1386
00:45:54,130 --> 00:45:56,559
for performance on popular processors.

1387
00:45:56,560 --> 00:45:57,999
And we do that using standard functional

1388
00:45:58,000 --> 00:46:00,099
programing stuff like folds

1389
00:46:00,100 --> 00:46:02,169
and maps and so on.

1390
00:46:02,170 --> 00:46:03,729
For instance, here's a definition of

1391
00:46:03,730 --> 00:46:06,079
multiplication, flat map and a map.

1392
00:46:06,080 --> 00:46:07,599
I won't go into details on exactly our

1393
00:46:07,600 --> 00:46:08,919
representation. This is kind of a

1394
00:46:08,920 --> 00:46:11,139
simplified version for demo purposes,

1395
00:46:12,250 --> 00:46:15,609
but we get to process all this and

1396
00:46:15,610 --> 00:46:16,779
make our way down to the bottom of this

1397
00:46:16,780 --> 00:46:18,909
file where I'm going to essentially

1398
00:46:18,910 --> 00:46:20,259
state the following goal.

1399
00:46:20,260 --> 00:46:22,719
Assume we have two 10

1400
00:46:22,720 --> 00:46:24,549
tuples of integers.

1401
00:46:24,550 --> 00:46:26,649
Turns out that on 32 bit processors,

1402
00:46:26,650 --> 00:46:28,149
four curved two five five one nine, you

1403
00:46:28,150 --> 00:46:30,609
want to represent each number

1404
00:46:30,610 --> 00:46:32,749
with ten different word

1405
00:46:32,750 --> 00:46:34,509
size digits. So that's why these are ten

1406
00:46:34,510 --> 00:46:36,249
tuples. And we say, please find an

1407
00:46:36,250 --> 00:46:38,229
element of the set tuples that are the

1408
00:46:38,230 --> 00:46:40,299
correct answers to the problem of

1409
00:46:40,300 --> 00:46:42,409
multiplying the two inputs mod

1410
00:46:42,410 --> 00:46:44,320
two to the two fifty five minus 19.

1411
00:46:45,670 --> 00:46:48,009
And I run a few

1412
00:46:48,010 --> 00:46:49,479
steps here that I want to explain in

1413
00:46:49,480 --> 00:46:52,539
detail to start symbolic manipulation.

1414
00:46:52,540 --> 00:46:54,039
And then key step is here.

1415
00:46:54,040 --> 00:46:55,119
This is going to run for about ten

1416
00:46:55,120 --> 00:46:57,669
seconds. I think we are somewhere

1417
00:46:57,670 --> 00:47:00,129
asking Corke to do some correct

1418
00:47:00,130 --> 00:47:02,709
by construction derivation of concrete

1419
00:47:02,710 --> 00:47:04,719
code for computing this multiplication

1420
00:47:04,720 --> 00:47:07,149
result and get a bunch of outputs

1421
00:47:07,150 --> 00:47:07,569
like this.

1422
00:47:07,570 --> 00:47:09,579
I think this is the last digit of the

1423
00:47:09,580 --> 00:47:11,049
answer and some of the arithmetic for

1424
00:47:11,050 --> 00:47:11,979
computing it.

1425
00:47:11,980 --> 00:47:13,839
That's a lot of redundant parts to it.

1426
00:47:13,840 --> 00:47:15,909
So let's use the the

1427
00:47:15,910 --> 00:47:18,009
ring algebraic laws to simplify

1428
00:47:18,010 --> 00:47:20,169
this a bit more and this one takes

1429
00:47:20,170 --> 00:47:21,759
a few tenths of seconds to run.

1430
00:47:21,760 --> 00:47:23,859
But what's happening is the system is

1431
00:47:23,860 --> 00:47:25,579
automatically computing the code that

1432
00:47:25,580 --> 00:47:27,339
traditionally has been written by hand

1433
00:47:27,340 --> 00:47:29,919
for this cryptographic

1434
00:47:29,920 --> 00:47:31,989
primitive. And it's

1435
00:47:31,990 --> 00:47:33,549
doing it in a way that doesn't just build

1436
00:47:33,550 --> 00:47:35,619
the code. It also builds the proof that

1437
00:47:35,620 --> 00:47:37,539
it's related back to the original

1438
00:47:37,540 --> 00:47:39,519
whiteboard level modular arithmetic

1439
00:47:39,520 --> 00:47:40,520
definition.

1440
00:47:41,710 --> 00:47:43,909
And this will finish

1441
00:47:43,910 --> 00:47:46,149
any moment now and it

1442
00:47:46,150 --> 00:47:48,249
will no longer have all these

1443
00:47:48,250 --> 00:47:50,019
constants in it. These will be combined

1444
00:47:50,020 --> 00:47:52,209
into a nicer form and the

1445
00:47:52,210 --> 00:47:53,889
output that'll pop up here is not the end

1446
00:47:53,890 --> 00:47:55,449
of the story. We also have later phases

1447
00:47:55,450 --> 00:47:57,789
that lower it into C like code

1448
00:47:57,790 --> 00:47:59,439
do bounds analysis to make sure we've

1449
00:47:59,440 --> 00:48:01,569
assigned enough bits to each register.

1450
00:48:01,570 --> 00:48:03,399
That all happens automatically given the

1451
00:48:03,400 --> 00:48:04,400
other parameters.

1452
00:48:06,550 --> 00:48:08,679
And here

1453
00:48:08,680 --> 00:48:11,199
we have an example of the output.

1454
00:48:11,200 --> 00:48:13,359
So this might look familiar to people

1455
00:48:13,360 --> 00:48:14,679
who have been implementing curve two five

1456
00:48:14,680 --> 00:48:15,579
five one nine.

1457
00:48:15,580 --> 00:48:17,739
It's still in a relatively high

1458
00:48:17,740 --> 00:48:19,959
level form without breaking things

1459
00:48:19,960 --> 00:48:22,089
down into individual temporary variables.

1460
00:48:22,090 --> 00:48:23,679
But we didn't have to manually figure

1461
00:48:23,680 --> 00:48:24,669
out. For instance, there should be a

1462
00:48:24,670 --> 00:48:26,799
thirty eight here and nineteen

1463
00:48:26,800 --> 00:48:28,149
here and all those things.

1464
00:48:28,150 --> 00:48:30,129
It was done for us automatically and you

1465
00:48:30,130 --> 00:48:31,269
can put in different parameters and

1466
00:48:31,270 --> 00:48:33,459
generate the standard code for any

1467
00:48:33,460 --> 00:48:35,139
of the other curves in any of the

1468
00:48:35,140 --> 00:48:37,089
hardware architectures that folks like to

1469
00:48:37,090 --> 00:48:38,090
use them on.

1470
00:48:39,460 --> 00:48:42,219
So to finish this up, we

1471
00:48:42,220 --> 00:48:44,049
are implementation automatically builds

1472
00:48:44,050 --> 00:48:46,329
that code from a way of writing down the

1473
00:48:46,330 --> 00:48:48,399
modules in a suggestive way like that

1474
00:48:48,400 --> 00:48:48,939
here.

1475
00:48:48,940 --> 00:48:50,289
This is typically how these are written

1476
00:48:50,290 --> 00:48:52,539
as sums and differences of powers of two

1477
00:48:52,540 --> 00:48:54,069
or small multiples of them.

1478
00:48:54,070 --> 00:48:55,539
So we wrote a Python script that uses

1479
00:48:55,540 --> 00:48:57,219
that to generate a few lines of

1480
00:48:57,220 --> 00:48:59,409
configuration to our system.

1481
00:48:59,410 --> 00:49:01,899
And then we scraped

1482
00:49:01,900 --> 00:49:04,209
the elliptic curve mailing

1483
00:49:04,210 --> 00:49:06,039
list that modern crypto cryptography to

1484
00:49:06,040 --> 00:49:07,599
find every prime number that appears in

1485
00:49:07,600 --> 00:49:09,789
the archives. So we use them all as

1486
00:49:09,790 --> 00:49:11,949
inputs for automatically generating code.

1487
00:49:11,950 --> 00:49:14,139
There are only a few weird prime numbers

1488
00:49:14,140 --> 00:49:15,699
posted there that are actually very

1489
00:49:15,700 --> 00:49:16,719
inefficient to implement.

1490
00:49:16,720 --> 00:49:18,699
That was probably a newbie to the list

1491
00:49:18,700 --> 00:49:20,799
who was made a suggestion was quickly

1492
00:49:20,800 --> 00:49:22,569
rebuffed. But we generated those two or

1493
00:49:22,570 --> 00:49:24,009
tried to. Sometimes we ran out of memory

1494
00:49:24,010 --> 00:49:26,169
with those and we

1495
00:49:26,170 --> 00:49:27,849
automatically built code for all these

1496
00:49:27,850 --> 00:49:29,859
cases compared against the Guanyu Multi

1497
00:49:29,860 --> 00:49:31,239
Position Library, which is a very

1498
00:49:31,240 --> 00:49:33,339
standard off the shelf library for

1499
00:49:33,340 --> 00:49:34,899
big integer arithmetic.

1500
00:49:34,900 --> 00:49:36,609
Here are results generated for sixty four

1501
00:49:36,610 --> 00:49:38,769
point eighty six year for 32 bit arm

1502
00:49:38,770 --> 00:49:40,329
running on an Android phone.

1503
00:49:40,330 --> 00:49:42,519
And generally the new multi

1504
00:49:42,520 --> 00:49:45,129
position version of things will take

1505
00:49:45,130 --> 00:49:47,289
two to ten times longer to run one

1506
00:49:47,290 --> 00:49:49,059
of the elliptic curve operations than

1507
00:49:49,060 --> 00:49:50,319
with our code, which was created

1508
00:49:50,320 --> 00:49:52,479
completely, automatically and also has

1509
00:49:52,480 --> 00:49:54,969
strong guarantees of correctness.

1510
00:49:54,970 --> 00:49:57,159
So thanks.

1511
00:50:03,610 --> 00:50:05,169
I'll just leave you with a few pointers

1512
00:50:05,170 --> 00:50:06,849
to places to follow up if you'd like to

1513
00:50:06,850 --> 00:50:09,129
learn more about this subject, No.

1514
00:50:09,130 --> 00:50:11,079
One place to start an online book called

1515
00:50:11,080 --> 00:50:12,429
Software or Foundations, written by

1516
00:50:12,430 --> 00:50:14,319
Benjamin Pierce and others.

1517
00:50:14,320 --> 00:50:16,659
It introduces Calk just assuming

1518
00:50:16,660 --> 00:50:18,369
standard undergraduate computer science

1519
00:50:18,370 --> 00:50:20,109
material, no formal methods or even

1520
00:50:20,110 --> 00:50:22,269
functional programing experience.

1521
00:50:22,270 --> 00:50:23,949
I should also advertise our Multi

1522
00:50:23,950 --> 00:50:25,779
University Initiative in the US called

1523
00:50:25,780 --> 00:50:27,639
The Science of Deep Specification or deep

1524
00:50:27,640 --> 00:50:29,649
spec in particular.

1525
00:50:29,650 --> 00:50:31,149
In Summer Twenty eighteen, we'll be

1526
00:50:31,150 --> 00:50:32,769
running a summer school that introduces

1527
00:50:32,770 --> 00:50:34,029
all this stuff with some hands on

1528
00:50:34,030 --> 00:50:35,919
practice, with hardware verification and

1529
00:50:35,920 --> 00:50:37,929
C program verification and a few other

1530
00:50:37,930 --> 00:50:39,399
topics in Cork.

1531
00:50:39,400 --> 00:50:41,049
It'll be held in Princeton, New Jersey,

1532
00:50:41,050 --> 00:50:43,179
and you can subscribe

1533
00:50:43,180 --> 00:50:44,529
to our mailing list on the website if

1534
00:50:44,530 --> 00:50:46,599
you'd like to get updates. As we pin down

1535
00:50:46,600 --> 00:50:48,609
more of the details, here are the GitHub

1536
00:50:48,610 --> 00:50:50,739
projects for my own projects that I

1537
00:50:50,740 --> 00:50:52,959
mentioned here and two of the

1538
00:50:52,960 --> 00:50:55,269
others that we made code

1539
00:50:55,270 --> 00:50:56,439
contributions to.

1540
00:50:56,440 --> 00:50:57,440
Thank you.

1541
00:51:08,930 --> 00:51:11,239
We have about 10 minutes for a Q and a

1542
00:51:11,240 --> 00:51:12,240
microphone want

1543
00:51:13,460 --> 00:51:14,609
Sengstacke?

1544
00:51:14,610 --> 00:51:15,799
Great talk.

1545
00:51:15,800 --> 00:51:17,779
I enjoyed it really much.

1546
00:51:17,780 --> 00:51:18,889
I have a question.

1547
00:51:18,890 --> 00:51:21,289
Watch what strategy

1548
00:51:21,290 --> 00:51:22,460
you have for

1549
00:51:23,660 --> 00:51:26,029
finding the right series, because

1550
00:51:26,030 --> 00:51:28,309
I think this is the hardest part

1551
00:51:28,310 --> 00:51:29,479
here.

1552
00:51:29,480 --> 00:51:32,059
And you said you make

1553
00:51:32,060 --> 00:51:34,339
reviews for this, and

1554
00:51:34,340 --> 00:51:36,769
I think maybe that's not enough.

1555
00:51:36,770 --> 00:51:38,299
For example, when you have a reverse

1556
00:51:38,300 --> 00:51:40,459
function and the sorting function,

1557
00:51:40,460 --> 00:51:42,709
you can have the theorem that

1558
00:51:42,710 --> 00:51:44,869
for the invariant, that the length is

1559
00:51:44,870 --> 00:51:47,539
always the same when you execute it.

1560
00:51:47,540 --> 00:51:49,729
But maybe that's not enough

1561
00:51:49,730 --> 00:51:52,219
for, um,

1562
00:51:52,220 --> 00:51:53,479
for the series.

1563
00:51:53,480 --> 00:51:54,979
So what are your strategies?

1564
00:51:54,980 --> 00:51:57,319
Because maybe I think a combination

1565
00:51:57,320 --> 00:51:59,509
to have some concrete example

1566
00:51:59,510 --> 00:52:01,669
test that test the functions

1567
00:52:01,670 --> 00:52:04,009
also. Good. So theorems maybe

1568
00:52:04,010 --> 00:52:06,289
not enough what this

1569
00:52:06,290 --> 00:52:07,469
thinks. Yeah.

1570
00:52:07,470 --> 00:52:09,079
So this relates to a number of the

1571
00:52:09,080 --> 00:52:10,819
frequently asked questions, answers that

1572
00:52:10,820 --> 00:52:13,819
I had in the middle of the talk.

1573
00:52:13,820 --> 00:52:15,949
Generally we build

1574
00:52:15,950 --> 00:52:18,019
infrastructure to support applications

1575
00:52:18,020 --> 00:52:20,119
and applications tend to have more

1576
00:52:20,120 --> 00:52:22,249
straightforward specifications in

1577
00:52:22,250 --> 00:52:23,569
many cases than the systems

1578
00:52:23,570 --> 00:52:24,979
infrastructure below.

1579
00:52:24,980 --> 00:52:25,939
We try to set things up.

1580
00:52:25,940 --> 00:52:27,709
So if we got the infrastructure theorem

1581
00:52:27,710 --> 00:52:29,779
wrong, we will not be able to use it to

1582
00:52:29,780 --> 00:52:31,909
prove the application result.

1583
00:52:31,910 --> 00:52:32,910
So in general,

1584
00:52:34,160 --> 00:52:36,289
one technique is when

1585
00:52:36,290 --> 00:52:38,119
you build a verified component, use it in

1586
00:52:38,120 --> 00:52:39,919
as many different contexts as possible

1587
00:52:39,920 --> 00:52:41,149
and don't peek back inside the

1588
00:52:41,150 --> 00:52:42,829
implementation. Only look at the original

1589
00:52:42,830 --> 00:52:43,729
theorem you chose.

1590
00:52:43,730 --> 00:52:45,709
And by forcing yourself to prove all the

1591
00:52:45,710 --> 00:52:47,389
larger systems that that used your

1592
00:52:47,390 --> 00:52:49,489
component, you will get out the bugs

1593
00:52:49,490 --> 00:52:52,009
in the specification of the components

1594
00:52:52,010 --> 00:52:54,259
that you're reusing it.

1595
00:52:54,260 --> 00:52:56,089
And you mentioned it might be useful to

1596
00:52:56,090 --> 00:52:58,069
have a set of test cases as part of a

1597
00:52:58,070 --> 00:53:00,529
specification for a component

1598
00:53:00,530 --> 00:53:03,139
that that could be a

1599
00:53:03,140 --> 00:53:04,279
good strategy also.

1600
00:53:04,280 --> 00:53:06,379
But it's kind of covered by

1601
00:53:06,380 --> 00:53:08,029
reusing your component in many different

1602
00:53:08,030 --> 00:53:09,469
settings where you're naturally, say,

1603
00:53:09,470 --> 00:53:10,879
calling your function with with different

1604
00:53:10,880 --> 00:53:12,949
arguments. And you need that function to

1605
00:53:12,950 --> 00:53:14,539
be correct to get the larger system to be

1606
00:53:14,540 --> 00:53:15,540
correct.

1607
00:53:17,090 --> 00:53:18,669
I go from three.

1608
00:53:18,670 --> 00:53:20,179
Right, thank you very much for the

1609
00:53:20,180 --> 00:53:21,180
excellent talk.

1610
00:53:22,080 --> 00:53:24,199
Um, the

1611
00:53:24,200 --> 00:53:26,299
specifications of many,

1612
00:53:26,300 --> 00:53:28,609
many problems also involve

1613
00:53:28,610 --> 00:53:30,109
time.

1614
00:53:30,110 --> 00:53:32,209
So you have not only to

1615
00:53:32,210 --> 00:53:34,129
have the correct answer, you have to have

1616
00:53:34,130 --> 00:53:36,079
the correct answer before the Tsipi

1617
00:53:36,080 --> 00:53:37,049
session times.

1618
00:53:37,050 --> 00:53:39,199
But the user gets board, the machine

1619
00:53:39,200 --> 00:53:40,459
gets rebooted.

1620
00:53:40,460 --> 00:53:42,649
Your deadline is over.

1621
00:53:42,650 --> 00:53:45,349
How do you write down

1622
00:53:45,350 --> 00:53:47,509
time specifications

1623
00:53:47,510 --> 00:53:49,819
in clock? Like this thing will

1624
00:53:49,820 --> 00:53:51,949
finish after at most this

1625
00:53:51,950 --> 00:53:52,999
many milliseconds.

1626
00:53:53,000 --> 00:53:54,859
Yeah, I think we'll probably hold off and

1627
00:53:54,860 --> 00:53:56,329
trying to prove that users don't get

1628
00:53:56,330 --> 00:53:58,579
bored for now, but maybe someday,

1629
00:53:58,580 --> 00:54:00,649
but for the time out property,

1630
00:54:00,650 --> 00:54:02,779
I, I think so.

1631
00:54:02,780 --> 00:54:03,799
We're working in a very general

1632
00:54:03,800 --> 00:54:05,149
mathematical framework that can

1633
00:54:05,150 --> 00:54:06,709
essentially encode anything that you'd

1634
00:54:06,710 --> 00:54:07,789
find in a math textbook.

1635
00:54:07,790 --> 00:54:10,309
And it's it's pretty easy to define

1636
00:54:10,310 --> 00:54:12,559
what time is from

1637
00:54:12,560 --> 00:54:15,169
the program execution perspective.

1638
00:54:15,170 --> 00:54:16,759
And there have been projects that, for

1639
00:54:16,760 --> 00:54:18,979
instance, prove that a C

1640
00:54:18,980 --> 00:54:21,319
compiler preserves running time between

1641
00:54:21,320 --> 00:54:23,659
the source code and the assembly

1642
00:54:23,660 --> 00:54:26,029
code with a pretty probably a naive model

1643
00:54:26,030 --> 00:54:27,289
at the bottom.

1644
00:54:27,290 --> 00:54:29,779
And I

1645
00:54:29,780 --> 00:54:32,479
don't have any projects of my own in that

1646
00:54:32,480 --> 00:54:34,069
vein so far. But a number of people have

1647
00:54:34,070 --> 00:54:36,079
looked at that. So I think my main answer

1648
00:54:36,080 --> 00:54:37,459
is very flexible framework.

1649
00:54:37,460 --> 00:54:38,749
You can encode whatever you want, you can

1650
00:54:38,750 --> 00:54:39,859
prove whatever you want. It's just a

1651
00:54:39,860 --> 00:54:41,869
question of how much labor is going to be

1652
00:54:41,870 --> 00:54:42,529
required.

1653
00:54:42,530 --> 00:54:43,909
And we'll learn more about better

1654
00:54:43,910 --> 00:54:45,499
abstractions that will reduce that labor

1655
00:54:45,500 --> 00:54:46,639
over time.

1656
00:54:46,640 --> 00:54:47,779
Thank you.

1657
00:54:47,780 --> 00:54:48,989
We have a question from the signal

1658
00:54:48,990 --> 00:54:51,259
engine. Yeah, it's maybe somewhat related

1659
00:54:51,260 --> 00:54:53,359
to the last one so I can prove

1660
00:54:53,360 --> 00:54:55,249
self help. Again, such analytics, for

1661
00:54:55,250 --> 00:54:56,749
example, by proving a certain function

1662
00:54:56,750 --> 00:54:58,369
runs in constant time because the main

1663
00:54:58,370 --> 00:55:00,169
point of manuscript implementation is to

1664
00:55:00,170 --> 00:55:02,239
avoid such analytics

1665
00:55:02,240 --> 00:55:03,949
and do the automatically created

1666
00:55:03,950 --> 00:55:05,449
implementations that you presented to

1667
00:55:05,450 --> 00:55:07,009
address this issue.

1668
00:55:07,010 --> 00:55:08,719
So the automatically created

1669
00:55:08,720 --> 00:55:10,429
implementations I presented are actually

1670
00:55:10,430 --> 00:55:12,379
in a very restrictive language that

1671
00:55:12,380 --> 00:55:14,599
enforces Konstantine by construction.

1672
00:55:14,600 --> 00:55:15,919
It's all straight line code.

1673
00:55:15,920 --> 00:55:17,749
Think of it as the performance critical

1674
00:55:17,750 --> 00:55:20,179
inner loops of the cryptographic library.

1675
00:55:20,180 --> 00:55:22,849
And so yes, that one does

1676
00:55:22,850 --> 00:55:25,039
by construction provide

1677
00:55:25,040 --> 00:55:27,049
constant time. Other formal methods

1678
00:55:27,050 --> 00:55:29,089
projects have done automatic program

1679
00:55:29,090 --> 00:55:31,219
analysis to guarantee Konstantine

1680
00:55:31,220 --> 00:55:32,149
properties.

1681
00:55:32,150 --> 00:55:34,399
I think this is an important kind

1682
00:55:34,400 --> 00:55:35,539
of analysis to do.

1683
00:55:35,540 --> 00:55:37,879
It's among the easier ones for

1684
00:55:37,880 --> 00:55:39,619
for formal methods, which is great given

1685
00:55:39,620 --> 00:55:41,299
the big payoff in security.

1686
00:55:41,300 --> 00:55:43,519
So I'm optimistic about scaling

1687
00:55:43,520 --> 00:55:45,590
up to whatever settings it's needed in

1688
00:55:47,520 --> 00:55:48,520
microphone for.

1689
00:55:49,400 --> 00:55:52,069
Yeah, so about scaling up.

1690
00:55:52,070 --> 00:55:54,919
To what extent is the user

1691
00:55:54,920 --> 00:55:56,119
to what extent do you look at human

1692
00:55:56,120 --> 00:55:57,289
factors?

1693
00:55:57,290 --> 00:55:59,149
Because in many cases that's how they

1694
00:55:59,150 --> 00:56:00,889
break. Well, it depends.

1695
00:56:00,890 --> 00:56:02,149
If you're talking about the humans who

1696
00:56:02,150 --> 00:56:04,489
are writing the proofs, then that's us.

1697
00:56:04,490 --> 00:56:06,559
We're very concerned about us and we make

1698
00:56:06,560 --> 00:56:08,299
an effort to help ourselves by improving

1699
00:56:08,300 --> 00:56:09,259
tools.

1700
00:56:09,260 --> 00:56:11,419
Um, do you mean

1701
00:56:11,420 --> 00:56:13,279
the users of the software or hardware

1702
00:56:13,280 --> 00:56:14,899
that we're creating and some way that

1703
00:56:14,900 --> 00:56:16,309
that might be worse off because of this

1704
00:56:16,310 --> 00:56:17,689
methodology?

1705
00:56:17,690 --> 00:56:19,819
Yeah. So to what extent so

1706
00:56:19,820 --> 00:56:21,409
when you're talking about like full

1707
00:56:21,410 --> 00:56:23,839
system specifications, it's

1708
00:56:23,840 --> 00:56:25,759
really a question of what do you include

1709
00:56:25,760 --> 00:56:28,039
in your system, as I see.

1710
00:56:28,040 --> 00:56:29,569
Oh, you're asking, do we formally model

1711
00:56:29,570 --> 00:56:31,220
the user in the theorems?

1712
00:56:32,810 --> 00:56:34,999
So we'll do things like the

1713
00:56:35,000 --> 00:56:37,219
top level theorem of the hardware

1714
00:56:37,220 --> 00:56:38,899
models, that it has an input channel and

1715
00:56:38,900 --> 00:56:39,799
output channel.

1716
00:56:39,800 --> 00:56:41,629
We have no idea what's on the other side,

1717
00:56:41,630 --> 00:56:43,189
but our specification says if you get

1718
00:56:43,190 --> 00:56:45,049
this output, sorry, get this input, you

1719
00:56:45,050 --> 00:56:47,149
need to give that output, that sort of

1720
00:56:47,150 --> 00:56:49,189
thing. I don't know if anyone who's been

1721
00:56:49,190 --> 00:56:50,989
doing anything like cognitively modeling

1722
00:56:50,990 --> 00:56:52,639
users and what you should expect from

1723
00:56:52,640 --> 00:56:54,889
them in this kind of setting, that

1724
00:56:54,890 --> 00:56:57,019
that could be worthwhile, though

1725
00:56:57,020 --> 00:56:58,759
I expect there'd be a lot of disagreement

1726
00:56:58,760 --> 00:57:00,049
about exactly how to write the theorem

1727
00:57:00,050 --> 00:57:01,050
down.

1728
00:57:01,970 --> 00:57:03,769
Thanks, Michael, for one.

1729
00:57:05,030 --> 00:57:06,619
So I'd just like to start by saying that

1730
00:57:06,620 --> 00:57:07,879
I already agree with everything you said

1731
00:57:07,880 --> 00:57:09,109
here, so you don't need to accept your

1732
00:57:09,110 --> 00:57:10,099
soul mate on any of it.

1733
00:57:10,100 --> 00:57:12,239
But I basically believe

1734
00:57:12,240 --> 00:57:13,939
that the average industrial programmers,

1735
00:57:13,940 --> 00:57:15,230
the Dunning, Kruger Posterchild

1736
00:57:17,120 --> 00:57:19,369
and so anyway, Tony Hawk said

1737
00:57:19,370 --> 00:57:21,139
that no references were a billion dollar

1738
00:57:21,140 --> 00:57:23,359
mistake. And he has the following quote,

1739
00:57:23,360 --> 00:57:25,549
which is, Ten years ago, researchers

1740
00:57:25,550 --> 00:57:27,049
into formal methods predicted that the

1741
00:57:27,050 --> 00:57:28,489
programing world would embrace with

1742
00:57:28,490 --> 00:57:30,049
gratitude every assistance promised by

1743
00:57:30,050 --> 00:57:31,549
Formalization to solve the problems of

1744
00:57:31,550 --> 00:57:33,679
reliability that arise when programs get

1745
00:57:33,680 --> 00:57:36,859
larger and more safety critical, then

1746
00:57:36,860 --> 00:57:39,019
more stuff and concludes in, which

1747
00:57:39,020 --> 00:57:41,239
is like the most dismal, sad

1748
00:57:41,240 --> 00:57:42,469
thing I've ever heard.

1749
00:57:42,470 --> 00:57:44,029
It has turned out that the world just

1750
00:57:44,030 --> 00:57:45,589
does not suffer significantly from the

1751
00:57:45,590 --> 00:57:47,119
kind of problem that our research was

1752
00:57:47,120 --> 00:57:48,309
originally intended to solve.

1753
00:57:49,670 --> 00:57:51,769
OK, Enlight, why do

1754
00:57:51,770 --> 00:57:53,809
you believe that programmers will accept

1755
00:57:53,810 --> 00:57:55,969
this now when they have for thirty

1756
00:57:55,970 --> 00:57:58,279
years developed so many confused

1757
00:57:58,280 --> 00:58:01,279
epicycles on the ideas of research that

1758
00:58:01,280 --> 00:58:02,280
do work?

1759
00:58:02,930 --> 00:58:04,639
In my experience, at least for some

1760
00:58:04,640 --> 00:58:06,529
isolated communities of developers like

1761
00:58:06,530 --> 00:58:08,689
the authors of the main crypto libraries,

1762
00:58:08,690 --> 00:58:11,329
they do worry about correctness and

1763
00:58:11,330 --> 00:58:13,719
they're very interested in exploring

1764
00:58:13,720 --> 00:58:15,539
these kinds of techniques.

1765
00:58:15,540 --> 00:58:17,729
So that's one answer the let's say the

1766
00:58:17,730 --> 00:58:19,979
very high assurance domain where people

1767
00:58:19,980 --> 00:58:21,899
are especially interested, especially

1768
00:58:21,900 --> 00:58:23,999
willing to accept certain costs for

1769
00:58:24,000 --> 00:58:25,389
correctness.

1770
00:58:25,390 --> 00:58:27,149
The other answer I'd give is that I think

1771
00:58:27,150 --> 00:58:29,129
we're progressively improving the

1772
00:58:29,130 --> 00:58:30,989
usability and

1773
00:58:32,790 --> 00:58:35,009
creating many component

1774
00:58:35,010 --> 00:58:36,809
libraries that can do most of the work

1775
00:58:36,810 --> 00:58:38,159
for you for each project so you don't

1776
00:58:38,160 --> 00:58:39,419
have to start from scratch.

1777
00:58:39,420 --> 00:58:41,699
And I think we're heading towards passing

1778
00:58:41,700 --> 00:58:44,369
some critical point where the ecosystem

1779
00:58:44,370 --> 00:58:47,069
is big enough that there's a

1780
00:58:47,070 --> 00:58:48,389
fundamental change in the cost of

1781
00:58:48,390 --> 00:58:49,499
applying these tools.

1782
00:58:49,500 --> 00:58:51,209
And if we get there, then I think that'll

1783
00:58:51,210 --> 00:58:52,589
make the difference for the kind of

1784
00:58:52,590 --> 00:58:55,439
barrier that Paul was talking about.

1785
00:58:55,440 --> 00:58:56,339
I agree with that.

1786
00:58:56,340 --> 00:58:58,409
But I also think, like my example

1787
00:58:58,410 --> 00:59:00,119
was no references.

1788
00:59:00,120 --> 00:59:02,019
And all you need in order to eliminate

1789
00:59:02,020 --> 00:59:03,239
noise from a language are

1790
00:59:04,350 --> 00:59:07,019
parametric polymorphism and some types.

1791
00:59:07,020 --> 00:59:09,059
And people still say, oh, you know, Neal,

1792
00:59:09,060 --> 00:59:10,169
it's not a problem.

1793
00:59:10,170 --> 00:59:11,399
It's all good.

1794
00:59:11,400 --> 00:59:12,749
I don't make those kind of mistakes.

1795
00:59:14,430 --> 00:59:16,979
Well, I guess we can discuss this later.

1796
00:59:16,980 --> 00:59:19,529
There's a question from the Internet, um,

1797
00:59:19,530 --> 00:59:21,179
how does the halting problem limit what

1798
00:59:21,180 --> 00:59:23,099
you can improve? And is that a problem in

1799
00:59:23,100 --> 00:59:25,559
real life? Applications of

1800
00:59:25,560 --> 00:59:27,359
the holding problem applies to arbitrary

1801
00:59:27,360 --> 00:59:28,889
pieces of code that get thrown down in

1802
00:59:28,890 --> 00:59:30,149
front of you and you get asked questions

1803
00:59:30,150 --> 00:59:32,309
about them. Luckily, the pieces of code

1804
00:59:32,310 --> 00:59:34,259
we care about were written by people who

1805
00:59:34,260 --> 00:59:36,359
had understandings of them in mind.

1806
00:59:36,360 --> 00:59:38,609
And I personally believe

1807
00:59:38,610 --> 00:59:40,289
that the author of the code should be

1808
00:59:40,290 --> 00:59:42,449
responsible for distributing proof with

1809
00:59:42,450 --> 00:59:44,669
it, and it is then decided

1810
00:59:44,670 --> 00:59:46,379
to check if the proof is convincing,

1811
00:59:46,380 --> 00:59:48,089
which is the real problem that that

1812
00:59:48,090 --> 00:59:49,090
matters for this domain.

1813
00:59:50,160 --> 00:59:51,160
Microphone three,

1814
00:59:53,340 --> 00:59:55,829
the baby examples you get for us

1815
00:59:55,830 --> 00:59:57,899
with proofs. The proofs

1816
00:59:57,900 --> 01:00:01,109
to me looked still very verbose

1817
01:00:01,110 --> 01:00:03,389
because this exam was

1818
01:00:03,390 --> 01:00:05,489
very simple. So it's the

1819
01:00:05,490 --> 01:00:08,129
more optimization possible and

1820
01:00:08,130 --> 01:00:09,269
people are doing that.

1821
01:00:10,920 --> 01:00:13,229
There are a wide variety of approaches

1822
01:00:13,230 --> 01:00:14,669
to proof automation.

1823
01:00:14,670 --> 01:00:16,289
I think of those proofs is pretty short.

1824
01:00:16,290 --> 01:00:17,759
They're definitely significantly shorter

1825
01:00:17,760 --> 01:00:19,289
than the programs that they're applying

1826
01:00:19,290 --> 01:00:20,429
to. It was maybe

1827
01:00:21,600 --> 01:00:24,059
one fifth the size of the program, which

1828
01:00:24,060 --> 01:00:26,129
if you look at these kind

1829
01:00:26,130 --> 01:00:28,319
of proofs 20 years ago, that's a big

1830
01:00:28,320 --> 01:00:29,639
improvement already. And we can

1831
01:00:29,640 --> 01:00:31,709
extrapolate from that and hope for even

1832
01:00:31,710 --> 01:00:32,969
shorter proofs in the future.

1833
01:00:34,530 --> 01:00:35,849
Last question. Microphone three,

1834
01:00:37,320 --> 01:00:39,089
you said that in the semantics of Blue

1835
01:00:39,090 --> 01:00:41,219
Spack methods,

1836
01:00:41,220 --> 01:00:43,319
executed atomically.

1837
01:00:43,320 --> 01:00:44,999
So what actually happens if there's a

1838
01:00:45,000 --> 01:00:47,099
failure during

1839
01:00:47,100 --> 01:00:48,539
a method?

1840
01:00:48,540 --> 01:00:50,889
Well, so this is in hardware land

1841
01:00:50,890 --> 01:00:52,499
where we're within a method.

1842
01:00:52,500 --> 01:00:54,689
Everything's running just as propagation

1843
01:00:54,690 --> 01:00:56,729
of signals over wires.

1844
01:00:56,730 --> 01:00:58,889
So you have to plan ahead and

1845
01:00:58,890 --> 01:01:00,959
detect the failures with guard conditions

1846
01:01:00,960 --> 01:01:02,489
at the beginning. And then you just don't

1847
01:01:03,690 --> 01:01:05,429
look at what's flowing down that part of

1848
01:01:05,430 --> 01:01:06,430
the circuit

1849
01:01:07,560 --> 01:01:09,009
right now.

1850
01:01:09,010 --> 01:01:11,609
That's it. And sorry if it

1851
01:01:11,610 --> 01:01:12,599
could be in 15 minutes.

1852
01:01:12,600 --> 01:01:13,600
Thanks. Thanks.