1 00:00:00,160 --> 00:00:13,499 *33c3 opening theme music* 2 00:00:13,499 --> 00:00:20,460 Herald: I'm excited to be here, I guess you are too. We will get started with our 3 00:00:20,460 --> 00:00:26,670 first talker for the day. He is a security researcher at SBA Research, and he's also 4 00:00:26,670 --> 00:00:32,668 a member of CCC Vienna. The talk we'll be hearing today is "Everything you always 5 00:00:32,668 --> 00:00:37,250 wanted to know about Certificate Transparency" and with that, I will pass 6 00:00:37,250 --> 00:00:41,848 on the stage, please give a warm welcome to Martin Schmiedecker! 7 00:00:41,909 --> 00:00:46,361 *applause* 8 00:00:48,740 --> 00:00:53,720 Martin: Thank you very much for these kind words and this very nice introduction. 9 00:00:54,071 --> 00:00:58,820 As Ari said, I'm a member of CCC Vienna, I'm also on twitter, so if you have a 10 00:00:58,820 --> 00:01:02,730 comment afterwards, or want to ping me, if you find a typo in the slides, or 11 00:01:02,730 --> 00:01:05,220 whatever, just ping me on twitter. 12 00:01:05,220 --> 00:01:08,720 So, what is this talk about? What are we going 13 00:01:08,720 --> 00:01:13,010 to talk about? Certificate Transparency is kind of a new thing in the TLS 14 00:01:13,010 --> 00:01:19,680 ecosystem so not many people are familiar that it is here. So I will present the 15 00:01:19,680 --> 00:01:24,910 overview, what is CT and what it does and will also peek under the hood and see what 16 00:01:24,910 --> 00:01:32,060 it actually does, how it works, and how you can play with it. So one of the things 17 00:01:32,060 --> 00:01:38,150 I have to say about myself: I'm a keen fan of Internet memes. So even though these 18 00:01:38,150 --> 00:01:44,690 are hilarious pictures. Personally I find hilarious pictures that I put online. Keep 19 00:01:44,690 --> 00:01:48,700 in mind that HTTPS is a serious topic. Whether you do net banking, you're 20 00:01:48,700 --> 00:01:53,670 googling, or whatever you do online, HTTPS is there to protect your privacy and to 21 00:01:53,670 --> 00:01:59,690 protect your security. And in some states, this has been shown by history, this is 22 00:01:59,690 --> 00:02:05,350 not a case, so there are nation-wide introspecting devices which break open the 23 00:02:05,350 --> 00:02:11,400 TLS encryption and look at the content. And people will get a visit from secret 24 00:02:11,400 --> 00:02:16,010 police or anything and they will knock on their door and arrest them. Just like this 25 00:02:16,010 --> 00:02:21,650 week happened in Turkey, where people got arrested for posting things on Facebook. 26 00:02:21,650 --> 00:02:25,720 So even though there are some funny pictures in there keep in mind that this 27 00:02:25,720 --> 00:02:34,030 is just a means to an end for my presentation. I personally find HTTPS is a 28 00:02:34,030 --> 00:02:39,270 very important topic. I hope I can convince you, too. And CT in particular is 29 00:02:39,270 --> 00:02:46,900 fascinating. Why is there something like Certificate Transparency? The name says it 30 00:02:46,900 --> 00:02:52,650 all: if you are a certification authority, you want to make public the certificates 31 00:02:52,650 --> 00:02:59,860 you sell or you issue. As with many good stories and many good tools it all started 32 00:02:59,860 --> 00:03:06,150 with a hack. Back in 2011 there was this Dutch certification authority called 33 00:03:06,150 --> 00:03:10,850 DigiNotar, and they got pawned. They got really, really badly fisted. 34 00:03:10,850 --> 00:03:11,850 *laughter* 35 00:03:11,850 --> 00:03:17,680 They lost everything. They lost all their crown jewels. And as part of this hack, 36 00:03:17,680 --> 00:03:23,650 there were 500-something fraudulent certificates issued. And not just any 37 00:03:23,650 --> 00:03:27,370 certificates, not just like Let's Encrypt, where you can get a free certificate, and 38 00:03:27,370 --> 00:03:32,350 and then use it for your internal systems, or for your web site, or whatever. No, 39 00:03:32,350 --> 00:03:38,870 really, really high value domains and high value certificates. Like google.com, very 40 00:03:38,870 --> 00:03:43,290 privacy-invasive, if you can read what people are googling, or what they are 41 00:03:43,290 --> 00:03:48,360 sending in their emails. windowsupdate.com, which is like the back 42 00:03:48,360 --> 00:03:56,069 door to some of the windows world. mozilla.com, the attacker could manipulate 43 00:03:56,069 --> 00:04:03,140 the Firefox download, sign it with the certificate and ship it over a 44 00:04:03,140 --> 00:04:11,050 secure-seeming website. torproject, and so forth. This was back in 2011 and this was 45 00:04:11,050 --> 00:04:19,180 not just a small incident it hasn't been a small CA but it was a regular CA with regular 46 00:04:19,180 --> 00:04:24,960 business. What's more on this hack is that: These certificates have then been 47 00:04:24,960 --> 00:04:29,690 used to intercept communication of clients. People browsing the web, reading 48 00:04:29,690 --> 00:04:34,850 their email. The company which investigated the breach afterwards found 49 00:04:34,850 --> 00:04:42,240 out that at least 300.000 IP addresses were connecting to google.com and were 50 00:04:42,240 --> 00:04:50,400 seeing this fraudulent cert. 99% of which where from Iran. So it was kind of a 51 00:04:50,400 --> 00:04:56,570 nation state attack against clients of either ISP based or border gateway based 52 00:04:56,570 --> 00:05:04,070 where people were thinking they were browsing secured by HTTPS but they were 53 00:05:04,070 --> 00:05:12,220 actually not. This is a wonderful frame from the video. The guys from Fox IT which 54 00:05:12,220 --> 00:05:19,949 investigated this breach they used the OCSP requests. Every time you get a 55 00:05:19,949 --> 00:05:23,450 certificate your browser has to somehow figure out whether or not this certificate 56 00:05:23,450 --> 00:05:30,880 is still valid. If it has been revoked, it would be nice to not use it anymore. And 57 00:05:30,880 --> 00:05:38,060 one of the approaches which is used is so called OCSP, so the client asks the 58 00:05:38,060 --> 00:05:45,870 certificate authority: "hey is this still valid?" And this has been logged. Each of 59 00:05:45,870 --> 00:05:53,360 these requests is one of the clients seeing this fraudulent certificate and 60 00:05:53,360 --> 00:05:59,790 asking DigiNotar: "Hey, is this cert still valid?" And as you can see, most of the 61 00:05:59,790 --> 00:06:03,580 connections - it's actually a movie, so you can see the lights flickering and 62 00:06:03,580 --> 00:06:08,699 popping up and down as people go to sleep and wake up again. And most of the 63 00:06:08,699 --> 00:06:15,860 people were from Iran. So how did DigiNotar got hacked? They got really, 64 00:06:15,860 --> 00:06:21,229 really, badly hacked because they had vulnerabilities everywhere. They had a 65 00:06:21,229 --> 00:06:27,400 system running which was incomprehensibly insecure for a certification authority. 66 00:06:27,400 --> 00:06:31,900 People think that if you run a certification authority you build the 67 00:06:31,900 --> 00:06:37,449 foundation for secure communication online. You are the one securing Internet 68 00:06:37,449 --> 00:06:42,690 communication. And if you run such an entity, people think you know security. 69 00:06:42,690 --> 00:06:43,960 Actually, 70 00:06:43,960 --> 00:06:45,600 *laughter* 71 00:06:45,600 --> 00:06:52,100 actually, DigiNotar did not. They had unpatched software, which was facing the Internet. 72 00:06:52,100 --> 00:06:55,990 Might happen. They didn't have anti-virus on the machines that issued the 73 00:06:55,990 --> 00:07:01,860 certificates. The didn't have a strong password for their admin account. So like 74 00:07:01,860 --> 00:07:05,040 "password" or "admin". Actually, you can read the report online, and the 75 00:07:05,040 --> 00:07:11,600 recommendations from ENISA, the European security body, they listed all the things 76 00:07:11,600 --> 00:07:18,700 that have been found and identified. Also, all the certificate-issuing servers were 77 00:07:18,700 --> 00:07:27,040 in one Windows domain. Also kind of bad from DigiNotar: they kept the incident 78 00:07:27,040 --> 00:07:31,690 secret. Of course, they did not want to spread out onto the Internet "hey, we got 79 00:07:31,690 --> 00:07:37,760 hacked, and we have had bad security". They kept this incident hidden 80 00:07:37,760 --> 00:07:39,900 for more than 2 months. 81 00:07:39,900 --> 00:07:45,380 After 2 months, when it got public, and when the Internet found out, 82 00:07:45,380 --> 00:07:49,820 that actually something really, really bad had happened, they found out, and 83 00:07:49,820 --> 00:07:59,640 DigiNotar then went bankrupt. That's the sad ending of the story. But this is not one 84 00:07:59,640 --> 00:08:05,620 of the problems that certification authorities face. If you run a 85 00:08:05,620 --> 00:08:10,860 certification authority, you issue certificates based on the identify of your 86 00:08:10,860 --> 00:08:17,310 customers. You can create sub-root CAs, so you can say Hey, Martin, he looks like a 87 00:08:17,310 --> 00:08:22,960 nice guy, he looks like he knows security, let's make him a CA and make him verify 88 00:08:22,960 --> 00:08:31,710 identities. Probably not a good idea, but this is what the business model of HTTPS 89 00:08:31,710 --> 00:08:36,599 and certification authorities is. They issue certificates and they grant the 90 00:08:36,599 --> 00:08:45,470 permission to issue certificates as well. And the entire goal of these companies is 91 00:08:45,470 --> 00:08:50,910 to get into the trust stores. Every browser, every operating system, every 92 00:08:50,910 --> 00:08:56,879 thing connects over TLS has something called like trust store, where it stores 93 00:08:56,879 --> 00:09:02,499 the entities that are entitled to issue certificates. And the problem is, those 94 00:09:02,499 --> 00:09:07,199 CAs are not strictly audited. They have their requirements that they have to 95 00:09:07,199 --> 00:09:13,369 fullfil. They have to show that they have some kind of security. But afterwards, 96 00:09:13,369 --> 00:09:17,709 once they're certified, and once they're in the trust stores, there is not such a 97 00:09:17,709 --> 00:09:23,130 strong incentive to audit them, because they are already in the trust stores, and 98 00:09:23,130 --> 00:09:31,269 they've had their audits, and so forth. This can lead to many problems. Another 99 00:09:31,269 --> 00:09:38,959 CA, Trustwave, in 2011, it issued sub-CA certificates. Anyone with a sub-CA 100 00:09:38,959 --> 00:09:46,199 certificate can issue a TLS certificate for any domain. They used it for traffic 101 00:09:46,199 --> 00:09:50,249 introspection. So they were selling, I don't know, to a company, which was 102 00:09:50,249 --> 00:09:55,670 building appliances which can break open the network connections for banks, 103 00:09:55,670 --> 00:10:05,170 companies, or entire ISPs. They can look into the traffic of it's users. Also, 104 00:10:05,170 --> 00:10:11,749 there was Lenovo SuperFish, wonderful idea. SuperFish was a local 105 00:10:11,749 --> 00:10:17,070 man-in-the-middle CA, and the goal of the SuperFish CA was to break open HTTPS 106 00:10:17,070 --> 00:10:20,510 traffic, so that they can inject ads. 107 00:10:20,510 --> 00:10:22,040 *laughter* 108 00:10:22,040 --> 00:10:27,239 Even though you're using gmail and you have this nice, slick interface without 109 00:10:27,239 --> 00:10:34,160 obvious ads, SuperFish would break open this connection, would be trusted by the 110 00:10:34,160 --> 00:10:44,199 browser, and would have huge overlay ads. Lenovo stopped cooperating with SuperFish. 111 00:10:44,199 --> 00:10:51,889 This was preinstalled on Lenovo notebooks. They had a local CA installed on the 112 00:10:51,889 --> 00:10:57,720 system so they could inspect the traffic and show ads to users. What's even more 113 00:10:57,720 --> 00:11:03,470 interesting is that all these CAs had the same key, and the private key was in RAM. 114 00:11:03,470 --> 00:11:12,889 So anybody could extract the private key of the CA, use it to sign certificates for 115 00:11:12,889 --> 00:11:19,660 anything, and have an additional layer of HTTPS injection, where you could not only 116 00:11:19,660 --> 00:11:27,160 show ads, but also read the emails or do whatever you want. Very bad. They're not doing it 117 00:11:27,160 --> 00:11:34,709 allegedly anymore. Then there was, in China, the CNNIC, they issued a sub-CA for 118 00:11:34,709 --> 00:11:38,649 an introspection company. Again the company wanted to sell appliances where 119 00:11:38,649 --> 00:11:46,209 they could break open HTTPS connections and look into the traffic of the users. 120 00:11:46,209 --> 00:11:51,220 And there was another incident just this year: Symantec was issuing "test" 121 00:11:51,220 --> 00:11:57,399 certificates to a company or whatever, among them google.com, opera.com, things 122 00:11:57,399 --> 00:12:04,230 that you probably not would like to test, and got caught. And the nice thing about 123 00:12:04,230 --> 00:12:08,709 this incident is: they already had Certificate Transparency installed. And we 124 00:12:08,709 --> 00:12:15,490 will come back to this incident in a minute. Traffic introspection is a valid 125 00:12:15,490 --> 00:12:21,839 thing. If you have a fleet of planes, and they are connected via expensive satellite 126 00:12:21,839 --> 00:12:26,739 connections and you really pay a lot for bandwidth you would like to block, for 127 00:12:26,739 --> 00:12:33,259 example, Netflix, or anything which causes a lot of traffic. One of the approaches 128 00:12:33,259 --> 00:12:40,309 which was taken by Gogo, they had traffic introspection devices in their planes and 129 00:12:40,309 --> 00:12:48,899 they issued not-trusted certificates to inspect the traffic. Bad for them: 130 00:12:48,899 --> 00:12:54,829 Adrienne Porter Felt who works for Google noticed this and Gogo is not doing this 131 00:12:54,829 --> 00:13:02,200 anymore. And even though traffic introspection sounds like a really bad 132 00:13:02,200 --> 00:13:07,910 thing, I can think of use cases where this is legit. If you run a company, if you run 133 00:13:07,910 --> 00:13:15,039 a bank, and you want to prevent people from leaking data, this can be OK. But it 134 00:13:15,039 --> 00:13:18,120 has to be transparent, people have to know that this is happening, that they're 135 00:13:18,120 --> 00:13:22,660 inspecting everything. And still won't prevent people from carrying out the USB 136 00:13:22,660 --> 00:13:29,929 thumb drive with all the data on it. So this is the big picture why we need 137 00:13:29,929 --> 00:13:34,899 Certificate Transparency. We would like to see which certificates have been issued by 138 00:13:34,899 --> 00:13:42,889 a specific CA. Some minor issues, not really minor, that additionally come to 139 00:13:42,889 --> 00:13:49,189 play are that TLS has it's issues nonetheless whether these certificates are 140 00:13:49,189 --> 00:13:54,790 issued or not. One of them is certificate revocation is tricky. It's not as easy as 141 00:13:54,790 --> 00:14:01,109 just saying "this certificate is not valid anymore". Once a certificate is issued, it 142 00:14:01,109 --> 00:14:08,040 is valid until the date shown in the certificate, which can be three years. 143 00:14:08,040 --> 00:14:12,230 Happens to be, if on the first day of using this certificate, people notice, 144 00:14:12,230 --> 00:14:17,999 "uh, we should revoke it", clients that don't get this update will be able to use 145 00:14:17,999 --> 00:14:28,019 this certificate for two and more years. Also, another limitation is that all CAs 146 00:14:28,019 --> 00:14:35,149 can issue certificates for all websites. Any of those 1,800 CAs and sub-CAs which 147 00:14:35,149 --> 00:14:41,750 were in trust stores in 2013 they can all issue a certificate for google.com or 148 00:14:41,750 --> 00:14:46,620 facebook.com. This is not prevented by any means but social means and contracts, 149 00:14:46,620 --> 00:14:54,640 which state that they have to check the legitimacy of the request. This was 150 00:14:54,640 --> 00:15:02,869 published in a paper in 2013. There are more than 1,800 CAs which can sign 151 00:15:02,869 --> 00:15:10,379 certificates for any domain in regular user devices. Another paper in 2014 found 152 00:15:10,379 --> 00:15:16,089 out that one third of them, one third of those 1,800 certification authorities, 153 00:15:16,089 --> 00:15:21,100 never issued a single HTTPS certificate. This makes you wonder: why are they then 154 00:15:21,100 --> 00:15:26,759 in the trust stores and so forth. You can claim a certain percentage of them they 155 00:15:26,759 --> 00:15:34,499 are used for issuing private certificates within networks. Still, one third of them 156 00:15:34,499 --> 00:15:44,220 never issued a publicly obtainable HTTPS certificate. Then of course there the 157 00:15:44,220 --> 00:15:49,109 implementation issues. TLS has a long history of implementation flaws. Not just 158 00:15:49,109 --> 00:15:54,109 cryptographic, there's logjam, freak, poodle, whatever. They are a completely 159 00:15:54,109 --> 00:16:01,799 separate issue. But the implementation issues are troubling the device security 160 00:16:01,799 --> 00:16:06,820 at a constant pace. Famous example is: "goto fail;" from iOS, where they had an 161 00:16:06,820 --> 00:16:12,660 additional "goto fail" missing bracket and the certificate validity wasn't checked. 162 00:16:12,660 --> 00:16:19,629 Also, we have a lot of embedded devices. Once they're powered up, they're used to 163 00:16:19,629 --> 00:16:25,369 generate their private key, and they have no access to good entropy. Entropy on 164 00:16:25,369 --> 00:16:33,010 embedded devices is surprisingly hard. So a lot of them generate the same keys. And 165 00:16:33,010 --> 00:16:37,399 as already mentioned, we have different trust stores per browser, per operating 166 00:16:37,399 --> 00:16:41,910 system. Everyone has a different trust base. Also of course, every CA tries to 167 00:16:41,910 --> 00:16:47,379 get access into all of the trust stores, get shipped with system updates to be 168 00:16:47,379 --> 00:16:54,670 trusted, and we have a diversity which is not natural. Could be much easier if 169 00:16:54,670 --> 00:17:01,490 people would have the same trust base on all their devices. And there are plenty of 170 00:17:01,490 --> 00:17:07,609 deployment issues. SSLv2: everybody thinks it dead, but apparently, it's not. 171 00:17:07,609 --> 00:17:12,099 Sebastian Schinzel will give a splendid presentation two hours from now about the 172 00:17:12,099 --> 00:17:19,129 DROWN attack. The DROWN attack uses SSLv2 weaknesses in email transport. Simply 173 00:17:19,129 --> 00:17:26,720 because it's activated, and it uses the same key, you can attack top-notch TLS 1.2 174 00:17:26,720 --> 00:17:32,850 encryption, because this is still here. There's the whole shmafoo of the SHA1 175 00:17:32,850 --> 00:17:37,780 certificates. Certification authorities are not supposed to issue any SHA1 176 00:17:37,780 --> 00:17:41,760 certificates anymore. Some do, some get caught, because they back-dated their 177 00:17:41,760 --> 00:17:47,380 certificates, and so forth. It's a mess. Then there's cypher suites. There are more 178 00:17:47,380 --> 00:17:54,610 than 500 cypher suites available for the different versions of TLS. Every admin 179 00:17:54,610 --> 00:18:00,060 would like to be [as] secure as possible but which should he choose. As soon as 180 00:18:00,060 --> 00:18:04,910 there is money involved, like Amazon, they need to be compatible with Internet 181 00:18:04,910 --> 00:18:16,140 Explorer 6 and so forth. It's really a mess. And of course, email STARTTLS: Email 182 00:18:16,140 --> 00:18:22,220 never had the design to incorporate security and authentication, so as always, 183 00:18:22,220 --> 00:18:27,750 they just popped it on top, and this is STARTTLS. The problem with STARTTLS is it 184 00:18:27,750 --> 00:18:33,080 can be suppressed and people will fall back to plaintext if they cannot reach the 185 00:18:33,080 --> 00:18:39,530 service with STARTTLS. Perfect forward secrecy and so forth, deployment is another 186 00:18:39,530 --> 00:18:46,770 topic which can be a talk about. And there is this troublesome development that the 187 00:18:46,770 --> 00:18:52,340 CAs, they get bought and they get sold constantly. Just this year, Symantec 188 00:18:52,340 --> 00:19:00,040 bought the company BlueCoat. Symantec is one of the larger CAs. They run the entire 189 00:19:00,040 --> 00:19:07,150 - not the entire, but they run large parts of the certifications that are observable. 190 00:19:07,150 --> 00:19:13,100 BlueCoat got popular in the Arab Spring, because they found BlueCoat proxies which 191 00:19:13,100 --> 00:19:18,700 are capable using man-in-the-middle attacks to conduct traffic introspection, 192 00:19:18,700 --> 00:19:23,320 have been used at an ISP I think in Syria or Egypt. They found them, and they have 193 00:19:23,320 --> 00:19:28,820 been deployed nationwide. So if you think about it that Symantec, one of the largest 194 00:19:28,820 --> 00:19:34,690 CAs, is buying BlueCoat, one of the larger traffic introspection companies, things 195 00:19:34,690 --> 00:19:38,620 can look really fishy or scary. 196 00:19:39,580 --> 00:19:44,180 Of course they promised they would never use the Symantec 197 00:19:44,180 --> 00:19:46,600 *laughter* 198 00:19:46,600 --> 00:19:53,140 This is the state we're in. This is fine, but it's not. But people still think about 199 00:19:53,140 --> 00:19:59,561 it that HTTPS is safe. And actually it took a decade to teach people that they 200 00:19:59,561 --> 00:20:05,060 have to search for the lock icon. But if they do not understand - actually they do 201 00:20:05,060 --> 00:20:11,910 not know how the lock icon appears. But the entire lock icon is a farce if you dig 202 00:20:11,910 --> 00:20:20,860 into the details. We're all sitting in a room filled with flames, so to say. So, 203 00:20:20,860 --> 00:20:26,520 this is where certificate transparency comes in. Certificate transparency has the 204 00:20:26,520 --> 00:20:38,050 goal to identify fraudulent certification authorities. In a perfect world, any 205 00:20:38,050 --> 00:20:43,140 certification authority would publish all it's logs, would publish all the 206 00:20:43,140 --> 00:20:48,700 certificates it issues. So as soon as I get a certificate for schmiedecker.net, 207 00:20:48,700 --> 00:20:54,160 the certification authority - this is part of the public/private key, it can be 208 00:20:54,160 --> 00:20:59,840 public - so wouldn't it be nice if the CA would publish that it just issued a 209 00:20:59,840 --> 00:21:05,740 certificate for schmiedecker.net? Basically: yes. Of course, certification 210 00:21:05,740 --> 00:21:11,300 authorities do not want this to happen, in particular if they're selling to funky 211 00:21:11,300 --> 00:21:18,440 states or funky businesses which earn their money with traffic introspection and 212 00:21:18,440 --> 00:21:23,920 so forth. So the perfect world would be the public key of each certificate would 213 00:21:23,920 --> 00:21:28,160 be published. The certification authority could say "Hey, I just issued this 214 00:21:28,160 --> 00:21:30,990 certificate" and everybody could see it, could verify it 215 00:21:30,990 --> 00:21:35,200 and it would be, well, a better world. 216 00:21:37,740 --> 00:21:43,200 This would help to detect problems very early. So if a small Dutch 217 00:21:43,200 --> 00:21:47,330 certification authority would issue a certificate for google.com or 218 00:21:47,330 --> 00:21:52,300 torproject.com, this would be noticeable. I mean, this is a small CA, they would be 219 00:21:52,300 --> 00:21:57,280 really - they should be really surprised if google.com decides to issue a 220 00:21:57,280 --> 00:22:04,540 certificate for their service. This would shorten the window of opportunity for an 221 00:22:04,540 --> 00:22:12,560 attacker. Also, the idea is to have some form of punishment for misbehaving CAs. So 222 00:22:12,560 --> 00:22:18,020 at the moment, right now, if a certification authority fucks up, and 223 00:22:18,020 --> 00:22:23,970 Google is affected, they mandate that they need to have additional steps to be 224 00:22:23,970 --> 00:22:32,800 reintroduced into the trust stores. This is what Google did. They did the Power 225 00:22:32,800 --> 00:22:41,650 Ranger move, and they decided they want to make the internet more secure. Why Google? 226 00:22:41,650 --> 00:22:46,610 Well, Google is uniquely positioned in a way that they control the clients with 227 00:22:46,610 --> 00:22:53,820 their browsers with the Android system, and they also control a large portion of 228 00:22:53,820 --> 00:22:58,340 the servers. Everyone uses Google, except for those that use Bing. 229 00:22:58,340 --> 00:23:00,530 *laughter* 230 00:23:00,530 --> 00:23:08,140 Just kidding. What Google did is, once the DigiNotar hack got public, they pinned 231 00:23:08,140 --> 00:23:13,620 their certificates. Since Chrome has a decent update cycle they can ship the 232 00:23:13,620 --> 00:23:19,241 certificates which they expect to see with a browser update. So as soon as [the] 233 00:23:19,241 --> 00:23:27,510 browser updates in the background, it can enforce the specific certificate that it 234 00:23:27,510 --> 00:23:34,670 expects to see for google.com, youtube.com, and whatever. Also, it has a 235 00:23:34,670 --> 00:23:40,330 really huge market share. 50% and more, depending on how you count. Chrome and 236 00:23:40,330 --> 00:23:46,060 Chromium are rather popular. And lastly, they are a common target. So if some 237 00:23:46,060 --> 00:23:53,860 dictator decides to introspect client emails, user emails, usually they target 238 00:23:53,860 --> 00:23:59,640 gmail.com, because they have a decent security, they do not have any other 239 00:23:59,640 --> 00:24:10,180 vulnerabilities or backdoors to allow access to their content. Which makes the 240 00:24:10,180 --> 00:24:15,700 attack to Gmail a very drastic attack. With the changes that Google introduced 241 00:24:15,700 --> 00:24:21,190 into Chrome with the certificate pinning, they can now detect these attacks. 242 00:24:21,190 --> 00:24:29,940 But this was already back in 2011. Since then, for example, the Porter Felt tweet 243 00:24:29,940 --> 00:24:37,520 I showed you, If Chrome would go to a website google.com or youtube.com, and 244 00:24:37,520 --> 00:24:44,200 would see a fraudulent certificate, they would warn the user. And what Google then 245 00:24:44,200 --> 00:24:52,840 did, was to propose a standard, to make an RFC, how to transparently publish the logs 246 00:24:52,840 --> 00:25:01,350 for certificates that have been issued. The idea of the RFC is that every 247 00:25:01,350 --> 00:25:11,460 certificate issued is public. This is implemented in a public, append-only log. 248 00:25:11,460 --> 00:25:16,900 So they have a log, they have open APIs, and they accept every certificate. Then, 249 00:25:16,900 --> 00:25:22,180 cryptographically assured, the client like the browser can verify that this is a 250 00:25:22,180 --> 00:25:27,640 publicly logged certificate. And the entire system is open for all. So you can 251 00:25:27,640 --> 00:25:30,190 go to the website, you can get the source code, 252 00:25:30,190 --> 00:25:36,490 you can run your own log for RFC 6962. 253 00:25:36,490 --> 00:25:40,610 And everyone is happy. 254 00:25:40,870 --> 00:25:45,960 The goals were to detect misbehaving CAs. As I said, 255 00:25:45,960 --> 00:25:51,500 they have their audits, they have their compliance regulations, and so forth, but 256 00:25:51,500 --> 00:25:55,010 not on the certificate level. With certificate transparency, they become 257 00:25:55,010 --> 00:26:00,950 audible by the public, by the browsers. Everyone can query the logs and see 258 00:26:00,950 --> 00:26:04,730 whether or not this particular certification authority has issued a 259 00:26:04,730 --> 00:26:07,290 certificate for google.com. 260 00:26:10,200 --> 00:26:15,390 Alright! Upon reading the RFC, there are three entities 261 00:26:15,390 --> 00:26:20,260 which are part of certification transparency. There are, for one, 262 00:26:20,260 --> 00:26:27,680 the logs, which are like giant vacuum cleaners. They ingest all the certificates 263 00:26:27,680 --> 00:26:34,170 which are sent to them, and then cryptographically sign them and issue the 264 00:26:34,170 --> 00:26:40,620 assurance that this specific certificate has been logged. And this has been issued 265 00:26:40,620 --> 00:26:45,640 and has not been tampered with, and so forth. Then there are monitors. They 266 00:26:45,640 --> 00:26:49,860 identify suspicious certificates. Usually, these are the certification authorities 267 00:26:49,860 --> 00:26:55,930 themselves which run those monitors. And then there are the auditors. The auditors 268 00:26:55,930 --> 00:27:02,870 usually are implemented in the browser. And they verify that the issued 269 00:27:02,870 --> 00:27:10,190 certificates are really logged. Looking at them in detail: the role of the monitor 270 00:27:10,190 --> 00:27:14,080 and the auditor is kind of interchangeable, so a monitor can be an 271 00:27:14,080 --> 00:27:21,350 auditor, back and forth. What the monitor does, it fetches all the certificates. 272 00:27:21,350 --> 00:27:27,720 So you have this giant pool of certificates. They are cryptographically assured which 273 00:27:27,720 --> 00:27:33,220 we will see soon. And the monitor just fetches them all. And they have some form 274 00:27:33,220 --> 00:27:39,920 of semantic checking. They can see, has there been a certificate for my domain, 275 00:27:39,920 --> 00:27:47,059 has there been any sub-CA created, which is able to issue certificates for traffic 276 00:27:47,059 --> 00:27:53,590 introspection, and so forth. Also, what they can then, with this data, do, they 277 00:27:53,590 --> 00:28:00,160 can identify misbehaving log operators. I said, the logs, they are just gigantic 278 00:28:00,160 --> 00:28:05,150 hoovers, which collect all the certificates, and they need auditing, too, 279 00:28:05,150 --> 00:28:09,390 of course. They need - they have a position of power, because they are 280 00:28:09,390 --> 00:28:18,300 managing this huge pool of certificates. And one needs to challenge the log to 281 00:28:18,300 --> 00:28:24,400 identify misbehaviour. This can be done by the monitors, can also be done by the 282 00:28:24,400 --> 00:28:32,490 auditors. Every client - right now, it's implemented in Chrome. Chrome checks for 283 00:28:32,490 --> 00:28:43,110 these certification transparency cryptographically signed blobs. And the 284 00:28:43,110 --> 00:28:47,460 browsers and everything, they can verify the log integrity as well. So in the 285 00:28:47,460 --> 00:28:56,860 backend, the log, it creates a hash tree. This hash tree is signed. We will come to 286 00:28:56,860 --> 00:29:05,650 that in a second. I got lost here. So both monitors and auditors, they query that the 287 00:29:05,650 --> 00:29:10,570 log entity is working correctly. It wouldn't be a good thing if China could go 288 00:29:10,570 --> 00:29:16,530 to Google and say them "Hey, we would like to have this certificate removed." Google 289 00:29:16,530 --> 00:29:22,670 could then comply or could not comply but whether they remove the certificate this 290 00:29:22,670 --> 00:29:28,340 would be auditible and this would be observable to the public. So the good 291 00:29:28,340 --> 00:29:33,690 thing is anyone run any software, anyone of you in this room can run a log entity. 292 00:29:33,690 --> 00:29:38,430 You need some kind of access to some certificates, so whether or not you are a 293 00:29:38,430 --> 00:29:45,340 certification authority, you can just run a public log, and everybody can push their 294 00:29:45,340 --> 00:29:53,710 certificates to your service. Right now, this is not the case. Usually, the CAs run 295 00:29:53,710 --> 00:30:00,230 the monitors and they run the logs, but this is not by design, anybody can run 296 00:30:00,230 --> 00:30:06,470 anything. One of the problems is availability. So even through I can set up 297 00:30:06,470 --> 00:30:15,140 a log for certificates, I have the problem that my log needs to be online 24/7. My 298 00:30:15,140 --> 00:30:22,870 ISP is not happy if I ask him to guarantee this for me, if I don't pay much much much 299 00:30:22,870 --> 00:30:31,350 more. So, how does it work? Currently, if you get a certificate, you go to the 300 00:30:31,350 --> 00:30:36,070 certification authority, You say, "hey, I'm this wonderful domain, please could I 301 00:30:36,070 --> 00:30:42,860 get a certificate?" And then you get the certificate. What's additionally happening 302 00:30:42,860 --> 00:30:50,350 with certification transparency is that the CA upon issuing the certificate - this can 303 00:30:50,350 --> 00:30:55,610 be any CA, this can be Let's Encrypt, this can be Thawte, Symantec, you name it - 304 00:30:55,610 --> 00:31:02,090 what they do is they send the certificate once they issued it, they send the 305 00:31:02,090 --> 00:31:13,500 certificate to one of the logs. The log then signs the successful reception of the 306 00:31:13,500 --> 00:31:18,000 certificate, and immediately sends something back. This blob is called the 307 00:31:18,000 --> 00:31:24,309 SCT, the signed certificate timestamp, and this can then be included in the 308 00:31:24,309 --> 00:31:32,990 certificate or with other ways. Key point here is that once the server installs the 309 00:31:32,990 --> 00:31:42,860 certificate, it also installs this SCT, so that browsers can see it and parse it. 310 00:31:42,860 --> 00:31:49,540 Some people I might have lost here. Nonetheless, everything is easier in 311 00:31:49,540 --> 00:31:53,771 pictures. Right now, currently - and these are the pictures from the certification 312 00:31:53,771 --> 00:31:58,570 transparency website, thanks for making them - my pic skills are really not that 313 00:31:58,570 --> 00:32:03,960 good, so I never would have been able to make such beautiful graphs. So currently, 314 00:32:03,960 --> 00:32:10,020 there is the certification authority. It issues a certificate, and the website then 315 00:32:10,020 --> 00:32:17,059 installs it in the correct directory. The clients check it, and encryption can 316 00:32:17,059 --> 00:32:23,240 happen. The additional step, and this is the nice thing, it can happen without any 317 00:32:23,240 --> 00:32:28,850 additional steps on the server side and the client side, it's just the 318 00:32:28,850 --> 00:32:33,650 certification authority needs to do an additional step. So instead of just 319 00:32:33,650 --> 00:32:39,920 issuing the certificate, they send the certificate to the logs, the log 320 00:32:39,920 --> 00:32:45,800 immediately sends back the so-called SCT, the signed certificate timestamp, and this 321 00:32:45,800 --> 00:32:51,830 is then included in the certificate, which is shipped to the client. And then the 322 00:32:51,830 --> 00:32:57,570 client, if it supports it, can ask the server whether or not this particular 323 00:32:57,570 --> 00:33:05,680 certificate is included or not. The things that come back from the log they are 324 00:33:05,680 --> 00:33:11,010 signed, they have an ID, and they have a timestamp. These are the important things. 325 00:33:11,010 --> 00:33:18,440 They need to be included in those SCT. Also, what will be interesting in the 326 00:33:18,440 --> 00:33:27,160 future, that the certificate can have multiple log entries. So the SCT is like a 327 00:33:27,160 --> 00:33:36,380 promise. The log operator promises to include this certificate in its logs. And 328 00:33:36,380 --> 00:33:40,140 everybody can check afterwards then if this log has really publicly logged, or if 329 00:33:40,140 --> 00:33:45,260 the authority has omitted to log it. In the future it will be the case that many 330 00:33:45,260 --> 00:33:52,800 SCTs can be within a certificate. If I'm a certification authority I can go to any 331 00:33:52,800 --> 00:34:00,000 log operator, send them every certificate I have and then include many, many SCTs. 332 00:34:00,000 --> 00:34:04,080 And the SCT is not private. This is just an ID, it's a timestamp, and it's a 333 00:34:04,080 --> 00:34:12,969 signature. This is probably too much. There's multiple ways for the client to 334 00:34:12,969 --> 00:34:21,289 verify that this certificate has an SCT. So one of the methods for example is OCSP 335 00:34:21,289 --> 00:34:26,389 stapling. Right now, if you have a certificate, instead of going to the CA, 336 00:34:26,389 --> 00:34:34,149 the server can staple the OCSP request signed by the CA. And within this OCSP 337 00:34:34,149 --> 00:34:44,109 stapling there can also be the SCT included. How does it work on the log 338 00:34:44,109 --> 00:34:48,489 side? Everything there is, is a Merkle hash tree. A Merkle hash tree is a 339 00:34:48,489 --> 00:34:52,940 wonderful data structure. It's nothing new, it's nothing fancy, and it's not the 340 00:34:52,940 --> 00:34:54,418 blockchain. 341 00:34:54,418 --> 00:34:55,899 *laughter* 342 00:34:55,899 --> 00:35:05,400 The Merkle hash tree, it looks, it's a binary tree. Every node has two children, 343 00:35:05,400 --> 00:35:10,570 and the hash value of an inner node depends on the two children. So usually 344 00:35:10,570 --> 00:35:14,600 it's the concatenation of the values of the two children. Get's hashed again, up 345 00:35:14,600 --> 00:35:19,859 to the root. Makes it very space efficient because if I want to verify the integrity 346 00:35:19,859 --> 00:35:27,799 of one entire tree, all I have to check is the hash value of the root. Then, of 347 00:35:27,799 --> 00:35:36,260 course, I can get all the relevant hash values, and then I can reconstruct it. CT 348 00:35:36,260 --> 00:35:45,460 uses SHA256 Merkle tree, and as I said, everything below a certain node is 349 00:35:45,460 --> 00:35:51,509 responsible for the hash value. So if you remove a node, if you add a node, or if 350 00:35:51,509 --> 00:36:02,490 you relocate a node, the hash values of all the upper nodes get changed. Each of 351 00:36:02,490 --> 00:36:06,920 the log operators, additionally to the promise that they will include every 352 00:36:06,920 --> 00:36:12,400 certificate that it receives, it also gives a promise on the maximum merge 353 00:36:12,400 --> 00:36:18,890 delay. The SCT, the promise to include this certificate chain into the log, it 354 00:36:18,890 --> 00:36:26,069 can only finish immediately because it's a promise to include this into the log. And 355 00:36:26,069 --> 00:36:32,400 the maximum merge delay is the time the log operator promises to include it in the 356 00:36:32,400 --> 00:36:41,150 big, big Merkle hash tree. The good thing about the Merkle hash tree is despite 357 00:36:41,150 --> 00:36:46,369 being very space efficient, calculation efficient, not that much data overhead, 358 00:36:46,369 --> 00:36:50,869 and so forth, it's not possible to backdate elements. This was interesting 359 00:36:50,869 --> 00:36:55,470 for one of the certification authorities which issued SHA1 signed certificates, 360 00:36:55,470 --> 00:36:59,670 even though the browsers and everyone agreed that this should not happen 361 00:36:59,670 --> 00:37:05,440 anymore. So it's also not possible remove elements that have been once in there. So 362 00:37:05,440 --> 00:37:09,780 if Symantec decided to remove the google.com certificate, which was a "test" 363 00:37:09,780 --> 00:37:14,359 certificate, this would be noticeable as well, because if you remove one of the 364 00:37:14,359 --> 00:37:20,739 leaves, the hash values up to the root, they all change. And it's also not 365 00:37:20,739 --> 00:37:26,690 possible to add elements. if you would like to add an element unnoticably, you 366 00:37:26,690 --> 00:37:34,160 cannot do this, because the hash values of all the upper nodes would change. So how 367 00:37:34,160 --> 00:37:39,989 do the logs operate? What they usually do is once every hour, they receive the 368 00:37:39,989 --> 00:37:48,319 certificates, and once every hour they include them into their Merkle hash tree. 369 00:37:48,319 --> 00:37:52,069 Probably already too much detail. They build a separate tree, and then include it 370 00:37:52,069 --> 00:38:01,480 and recalculate the root hash value, which is then signed and shipped. And the nice 371 00:38:01,480 --> 00:38:07,829 thing about the Merkle tree is that you have multiple ways of proving things. One 372 00:38:07,829 --> 00:38:18,359 of the things that can be proved whether or not this log operator is honest. if a 373 00:38:18,359 --> 00:38:21,989 log operator removes one of the certificates, this becomes visible by 374 00:38:21,989 --> 00:38:32,099 changing all the relevant nodes. Also, it's very efficient. Also a figure from 375 00:38:32,099 --> 00:38:39,279 the project website. On the left side, you have a Merkle tree with some added 376 00:38:39,279 --> 00:38:47,039 certificates, appended certificates. And if a monitor or an auditor decides to 377 00:38:47,039 --> 00:38:53,699 challenge the log operator, at a later point in time, whether or not these 378 00:38:53,699 --> 00:39:00,509 certificates D6 and D7 have been correctly added, all the log operator has to send 379 00:39:00,509 --> 00:39:07,329 are those highlighted nodes. This is the root, this is the thing that is signed, 380 00:39:07,329 --> 00:39:13,079 for example, every hour. This is public. The certificates, they are public because 381 00:39:13,079 --> 00:39:20,539 like, they're certificates. If now someone wants to verify that not only these have 382 00:39:20,539 --> 00:39:25,599 been included, this is very easy, because you just have to calculate all the way up, 383 00:39:25,599 --> 00:39:30,279 but also verify that all the other certificates are still there, so none of 384 00:39:30,279 --> 00:39:36,510 the old certificates have been removed, there only needs to be three hash values 385 00:39:36,510 --> 00:39:42,190 transmitted. And then the challenger can re-calculate everything. So as soon as the 386 00:39:42,190 --> 00:39:46,950 challenger knows those hash values they can concatenate everything back together 387 00:39:46,950 --> 00:39:57,079 and in the end, it should have the same hash value as the root. Another proof that 388 00:39:57,079 --> 00:40:02,790 is possible is whether a specific certificate is still in the log. So it's 389 00:40:02,790 --> 00:40:07,359 not only possible to challenge the consistency of the entire log regarding 390 00:40:07,359 --> 00:40:14,369 old data, but it's also to verify that a specific certificate is still in the logs, 391 00:40:14,369 --> 00:40:21,109 or made it into the logs. Remember, the SCT, the thing that finished immediately, 392 00:40:21,109 --> 00:40:27,190 is just a promise to include it in the logs, and at a later point in time, 393 00:40:27,190 --> 00:40:35,619 anyone, any auditor can challenge the log operator if the certificate is really in 394 00:40:35,619 --> 00:40:45,569 the log. So again, if I want to verify that a specific certificate is in the log 395 00:40:45,569 --> 00:40:51,300 I have the certificate that I would like to challenge, then I just need, in this 396 00:40:51,300 --> 00:40:57,259 example, those three nodes, and everything else, the j node can be calculated because 397 00:40:57,259 --> 00:41:02,330 I have the certificate. Then I have the hash of the certificate. I need this hash, 398 00:41:02,330 --> 00:41:12,430 then I can calculate this value, and so forth, until I am at the root. So much for 399 00:41:12,430 --> 00:41:17,470 under the hood. Merkle hash trees are gone. One of the problems of those logs 400 00:41:17,470 --> 00:41:22,630 are they are every growing. You might have noticed, there is not a single word about 401 00:41:22,630 --> 00:41:31,949 deleting certificates, for valid reasons, they are ever growing. Of course, nothing 402 00:41:31,949 --> 00:41:39,279 is forever, so what log operators do is that they rotate the logs. So at a 403 00:41:39,279 --> 00:41:46,119 specific point in time, the log gets frozen, the tree is then static, and there 404 00:41:46,119 --> 00:41:51,920 is another log entity, which is brough online and used for, including the newer 405 00:41:51,920 --> 00:41:58,069 certificates. Quite recently, aviator from Google got frozen. 406 00:41:58,069 --> 00:42:00,719 It contains 46 million certificates. 407 00:42:00,719 --> 00:42:09,060 Small drawback of freezing a log: as long as one certificate in this 408 00:42:09,060 --> 00:42:16,279 log, in this three is still valid, this log needs to be reachable. As soon as all 409 00:42:16,279 --> 00:42:22,680 the certificates have been expired, it can be dumped. But until that it has to be 410 00:42:22,680 --> 00:42:25,680 available for the proofs. 411 00:42:28,099 --> 00:42:34,529 One of the issues is that right now there are just a few log operators. 412 00:42:34,529 --> 00:42:39,240 In the future, there should be many more. Not hundred-thousands of 413 00:42:39,240 --> 00:42:46,840 them, but maybe hundreds of them. And they need to exchange information. Some form of 414 00:42:46,840 --> 00:42:53,460 log chatter should appear. The log operators chatter with the clients to 415 00:42:53,460 --> 00:43:01,349 verify that they all see the same state of the Merkle trees. And this has been 416 00:43:01,349 --> 00:43:08,940 published in a paper last year. Right now, the idea is not yet at a level where they 417 00:43:08,940 --> 00:43:14,440 need to chatter, which we will soon see. This happens when you create memes on the 418 00:43:14,440 --> 00:43:19,790 train. Usually, they are very bad memes. This is apparently Gossip Girl, I've never 419 00:43:19,790 --> 00:43:24,579 seen it, but if you google gossip and meme, ta-da! 420 00:43:24,579 --> 00:43:27,190 *laughter* 421 00:43:28,650 --> 00:43:33,219 Who now runs the logs? Who are the entities who are actively running logs. Of 422 00:43:33,219 --> 00:43:37,650 course, Google is running the majority of them. They proposed the entire thing, they 423 00:43:37,650 --> 00:43:43,970 wrote the code to run these things, and they run the large, open-for-all 424 00:43:43,970 --> 00:43:50,369 certificate logs. Three of them are currently open-for-all. Another one is for 425 00:43:50,369 --> 00:43:54,559 Let's Encrypt certificates, and another one is for non Let's Encrypt certificates. 426 00:43:54,559 --> 00:44:00,470 Of course, Let's Encrypt issues a lot of certificates., thankfully. So they 427 00:44:00,470 --> 00:44:05,119 separated that, apparently. If you read the mailing list, they promise that these 428 00:44:05,119 --> 00:44:11,700 free open-for-all logs are separated geographically and administratively. The 429 00:44:11,700 --> 00:44:21,170 are run by different entities, but they all have the same boss, and it would be 430 00:44:21,170 --> 00:44:30,190 better if there were more open logs. Symantec has one, Wosign, CNNIC. Everytime 431 00:44:30,190 --> 00:44:34,410 Google detects that a fraudulent certificate for google.com has been 432 00:44:34,410 --> 00:44:44,109 issued, those certification authorities are mandated to run CT. Which is a good 433 00:44:44,109 --> 00:44:50,050 thing, I mean, public and everything. Google has tens of millions of 434 00:44:50,050 --> 00:44:54,160 certificates. They really have an open-for-all log, so everyone can push 435 00:44:54,160 --> 00:45:00,640 certificates in there. DigiCert, Symantec is kind of big, but all the other nodes 436 00:45:00,640 --> 00:45:05,849 which are listed on the website, they have a hundred-thousand-ish certificates, which 437 00:45:05,849 --> 00:45:14,320 is not that much compared to 50 million or 60 millions. Right now, Google already 438 00:45:14,320 --> 00:45:22,359 mandates certification transparency for extended valiity certificates, so if you 439 00:45:22,359 --> 00:45:28,160 not only see the green text up in the left corner of your browser, but also some 440 00:45:28,160 --> 00:45:35,660 fancy name and big, big green whatever, this is an EV cert. And Google mandates 441 00:45:35,660 --> 00:45:44,190 for EV certs to have two SCTs. Firefox is in the process of including it, I think. 442 00:45:44,190 --> 00:45:53,450 Also, apparently, certificate transparency works. Because, when Symantec issued this 443 00:45:53,450 --> 00:45:59,950 certificate for google.com they released a report stating that they found 23 "test" 444 00:45:59,950 --> 00:46:06,910 certificates. Symantec said that it issued 23 test certificates. But the logs are 445 00:46:06,910 --> 00:46:12,970 public, anybody can query them. And within seconds, you can see that Symantec issued 446 00:46:12,970 --> 00:46:20,839 another 164 certificates for other domains, and also 2,500 certificates for 447 00:46:20,839 --> 00:46:29,260 non-exisisting domains. Just regarding this one issue. I need to hurry, time is 448 00:46:29,260 --> 00:46:34,960 running out. Some of the downsides of certificate transparency. Of course: 449 00:46:34,960 --> 00:46:40,799 privacy. People can learn your internal hosts, so if you have NAS for example, and 450 00:46:40,799 --> 00:46:46,289 this NAS is only reachable within your LAN, and you want to get rid of the 451 00:46:46,289 --> 00:46:51,210 browser warning whenever you access the interface of your NAS, you can get a Let's 452 00:46:51,210 --> 00:46:56,779 Encrypt certificate but since not only the certificate is published, but also it's 453 00:46:56,779 --> 00:47:04,230 logged, people can see in the public log file that there is, for your domain, a 454 00:47:04,230 --> 00:47:10,210 NAS. Also, log entries must contain the entire chain up to a trusted root 455 00:47:10,210 --> 00:47:15,099 certificate, which excludes everything which is self-signed, and everything which 456 00:47:15,099 --> 00:47:23,660 is DANE. DANE is for verifying TLS certificates using DNSsec. And since these 457 00:47:23,660 --> 00:47:30,150 two have no trusted root, they are currently not working for certificate transparency. 458 00:47:30,150 --> 00:47:35,970 Now, of course you want to see the data. You're gonna play around with this. 459 00:47:35,970 --> 00:47:42,849 Basically, what you can query, everything is JSON. So, if you know JSON, you can 460 00:47:42,849 --> 00:47:52,769 work with certificate transparency. The basic URL is like this. The URL is any log 461 00:47:52,769 --> 00:48:00,719 server, responds with the current root and it's signature, using this URL. Most 462 00:48:00,719 --> 00:48:05,180 interestingly, it gives you also the number of certificates and the time stamp. 463 00:48:05,180 --> 00:48:11,740 It looks then like this. JSON, so you have, this is the aviator log from Google, 464 00:48:11,740 --> 00:48:18,759 which is now frozen. Has 46 something million certificates, the hash value of 465 00:48:18,759 --> 00:48:28,109 the Merkle tree, and the signature. Also, you can challenge the certification logs 466 00:48:28,109 --> 00:48:35,339 with consistency proofs, where you have two states of their tree, and the log has 467 00:48:35,339 --> 00:48:41,280 to prove that it did not modify anything in between them. And of course, you can 468 00:48:41,280 --> 00:48:49,900 verify that specific certificate is in the tree with the second URL. And you can just 469 00:48:49,900 --> 00:48:54,940 push certificates there with a POST request. So you push it, they send back 470 00:48:54,940 --> 00:49:00,859 the SCT, if you're the log operator, then you would include this. Any website which 471 00:49:00,859 --> 00:49:10,799 right now is not using SCT all it takes is a POST request. Nothing more. Some screens 472 00:49:10,799 --> 00:49:18,509 from the internals. This is for google.com in the net internals view. What you can 473 00:49:18,509 --> 00:49:28,130 see is that signed certificate timestamp, the SCT, is received. It is valid. And 474 00:49:28,130 --> 00:49:33,180 compliance is checked. So this was for google.com. And everything worked out. 475 00:49:33,180 --> 00:49:39,960 Last but no least, just to mention it, Comodo operates a large search engine, 476 00:49:39,960 --> 00:49:50,229 crt.sh. There you can query public logs. Also, Facebook recently added a monitor 477 00:49:50,229 --> 00:49:58,180 for certificates. So if you own a domain name, and you use an entity which - no if 478 00:49:58,180 --> 00:50:04,739 you own a domain, you can get updates if the certificate changes. The also monitor 479 00:50:04,739 --> 00:50:10,920 the public logs and as soon as, for example, facebook.com uses a new 480 00:50:10,920 --> 00:50:19,579 certificate that is logged in CT, you can get a notification for that. This is what 481 00:50:19,579 --> 00:50:23,619 it looks like. Remember, Facebook can also send PGP-encrypted mails, then nothing 482 00:50:23,619 --> 00:50:31,790 leaks to anyone. This screenshot was borrowed from Scott Helme. So, what's 483 00:50:31,790 --> 00:50:41,700 next? Just a few - One month ago, Google announced that it will mandate certificate 484 00:50:41,700 --> 00:50:49,650 transparency from October 2017 on. So if you run a website which is secured by TLS 485 00:50:49,650 --> 00:50:53,790 you might want to check before that date whether or not your certification 486 00:50:53,790 --> 00:50:58,680 authority is using certificate transparency. I would expect to have more 487 00:50:58,680 --> 00:51:07,049 logs and more certificates included in the logs. In the far future, basically, the 488 00:51:07,049 --> 00:51:12,869 idea of transparency and this Merkle tree is open for anything. You could put key 489 00:51:12,869 --> 00:51:17,759 management software releases, anything in there. The team at Google, they also 490 00:51:17,759 --> 00:51:24,779 builded a prototype for that, called Trillian, and described in the paper 491 00:51:24,779 --> 00:51:26,879 "Verifiable Data Structures". 492 00:51:26,879 --> 00:51:29,279 Before we come to the end and questions, 493 00:51:30,569 --> 00:51:31,460 *laughter* 494 00:51:32,270 --> 00:51:33,140 *applause* 495 00:51:37,660 --> 00:51:41,579 There is a distinction. Of course, you could solve this problem with blockchain 496 00:51:41,579 --> 00:51:49,930 as well. But a Merkle hash tree is much more efficient, much more elegant. When I 497 00:51:49,930 --> 00:51:53,599 talked to a colleague on the train here, he said, of course, you can just push the 498 00:51:53,599 --> 00:51:57,539 log into the blockchain. Yeah, not the same thing. 499 00:51:58,309 --> 00:51:59,539 Thank you! 500 00:51:59,979 --> 00:52:00,979 *applause* 501 00:52:10,769 --> 00:52:13,899 Herald: Thank you Martin for a very interesting talk! We have a few more 502 00:52:13,899 --> 00:52:17,890 minutes left for Q&A, so if you have a question, please line up next to the 503 00:52:17,890 --> 00:52:24,390 microphones, and ask your question. Remember: a question has a question mark 504 00:52:24,390 --> 00:52:29,840 at the end. Also, if you're exiting, please do so silently and from the front 505 00:52:29,840 --> 00:52:34,650 door, thank you. I think we have a question over there: 506 00:52:43,150 --> 00:52:55,789 Q: Can you recommend some libs or software where I can accomplish the TLS handshake 507 00:52:55,789 --> 00:53:02,190 from the client side, so I can get the SCT, via TLS extension, via OCSP 508 00:53:02,190 --> 00:53:07,039 extension, via the inherited pre-certificate SCT. 509 00:53:07,039 --> 00:53:14,920 M: Not by heart. I mean, if it's part of TLS certificate anything will go, OpenSSL, 510 00:53:14,920 --> 00:53:21,589 whatever, it's just a field. Same as for OCSP, so anything that does OCSP will 511 00:53:21,589 --> 00:53:25,410 include it, it's just that clients that do not know the extension will just not - 512 00:53:25,410 --> 00:53:31,989 they will ignore it. But anything that does OCSP or SSL handshake will work. 513 00:53:35,229 --> 00:53:37,029 H: Thank you. Question from this microphone. 514 00:53:37,029 --> 00:53:42,210 Q: Hello, thank you very much for the nice talk. Do you know how much space is needed 515 00:53:42,210 --> 00:53:45,070 to store all the logs currently? 516 00:53:45,070 --> 00:53:54,009 M: I had the same question, but unfortunately not. What they store is the 517 00:53:54,009 --> 00:54:02,009 tree, and they store the entire chain, excluding the root certificates. So, 518 00:54:02,009 --> 00:54:09,700 probably two, three, four certificates per entry, which is like - I think you can buy 519 00:54:09,700 --> 00:54:17,969 at the regular electronic markets a hard drive which is able to fit a lot of those entries. 520 00:54:20,199 --> 00:54:21,739 H: Next question from that mic. 521 00:54:21,739 --> 00:54:27,650 Q: Yeah, thank you for the talk. Why do you need two SCTs for extended validation? 522 00:54:27,650 --> 00:54:36,170 M: Because a single entity might cheat. So it's like - even though you can detect it, 523 00:54:36,170 --> 00:54:40,940 it's still a timeframe left. And if you have two SCTs, which are operated 524 00:54:40,940 --> 00:54:45,919 independently, the idea is it's not that likely that the two will collaborate 525 00:54:45,919 --> 00:54:48,239 to make a certificate disappear. 526 00:54:48,239 --> 00:54:50,019 Q: Thanks! 527 00:54:50,019 --> 00:54:51,499 H: That microphone, yes. 528 00:54:51,499 --> 00:54:55,229 Q: I'm actually a bit surprised, because Google has been pushing for making the 529 00:54:55,229 --> 00:55:00,209 server HELLO as small as possible, and of course, this is increasing the server 530 00:55:00,209 --> 00:55:06,839 HELLO with, in this case, an SCT, and of course, they are also doing OCSP stapling, 531 00:55:06,839 --> 00:55:11,469 so that makes it even bigger. And this is like a SHA256, so we're talking 256 bits 532 00:55:11,469 --> 00:55:15,690 there, plus another one you said that, you know, one is not enough. Actually I've 533 00:55:15,690 --> 00:55:19,459 never seen that has more than one SCT. Have you? 534 00:55:22,749 --> 00:55:23,580 M: No. 535 00:55:23,580 --> 00:55:24,010 *laughter* 536 00:55:24,100 --> 00:55:25,390 Not yet. 537 00:55:25,390 --> 00:55:26,589 Q: I've looked around, but nothing. 538 00:55:26,589 --> 00:55:27,710 M: Yeah. 539 00:55:27,710 --> 00:55:31,580 Q: It's actually increasing the size. And I'm just wondering, where is this going. 540 00:55:31,580 --> 00:55:39,319 Are we just gonna eat the costs of having all these SCTs and OCSP stapling? Are we 541 00:55:39,319 --> 00:55:40,319 prepared to eat that cost? 542 00:55:40,319 --> 00:55:46,609 M: I think the cost is small compared to the gain you get by HTTP2. So if you pipe 543 00:55:46,609 --> 00:55:52,029 anything to one singular connection. I think it's not bad of a cost anymore. But 544 00:55:52,029 --> 00:55:57,319 of course, this is a policy thing. To require a certain amount of SCTs, to 545 00:55:57,319 --> 00:56:01,849 prevent fraudulent CAs. 546 00:56:01,849 --> 00:56:07,859 Q: Is the idea that this will replace something like the SSL observatory, where 547 00:56:07,859 --> 00:56:13,900 browsers send in certs they see, and then - you nodded, so I assume yes. And then 548 00:56:13,900 --> 00:56:18,589 also, how does this work for people who can't have their certs be public? 549 00:56:18,589 --> 00:56:21,359 For people who are like issuing things for internal networks? 550 00:56:21,359 --> 00:56:27,329 M: If you can't have the certificate public, probably the better way right now 551 00:56:27,329 --> 00:56:33,650 is to have a certification authority which is not using CT. In the future, it makes 552 00:56:33,650 --> 00:56:39,930 it much more expensive to operate your own CA, incorporate it in the trust stores. 553 00:56:39,930 --> 00:56:43,969 But of course, this is costly. You have to sign the certificate and everything. 554 00:56:43,969 --> 00:56:52,180 Q: But if like in October 2017, when Chrome rejects all certs that don't have 555 00:56:52,180 --> 00:56:54,470 signed timestamps like what do I do? 556 00:56:56,570 --> 00:56:57,579 M: Use Edge. 557 00:56:58,209 --> 00:57:00,369 *laughter* 558 00:57:01,949 --> 00:57:06,670 I'm sure you can disable it somehow, but it's *blerg*. 559 00:57:08,470 --> 00:57:15,949 Q: What about if someone tries SCT with DHT or other system. 560 00:57:15,949 --> 00:57:18,169 Not blockchain, of course! 561 00:57:18,169 --> 00:57:21,289 It's possible to do that without central authorities? 562 00:57:21,289 --> 00:57:24,440 M: Sorry, say again? 563 00:57:24,440 --> 00:57:31,670 Q: My English is very bad, I'm sorry. I said, it is possible to do that without 564 00:57:31,670 --> 00:57:36,799 some central authority, like Google or over SCT, but 565 00:57:36,799 --> 00:57:41,409 with a distributed hash table, like DHT technologies, 566 00:57:41,409 --> 00:57:42,739 M: Yes, yes, of course. 567 00:57:42,739 --> 00:57:47,290 Q: And are there existing implementations? 568 00:57:47,290 --> 00:57:53,079 M: For the centralized thing, yes. Not for the distributed thing. But I think it's 569 00:57:53,079 --> 00:58:00,269 just adding a layer of DHT on top of it. So I'm sure you can think of a browser 570 00:58:00,269 --> 00:58:06,039 extension which uses the DHT to obtain SCT. But right now it's just purely 571 00:58:06,039 --> 00:58:08,039 centralized. But the source is open. 572 00:58:08,039 --> 00:58:09,229 Q: OK, thank you. 573 00:58:10,669 --> 00:58:15,369 Q: I was just curious how it works if you have a certificate which gets revoked, in 574 00:58:15,369 --> 00:58:19,930 context of the tree. Especially if the tree is frozen. So how does this work? 575 00:58:19,930 --> 00:58:24,859 How do you revoke a certificate with a tree, and then how does it work if it's 576 00:58:24,859 --> 00:58:26,690 frozen already. 577 00:58:26,690 --> 00:58:37,339 M: Good question! The goal of CT is not - it's not about revocation. So whether 578 00:58:37,339 --> 00:58:43,900 revocation path is taken regularly. So you ask OCSP. It's independent of the 579 00:58:43,900 --> 00:58:48,019 revocation thing. It's just publicly saying that this certificate has been 580 00:58:48,019 --> 00:58:56,789 issued. So removing a certificate from the tree, which has been removed - revoked, is 581 00:58:56,789 --> 00:59:01,390 not part of the specification. This is not the use case. It's just logging the 582 00:59:01,390 --> 00:59:03,089 certificates which have been issued. 583 00:59:03,089 --> 00:59:07,950 Q: But if you audit all the logs, and you want to know if something is, like going 584 00:59:07,950 --> 00:59:11,380 on that shouldn't be going on, wouldn't you want to know whether the certificate 585 00:59:11,380 --> 00:59:12,650 has been revoked at some point? 586 00:59:12,650 --> 00:59:20,279 M: Yes, but not in the logs. The logs are just to prove that the CA has issued this 587 00:59:20,279 --> 00:59:26,640 certificate, and to prove that the log has correctly logged it. Revocation is 588 00:59:26,640 --> 00:59:32,680 different. Usually, OCSP stapling with the CA, but that's a different channel. So 589 00:59:32,680 --> 00:59:34,760 this is not for certificate transparency. 590 00:59:34,760 --> 00:59:36,520 Q: Thank you! 591 00:59:36,520 --> 00:59:38,789 H: That's all the time we have for Q&A. 592 00:59:38,789 --> 00:59:41,479 Big round of applause again for Martin for a great talk! 593 00:59:41,479 --> 00:59:42,859 *applause* 594 00:59:43,339 --> 00:59:45,599 *postroll music*