1 00:00:00,000 --> 00:00:13,635 *34c3 intro* 2 00:00:13,635 --> 00:00:19,891 Herald: The next talk will be about embedded systems security and Pascal, the 3 00:00:19,891 --> 00:00:25,931 speaker, will explain how you can hijack debug components for embedded security in 4 00:00:25,931 --> 00:00:33,170 ARM processors. Pascal is not only an embedded software security engineer but 5 00:00:33,170 --> 00:00:39,100 also a researcher in his spare time. Please give a very very warm 6 00:00:39,100 --> 00:00:41,910 welcoming good morning applause to Pascal. 7 00:00:41,910 --> 00:00:48,010 *applause* 8 00:00:48,010 --> 00:00:54,489 Pascal: OK, thanks for the introduction. As it was said, I'm an engineer by day in 9 00:00:54,489 --> 00:00:59,483 a French company where I work as an embedded system security engineer. But 10 00:00:59,483 --> 00:01:04,459 this talk is mainly about my spare-time activity which is researcher, hacker or 11 00:01:04,459 --> 00:01:10,659 whatever you call it. This is because I work with a PhD student called Muhammad 12 00:01:10,659 --> 00:01:17,640 Abdul Wahab. He's a third year PhD student in a French lab. So, this talk will be 13 00:01:17,640 --> 00:01:23,070 mainly a representation on his work about embedded systems security and especially 14 00:01:23,070 --> 00:01:29,990 debug components available in ARM processors. Don't worry about the link. At 15 00:01:29,990 --> 00:01:34,189 the end, there will be also the link with all the slides, documentations and 16 00:01:34,189 --> 00:01:42,479 everything. So, before the congress, I didn't know about what kind of background 17 00:01:42,479 --> 00:01:46,780 you will need for my talk. So, I put there some links, I mean some 18 00:01:46,780 --> 00:01:51,710 references of some talks where you will have all the vocabulary needed to 19 00:01:51,710 --> 00:01:57,490 understand at least some parts of my talk. About computer architecture and embedded 20 00:01:57,490 --> 00:02:03,079 system security, I hope you had attended the talk by Alastair about the formal 21 00:02:03,079 --> 00:02:09,440 verification of software and also the talk by Keegan about Trusted Execution 22 00:02:09,440 --> 00:02:17,610 Environments (TEEs such as TrustZone). And, in this talk, I will also talk about 23 00:02:17,610 --> 00:02:25,880 FPGA stuff. About FPGAs, there was a talk on day 2 about FPGA reverse engineering. 24 00:02:25,880 --> 00:02:31,180 And, if you don't know about FPGAs, I hope that you had some time to go to the 25 00:02:31,180 --> 00:02:37,889 OpenFPGA assembly because these guys are doing a great job about FPGA open-source 26 00:02:37,889 --> 00:02:46,950 tools. When you see this slide, the first question is that why I put "TrustZone is 27 00:02:46,950 --> 00:02:53,590 not enough"? Just a quick reminder about what TrustZone is. TrustZone is about 28 00:02:53,590 --> 00:03:03,600 separating a system between a non-secure world in red and a secure world in green. 29 00:03:03,600 --> 00:03:09,290 When we want to use the TrustZone framework, we have lots of hardware 30 00:03:09,290 --> 00:03:16,700 components, lots of software components allowing us to, let's say, run separately 31 00:03:16,700 --> 00:03:24,750 a secure OS and a non-secure OS. In our case, what we wanted to do is to use the 32 00:03:24,750 --> 00:03:31,450 debug components (you can see it on the left side of the picture) to see if we can 33 00:03:31,450 --> 00:03:39,300 make some security with it. Furthermore, we wanted to use something else than 34 00:03:39,300 --> 00:03:45,460 TrustZone because if you have attended the talk about the security in the Nintendo 35 00:03:45,460 --> 00:03:51,150 Switch, you can see that the TrustZone framework can be bypassed under specific 36 00:03:51,150 --> 00:03:58,970 cases. Furthermore, this talk is something quite complimentary because we will do 37 00:03:58,970 --> 00:04:07,900 something at a lower level, at the processor architecture level. I will talk 38 00:04:07,900 --> 00:04:14,730 in a later part of my talk about what we can do between TrustZone and the approach 39 00:04:14,730 --> 00:04:21,250 developed in this work. So, basically, the presentation will be a quick introduction. 40 00:04:21,250 --> 00:04:27,320 I will talk about some works aiming to use debug components to make some security. 41 00:04:27,320 --> 00:04:33,570 Then, I will talk about ARMHEx which is the name of the system we developed to 42 00:04:33,570 --> 00:04:37,640 use the debug components in a hardcore processor. And, finally, some results and 43 00:04:37,640 --> 00:04:46,180 a conclusion. In the context of our project, we are working with System-on- 44 00:04:46,180 --> 00:04:54,030 Chips. So, System-on-Chips are a kind of devices where we have in the green part a 45 00:04:54,030 --> 00:04:58,785 processor. So it can be a single core, dual core or even quad core processor. 46 00:04:58,785 --> 00:05:05,575 And another interesting part which is in yellow in the image is the programmable 47 00:05:05,575 --> 00:05:09,531 logic. Which is also called an FPGA in this case. And 48 00:05:09,531 --> 00:05:13,870 in this kind of System-on- Chip, you have the hardcore processor, 49 00:05:13,870 --> 00:05:23,790 the FPGA and some links between those two units. You can see here, in the red 50 00:05:23,790 --> 00:05:32,840 rectangle, one of the two processors. This picture is an image of a System-on-Chip 51 00:05:32,840 --> 00:05:38,500 called Zynq provided by Xilinx which is also a FPGA provider. In this kind of 52 00:05:38,500 --> 00:05:45,030 chip, we usually have 2 Cortex-A9 processors and some FPGA logic to work 53 00:05:45,030 --> 00:05:53,910 with. What we want to do with the debug components is to work about Dynamic 54 00:05:53,910 --> 00:06:00,290 Information Flow Tracking. Basically, what is information flow? Information flow is 55 00:06:00,290 --> 00:06:07,040 the transfer of information from an information container C1 to C2 given a 56 00:06:07,040 --> 00:06:14,408 process P. In other words, if we take this simple code over there: if you have 4 57 00:06:14,408 --> 00:06:24,100 variables (for instance, a, b, w and x), the idea is that if you have some metadata 58 00:06:24,100 --> 00:06:31,990 in a, the metadata will be transmitted to w. In other words, what kind of 59 00:06:31,990 --> 00:06:39,560 information will we transmit into the code? Basically, the information I'm 60 00:06:39,560 --> 00:06:48,210 talking in the first block is "OK, this data is private, this data is public" and 61 00:06:48,210 --> 00:06:55,248 we should not mix data which are public and private together. Basically we can say 62 00:06:55,248 --> 00:07:00,440 that the information can be binary information which is "public or private" 63 00:07:00,440 --> 00:07:08,620 but of course we'll be able to have several levels of information. In the 64 00:07:08,620 --> 00:07:16,449 following parts, this information will be called taint or even tags and to be a bit 65 00:07:16,449 --> 00:07:22,070 more simple we will use some colors to say "OK, my tag is red or green" just to 66 00:07:22,070 --> 00:07:33,930 say if it's private or public data. As I said, if the tag contained in a is red, 67 00:07:33,930 --> 00:07:42,240 the data contained in w will be red as well. Same thing for b and x. If we have a 68 00:07:42,240 --> 00:07:48,920 quick example over there, if we look at a buffer overflow. In the upper part of the 69 00:07:48,920 --> 00:07:57,100 slide you have the assembly code and on the lower part, the green columns will be 70 00:07:57,100 --> 00:08:03,600 the color of the tags. On the right side of these columns you have the status of 71 00:08:03,600 --> 00:08:10,940 the different registers. This code is basically: OK, when my input is red at the 72 00:08:10,940 --> 00:08:19,900 beginning, we just use the tainted input into the index variable. The register 2 73 00:08:19,900 --> 00:08:28,210 which contains the idx variable will be red as well. Then, when we want to access 74 00:08:28,210 --> 00:08:36,979 buffer[idx] which is the second line in the C code at the beginning, the 75 00:08:36,979 --> 00:08:43,568 information we have there will be red as well. And, of course, the result of the 76 00:08:43,568 --> 00:08:50,101 operation which is x will be red as well. Basically, that means that if there is a 77 00:08:50,101 --> 00:08:57,050 tainted input at the beginning, we must be able to transmit this information until 78 00:08:57,050 --> 00:09:03,389 the return address of this code just to say "OK, if this tainted input is private, 79 00:09:03,389 --> 00:09:12,470 the return adress at the end of the code should be private as well". What can we do 80 00:09:12,470 --> 00:09:17,970 with that? There is a simple code over there. This is a simple code saying if you 81 00:09:17,970 --> 00:09:25,890 are a normal user, if in your code, you just have to open the welcome file. 82 00:09:25,890 --> 00:09:33,329 Otherwise, if you are a root user, you must open the password file. So this is to 83 00:09:33,329 --> 00:09:38,680 say if we want to open the welcome file, this is a public information: you can do 84 00:09:38,680 --> 00:09:45,129 whatever you want with it. Otherwise, if it's a root user, maybe the password will 85 00:09:45,129 --> 00:09:51,920 contain for instance a cryptographic key and we should not go to the printf 86 00:09:51,920 --> 00:10:01,970 function at the end of this code. The idea behind that is to check that the fs 87 00:10:01,970 --> 00:10:08,290 variable containing the data of the file is private or public. There are mainly 88 00:10:08,290 --> 00:10:13,899 three steps for that. First of all, the compilation will give us the assembly 89 00:10:13,899 --> 00:10:25,290 code. Then, we must modify system calls to send the tags. The tags will be as I said 90 00:10:25,290 --> 00:10:33,720 before the private or public information about my fs variable. I will talk a bit 91 00:10:33,720 --> 00:10:40,699 about that later: maybe, in future works, the idea is to make or at least to compile 92 00:10:40,699 --> 00:10:51,790 an Operating System with integrated support for DIFT. There were already some 93 00:10:51,790 --> 00:10:58,459 works about Dynamic Information Flow Tracking. So, we should do this kind of 94 00:10:58,459 --> 00:11:04,839 information flow tracking in two manners. The first one at the application level 95 00:11:04,839 --> 00:11:14,920 working at the Java or Android level. Some works also propose some solutions at the 96 00:11:14,920 --> 00:11:21,100 OS level: for instance, KBlare. But what we wanted to do here is to work at a lower 97 00:11:21,100 --> 00:11:27,730 level so this is not at the application or the OS leve but more at the hardware level 98 00:11:27,730 --> 00:11:34,769 or, at least, at the processor architecture level. If you want to have 99 00:11:34,769 --> 00:11:39,540 some information about the OS level implementations of information flow 100 00:11:39,540 --> 00:11:47,179 tracking, you can go to blare-ids.org where you have some implementations of an 101 00:11:47,179 --> 00:11:55,749 Android port and a Java port of intrusion detection systems. In the rest of my talk, 102 00:11:55,749 --> 00:12:05,069 I will just go through the existing works and see what we can do about that. When we 103 00:12:05,069 --> 00:12:10,706 talk about dynamic information flow tracking at a low level, there are mainly 104 00:12:10,706 --> 00:12:22,489 three approaches. The first one is the one in the left-side of this slide. The idea is 105 00:12:22,489 --> 00:12:29,300 that in the upper-side of this figure, we have the normal processor pipeline: 106 00:12:29,300 --> 00:12:38,059 basically, decode stage, register file and Arithmetic & Logic Unit. The basic idea is 107 00:12:38,059 --> 00:12:44,410 that when we want to process with tags or taints, we just duplicate the processor 108 00:12:44,410 --> 00:12:54,129 pipeline (the grey pipeline under the normal one) just to process data. And, it 109 00:12:54,129 --> 00:12:58,009 implies two things: First of all, we must have the source code of the processor 110 00:12:58,009 --> 00:13:08,720 itself just to duplicate the processor pipeline and to make the DIFT pipeline. 111 00:13:08,720 --> 00:13:16,399 This is quite inconvenient because we must have the source code of the processor 112 00:13:16,399 --> 00:13:25,160 which is not really easy sometimes. Otherwise, the main advantage of this 113 00:13:25,160 --> 00:13:29,929 approach is that we can do nearly anything we want because we have access to all 114 00:13:29,929 --> 00:13:34,839 codes. So, we can pull all wires we need from the processor just to get the 115 00:13:34,839 --> 00:13:41,470 information we need. On the second approach (right side of the picture), 116 00:13:41,470 --> 00:13:47,129 there is something a bit more different: instead of having a single processor 117 00:13:47,129 --> 00:13:52,459 aiming to do the normal application flow + the information flow tracking, we should 118 00:13:52,459 --> 00:13:58,869 separate the normal execution and the information flow tracking (this is the 119 00:13:58,869 --> 00:14:04,639 second approach over there). This approach is not satisfying as well because you will 120 00:14:04,639 --> 00:14:15,019 have one core running the normal application but core #2 will be just able 121 00:14:15,019 --> 00:14:22,360 to make DIFT controls. Basically, it's a shame to use a processor just to make DIFT 122 00:14:22,360 --> 00:14:29,829 controls. The best compromise we can do is to make a dedicated coprocessor just to 123 00:14:29,829 --> 00:14:35,670 make the information flow tracking processing. Basically, the most 124 00:14:35,670 --> 00:14:42,160 interesting work in this topic is to have a main core processor aiming to make the 125 00:14:42,160 --> 00:14:47,079 normal application and a dedicated coprocessor to make the IFT controls. You 126 00:14:47,079 --> 00:14:54,380 will have some communications between those two cores. If we want to make a 127 00:14:54,380 --> 00:15:01,040 quick comparison between different works. If you want to run the dynamic information 128 00:15:01,040 --> 00:15:09,230 flow control in pure software (I will talk about that in the slide after), this is 129 00:15:09,230 --> 00:15:19,809 really painful in terms of time overhead because you will see that the time to do 130 00:15:19,809 --> 00:15:25,329 information flow tracking in pure software is really unacceptable. Regarding 131 00:15:25,329 --> 00:15:30,630 hardware-assisted approaches, the best advantage in all cases is that we have a 132 00:15:30,630 --> 00:15:38,269 low overhead in terms of silicon area: it means that, on this slide, the overhead 133 00:15:38,269 --> 00:15:45,799 between the main core and the main core + the coprocessor is not so important. We 134 00:15:45,799 --> 00:16:00,967 will see that, in the case of my talk, the dedicated DIFT coprocessor is also easier 135 00:16:00,967 --> 00:16:10,410 to get different security policies. As I said in the pure software solution (the 136 00:16:10,410 --> 00:16:17,499 first line of this table), the basic idea behind that is to use instrumentation. If 137 00:16:17,499 --> 00:16:23,579 you were there on day 2, the instrumentation is the transformation of a 138 00:16:23,579 --> 00:16:30,049 program into its own measurement tool. It means that we will put some sensors in all 139 00:16:30,049 --> 00:16:36,600 parts of my code just to monitor its activity and gather some information from 140 00:16:36,600 --> 00:16:42,869 it. If we want to measure the impact of instrumentation on the execution time of 141 00:16:42,869 --> 00:16:48,129 an application, you can see in this diagram over there, the normal application 142 00:16:48,129 --> 00:16:53,989 level which is normalized to 1. When we want to use instrumentation with it, the 143 00:16:53,989 --> 00:17:06,130 minimal overhead we have is about 75%. The time with instrumentation will be most of 144 00:17:06,130 --> 00:17:11,888 the time twice higher than the normal execution time. This is completely 145 00:17:11,888 --> 00:17:18,609 unacceptable because it will just run slower your application. Basically, as I 146 00:17:18,609 --> 00:17:24,409 said, the main concern about my talk is about reducing the overhead of software 147 00:17:24,409 --> 00:17:29,880 instrumentation. I will talk also a bit about the security of the DIFT coprocessor 148 00:17:29,880 --> 00:17:36,679 because we can't include a DIFT coprocessor without taking care of its 149 00:17:36,679 --> 00:17:45,370 security. According to my knowledge, this is the first work about DIFT in ARM-based 150 00:17:45,370 --> 00:17:53,380 system-on-chips. On the talk about the security of the Nintendo Switch, the 151 00:17:53,380 --> 00:17:59,460 speaker said that black-box testing is fun ... except that it isn't. In our case, we 152 00:17:59,460 --> 00:18:05,380 have only a black-box because we can't modify the structure of the processor, we 153 00:18:05,380 --> 00:18:13,810 must make our job without, let's say, decaping the processor and so on. This is 154 00:18:13,810 --> 00:18:21,910 an overall schematic of our architecture. On the left side, in light green, you have 155 00:18:21,910 --> 00:18:27,130 the ARM processor. In this case, this is a simplified version with only one core. 156 00:18:27,130 --> 00:18:32,630 And, on the right side, you have the structure of the coprocessor we 157 00:18:32,630 --> 00:18:40,720 implemented in the FPGA. You can notice, for instance, for the moment sorry, two 158 00:18:40,720 --> 00:18:48,070 things. The first is that you have some links between the FPGA and the CPU. These 159 00:18:48,070 --> 00:18:54,160 links are already existing in the system- on-chip. And you can see another thing 160 00:18:54,160 --> 00:19:03,680 regarding the memory: you have separate memory for the processor and the FPGA. And 161 00:19:03,680 --> 00:19:08,620 we will see later that we can use TrustZone to add a layer of security, just 162 00:19:08,620 --> 00:19:17,470 to be sure that we won't mix the memory between the CPU and the FPGA. Basically, 163 00:19:17,470 --> 00:19:24,240 when we want to work with ARM processors, we must use ARM datasheets, we must read 164 00:19:24,240 --> 00:19:29,660 ARM datasheets. First of all, don't be afraid by the length of ARM datasheets 165 00:19:29,660 --> 00:19:36,590 because, in my case, I used to work with the ARM-v7 technical manual which is 166 00:19:36,590 --> 00:19:49,251 already 2000 pages. The ARM-v8 manual is about 6000 pages. Anyway. Of course, what 167 00:19:49,251 --> 00:19:54,690 is also difficult is that the information is split between different documents. 168 00:19:54,690 --> 00:20:01,320 Anyway, when we want to use debug components in the case of ARM, we just 169 00:20:01,320 --> 00:20:07,740 have this register over there which is called DBGOSLAR. We can see that, in this 170 00:20:07,740 --> 00:20:15,400 register, we can say that writing the key value 0xC5A-blabla to this field locks the 171 00:20:15,400 --> 00:20:20,179 debug registers. And if your write any other value, it will just unlock those 172 00:20:20,179 --> 00:20:27,599 debug registers. So that was basically the first step to enable the debug components: 173 00:20:27,599 --> 00:20:38,840 Just writing a random value to this register just to unlock my debug components. Here 174 00:20:38,840 --> 00:20:44,870 is again a schematic of the overall system-on-chip. As you see, you have the 175 00:20:44,870 --> 00:20:50,220 two processors and, on the top part, you have what are called Coresight components. 176 00:20:50,220 --> 00:20:56,120 These are the famous debug components I will talk in the second part of my talk. 177 00:20:56,120 --> 00:21:05,680 Here is a simplified view of the debug components we have in Zynq SoCs. On the 178 00:21:05,680 --> 00:21:13,460 left side, we have the two processors (CPU0 and CPU1) and all the Coresight 179 00:21:13,460 --> 00:21:21,210 components are: PTM, the one which is in the red rectangle; and also the ECT which 180 00:21:21,210 --> 00:21:26,460 is the Embedded Cross Trigger; and the ITM which is the Instrumentation Trace 181 00:21:26,460 --> 00:21:32,940 Macrocell. Basically, when we want to extract some data from the Coresight 182 00:21:32,940 --> 00:21:43,559 components, the basic path we use is the PTM, go through the Funnel and, at this 183 00:21:43,559 --> 00:21:50,750 step, we have two choices to store the information taken from debug components. 184 00:21:50,750 --> 00:21:55,830 The first one is the Embedded Trace Buffer which is a small memory embedded in the 185 00:21:55,830 --> 00:22:04,279 processor. Unfortunately, this memory is really small because it's only about 186 00:22:04,279 --> 00:22:10,570 4KBytes as far as I remember. But the other possibility is just to export some data to 187 00:22:10,570 --> 00:22:15,799 the Trace Packet Output and this is what we will use just to export some data to 188 00:22:15,799 --> 00:22:26,309 the coprocessor implemented in the FPGA. Basically, what PTM is able to do? The 189 00:22:26,309 --> 00:22:34,149 first thing that PTM can do is to trace whatever in your memory. For instance, you 190 00:22:34,149 --> 00:22:41,880 can trace all your code. Basically, all the blue sections. But, you can also let's 191 00:22:41,880 --> 00:22:47,890 say trace specific regions of the code: You can say OK I just want to trace the 192 00:22:47,890 --> 00:22:55,519 code in my section 1 or section 2 or section N. Then the PTM is also able to 193 00:22:55,519 --> 00:23:00,100 make some Branch Broadcasting. That is something that was not present in the 194 00:23:00,100 --> 00:23:06,919 Linux kernel. So, we already submitted a patch that was accepted to manage the 195 00:23:06,919 --> 00:23:14,309 Branch Broadcasting into the PTM. And we can do some timestamping and other things 196 00:23:14,309 --> 00:23:22,250 just to be able to store the information in the traces. Basically, what a trace 197 00:23:22,250 --> 00:23:27,340 looks like? Here is the most simple code we could had: it's just a for loop 198 00:23:27,340 --> 00:23:35,570 doing nothing. The assembly code over there. And the trace will look like this. 199 00:23:35,570 --> 00:23:45,070 In the first 5 bytes, some kind of start packet which is called the A-sync packet 200 00:23:45,070 --> 00:23:50,390 just to say "OK, this is the beginning of the trace". In the green part, we'll have 201 00:23:50,390 --> 00:23:56,460 the address which corresponds to the beginning of the loop. And, in the orange 202 00:23:56,460 --> 00:24:02,700 part, we will have the Branch Address Packet. You can see that you have 10 203 00:24:02,700 --> 00:24:08,299 iterations of this Branch Address Packet because we have 10 iterations of the for 204 00:24:08,299 --> 00:24:18,679 loop. This is just to show what is the general structure of a trace. This is just 205 00:24:18,679 --> 00:24:22,720 a control flow graph just to say what we could have about this. Of course, if we 206 00:24:22,720 --> 00:24:27,009 have another loop at the end of this control flow graph, we'll just make the 207 00:24:27,009 --> 00:24:31,820 trace a bit longer just to have the information about the second loop and so 208 00:24:31,820 --> 00:24:40,980 on. Once we have all these traces, the next step is to say I have my tags but how 209 00:24:40,980 --> 00:24:49,220 do I define the rules just to transmit my tags. And this is there we will use static 210 00:24:49,220 --> 00:24:55,880 analysis for this. Basically, in this example, if we have the instruction "add 211 00:24:55,880 --> 00:25:05,870 register1 + register2 and put the result in register0". For this, we will use 212 00:25:05,870 --> 00:25:12,779 static analysis which allows us to say that the tag associated with register0 will be 213 00:25:12,779 --> 00:25:19,029 the tag of register1 or the tag of register2. Static analysis will be done 214 00:25:19,029 --> 00:25:25,220 before running my code just to say I have all the rules for all the lines of my 215 00:25:25,220 --> 00:25:33,590 code. Now that we have the trace, we know how to transmit the tags all over my code, 216 00:25:33,590 --> 00:25:41,529 the final step will be just to make the static analysis in the LLVM backend. The 217 00:25:41,529 --> 00:25:46,640 final step will be about instrumentation. As I said before, we can recover all the 218 00:25:46,640 --> 00:25:51,809 memory addresses we need through instrumentation. Otherwise, we can also 219 00:25:51,809 --> 00:26:02,850 only get the register-relative memory addresses through instrumentation. In this 220 00:26:02,850 --> 00:26:12,179 first case, on this simple code, we can instrument all the code but the main 221 00:26:12,179 --> 00:26:19,909 drawback of this solution is that it will completely excess the time of the 222 00:26:19,909 --> 00:26:27,400 instruction. Otherwise, what we can do is that with the store instruction over 223 00:26:27,400 --> 00:26:33,529 there, we can get data from the trace: basically, we will use the Program Counter 224 00:26:33,529 --> 00:26:37,860 from the trace. Then, for the Stack Pointer, we will use static analysis to 225 00:26:37,860 --> 00:26:42,730 get information from the Stack Pointer. And, finally, we can use only one 226 00:26:42,730 --> 00:26:54,590 instrumented instruction at the end. If I go back to this system, the communication 227 00:26:54,590 --> 00:27:03,039 overhead will be the main drawback as I said before because if we have over there 228 00:27:03,039 --> 00:27:09,340 the processor and the FPGA running in different parts, the main problem will be 229 00:27:09,340 --> 00:27:18,090 how we can transmit data in real-time or, at least, in the highest speed we can 230 00:27:18,090 --> 00:27:27,460 between the processor and the FPGA. This is the time overhead when we enable 231 00:27:27,460 --> 00:27:35,299 Coresight components or not. In blue, we have the basic time overhead when the 232 00:27:35,299 --> 00:27:40,610 traces are disabled. And we can see that, when we enable traces, the time overhead 233 00:27:40,610 --> 00:27:50,620 is nearly negligible. Regarding time instrumentation, we can see that regarding 234 00:27:50,620 --> 00:27:56,780 the strategy 2 which is using the Coresight components, using the static 235 00:27:56,780 --> 00:28:02,429 analysis and the instrumentation, we can lower the instrumentation overhead from 236 00:28:02,429 --> 00:28:11,120 53% down to 5%. We still have some overhead due to instrumentation but it's 237 00:28:11,120 --> 00:28:18,219 really low compared to the related works where all the code was instrumented. This 238 00:28:18,219 --> 00:28:26,190 is an overview that shows that in the grey lines some overhead of related works 239 00:28:26,190 --> 00:28:31,200 with full instrumentation and we can see that, in our approach (with the greeen 240 00:28:31,200 --> 00:28:43,870 lines over there), the time overhead with our code is much much smaller. Basically, 241 00:28:43,870 --> 00:28:49,139 how we can use TrustZone with this? This is just an overview of our system. And we 242 00:28:49,139 --> 00:28:55,699 can say we can use TrustZone just to separate the CPU from the FPGA 243 00:28:55,699 --> 00:29:07,210 coprocessor. If we make a comparison with related works, we can see that compared to 244 00:29:07,210 --> 00:29:14,260 the first works, we are able to make some information flow control with an hardcore 245 00:29:14,260 --> 00:29:22,289 processor which was not the case with the two first works in this table. It means 246 00:29:22,289 --> 00:29:26,510 you can use a basic ARM processor just to make the information flow tracking instead 247 00:29:26,510 --> 00:29:33,340 of having a specific processor. And, of course, the area overhead, which is 248 00:29:33,340 --> 00:29:39,090 another important topic, is much much smaller compared to the existing works. 249 00:29:39,090 --> 00:29:44,570 It's time for the conclusion. As I presented in this talk, we are able to use 250 00:29:44,570 --> 00:29:50,789 the PTM component just to obtain runtime information about my application. This is 251 00:29:50,789 --> 00:29:56,938 a non-intrusive tracing because we still have negligible performance overhead. 252 00:29:56,938 --> 00:30:02,150 And we also improve the software security just because we were able to make some 253 00:30:02,150 --> 00:30:07,709 security on the coprocessor. The future perspective of that work is mainly to work 254 00:30:07,709 --> 00:30:16,020 with multicore processors and see if we can use the same approach for Intel and maybe 255 00:30:16,020 --> 00:30:21,100 ST microcontrollers to see if we can also do information flow tracking in this case. 256 00:30:21,100 --> 00:30:25,519 That was my talk. Thanks for listening. 257 00:30:25,519 --> 00:30:33,171 *applause* 258 00:30:35,210 --> 00:30:37,866 Herald: Thank you very much for this talk. 259 00:30:37,866 --> 00:30:44,580 Unfortunately, we don't have time for Q&A, so please, if you leave the room and take 260 00:30:44,580 --> 00:30:48,169 your trash with you, that makes the angels happy. 261 00:30:48,169 --> 00:30:54,840 Pascal: I was a bit long, sorry. 262 00:30:54,840 --> 00:30:57,490 Herald: Another round of applause for Pascal. 263 00:30:57,490 --> 00:31:02,722 *applause* 264 00:31:02,722 --> 00:31:07,512 *34c3 outro* 265 00:31:07,512 --> 00:31:24,000 subtitles created by c3subtitles.de in the year 2020. Join, and help us!