Mill Computing, Inc. Forums The Mill Architecture ASLR (security) Reply To: ASLR (security)

Ivan Godard
Post count: 689

We seem all agreed that The Mill makes ROP impossible. Back when we first decided to put the call metadata in the spiller directly, we also looked at whether the equivalent exploit using function pointers (including VTABs) was possible, and if so what we could do about it. Your advocacy of AS:R has prompted me to rethink the issues.

To an attacker, ROP offers an easy way to create programs of arbitrary length: identify code fragments with interesting operation sequences ending with return, construct an arbitrarily long sequence of return addresses on the stack where the return address points at the next desired code fragment in sequence, and let the next program return enter the constructed sequence. Straightforward. Returns are ubiquitous, so it’s easy to find desired fragments.

Considering conventional architectures (say x86) first, doing the same idea with function or label pointers is not so easy. Rather than returning, the fragment must end with a branch through a label pointer or a call through a function pointer. Label pointers are quite rare in code, dispatch tables for switch statements mostly, and those are often in read-only memory. However function pointers are common, especially in OO languages. An exploit could overwrite one in dataspace and when the function was called could send the program to an arbitrary exploit-chosen location.

But is that sufficient for penetration? No: it gets only one fragment executed, whereas the exploit needs an arbitrary sequence so as to be able to construct a program. On ROP, sequences fall out of of the stack structure and the way return works; if you can get the first fragment executed, it’s easy to get the second/ But how do you get the second fragment using function pointers?

To chain a call exploit, the fragment must 1) end with another function-pointer call; 2) it must find and access the FP to call through; 3) the new FP must be changeable by the attacker to redirect the call. As FP calls are common in some languages I’ll assume that #1 can be satisfied by scanning enough code; the scan may be made more difficult by ASLR or other defenses, but won’t be impossible.

#2 depends on how hardware supports indirect call. On some architectures (x86?) you can call through a memory address as well as calling toone. On other machines (most RISC) you have to load the FP to a register before executing an indirect call. To get that second fragment, there must be a FP that can be made to point to it (#1, assumed) that the first one indirects through. Consequently, the number of fragments in the exploit program is limited by the number of different FPs used for indirection in the host program. It does the exploit no good if the end call of a candidate fragment goes through a FP that is already used by a different fragment.

This is a rather strong limitation; a fragment must not only end with a FP call, it must end with a FP call to a function not called by any other fragment. As FPs are used by the host program for function entry, this limitation means that the number of fragments must be no more than the number of functions (functions, not calls) in the host that are the target of dispatch. That number is small enough in C and C++ to make an FP exploit infeasible, but a naive Java implementation that dispatches every function would remain exploitable. So let’s say that #2 is open to an attacker, so long as the exploit program is short, in the order of a thousand or so equivalent instructions.

#3 requires that the FP be in writable memory. This seems the greatest constraint to me: most indirect calls are vtables, and those are in read-only memory (or the defender can put them there). FPs as data are certainly used, but the number of useful fragments that end with a call through a data FP is very few, and it seems unlikely that an exploit could find enough to be able to chain them into a program.

So our conclusion then, and my conclusion now, is that FP oriented programming is possible on a conventional in the abstract, but it seems unlikely that it coiukd be used in practice to construct a usable exploit. Still, unlikely is not never.

Now how does the Mill impact this? The Mill has a unique encoding because it executes two instruction streams in opposite directions from an entry point in the code. Unlike a conventional, both streams must make sense to the decoder or the hardware will fault. Sensibility is easy if the exploit fragment is a whole EBB, because the entry address has real code on both sides. However, it makes finding a fragment at a point other than at the EBB entry near impossible: on one side or the others, bits intended for a different engine must still make sense and do something useful when bit-reversed. The situation is more extreme even than jumping into an array of floating-point numbers and hoping to execute them as code.

Moreover, not only must the code make sense, but its belt behavior must also make sense, on both sides. And note that the Mill does not have a memory-indirection call operation, so the chain call to the next fragment must itself modify the belt. As fragments can only pass data along to the next fragment within the belt, a useful fragment depends not only on valid bit encoding, but also on a jigsaw-puzzle belt fit with adjacent fragments. I may be unimaginative, but I consider finding and composing such fragments to be impossible.

This leaves using existing EBBs as fragments. The nature of Mill code, as a wide-issue machine, means that there are many short (often one instruction) EBBs. However, to form a chain the EBB must end with a FP call, and the FP must be in the belt. At the right place. So in practice, the EBB must contain a load of the FP too, or a fill from scratch. Scratch cannot be overwritten by a buffer overflow, so getting the modified FP to scratch seems impossible. Mill code hoists loads, including those of FPs, so the fragment must have at least enough instructions to load the FP to the belt and call it, without screwing up the belt position order for the FP data.

Bottom line: ain’t gonna happen on a Mill.

Of course, if the code contains a function that unlocks the front door unconditionally, then an exploit can arrange for that function to be called. Bang, you’re in, on a Mill or any architecture. But this gets back to “well structured”: no program desiring security should have the access control in userland; access control should be a simple generic library run as a service.