Mill Computing, Inc. Forums The Mill Tools Compilers Creating an LLVM backend for the Mill Reply To: Creating an LLVM backend for the Mill

Ivan Godard
Keymaster
Post count: 689

The current LLVM port effort does not use the LLVM back end as such, nor does it accept the LLVM middle-to-backend IR, which discards information that the specializer needs. The replacement for that IR is a serialization of the middle-end dataflow graph structure, or at least we think so; work is still to early to be sure.

The intent is that the input to the specializer is of a form that permits a very fast and simple specialize step. Operation selection has been done in the compiler, using an abstract Mill with all operations supported.

We also will be adding a few passes to the middle-end, primarily to do operation selection. It’s good news if you have done some of these passes already. Type regularizing is certainly applicable. It’s not clear to me whether VARARGS can be handled in the compiler for us, because until we know what the belt length is (at specialize time) we don’t know how many can be passed in the belt. Of course, we might ignore that and pass all VARARGS on the stack; it’s not like performance matters much for that.

Large arguments by value is an interesting problem for us because the Mill call may cross a protection boundary. It is necessary to have such arguments addressable by both the caller (to pass them) and the callee (to use them), and there are semantic and security issues involved. For example, can the caller pass a struct and put a pointer to the passed struct in a global variable, make the call, and then another thread of the same process modifies the passed struct while the callee is working on it? Such things are hard to get right; it much easier to ignore such problems, and say that exploits are an application problem, but that’s not Mill.

It sounds like your expandStruct converts structs into tuples. That too is something we want to do, in part because we have to use tuples to be able to support Mill multi-result call operations. Although we are very tempted to add pass-by-copy to C/C++ as an extension as part of our work; IN, OUT, and INOUT parameters are a much more natural way to express multiple results than tuples IMO.

If you and your team are Bay Area, I’d be happy to buy you a coffee and gain what we can from your experience with LLVM. Likewise any other Forum participant with LLVM internals experience who would like to help. You can reach me at ivan@millComputing.com.