Mill Computing, Inc. › Forums › The Mill › Architecture › Memory › Reply To: Memory
I read through the Peyton Jones paper; not sure how long ago I looked at if ever, though the Miranda paper I did read.
Some general comments:
1) The Mill design has generally been concerned with commercially important languages, and we haven’t put much thought into support for esoterica, which Haskell still is 🙂 However, we want to provide such support, but don’t really know enough about the issues and details at this point.
2) The scheme of using global state over function calls as a way to do tail calls doesn’t work on a Mill; call has much more complex semantics than just changing the code pointer. However, the Mill supports label variables and indirect branches, which should provide a satisfactory substitute.
3) The data structure in which there is a pointer to a closure which contains a pointer to an info record which contains a pointer to the code would be very painful on any modern machine, Mill included. This implies that a closure should contain both the code and the arguments so that it can simply be entered with a single indirection, which is free if the closure pointer is already on the belt. However, there are issues with mixed code and data, because the Mill uses a Harvard architecture with separate i$ and d$. There are also issues with protection; the PLB has byte granularity, but you *really* don’t want a PLB entry for every closure! As a naive alternative, the closure could contain the bindings and an info pointer and also a code pointer, eliminating one but not both indirections. I have to think about what the predictor consequences of that would be.
4) One possibility is to use the function pointer representation of lexically nested functions as the representation of a closure pointer. That is, a closure pointer is the pair {code pointer, binding pointer}, as is used e.g. for Pascal and other languages with lexical nesting. We expect this form to be used for lambdas in C++. The drawback is that function pointers become 128 bits, which is OK on members with quad but not so good on smaller members.
5) An alternative is to generate a thunk for each lambda or statically nested function, where the thunk loads a static pointer to the binding (or outer frame) and then jumps to the actual code. This approach is used by gcc for lambdas I think.
6) both of the above have issues with where to put the data when there’s no GC; not relevant for Haskell and friends, but very relevant for C++.
7) The natural place for the Node register is inpReg. There are interesting issues when the closure is in a different protection domain (“turf”) than the caller, which inpReg helps with (sorry, NYF). In particular, the caller must not be able to muck with the closure data before or during the call. This again suggests a thunk-like approach, but again the PLB can’t tolerate one thunk per closure-instance and even one per bind-clause may have performance issues.
I’m probably showing my ignorance here 🙁 We will want to revisit this when the full machine is public.