Forum Replies Created

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • lpsmith
    Participant
    Post count: 2
    in reply to: The Belt #1552

    I just watched this video, and it was very interesting. Thank you.

    As mentioned in other threads, a big issue is proper tail calls. It’s not correct to think of proper tail calls as an “optimization” that can be performed by compilers: in general, removing all tail calls from a program is a global transformation, breaking separate compilation. “Tail call optimization” really is a misnomer, it should be called “tail recursion optimization”, because proper tail calls are not necessarily recursive, and tail recursion optimization and proper tail calls can be orthogonal concerns.

    Proper tail calls can be compiled onto most instruction sets just fine; the real issue is that popular compilers do not do this. So, compiling languages such as Scheme, ML, Haskell, F#, or .NET bytecode down to C (in order to avoid implementing a native code generator) or the JVM or JavaScript (because you have no choice given your deployment targets) involves some pretty monstrous kludges and/or some amount of interpretation remaining in the generated code. But compiling them down to machine code really isn’t an issue.

    (As an aside, unlike some remarks in other threads, I wouldn’t worry one bit about closures. We know how to deal with closures just fine, thank you. I haven’t been excited by various efforts to implement them in hardware, unlike the Mill.)

    Ideally, the Mill CPU would offer a “callret” instruction, that would be just like the current “call” instruction except that the current belt and scratch pad would be discarded before the call. When (if) the called function returns, it’s results would be dumped onto whatever belt the calling function would have returned to.

    As I understand it, the Mill can almost do tail calls: simply shuffle the belt and then jump. The two problems with this, though, that the called function would not necessarily have enough scratchpad space allocated to perform its work, and that the called function would be able to read non-parameter values left in the caller’s belt and scratchpad.

    The second issue certainly does raise the specter of security issues via information leakage; I don’t know enough about the Mill CPU’s security model and potential applications to offer an educated guess about the practical relevance of this issue.

    The first issue, however, would certainly be a pretty fundamental stumbling block for proper tail calls if scratchpad space cannot be re-allocated, and a practical stumbling block if the initial scratchpad allocation instruction doesn’t also serve as a scratchpad reallocation instruction. After all, the called function should have no idea if it was called normally or called via a tail call.

    Practical applications of tail calls include loops that span compilation boundaries, big Gordian knots of mutual recursion (as often found in lexers, parsers, and state machines), preserving object oriented abstractions, and supporting the commercially-relevant .NET bytecode instruction set.

    • This reply was modified 9 years, 12 months ago by  lpsmith.
    • This reply was modified 9 years, 12 months ago by  lpsmith.
    • This reply was modified 9 years, 12 months ago by  lpsmith.
    • This reply was modified 9 years, 12 months ago by  lpsmith.
    • This reply was modified 9 years, 12 months ago by  lpsmith.
  • lpsmith
    Participant
    Post count: 2
    in reply to: The Belt #1573

    2) My understanding is that .NET has a similar limitation. This limitation may not be strictly necessary, the paper A tail-recursive machine with stack inspection (non-paywalled link) may be relevant here.

    3) It’s not strictly necessary to get rid of the execution context immediately. The important thing is that the retained context for all tail calls is bounded by a (hopefully small) constant. Perhaps it would be simpler to mark a sequence number as a tail call, and then have the spiller throw away the belt when it would have been otherwise spilled? This might not be an ideal implementation, as it would increase the pressure on the on-chip storage, but it should still be quite usable, and would probably be a more than adequate first implementation.

    5a) Scheme supports varargs and mandates proper tail calls. Chez Scheme, Ikarus, Racket, Stalin, and/or Larceny might do what you describe, but I don’t know enough about any of these implementations on that particular count. In any case, it’s probably not important.

    5b) Yup, even with a callret instruction, there would still likely be some fairly obvious reasons to perform tail recursion optimization on the Mill. Even some Scheme compilers have TRO, though they all also implement proper tail calls. If TRO was all there really was to proper tail calls, they wouldn’t be that interesting or significantly increase the expressive power of the language. But there’s a lot more to proper tail calls than TRO.

Viewing 2 posts - 1 through 2 (of 2 total)