Mill Computing, Inc. Forums The Mill Architecture coroutines & greenlets in Mill

  • Author
    Posts
  • cheery
    Participant
    Post count: 7
    #843 |

    I suspect the staff already knows what coroutines or greenlets (green threads) are, but I won’t leave ambiguity here.

    Coroutines are like functions except you can return from them and continue with the evaluation later. They’re used to simplify program flow, for example python’s generators are coroutines.

    Greenlets are pretty much like representation of single task, and you can switch between these tasks with simple commands. Python’s greenlets are a good example. It’s been implemented in PC by peeling the stack into heap whenever a greenlet switch is called, up to the point where a replacement stack can be copied into correct position.

    I wonder how efficient or usable these concepts are in Mill architecture?

    If you wonder why I am asking it, you can look into https://github.com/cheery/midi-proxy-js/blob/master/midi-proxy.py to see some code I’ve written before. They allow representing asynchronous behavior with easily understandable code that is written just like synchronous equivalents.

    If they were efficient on hardware, there would be more motivation to use them in dynamic languages.

    • This topic was modified 10 years ago by  cheery.
    • This topic was modified 10 years ago by  cheery.
  • Ivan Godard
    Keymaster
    Post count: 689

    We know about co-routines 🙂 And light-weight processes for that matter, which are greenlets by another name.

    The good news is that the Mill supports co-routines and microthreads in hardware. The bad news is that it’s all NYF. Expect a talk on the subject at some point.

    Even without the NYF stuff, the hardware threads described in the Security talk can be used as greenlets. It ‘s a policy decision whether a given thread spawn produces an OS-known thread (and hence subject to preemptive multitasking, or does not and produces a greenlet that does voluntary multitasking. An OS swap-out saves the relevant specRegs in the current TCB. If the TCB is bound to a particular thread id then you have heavy-weight tasking. If the TCB saves (and eventually restores) the current thread id whatever it is then you have a greenlet group. The OS task switch code needs to be aware of the possibility of greenlets, but it makes no difference to the hardware.

    At that level at least. There’s more, deeper, but NYF. Sorry.

  • cheery
    Participant
    Post count: 7

    It would be interesting to hear how other things associated to higher order programming translate over. For instance JIT (code generation), garbage collection, or calls to native functions with arbitrary number of arguments (FFI).

    One characteristic of greenlets have been that they’ve been proposed to be lightweight enough to allow thousands of them exist simultaneously. Even with hardware support, is that possible?

    And what does the abbreviation NYF stand for? Is it an abbreviation of something in the first place? I didn’t find description for it.

    • This reply was modified 10 years ago by  cheery.
    • Ivan Godard
      Keymaster
      Post count: 689

      Sorry – you’ll see NYF here a lot. It stands for Not Yet Filed, as in patents, and implies that we can’t answer the question without an NDA due to USPTO rules.

      JIT code generation is conventional. To save the JIT from needing to do scheduling and bit packing, the JIT will create Mill load module abstract code and then call the same library that the specializer uses to get executable bits for the host machine. Thus the same JIT runs on all Mill members, even though the binary encoding varies by member.

      The Mill has special support for GC, in the form of the event bits in the pointer format. See the Memory talk IIRC.

      VARARGS is supported.

      As the language RTS will need to keep state, including a possibly large stack, for each of those thousands of greenlet threads, the limiting factor for any CPU, Mill included, is probably thrashing in the caches. Even if each greenlet thread uses only the 4KB initial stacklet, a thousand would completely saturate the L2 of a modern CPU, leaving no room for code, OS, or any other process. So they would get evicted to DRAM, and switching to a new greenlet would require reloading from DRAM. The result would be hopelessly slow on any CPU architecture. You would run out of cache long before you ran out of thread ids.

      If the greenlets are not really threads but are closures (small closures) that transiently use a stack when they get invoked, then the state requirements become much less and the cache issues go away. However there is then no reason to treat the greenlet as a thread requiring an id in the Mill sense; they are just a collection of cross-calling closures that can be identified by the address of the state object, and there is only one real stack and hence only one real thread and only one real thread id.

      Even if the greenlet/closures admit callbacks (and hence cannot use a single stack without GC) you can multiplex them across a small set of true threads (Mill sense), where the thread pool size is determined by the number of concurrent cross-activations, which I suspect is orders of magnitude smaller than the number of greenlets in the use-case.

      In all of these the Mill hardware should make the implementation of greenlets easier than on a conventional. But the details depend on the use-case and what the designers of the software have in mind.

You must be logged in to reply to this topic.