Mill Computing, Inc. Forums The Mill Architecture coroutines & greenlets in Mill Reply To: coroutines & greenlets in Mill

Ivan Godard
Keymaster
Post count: 689

Sorry – you’ll see NYF here a lot. It stands for Not Yet Filed, as in patents, and implies that we can’t answer the question without an NDA due to USPTO rules.

JIT code generation is conventional. To save the JIT from needing to do scheduling and bit packing, the JIT will create Mill load module abstract code and then call the same library that the specializer uses to get executable bits for the host machine. Thus the same JIT runs on all Mill members, even though the binary encoding varies by member.

The Mill has special support for GC, in the form of the event bits in the pointer format. See the Memory talk IIRC.

VARARGS is supported.

As the language RTS will need to keep state, including a possibly large stack, for each of those thousands of greenlet threads, the limiting factor for any CPU, Mill included, is probably thrashing in the caches. Even if each greenlet thread uses only the 4KB initial stacklet, a thousand would completely saturate the L2 of a modern CPU, leaving no room for code, OS, or any other process. So they would get evicted to DRAM, and switching to a new greenlet would require reloading from DRAM. The result would be hopelessly slow on any CPU architecture. You would run out of cache long before you ran out of thread ids.

If the greenlets are not really threads but are closures (small closures) that transiently use a stack when they get invoked, then the state requirements become much less and the cache issues go away. However there is then no reason to treat the greenlet as a thread requiring an id in the Mill sense; they are just a collection of cross-calling closures that can be identified by the address of the state object, and there is only one real stack and hence only one real thread and only one real thread id.

Even if the greenlet/closures admit callbacks (and hence cannot use a single stack without GC) you can multiplex them across a small set of true threads (Mill sense), where the thread pool size is determined by the number of concurrent cross-activations, which I suspect is orders of magnitude smaller than the number of greenlets in the use-case.

In all of these the Mill hardware should make the implementation of greenlets easier than on a conventional. But the details depend on the use-case and what the designers of the software have in mind.