ivan
Hardware div units are expensive and don't pipeline well. On a microcoded OOO box that doesn't matter much, but the Mill doesn't microcode and is in-order. Other machines also use macrocode after a seed value, so we knew it was possible and would fit with the rest of the machine regardless of the algorithm, so we left the details to arm-waving until later. The emulation codes are like that in general: obviously software emulation of e.g. floating-point is possible, but it's a rat's nest that needs a specialist, so the details were deferred until we had a specialist.
We now have those specialists (Terje and Allen, and now Larry), so the arm-waving is being turned into code. Right now the bottleneck is genAsm. Speaking of which, I better get back to work :-)
PeterH
In the Compiler talk it was mentioned that launching coroutines is a user level operation. How is a process prevented from DoSing the chip by endlessly launching coroutines?
ivan
You can only DOS yourself, just like you can by coding an infinite loop. The coroutine is not an OS process, it's part of your program, and they collectively run under your timeslice. If you want a true OS process you have to ask the OS for it as usual. Or you can get five processes from the OS and run 20 coroutines in them; mix and match.
ralphbecket
I've seen mention in these fora that the Mill provides some support for garbage collectors. I'd like to ask a few questions on the topic if I may.
(1) Garbage collection typically involves a "stop the world" event where threads in the mutator are temporarily suspended, at least for the purposes of identifying roots. Stopping the world can be quite costly. Does the Mill offer any special support in this regard?
(2) Each concurrently executing thread may have pointers sitting on its belt. How can the GC get access to these during root marking?
(3) Many collectors compact memory. Since pointers can live on the belt and/or may have been squirreled away by the spiller, how should the GC take this into account?
Apologies for asking more than one question in a single post, but they are closely related.
ivan
I'll give you the best present most recent final word about GC, but must caution that we have not yet implemented a GC and this may not be as final as we'd like.
1) Pointers reserve three bits for use by the garbage collector; hardware knows this and for example does not examine these bits when testing pointer equality. There are a set of hardware masks that are checked when a pointer is used to access memory; the three bits index into the mask, or, when storing a pointer, the concatenated bits of the pointer being stored and the address being stored to are used as the index. If the indexed bit is set then the operation traps to software. With this, one can write a generation GC that does not "stop the world" because by setting the masks and allocators right one can get a trap whenever there is an up-generation store of a pointer and add the stored pointer to its roots.
2) There are several situations in which software must be able to access program state held in the spiller. GC is one such situation, but also debuggers and various statistics and diagnostics tools as well. System software presents a trusted API for such inspection; there is no direct access because spiller state is not in application address space. The API interacts with the portal call mechanism and permission granting. For example, if the app makes a portal call passing a callback function which the service then calls, and then takes a segv in the callback, the debugger will be able to inspect the state of the callback and of the app, but not the state of the service that is between the two (unless it requests and is granted permission for that address space too, of course).
The API only has bitwise access to the contents of the spilled data stack, but not the spilled control stack. The API provides enough information to construct a backtrace, but the debugger does not have direct access to return addresses, downstack links, and the rest of the call machinery; Mill is very attentive to matters of security.
3) GC and other software such as debuggers can get permissions to mutate saved state such as spilled pointers.
ralphbecket
Apologies for the late response: thank you for your answer. This sounds as though it will make GC substantially cheaper in practice and easier to implement. I look forward to someday seeing an implementation.