Mill Computing, Inc. › Forums › The Mill › Architecture › A random smattering of questions.
- AuthorPosts
- #3185 |
I’m new to the Mill and just sat through the various videos that have been posted. I still had some questions: if any of these have already been answered, I am sorry.
1) What is NaR + NaR? Whose metadata used?
I would like that the NaR that was generated first would be used, but given the format of your metadata, a one-side-wins approach is all I can hope for.2) A process (thread 8 in turf 5) spawns and dispatches a new thread (thread 22475). That thread (22475) kills itself. What is the OS to do about the original thread (8)? It can’t dispatch it: the process has no threads parked in the kernel turf. Can the OS only kill that process? What happens to the resources used by the thread? Stated alternatively: can I create a convoluted process by which a poorly written OS can leak machine resources?
3) In units of ALUs, how much hardware are you throwing at making a call a one-cycle operation?
4) Will all operations have the same latency across all Mill chips? Your examples have always had that multiply is a three cycle operation. Would a Mill ever be delivered with, for instance, a five cycle multiplier?
You haven’t done the talk on virtualization yet, so this may be answered later….
Popek and Goldberg’s virtualization requirements ( https://en.wikipedia.org/wiki/Popek_and_Goldberg_virtualization_requirements ) assert that all sensitive instructions must be privileged for efficient virtualization. You have repeatedly asserted that the Mill does not have privileged instructions. That would imply that you have no sensitive instructions, or the Mill is not virtualizable in the sense of Popek and Goldberg.5) The idea behind virtualization, for many people, is the ability to run an unmodified OS (as from Microsoft or Apple) as an application within another OS. This is so that one can sell a single computer to multiple dupes under a “cloud” computing scam, or allow one to use Microsoft Office natively running on Windows which is magically running on Linux. Will the Mill be virtualizable in this sense?
4) Will all operations have the same latency across all Mill chips? Your examples have always had that multiply is a three cycle operation. Would a Mill ever be delivered with, for instance, a five cycle multiplier?
It’s been mentioned in a few talks that the latencies are dealt with in the specializer, so they’re allowed to vary across members. I don’t remember off-hand whether any hardware instructions vary in latency, but binary and decimal floating-point were given as examples of code where the hardware implementation might not exist at all and the specializer would implement it in terms of a library call or inline instructions, effectively changing the latency of the operation.
-
1) What is NaR + NaR? Whose metadata used?
It will be one of the inputs; which one is implementation dependent.
-
2) A process (thread 8 in turf 5) spawns and dispatches a new thread (thread 22475). That thread (22475) kills itself. What is the OS to do about the original thread (8)? It can’t dispatch it: the process has no threads parked in the kernel turf. Can the OS only kill that process? What happens to the resources used by the thread?
OS (and RTS) policy. For example, an exception handler within the thread being unwound might dispatch out. There are some bottom turtle issues in thread death, just as there are for permission revoke, and we are not sure we have everything covered; there may be additional helper operations added to the ISA in the future as we (and the OS implementors) gain more experience.
-
Stated alternatively: can I create a convoluted process by which a poorly written OS can leak machine resources?
Of course, FSVO poor.
-
3) In units of ALUs, how much hardware are you throwing at making a call a one-cycle operation?
As compared to a 5 cycle call operation say, very little. As compared to a call without without the spiller, quite a bit, but hard to measure in those units. The spiller has internal SRAM for skid buffering, probably about as much as a top level cache – how many ALUs is a cache? The rest of the spiller and the belt are roughly the same as the bypasses on a conventional. Essentially all the call cost is the spiller.
-
4) Will all operations have the same latency across all Mill chips? Your examples have always had that multiply is a three cycle operation. Would a Mill ever be delivered with, for instance, a five cycle multiplier?
Hardware latency is specified individually on a per-FU basis in our configuration tools. Not only can latency vary across family members, they can also vary per slot within a single member, so a chip could have a fast (expensive) multiplier and also a slow (cheap) one.
-
You haven’t done the talk on virtualization yet, so this may be answered later….
Popek and Goldberg’s virtualization requirements ( https://en.wikipedia.org/wiki/Popek_and_Goldberg_virtualization_requirements ) assert that all sensitive instructions must be privileged for efficient virtualization. You have repeatedly asserted that the Mill does not have privileged instructions. That would imply that you have no sensitive instructions, or the Mill is not virtualizable in the sense of Popek and Goldberg.There are no sensitive operations. All control is via the address space and the protection mechanism.
-
5) The idea behind virtualization, for many people, is the ability to run an unmodified OS (as from Microsoft or Apple) as an application within another OS. This is so that one can sell a single computer to multiple dupes under a “cloud” computing scam, or allow one to use Microsoft Office natively running on Windows which is magically running on Linux. Will the Mill be virtualizable in this sense?
Not yet announced because NYF.
1) What is NaR + NaR? Whose metadata used?
It will be one of the inputs; which one is implementation dependent.So if it were a case of Null + some NaR that triggers an exception when written to memory, the behavior would depend on the particular Mill core? I was expecting some priority between the various NaRs.
There is no “Null”. Perhaps you are thinking of “None”, which is a distinct meta-state and is not a NaR in the machine semantics, albeit encoded in a similar way. “None” is preserving (None+NaR->None), but also None+None is implementation dependent.
My first two questions were originally about NaR + None, then I found a post that linked to this https://millcomputing.com/topic/metadata/#post-558 . To me, it makes the most sense given the use that the None is supposed to have, but I could be wrong.
Thinking more deeply, it really doesn’t matter whose metadata for a NaR wins. While I would want the location of the first generated NaR, it actually isn’t helpful. A pseudocode example:
A = 5 / 0 B = 6 / 0 C = A + B
If this is the code order, there is nothing that stops modern programming languages from reordering the divides as they see fit. Ideally, an unoptimized binary wouldn’t. Ideally.
As for the fast/slow multiplier question: my mistake was in what form genForm takes. I was putting it closer to the form in the hardware, which would make some specialization more difficult.
- AuthorPosts
You must be logged in to reply to this topic.