Forum Replies Created

Viewing 15 posts - 16 through 30 (of 74 total)
  • Author
    Posts
  • LarryP
    Participant
    Post count: 78
    in reply to: Execution #1718

    Changes in the divide instructions — now native on Silver?

    Greetings all,

    According to the wiki: http://millcomputing.com/wiki/Instruction_Set/divu

    Divide operations are no longer all emulated via loops around (less expensive in hardware) helper operations. This is an interesting departure from the strategy of emulating divide by looping around a helper function (that’s less hardware costly.)

    It seemed to me that this might happen at some point, for some Mill core(s), but I expected that it wasn’t on the critical path, so am interested in why change it now?

    In some ways, this reminds me of what seemed to happen with the ARM7 series of chips. The base ARM7 lacked hardware divide, and although there were software emulations, I think ARM licensed far more ARM7’s with hardware divide than without. I think system designers were worried about worst-case soft-divide performance, should the software guys intend on writing divide-heavy code. (Shame on them!)

    Thoughts?

  • LarryP
    Participant
    Post count: 78

    Will,

    Thanks for the info about a talk just a few weeks away. Such a talk, especially a keynote address at the LLVM conference, strongly implies some good news coming! šŸ˜‰

    I hope you folks can get materials from that talk online ASAP! Personally, I’d be happy to see the slides sooner by themselves, rather than having to wait for an edited video. Will Millcomputing be getting audio/video recording(s) of Ivan’s keynote? (Hope so!)

  • LarryP
    Participant
    Post count: 78

    Will, Ivan, et al:

    Could you (if the IP issues permit) give us any more detail on the LLVM-for-Mill port and its status? In at least one of Ivan’s presentations, Ivan indicated that “major surgery” was needed to make LLVM compile Mill code, because LLVM assumed a register-machine target, which the Mill is not. Will’s statement seems to say the port is well in hand, but says nothing about how or what you had to change to accomplish that. If that “major surgery” succeeded (or was handled in another way), telling the community of that significant success would IMHO help community confidence and probably ditto for prospective investors.

    Of course, a talk on the toolchain — preferably including a demo of the compiler, specializer and simulator — would be fantastic! — though I know you guys are very busy. IMHO, the only thing better would be an alpha release of the toolchain and simulator! Please, tell us what you safely can, as soon as you can.

    Thanks,

    Larry

  • LarryP
    Participant
    Post count: 78
    in reply to: Execution #1657

    A subtle difference between hardware-native operations and emulation sub-graphs? (Re Mill Boolean predicate gangs)

    According to the presentation on execution, exu-side/op-phase operations can pass condition codes to the next higher exu slot — which can then turn those codes into a Boolean result if so programmed.

    If some Mill operation (let’s call it foo) is not native on a member, then foo will have to be emulated on that member by a sub-graph (which may be an emulating function call or an “inlined sequence of operations.) However, since this emulated foo will not have a slot of its own, I don’t see an emulated foo having an obvious way of mimicking a hardware-native foo’s ability to pass a CC set to another functional unit “horizontally” as a predicate gang, as would a native foo operation.

    How does the Mill architecture and tool-chain handle this apparent difference between native and emulated operations?

    Thanks in advance!

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1634

    Ivan,

    Are there any instructions that ask for/demand a pre-fetch of code from a *different* ebb?

    Since the Forth data stack will almost certainly be in memory, the interpreter will likely often spend time loading operands from the data stack onto its own belt, before calling functions that implement Forth primitives. So if there’s a way to get code (e.g. for an upcoming primitive) fetched ahead of time, that would help speed up the interpreter loop considerably.

    For a first-ever Mill interpreter, I’m more concerned with getting something to work correctly (with understandable code — a challenge in assembler!) than I am with optimal efficiency. That said, fast execution is certainly a plus.

    Thanks,

    Larry

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1623

    David,

    Good to hear of another person interested in Forth on the Mill.

    I started three wiki pages about “Mill Forth.” See:

    http://millcomputing.com/wiki/Forth_for_the_Mill_and/or_Forth-like_variants,_better_suited_to_the_Mill

    http://millcomputing.com/wiki/Preliminary_Design_for_Mill-Forth

    http://millcomputing.com/wiki/How_best_to_map_the_core_components_of_Forth_onto_the_Mill%3F

    Please take a look and feel free to add your ideas, perspectives and useful links. Same for anybody else reading. If you’re logged into the forums, you should also be logged into the wiki.

    The idea of using the belt and scratchpad to serve as Forth’s data stack is attractive for speed reasons. However, I think that’s not the right approach for an initial version, since the belt doesn’t model a FIFO stack. While I’d like to let the Mill’s call/return machinery be Forth’s return stack, I don’t think that’ll work well, because the inner interpreter needs read/write access the stack for more than just call/return matters, including for the common DO … LOOP constructs. Breaking the interpreter’s ability to manipulate the return stack would IMHO make the result too far from Forth.

    Overall, I agree that the only people likely to be porting much existing Forth code to the Mill are those of us interested in Mill Forth. Still, my instinct is to design something that’s recognizably Forth, for the first iteration. A belt-oriented interpreted language (or a type-aware Forth) are interesting notions, but I’d like to separate those from getting a Forth interpreter running on the Mill (initially under sim.)

    So, my instinct is to go for a fairly vanilla port of Forth for the first iteration, with both stacks in memory. Doing so does not preclude using the Mill’s stack pointer, though the automatic stack cut-back upon true function return might make that problematic. Though if the inner interpreter never returns, we might be able to use it that way.

    Happy New Year,

    Larry

    • This reply was modified 9 years, 10 months ago by  LarryP. Reason: clarity
  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1619

    Thanks to Will, we now have a place on the Wiki for community ideas. If you’re logged into the forums, you can add your comments and ideas there. Feel free to chime in!

    Under this category on the top-level wiki page, I’ve started the http://millcomputing.com/wiki/Software page. (Puzzlingly, the “Software” link under the category “Community” is still appearing in red, despite the fact that it’s not empty. I don’t know why.)

    Under that, I’ve started a page about Forth or similar interpreters for the Mill; see
    http://millcomputing.com/wiki/Forth_for_the_Mill_and/or_Forth-like_variants,_better_suited_to_the_Mill

    From what I can tell, if you’re logged into the forums, you are also logged into the wiki, and can edit. I’m confining my edits to the discussion/talk tab for anything except those pages under the Community column on the TL wiki page.

    • This reply was modified 9 years, 10 months ago by  LarryP. Reason: clarity
  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1618

    Will,

    I don’t know when that Ertl paper was/wasn’t paywalled. It turned up in a Google search I did today; I recognized the author and took a look. I hope it stays available.

    Gcc computed gotos? Not the infamous “computed come from” statement? šŸ˜‰

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1615

    Greetings all,

    Those of you interested in an interpreter/interpreted language for the Mill may enjoy reading the following paper, available (as of today) free online:
    “The Structure and Performance of Efficient Interpreters” by M. Anton Ertl and David Gregg. Here’s the link: http://www.jilp.org/vol5/v5paper12.pdf

    One thing this paper focuses on is that on many CPUs, branch prediction for indirect branches (e.g. via function pointers) is often much poorer than branch prediction for direct branches (like simple if/then/else constructs.) I suppose I should re-watch the presentation on the Mill’s prediction mechanism, to see what’s revealed about how the Mill will handle indirect branches. This issue is much less important for compiled code than for interpreted code, but apparently still gets involved in run-time method dispatch,e.g. in polymorphic objects.

    I note that Yale Patt’s work is cited several times in this paper and I recall reading other interesting papers by Ertl.

    Enjoy!

  • LarryP
    Participant
    Post count: 78

    David,

    Endianness: Internally little-endian, your choice for memory.

    From one of Ivan’s posts http://millcomputing.com/topic/metadata/#post-318

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1706

    Ivan wrote:

    We did decide that we would release, and support, alpha-level tooling on a ā€œuse at your own riskā€ basis. A minority felt that we should wait for commercial-grade for fear that alpha-grade would forever paint us with ā€œbroken and unusableā€, but the majority felt that we should not try to hide our rough-shod nature.

    I’m glad that some Mill software will be released in some fashion before reaching commercial grade. I’m happy to support any NDA or other terms/guidelines to make such an alpha-grade release help — rather than hinder — bringing the Mill architecture to market.

    It seems that the genAsm assembler, specializer and a corresponding Mill simulator would be the first pieces of the toolchain available. So my focus is more focused on what we can do with those tools. Forth (or an interpreted language in that vein) seems doable with that partial Mill toolchain. I don’t know enough about functional languages, though this discussion has made me learn a little about them.

  • LarryP
    Participant
    Post count: 78
    in reply to: Execution #1674

    Renesac’s questions lead me to ask two more:

    1. Of the seven phases identified in the Wiki http://millcomputing.com/wiki/Phasing , which ones *require* a clock cycle vs. which ones do not? Based on Ivan’s most recent comment and the Wiki’s entry, my best guess is that pick, conform and transfer do not require, of themselves, another clock cycle. Ivan &Co. please correct me if I’ve mis-read things.

    2. In the execution talk and in the Wiki under phasing, we have diagrams of the reader/op/writer phases happening in steady state — but only for that case. What I’d like is some explanation of what happens when adjacent instructions do not all use the same phases. In such cases, are there essentially bubbles in the pipeline?

    IMHO, a diagram would be very good for illustrating this case and hopefully explaining how the Mill executes when adjacent instructions use different phases from one another.
    —-

    Side note:
    The presentation on instruction decoding presents details most relevant to the exu-side decoding — but the flow-side encoding is substantially different in terms of how the header info and blocks are used! So the Wiki entry http://millcomputing.com/wiki/Decode (specifically the section titled “Flow Stream”) is good reading to understand what Ivan wrote about how and why the con operation is encoded as it is.

    • This reply was modified 9 years, 9 months ago by  LarryP.
  • LarryP
    Participant
    Post count: 78
    in reply to: Memory #1653

    The linked article mentions the Mill CPU architecture just once, in a sentence so vague that I cannot honestly tell what the author believes to be true about the Mill:

    BTW, this is a major reason Iā€™m skeptical of the Mill architecture. Putting aside arguments about whether or not theyā€™ll live up to their performance claims and that every chip startup I can think of failed to hit their power/performance targets, being technically excellent isnā€™t, in and of itself, a business model.

    The first word of the above quote is such a vague reference (the previous paragraphs were about memory barriers and what is/isn’t guaranteed on multiprocessor systems), that I cannot tell what the author is trying to say about the Mill. Has anyone else read the linked article, and better understood what the author was trying to convey?

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1630

    Ivan,

    Can genAsm’s go construct do backwards branches? (that is to a label earlier in the same function??

    If so, then that’s sufficient for writing the endless loop of an interpreter (though I’ll welcome the syntactic sugar of better looping constructs.)

    From what I can tell, memory operations and statically allocated memory are the key other pieces needed to write a (very) basic Forth interpreter, e.g. one that starts out handling only scalar integers and pointers.

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1625

    David,

    You wrote:

    I’ve updated the wiki page on genAsm to fix some EBNF syntactic issues, and have the spec itself parsing correctly now.

    Please note that most of the wiki pages under architecture or infrastructure (except for the talk/discussion tab) are periodically overwritten from non-wiki source via the “generator.” Therefore your changes are likely to get overwritten. I strongly suggest you advise Ivan of your suggested changes and let him update the source upstream of the wiki.

    Offhand, I don’t see much benefit of interpreting genAsm, since to get anywhere, we’ll already have access to compiled genAsm. (Though feel free to convince me otherwise.) I’d rather write a more nearly user-friendly interpreter, like Forth.

Viewing 15 posts - 16 through 30 (of 74 total)