Forum Replies Created

Viewing 15 posts - 31 through 45 (of 74 total)
  • Author
    Posts
  • LarryP
    Participant
    Post count: 78

    Ivan,

    Sorry my question took so much time/effort to answer. That said, I’m very pleased to gain a little glimpse into your simulator, just as I am to see that this issue had already been thought through in considerable detail. Deferring faults as far as feasible seems like a win for speculation.

    One rare downside I see is that a byte vector loaded from a NaR address will contain such short NaRs that the hash in the payload may not be much help for tracking down the source of the resulting NaRs. And since the silent behavior of Nones may be nonintuitive to people (esp. those porting existing code to the Mill), some Mill-specific compiler switches (e.g. for extra tests and warnings for Nones/NaRs as addresses) may be desirable — after LLVM is generating usable Mill code.

    Happy new year to you and the whole Millcomputing team,

    Larry

    • This reply was modified 9 years, 11 months ago by  LarryP.
  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1606

    Transfers to and fall-throughs into a join carry data. The widths of the carried data is declared in the labelDef, and the count of carried operands must fit on the belt.

    Ivan,

    Would you kindly explain more about the data carried into joins? There are two things I’d like to better understand:

    1. when listing/describing belt operands associated with a label, does one describe the entire belt contents, or just what all the branches to that label have added to the belt, with respect to some previous (TBD?) belt state? (Or something else entirely?)

    2. If we’re writing genAsm code, then we don’t yet know the belt length — and ideally, we want this code to run correctly on any Mill, no matter its belt length. If the specializer is supposed to model an abstract Mill with an infinite belt in other (non-branch/label) contexts — e.g. by inserting fills and spills as needed — why is this seemingly-member-specific belt-length constraint exposed by genAsm branch labels? Have I misunderstood something?

  • LarryP
    Participant
    Post count: 78

    Greetings all,
    It turns out that Ivan has answered (in the metadata thread) regarding the result from a NaR and a None in a normal (read two-input, exu-side) operation: The None wins; see his response #558:
    http://millcomputing.com/topic/metadata/#post-558

    IMHO, this still leaves unclear what happens when either a None or a NaR are inputs to an address calculation (e.g., as base, offset or index inputs the the address calc.), either as a separate calc or as part of a load or store operation. The rules for address calculations (on the flow side) might be different than for exu-side ops.

    We know that Nones as data values silently act as no-ops when stored, and that inaccessible addresses return Nones when loaded from. However, I’m still puzzled re what load/store ops do when their effective address is None. My best guess is that a NaR effective address will make both loads and stores fault, but the desired behavior may well be different for an effective address of None.

    • This reply was modified 9 years, 11 months ago by  LarryP.
  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1601

    Ivan,

    Thanks for the explanation, the example of labels and branching, and the updated genAsm grammar. (I didn’t mean to make you work during XMas. My apologies.)

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1599

    genAsm language question re labels and branches
    How are program labels and control transfers (e.g. conditional branches) expressed in the genAsm language? Could you possibly make an example or two available here or on the wiki?

    Thanks much!

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1597

    Here’s a use case that makes sense to me:

    If Mill hardware (e.g. an FPGA prototype of Tin or Copper) is ready before the LLVM modifications are sufficient to compile Mill executables, then a simple interpreter could serve the roles of monitor, debugger and image loader. I think that could help a lot with board bring-up. Even if hardware only might be ready first, I think having such an interpreter would be a worthwhile risk-reduction strategy.

    Under the above use case, I’d reverse my previous opinion:
    In that use case, I’d suggest first porting a minimal version of vanilla Forth, without trying to make optimal use of the Mill’s unique characteristics. But once that Forth interpreter is working and stable, then we could use what we’ve learned from bringing that up on the Mill (or Mill simulation) to design a new interpreted language, one better suited to the Mill’s unique characteristics.

    Thoughts?

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1589

    Ivan wrote:

    You also can (and should) use the new public Wiki for design and documentation.

    Is the Wiki now public? I see a login page its the upper right corner, but no obvious way to create an account (and unsurprisingly, forum credentials don’t work.) If the Wiki is open for comments and additions, please let us know how!

    Thanks

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1587

    A couple more thoughts/opinions about Forth (or another small, interpreted language) for the Mill:

    1. Several common Forth words (for example swap and drop) are in conflict with the Belt’s single assignment nature. We could use the conform and rescue operations to force-fit Forth’s semantics onto the belt, but that seems ugly and problematic — unless there’s some real desire to have Forth per se, on the Mill. If that ugliness were hidden inside the interpreter, I could grit my teeth and live with it. The advantage of porting Forth is that it’d be “just” a port of an existing, well documented interpreter design, and then existing Forth code could be (relatively) quickly up and running on the Mill and be demonstrated to interested parties.

    2. Even a minimalist interpreter would need (the functional equivalents of) working memory, std-input/keyboard, std-output/console and persistent storage. Having std-error would also be nice. Compared to modern languages and CPUs, a Forth-like interpreter would have rather modest needs for working memory and persistent storage. If we want to get something running — and not tear out hair trying to code it in ConAsm, then we’d need genAsm to support basic memory operations (load, store and hopefully stack alloc./cutback) and something (simulator outcalls?) to serve as std-input and std-output. If the interpreter is running on a simulated Mill (seems likely short term), then a we could probably make a specified address range of simulated memory simulate non-volatile memory, thus finessing persistent storage (or perhaps simulate all memory as persistent.)

    3. Although APL brings back other fond memories (e.g. of writing Dungeons and Dragons support code on an IBM 5100), I think an interpreted language that does scalar ops well is a better near-term target. I suggest leaving vectorization wizardry for version 2.

    4. While this kind of preliminary arm waving is fun, I think we need some clarity on how such an interpreted language could be used to help bootstrap the Mill into quicker and broader interest and use. I don’t want to go all UML on folks, but what should be our key use case(s) for Forth or similar on the Mill?

    Thoughts, opinions, corrections? Please chime in!

  • LarryP
    Participant
    Post count: 78
    in reply to: Simulation #1580

    Tomberek, Ivan and others interested in Forth or similar:

    Ah, Forth; the name brings back good memories!

    One could certainly port Forth to the Mill — or one could design/implement another interpreted language that makes better use of the Mill’s architecture (and thus almost certainly faster, simpler and thus likely usable sooner.) Which way to go depends mainly on whether there’s real demand to run existing Forth code. People have done some amazing things in Forth (e.g. Chuck Moore, Forth’s creator, who wrote his own chip-design system in Forth, if I recall correctly), but I’m not sure how many prospective Mill users/programmers/system-designers want to run existing Forth code.

    On the other hand, having a working interpreted language for the Mill (either via a simulated Mill or a TBD implementation on an FPGA) would IMHO provide a more programmer-friendly way to try out the Mill than the current assembly/simulation tools — and sooner — even if it’s different than Forth. To me, that seems like a worthwhile option to explore, especially since the LLVM toolchain sounds like it won’t be ready as soon as we might wish.

    A small, efficient, interpreted language — one designed to make use of the belt model — might well be easier to implement, especially since (as I understand it) the core interpreter will have to be written in Mill ConAsm (at least until the Mill’s specializer is ready to generate simulate-able/runnable Mill executables GenAsm

    Ivan, any projection on when you’ll be able to simulate code written in GenAsm internally? (Or is that working already?)

    I find myself leaning somewhat toward a not-textbook-Forth interpreted language for the Mill, because many of the most common Forth “words” (such as dup, swap and over) are for juggling values on the stack, so the word you really want to invoke has all its operands on the stack in the correct order. It seems a shame to enforce strict stack semantics on a machine that has something arguable better (the belt) built right into it. However, it there is real demand (or even strong interest) for Forth itself on the Mill, I’d still be interested in getting that working.

    Tomberek,
    especially if you’re learning Forth and haven’t done a port, I suggest

      Threaded Interpretive Languages: Their Design and Implementation

    by R. G. Loeliger. Probably out of print, but used copies are apparently available from a certain large online book (and everything else) seller.

    I’d be interested in further discussion and possible collaboration. Feel free to email me at: ursine at (a large, free email service, hosted by g00gle.)

  • LarryP
    Participant
    Post count: 78
    in reply to: The Belt #1579

    Ivan,

    Would you kindly clarify some vocabulary in this discussion, specifically between frames and nesting numbers? You wrote:

    Say frame 17 calls frame 18, which does a MULF but returns to 17 before the MULF finishes. When the result pops out of the multiplier it is labeled as belonging to 18, known to be exited, and is discarded.

    In the above, are those (stack) frame numbers or (belt) nesting numbers? My understanding was that true calls on the Mill (not inlined or TCO’ed away) *always* get a new nesting number, and that calls may create a new stack frame, but don’t have to. So would you please clarify which you’re talking about in this context?

    Thanks!

  • LarryP
    Participant
    Post count: 78
    in reply to: Pipelining #1561

    How will the compiler convey pipelining and/or vectorizing info to the specializer?

    Ivan or other Millcomputing folks,

    Can you tell us how the compiler will convey information about pipelining and vectorizing loops to the specializer? From the GenAsm EBNF grammar on the Wiki, it looks like there are several ways you could do this, but I’ll simply ask (rather than speculate, per forum rules.)

  • LarryP
    Participant
    Post count: 78

    Must fields within a block of a Mill instruction all be the same width?

    The lecture on encoding indicates that all ops encoded in a single block are all of equal width. However, if the op-count for some block is N (!= 0), we know that the block contains operations for slots 0 through N-1 (of the slots for that stream (either exu or flow-side.) So long as the bit-widths for operations in the block are fixed for a given Mill model, the total bit width of this block (and thus the location of the next block) should be trivially calculable from that count and the widths, even if the fields within the block are not of uniform width. And similarly for the existence and location of the alignment hole, used (if present) to encode a compact no-op for the other bundle stream.

    Even in the uniform-width case, the decoders must calculate/look up the starting point of block N+1, based on the starting position and op counts of this and previously parsed blocks — even if that calc is a simple multiply by a constant. So it seems to me that handling non-uniform widths within a block wouldn’t significantly complicate block decoding over the uniform-width case. Is my reasoning wrong, and/or are there other reasons why the fields in a block must have uniform widths?

    Note: since I posed this question to Ivan (off list, so as not to speculate in a forum post), I now see details in the Wiki (for example http://millcomputing.com/wiki/Cores/Gold that describe different bit widths for both exu and flow slot operation encodings. But, as is often said in the videos, some of the existing docs/videos are simplifications, so I’ll be curious what the Millcomputing folks say on this issue.

    • This reply was modified 9 years, 12 months ago by  LarryP. Reason: clarity
  • LarryP
    Participant
    Post count: 78

    In order to propagate NaRs and Nones properly, I think all ALUish operations (including mult) will have to check their inputs explicitly for NaRs and Nones. Which leads to a question: What is the proper result for NaR * None?

    Which leads to another puzzler: What should be the behavior when a None gets involved in an address calculation? And what is the Mill’s behavior when such an effective address is used by a load or a store operation?

  • LarryP
    Participant
    Post count: 78
    in reply to: What's New? #1534

    Will,

    Thanks for your answer; Terje Mathisen’s talk “Are virtual machines the future?” is exactly what I was looking for!

    I’m pleased to report that somebody did record that talk and it’s up on vimeo at:
    http://vimeo.com/108094559 (at least for now. You folks might want to download it and put it on your own website.)

    FYI, here’s the abstract:

    Mill Computing has been working for over a decade to develop a brand new, highly scalable CPU architecture. The scalability is partly achieved by having no fixed machine/assembly language, instead all programs are compiled to an intermediate (VM) machine. Upon installation on the actual target machine, a specializer is run which converts the intermediate representation into the current format. The Mill’s most unique features include having no actual registers, the ability to run completely out of cache with no actual RAM memory, an instruction set which is size-agnostic (the same opcode can work on 8, 16, 32, 64 or 128-bit variables or vectors) as well as being able to unroll algorithms in a way that would cause traps and crashes on classic architectures.

  • LarryP
    Participant
    Post count: 78
    in reply to: What's New? #1532

    Greetings all,

    I seem to recall reading (months?) of an upcoming presentation on the abstract Mill, viewed as a virtual machine. If I recall, this was by somebody other than Ivan, possibly not an official Millcomputing person. (Please forgive me if I’m mis-remembering, my memory of what I read is hazy.)

    Unvarnished opinion: The most popular VMs (e.g. those for Java) use a stack model, imposing a one-at-a-time execution model. I believe that the stack-based JVM is a main reason why Java performance isn’t so hot.

    Does this ring a bell with anybody? If so, any pointers to an abstract, slides and/or video would be much appreciated.
    —-

    Bye the way, the Wiki (although it’s pre-official-announcement-of-readiness) now has — when it’s up — some fascinating new-to-me info on, among other things:

    * The instruction set,

    * The Mill abstract assembler grammar (Interestingly IMHO, it exposes few Mill-ish aspects other than supported widths and gangs),

    * More on instruction decoding,

    * Two new phases, and

    * More detail on the tin, copper, silver and gold cores, including more slot and FU details. (Note that these are almost certainly in flux.)

    Enjoy!

    PS Millcomputing folks. Please feel free to delete this post if I’ve mentioned things you’d rather not yet have on the forums. If so, my apologies.

Viewing 15 posts - 31 through 45 (of 74 total)