Mill Computing, Inc. Forums The Mill Architecture Binary distribution for a Mill

  • Author
  • anselmschueler
    Post count: 3
    #3916 |

    Hi, I have a few concerns about the logistics and systems around binary distribution on a Mill.

    1. You describe that Mill programs are compiled to an abstract form and then specialized
    for the specific Mill family member that they need to be run on.
    What format do you think application vendors and redistributors will distribute their packages in?
    Do you believe that some will try to distribute pre-specialized binaries, perhaps in a hybrid format?

    2. You describe that Mill programs dynamically analyze their branch prediction and are able
    to write back the predictions. This, along with specialization, require either modifying the state
    of the system or giving up performance. What about immutable systems or systems that regularly restore
    from an immutable source (examples would be containers, NixOS/Guix)? Could this be sped up by
    pre-specialized and/or pre-analyzed binaries, perhaps in a hybrid form? Would it be possible to separate
    the stateful part of the binary into a separate non-system location and load from/store to there on entry/exit?

    3. Could the dynamic branch prediction analysis be communicated back to the vendor using some sort of
    telemetry system or aggregated in some cloud store so other users can benefit?

    3. You say you want to include all operations that any Mill supports in the abstract assembly language
    and substitute emulation. After you have released your first Mill, any subsequent Mills that extend the
    operation set will require versioning the specializer. Do you have any specific versioning policy in mind?
    Do you anticipate incompatibilities from newer operations being used in shipped code that is specialized
    by an older specializer that does not know of the new operations yet? Could this be mitigated by shipping
    hybrid or pre-specialized binaries?

    I understand that you are not yet at the stage where these questions are particularly relevant but I would like to know if you have considered them before and if you have any ideas pertaining to them.

    • This topic was modified 3 months, 1 week ago by  anselmschueler. Reason: Added friendly signoff
    • This topic was modified 3 months, 1 week ago by  anselmschueler. Reason: Fix grammar mistake
  • Findecanor
    Post count: 30

    3. Could the dynamic branch prediction analysis be communicated back to the vendor using some sort of telemetry system or aggregated in some cloud store so other users can benefit?

    I think you would meet a lot of resistance against such telemetry, citing reasons of security.

  • anselmschueler
    Post count: 3

    That’s a good point

  • Ivan Godard
    Post count: 687

    1) App distribution in binary vs. IR:

    Certain parts of the software, such as the minimal BIOS and the boot specializer, must be distributed in binary. These will normally reside on a ROM, and the distribution of changes likely involves reflashing the ROM; vendors other than hardware manufacturers will never need to do this. Kernel vendors will distribute in IR form using the minimal canonical IR which is acceptable to the specializer in the ROM. Included in the kernel package will be a more feature-full app-IR specializer in ROM-IR form. That will be translated to native binary, and then run through itself so that the installed app-specializer has all the app-level features for its own work of translating app-IR to binary. Cascaded compiling-the-compiler is routine for cross-compilation, which is really what a Mill install is and does.

    Other than updating the boot ROM there is really no reason to distribute pre-specialized code in binary. Even assembler-like code can be represented in Mill IR as intrinsics. Now it is possible to have Mill configs with extra nonstandard instructions, and code that uses those won’t run if the instruction doesn’t exist in the target config at hand. But you’d still use IR for it – and get a specializer error if the requisite intrinsic is not found.

    2) Dynamic exit prediction update in read-only systems:

    Prediction update is just an optimization: it starts the predictor table off with the state from prior runs instead of empty. If the load module cannot be written then the optimization doesn’t happen. Conceivably the vendor could build a table using a mutable load module and then distribute the module as binary. The gain from the optimization is unlikely to justify the nuisance of maintaining multiple binaries for the different members.

    3) Sharing predictor history:

    Certainly predictor history could be separated from the binary code that uses/updates it. However, the history is tied to a particular config just like the specialized code is, so you’d have the administrative booking problem of making sure that both addressed the same config.

    3 redux) Varying config binary ISAs:

    It’s common that configs lack some instructions that others have. For example, some of our test configs lack floating-point or quad (128-bit) data forms. The specializer still recognizes these in the IR, and generates calls on emulation routines, which are often inlined. The substitution is automatic – the install provides signature info and the corresponding routine for everything potentially in the IR.

    The sig/emu info is tied to an IR level. If the IR changes to a new release then the installation must be upgraded with info to match. If you present an object module that uses IR12 but only IR9 is installed then you’ll get an error in the app install. It doesn’t matterwhether the actual hardware has the instructions: the specializer knows what the hardware can do, so it uses hardware if possible and emu otherwise. The IR install may provide emu routines for instructions that the particular hardware actually has; the hardware will be used.

    • anselmschueler
      Post count: 3

      Thanks! One question: Is it possible to split the dynamic exit prediction from the load module so it can be stored in a location that is mutable even if the load module itself isn’t, and then join them on execution again?

      • This reply was modified 3 months ago by  anselmschueler. Reason: fix? markup
      • Ivan Godard
        Post count: 687

        Yes, although we don’t yet support that mode.

You must be logged in to reply to this topic.