Mill Computing, Inc. Forums The Mill Architecture C boolean vs mill boolean

  • Author
    Posts
  • goldbug
    Participant
    Post count: 53
    #3099 |

    As I read through the documentation, I noticed instructions like brtr and calltr only check the last bit of the operand.

    In C, anything that is not zero is considered to be true. This means that when you compile C code, you will have to do an extra comparison operation to have mill booleans.

    For example, this function:

    
    void foo(int p) {
       if (p) {
          bar();
       }
    }
    

    would have to compare p with zero and needs to be compiled to something like this:

    
    F("foo") %0;  
       neq(%0, 0)  %1,
       calltr0(%1, "bar");
    

    If instead of checking the last bit you checked the or of all bits, you would have:

    
    F("foo") %0;  
       calltr0(%0, "bar");
    

    I would think that optimizing for C would be the goal no? why did you chose to check only the last bit instead of the or of all bits when doing conditional jumps?

    • This topic was modified 7 years ago by  goldbug.
  • Ivan Godard
    Keymaster
    Post count: 689

    We do it for speed.

    “checking all the bits” is not free; you need an OR tree, the full operand width. That can’t be done in the gap between cycles, where the branch is resolved, or at least can’t be done without unattractive clock rate impact. However, the NEQ can be in the same instruction as the BRTR but happen in opPhase so its result (one bit) is available for the branch. The NEW is just making explicit the timing dependencies that your proposed NEW/BRTR merge does implicitly.

    Predicate gangs also use the same one-bit paths. It costs nothing for an ADD (say) to also yield one bit values showing the comparison of the result with zero; many conventionals do this and set the condition codes with them. A predicate gang selects one of the generated signals and drops it as a boolean with a known-to-be-one-bit value that the BRTR can use.

    Our approach does use an encoding slot and a belt position that a merged test-and-branch would not. However, it’s more general and costs no added time.

You must be logged in to reply to this topic.