A subtle difference between hardware-native operations and emulation sub-graphs? (Re Mill Boolean predicate gangs)
According to the presentation on execution, exu-side/op-phase operations can pass condition codes to the next higher exu slot — which can then turn those codes into a Boolean result if so programmed.
If some Mill operation (let’s call it foo) is not native on a member, then foo will have to be emulated on that member by a sub-graph (which may be an emulating function call or an “inlined sequence of operations.) However, since this emulated foo will not have a slot of its own, I don’t see an emulated foo having an obvious way of mimicking a hardware-native foo’s ability to pass a CC set to another functional unit “horizontally” as a predicate gang, as would a native foo operation.
How does the Mill architecture and tool-chain handle this apparent difference between native and emulated operations?
Thanks in advance!