Forum Replies Created
- AuthorPosts
I thought of a better way. The mill supports vector reducing operations like all and any. I don’t know the syntax for them, but perhaps the between operation could take a number and a vector and tell you if the number is in that range. for example:
F("bar") %0; con (v(4ul, 7ul)) %1, // whatever syntax to build a vector betw (b1 %0, b0 %1) %2; // true if 4 <= x && x <= 7
Since the value is a vector, it can use only one functional unit to calculate if x is within the range. That would make switches ranges trivial to optimize.
Sorry if it looks like I am spamming, the forum keeps deleting my post.
The pick operation can be used to build a range check. For example, to check if (x >=4 && x <= 7) one could write:
F("bar") %0; rd (w(0ul)) %1, geqs (b1 %0, 4) %2, leqs (b2 %0, 7) %3, pick (b1 %2, b0 %3, b2 %1) %4;
Range checks would be fairly common in switches. In this example, if (2 <= x && x <= 3) it should call foo(), and if (4 <= x && x <= 7) it can return x. It seems like a fairly simple optimization to add to the specializer instead of a bunch of eql. Range checks could also be useful to perform array bounds check.
The only issue, if I understand phasing correctly, is that the result of a pick cannot be used in a calltr1 within the same instruction.
I must admit I scratched my head about the fma operation. For the life of me I can’t imagine it being used that often. It seems like an oddly specific instruction to add, but maybe I am missing something. If you can have operations that use more than one ALU, then perhaps a “between” operation that does a range check would be useful. As shown above it can easily be done with a pick, but there could be a gain if the result of the between operation was usable in calltr1.
The pick operation can be used to build a range check. For example, to check if (x >=4 && x <= 7) one could write:
F("bar") %0; rd (w(0ul)) %1, geqs (b1 %0, 4) %2, leqs (b2 %0, 7) %3, pick (b1 %2, b0 %3, b2 %1) %4;
Range checks would be fairly common in switches. In this example, if (2 <= x && x <= 3) it should call foo(), and if (4 <= x && x <= 7) it can return x. It seems like a fairly simple optimization to add to the specializer instead of a bunch of eql. Range checks could also be useful to perform array bounds check.
The only issue, if I understand phasing correctly, is that the result of a pick cannot be used in a calltr1 within the same instruction.
I must admit I scratched my head a bit about the fma operation. For the life of me I can’t imagine it being used that often. It seems like an oddly specific instruction to add, but maybe I am missing something. If you can have operations that use more than one ALU, then perhaps a “between” operation that does a range check would be useful. As shown above it can easily be done with a pick, but there could be a gain if the result of the between operation was usable in calltr1.
The pick can be used to build a range check. For example, to check if (x >=4 && x <= 7) one could write:
F("bar") %0; rd (w(0ul)) %1, geqs (b1 %0, 4) %2, leqs (b2 %0, 7) %3, pick (b1 %2, b0 %3, b2 %1) %4;
Range checks would be fairly common in switches. In this example, if (2 <= x <= 3) it should call foo(), and if (4 <= x <= 7) it can return x. It seems like a fairly simple optimization to add to the specializer instead of a bunch of eql. Range checks could also be useful to perform array bounds check.
The only issue is that, if I understand the phasing correctly, the result of the pick cannot be used in the calltr1 within the same instruction.
I must admit I scratched my head a bit about the fma operation. For the life of me I can’t imagine it being used that often. It seems like an oddly specific instruction to add, but maybe I am missing something. If you can have operations use more than one ALU, then perhaps a “between” operation would be useful that would do range check. As shown above it can easily be done with a pick, but there could be a gain if the result of the between operation was usable in calltr1.
-
This reply was modified 7 years, 7 months ago by
goldbug.
-
This reply was modified 7 years, 7 months ago by
I finally managed to digest that code, impressive indeed!
I can also see that it would be easy to make the entire function just 1 instruction in gold.
Veedrac also has the right idea about how to format conAsm. He puts the rd and con operations first which more closely resembles phasing, he spaces instructions so that it is easy to visually scan the code and he shows in comments the slot where the result go. I am not sure what the arrows at the end mean though.
Do you think you could provide an example? How would this code be compiled?
int bar(int x) { switch(x) { case 0: return 1; case 2: case 3: return foo(x); case 5: return 5; case 6: return 6; case 7: return 7; case 100: break; default: return 4; } return 3; }
-
This reply was modified 7 years, 10 months ago by
goldbug.
-
This reply was modified 7 years, 10 months ago by
- in reply to: Control-flow folding #2814
Thanks for the detailed explanation. Seems like you guys have thought of everything :). I look forward to play around with the simulator when it is available.
- in reply to: Control-flow folding #2808
From the lecture you gave on prediction, I get that you predict exits.
In this case, there can be up to 3 exits, one for each call and the return. In fact, in this particular function there will always be at least 2 exits taken. At least one of the calls will be taken and the return will be taken. If I understood your talk correctly, you only have 1 prediction per EBB. Does that mean that your prediction would fail at least 50% of the time in this function? Or you have some mechanism to predict both exits?
I would think this is pretty common, we (developers) often make functions that call other functions and then return.
Also, on the lecture you gave some pretty interesting numbers at the beginning of the talk about how much time is wasted on cold runs in conventional hardware. You explain what the bottlenecks are and how you are addressing them in the Mill. But you don’t mention how effective your solutions are. Do you have any numbers as to how your solution performs vs conventional hardware regarding mispredictions?
- in reply to: Belt saturation in short belts #2787
Very interesting Ivan, could you elaborate on how it is that you can issue multiple function calls in the same instruction? When you return from one function does it jump straight into the next function?
- AuthorPosts