Mill Computing, Inc

Forum Replies Created

Viewing 2 posts - 1 through 2 (of 2 total)

Author
Posts
Gregor Budweiser
Participant
October 25, 2018 at 12:11 pm
Post count: 3
in reply to: Pipelining on a Gold member of the Mill family #3346
Phasing is a separate issue, having nothing to do with piping. It makes a difference only in open code, not loops, where it is redundant to piping. We would still pipeline to the same degree if there were only one phase.
Oh dear, I didn’t realize that at all. In the decode talk you actually briefly mention that there are multiple execute phases (X0-X4+) per instruction/bundle. That would make dependent operations possible in a single instruction because they can execute in different cycles. I guess at this point I’ll have to carefully watch the talks again.
Thanks again. Your help is much appreciated!
Gregor Budweiser
Participant
October 25, 2018 at 5:50 am
Post count: 3
in reply to: Pipelining on a Gold member of the Mill family #3344
Many thanks!
Your explanations give a more realistic feel of what the Mill is capable of and what the relevant factors are. Assuming vector registers one might even do a 4×4-Matrix-Vector multiply in a couple of cycles. That is impressive imho.
About loop carried variables: I do see how conceptually the carried variables can just stay on the belt and therefore the maximal loop distance is proportional to the length of the belt (ignoring spill/fill ops). What I was trying to get at is that for the “i-1” version of the loop you can’t do two iterations in one instruction because of the data-dependency. Whereas the “i+1” version would allow more iterations per instruction due to no dependencies (ignoring other factors/limits).
So what I really wanted to ask: How can a single instruction contain/execute data-dependent integer or FPU ops (say a+b+c)?
According to the Execution talk (#6), an instruction is phased/pipelined internally. Now an instruction or bundle is decoded during 3 cycles and executed during 3 cycles but only one of the execution cycles is an ops phase (in which execution units can be used (true?)). I assume a+b+c can’t be computed during this one ops phase/cycle.
Author
Posts

Viewing 2 posts - 1 through 2 (of 2 total)

Gregor Budweiser

Forum Replies Created