There have been other symmetrically-split machines; for example, the TI 8-wide VLIWs are actually dual 4-wide VLIWs with the ability to communicate between the halves. The main advantage of Paysan’s is that it does not need to encode result locations, and the transients are single-assignment and so are hazard-free; the Mill has these same advantages with the Belt. The major difficulties with Paysan’s design in a modern context (which is unfair; Paysan’s work was done 15 years ago when a three-stage pipe and a 100MHz clock were cutting edge) are the communicating crossbar, and that bane of all wide-issue machines: cache miss stalls. The Mill solutions for both have been covered in the talks.
I also wouldn’t want to have to do an instruction scheduler for Paysan’s machine