Difference between revisions of "Instruction Set/fmafe"

Revision as of 01:35, 3 January 2015

realizing exu stream exu block compute phase operation in the binary floating point value domain and rounds to nearest, ties toward even adjacent value

native on: Silver Gold

Binary floating point fused multiply-add. As usual for those, it yields a higher precision than doing it separately, and is faster too. Rounds towards even.

fmafe(f x, f y, f z) → f r₀

operands: like Addf [ff:f]

Returns x*y+z on the belt.

encoding: fmafe(f x) , exuArgs(op arg0, op arg1)

Core	In Slots	Latencies
Silver	E0 E1	w,w:w=6 wv,wv:wv=6 d,d:d=7 dv,dv:dv=7 q,q:q=8 qv,qv:qv=8
Gold	E0 E1 E2 E3	w,w:w=6 wv,wv:wv=6 d,d:d=7 dv,dv:dv=7 q,q:q=8 qv,qv:qv=8

fmafe(f x, f y, f z, f w) → f r₀, f r₁

operands: like Fmasf [ff:f]

This is a fused multiply-add-subtract. An excellent way to make full use of all Functional Units in the 2 Slots.
r₀ is x*y+z*w
r₁ is x*y-z*w

encoding: fmafe(f x, f y) , exuArgs(op arg0, op arg1)

Core	In Slots	Latencies
Silver	E0	w,w:w,w=6,6 wv,wv:wv,wv=6,6 d,d:d,d=7,7 dv,dv:dv,dv=7,7 q,q:q,q=8,8 qv,qv:qv,qv=8,8
Gold	E0 E2	w,w:w,w=6,6 wv,wv:wv,wv=6,6 d,d:d,d=7,7 dv,dv:dv,dv=7,7 q,q:q,q=8,8 qv,qv:qv,qv=8,8

Instruction Set, alphabetical, Instruction Set by Category, Instruction Set, sortable, filterable

Difference between revisions of "Instruction Set/fmafe"

Revision as of 01:35, 3 January 2015

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools

Revision as of 18:53, 20 December 2014 (view source) Generator (Talk \| contribs) ← Older edit	Revision as of 01:35, 3 January 2015 (view source) Generator (Talk \| contribs) m (Protected "Instruction Set/fmafe": generated ([Edit=<protect-level-bot>] (indefinite) [Move=<protect-level-bot>] (indefinite))) Newer edit →
(No difference)