Classic case of slowdown due to global rounding mode is
the rules say you chop, and on the 80×87 this forces rounding mode to change twice, which costs a few
dozen cycles; see http://stereopsis.com/FPU.html.
There is a special optimization https://docs.microsoft.com/en-us/cpp/build/reference/qifist-suppress-ftol?view=vs-2017 to avoid this double change and allow rounding.
- This reply was modified 5 years, 2 months ago by gideony2.