This article explains how to calculate the length of a single 3D float vector stored in a SSE register. The length or
norm of a vector is defined as the square root of the dot product of the vector with itself:
|v| = length3(v) = sqrt(v.x^2 + v.y^2 + v.z^2)
A single SSE register can be used to hold a 3D vector (the highest 32 bits are unused). In a
previous article we show how to load a
struct
containing 3 float values into a SSE register. You may as well use the
_mm_setr_ps(x, y, z, 0)
intrinsic.
SSE4 introduced the
DPPS
instruction (accessible via the
_mm_dp_ps
intrinsic) which allows to calculate the dot product of up to four float values. We will now use this intrinsic to calculate the length of a 3D vector with minimal instructions.