FastC++: Coding Cpp Efficiently: Changing the sign of float values using SSE code

Monday, March 28, 2011

Changing the sign of float values using SSE code

The IEEE 754 floating point format defines the memory layout for the C++ float datatype. It consists of a one bit sign, the 8 bit exponent and 23 bits that store the fractional part of the value.

float x = [sign (1 bit) | exponent (8bit) | fraction (23bit)]

We can use this knowledge about the memory-layout in order to change the sign of floating point values without the need for floating point arithmetic. For example, calculating the absolute value of a floating point number is equivalent to setting the sign bit to zero. In SSE we can do this for four float values simultaneously by using a binary mask and logical operations:

static const __m128 SIGNMASK = 
               _mm_castsi128_ps(_mm_set1_epi32(0x80000000));
__m128 val = /* some value */;
__m128 absval = _mm_andnot_ps(SIGNMASK, val); // absval = abs(val)
//...

In a similar way we can negate floating point numbers by simply negating their highest bit:

__m128 val = /* some value */;
__m128 minusval = _mm_xor_ps(val, SIGNMASK); // minusval = -val
//...

5 comments:

EinschenkerMay 26, 2011 at 12:48 PM
Just wanted to point out that there's a zero missing in your mask :)
Should be : static const __m128 SIGNMASK = _mm_castsi128_ps(_mm_set1_epi32(0x80000000));
ReplyDelete
Replies
theowl84June 1, 2011 at 9:06 AM
thank you, that was a typo - it's now corrected
ReplyDelete
Replies
AnonymousNovember 23, 2012 at 4:31 PM
Thank you, exactly what I was looking for.
But actually you have to use _mm_xor_ps(SIGNMASK, val) because the sse instruction negates the first entry and not the second.
ReplyDelete
Replies
AnonymousMarch 24, 2015 at 7:52 PM
Just to point out that another, slightly more efficient (yet mathematically unpleasant) way of generating SIGNMASK is _mm_set1_ps(-0.0f)). as in _mm_xor_ps(val, _mm_set1_ps(-0.0f)). I don't now about the portability, but I seen lots of reference to this technique elsewhere.
ReplyDelete
Replies
Agatha MallettSeptember 23, 2015 at 4:39 AM
If you compare the assembly outputs, this should be faster (three instructions):

__m128 vec = _mm_load_ps1(&f);
vec = _mm_and_ps(vec, _mm_castsi128_ps(_mm_set_epi32(0,0,0,~(1<<31))) );
f = _mm_cvtss_f32(vec);

Although, Clang's `fabsf` does it in two, which I think is minimal. Probably, it was handcoded assembly.
ReplyDelete
Replies

Add comment

FastC++: Coding Cpp Efficiently

Monday, March 28, 2011

Changing the sign of float values using SSE code

5 comments:

About Me

Useful Links