This article explains how to set up a vector math library that performs mathematical operations on arbitrary sized float arrays or vectors. It can handle aligned and unaligned pointers with minimal code overhead but optimal runtime performance. This is achieved through template functions and compile-time arguments.
This tutorial is based on a simple code example: adding two arrays and storing the result in a third array. Step by step, we will introduce loop SSE intrinsics, loop unrolling and functor based concepts which allow to build a library with different operations.
Friday, December 30, 2011
Wednesday, December 28, 2011
In this post we show how to write a simple class which represents a 3D vector which uses SSE operations for fast calculations. The class stores three float values (x, y and z) and implements all basic vector operators such as add, subtract, multiply, divide, cross product, dot product and length calculations. It uses aligned 128-bit memory which allows to use SSE intrinsics directly. In addition, it overloads the
deleteoperators for arrays which allows to create multiple instances of