Friday, March 25, 2011

Fast iteration over STL vector elements

The STL class std::vector is a great container for managing dynamic arrays. Unfortunately it introduces a slight overhead when iterating through it using an index. For example, the following code is not very efficient:

std::vector<int> data( /* some data */ );
for(int i = 0; i < data.size(); i++)
{
  foo(data[i]);
}
First, the size of the array is not stored explicitly in the array structure, therefore data.size() has to calculate the size on each call. Also, the index operator data[i] introduces some overhead such as checking for index-out-of-bounds and looking up the starting address of the array.

Using STL iterators, it's possible to get rid of these disadvantages:
for(std::vector<int>::const_iterator it = data.begin(); 
    it != data.end(); ++it)
{
  foo(*it);
}
data.begin() and data.end() just read a member element of the std::vector object and therefore are really fast. If you also need access the index in your for-loop, you can calculate it by calling index = it - data.begin().

The third alternative which gives you the fastest code for both element access and index it to transform the std::vector into a C-style array:
const int num = data.size();
const int* ptr = (num > 0) ? data.data() : nullptr;
for(int i = 0; i < num; i++)
{
  bar[i] = foo(ptr[i]);
}
Be sure to check that the array has at least one element, otherwise data.data() fails. This method is also useful if you need to pass a pointer to an array to a method which does not accept a std::vector without copying the data.
void someFunc(const int* p, int N);
/* ... */
const int num = data.size();
const int* ptr = (num > 0) ? data.data() : nullptr;
someFunc(ptr, num);
Last but not least, I would like to point out a beautiful and fast method to iterate over all elements of a std::vector: the C++ 11 range-based loop.
/* ... */
std::vector<int> data = /* ... */;
for( auto i : data )
{
  foo(i);
}
Besides a very simple syntax, the range-based for loop maximizes the speed for iterating over all elements. And that's exactly what I have found out myself. So I suggest, for simplicity and efficiency, use this feature whenever possible.

On older compilers, you could also use the BOOST_FOREACH loop from the Boost-Library.
#include <boost/foreach.hpp>
/* ... */
std::vector<int> data = /* ... */;
BOOST_FOREACH( int i, data )
{
  foo(i);
}

7 comments:

  1. I have been doing research on iterating over a Vector, and I have found the fastest way to do so: grabbing a pointer to the beginning of the array, and manually incrementing the pointer to the element and continuing to loop as long as it's less than or equal to the pointer to the last element. BOOST_FOREACH is faster than using iterators, but accessing elements with data[i] is actually faster than using iterators or boost, almost twice as fast actually; and it turns out using data.at(i) is muuch slower than all of the above. My build was done using Microsoft Visual Studio 2008.

    With this data set:
    std::vector numbers(100000000,5);

    The following took more than 440ms.
    BOOST_FOREACH( int i, numbers )
    {
    foo(i);
    }

    Whereas the following took about 150ms.
    if(numbers.size() > 0)
    {
    const size_t nums = numbers.size();
    const int* lastInt = &numbers[nums - 1];
    int* pInt = &numbers[0];
    for( ; pInt <= lastInt; ++pInt)
    {
    foo(*pInt);
    }
    }

    It's not as pretty, sure, but it's faster. Muuch faster.


    leetNightshade.com

    ReplyDelete
  2. In addition to my last post, I would like to add it of course depends on the compiler. With Visual Studio 2008 the BOOST_FOREACH didn't run as well as a custom solution; however, with Visual Studio 2010 it runs damn quick with optimizations enabled.

    ReplyDelete
  3. BOOST_FOREACH,,, Love you boost
    It help me out from a very critical situation... <3
    so fast , even faster and fastest
    now i have to read more about BOOST

    ReplyDelete
  4. This article is quite inaccurate. If you make statements about code efficiency, please make measurements about the actual performance and determine exactly what causes the differences! In this case, some ideas are presented that result in obfuscated code that is NOT faster.

    For example, the simple for loop with indexing is optimal, as long as you don't compare signed with unsigned, so the canonical form

    for(unsigned int i = 0; i != data.size(); ++i)
    {
    foo(data[i]);
    }

    will be perfectly optimized by a decent compiler. The size() operation in particular is optimized away. Note that for Visual Studio, you need to make sure it does what you want, because unfortunately it has by default bounds checking, even for release mode. Turn it off with

    #define _SECURE_SCL 0

    Of course, the most modern way to write such a loop in C++ is

    for( int d : data)
    {
    foo(d);
    }

    So *trust* your optimizer and *trust* your compiler, but *know* what the pitfalls are.

    ReplyDelete
  5. Yes, very inaccurate info. I did actual measurements by different compilers and regular looping with indexing is the fastest. Why calculate the size of the vector at each call? std::vector::at() does range-checking.

    ReplyDelete
  6. Today i was testing and mesuring a little bit the iteration of a std::vector. I would like to share my results of it with you.

    First of all, the best method really is to use rangedBased for
    http://en.cppreference.com/w/cpp/language/range-for

    for (const auto& current : data)
    {
    foo(curent)
    }

    but here are the results in the link: https://drive.google.com/file/d/0B2Lt3rFKM1U0ODVIZ2dyRERfVEE/view?usp=sharing

    and here the used code
    https://drive.google.com/file/d/0B2Lt3rFKM1U0OTlIbFN1dEJmUDg/view?usp=sharing

    VS2015 Intel compiler 17.0
    vector had size from 100 - 5000 with simple int as data
    mesured clock cycles, all cpu speedup magic disabled, to have proper results

    my advise is in this case, let the compiler do the work, use the standard.

    ReplyDelete