Home

PLF C++ Library - plf::indiesort

Introduction

Indiesort (portmanteau of 'indirect sort') is a sort-function wrapper which enables use of std::sort (or other random-access sort techniques) with non-random-access containers, and also improves sort performance for large/non-trivially-copyable types in random-access containers and arrays. It achieves this by sorting pointers/indexes to elements by the value of the elements they point to, then relocating the elements, rather than sorting the elements themselves. Elements will be moved a maximum of two times - most will be moved once or not at all.

It has a temporary memory cost of N * (sizeof(pointer) + sizeof(size_t)) for sorting non-random-access iterators/containers (or N * sizeof(element) when sizeof(element) is smaller than (sizeof(pointer) + sizeof(size_t))),
and a N * sizeof(U) cost for random-access iterators/containers, where U = the smallest unsigned integer able to store N. For example if the size of the range being sorted is <= 255, U will be unsigned char.

Similar techniques are implemented by Morwenn's Mountain sort, and elsewhere. plf::list employs a similar technique in it's sort() techniques, but is able to do it much faster for larger-than-scalar types due to the structure of a linked list and only having to change pointers, not move elements. However when using plf::list with scalar types smaller than 8 bytes, you will get a 15% on average (up to 70%) performance increase using indiesort - at the cost of losing the validity of iterators/pointers pointing to the plf::list elements (as, unlike plf::list::sort(), indiesort changes element locations). So if you're using plf::list on larger-than-scalar types, don't use indiesort with it, just use it's own sort() technique.

Indiesort should be used when:

  1. The temporary memory cost mentioned is non-problematic,
  2. The container or iterators are not random_access and therefore std::sort cannot be used, and/or
  3. The element type is large or non-trivially-movable/copyable.

It is, on average across 1, 2, 4, 48 and 490 bytes, and numbers of elements ranging from 10 to 1000000:

See the benchmarks for more information.

Motivation

This technique developed out of the plf::list project, where it was used with the linked list to create a much faster sort than was available in std::list. From there it was ported into plf::colony, which, while having a large memory cost compared to plf::list's sort, was still useful. On Jens Maurer's suggestion, I separated the sort technique from colony, and in the process, streamlined and reduced the memory cost. It can now be used with any iterator/pointer range or container. Note: plf::list's internal sort is still superior to indiesort for a linked list, as it changes the previous and next pointers of each node rather than relocating elements, whereas indiesort will always reallocate elements.

Implementation

As of v1.3, the algorithm changes for non-random-access containers/iterators if sizeof(element) <= 2 * sizeof(element *) and the element is trivially-copyable or move-assignable - in this case, it will simply copy the elements to a C-array and sort the array, then copy the elements back. This becomes faster when the type size is small as there is less data to sort. In all other cases, the original method below is used.

As of v1.4, for random-access containers/iterators the algorithm will switch to directly sorting elements via std::sort when the sizeof(element) <= 152 (the benchmarked threshold for improved performance via indirect sorting) and the type is trivially-copyable or move-assignable. Otherwise the original method below is used.

The algorithm consists of gather, sort and scatter phases:

  1. Gather pointers:
    1. Obtain the size N of the range to be sorted, either via a container's size() function, last - first for random-access iterator ranges, or ++ iteration for non-random-access iterator ranges. If size is < 2 we return.
    2. Construct an array of pointer+size_t pair structs of size N. Fill the array with pointers to elements in the range + the corresponding index of each element within that range. This is the sort_array. It's pointer member is called original_location, while it's size_t member is called original_index.
      Note: For random_access iterators, original_location can be obtained from first + original_index, and can be omitted, with the size_t reduced to the smallest unsigned integer type whose std::numeric_limits::max() is larger than N. This significantly reduces memory usage in those scenarios.
  2. Sort the sort_array via the values of the elements which the original_location members point to, by default using std::sort. This is done via use of a sort_dereferencer functor of a type similar to that used in plf::list's pointer-sort hack section.
  3. Scatter the elements to their new locations: to do this we traverse the sort_array using index i in the loop below:
    1. If the end of the array has been reached at this point we exit the loop. If sort_array[i].original_index == i this means element position is unchanged post-sort (or that it has already been sorted as part of the move-chain below), so we increment i and repeat this step.
    2. Otherwise we move the element which this pair's original_location points to, into a temporary variable of the same element type called end_value. We then store i as destination_index and sort_array[i].original_index as source_index.
    3. We move the element pointed to by sort_array[source_index].original_location, to the location pointed to by sort_array[destination_index].original_location. We then set destination_index to the value of source_index. and set source_index to sort_array[destination_index].original_index. Then set sort_array[destination_index].original_index to destination_index (ie. indicating that the element pointed to has been sorted so it can be skipped in step (a)). At this point, if source_index == i, that means the end of this move-chain has been reached, so we move the element from end_value to the location pointed to by sort_array[destination_index].original_location, increment i, and return to step (a). Otherwise we repeat this step.
  4. Once the loop is over, we destroy and free the sort_array.

License

plf::indiesort is under a permissive zlib license. This means: Software is used on 'as-is' basis. It can be modified and used in commercial software. Authors are not liable for any damages arising from its use. The distribution of a modified version of the software is subject to the following restrictions:

Download

Download here (6kb zip file) or view the repository

The indiesort function library is a simple .h header file, to be included with a #include command.
In addition if you are interested in benchmarking you can also download the plf benchmark suite (64kb zip file).

Function Descriptions and syntax

To use a sort function other than std::sort, #define PLF_SORT_FUNCTION to the name of your sort function before calling plf::indiesort. Whatever sort function you use must take arguments in the same format and order as std::sort. If PLF_SORT_FUNCTION is not defined STL algorithm.h will be included and std::sort will be used instead. To avoid the plf_indiesort.h pulling in algorithm.h PLF_SORT_FUNCTION must be defined before including plf_indiesort.h.

Basic code example

#include <functional> // std::greater
#include <vector>
#include <iostream>

#include "plf_indiesort.h"


int main()
{
	std::vector<int> vec;

	for (int i = 0; i != 60; ++i)
	{
		vec.push_back(60 - i);
	}

	plf::indiesort(vec);

	int last = 0;

	for (std::vector<int>::iterator it = vec.begin(); it != vec.end(); ++it)
	{
		if (last > *it)
		{
			std::cout << "Less-than full-container sort failed\n";
			return -1;
		}

		last = *it;
	}

	std::cout << "Less-than full-container sort passed\n";


	std::vector<int>::iterator it1 = vec.begin() + 5, it2 = vec.begin() + 50;

	plf::indiesort(it1, it2, std::greater<int>());


	last = 60;

	for (std::vector<int>::iterator it = it1; it != it2; ++it)
	{
		if (last < *it)
		{
			std::cout << "Greater-than range sort failed\n";
			return -1;
		}

		last = *it;
	}

	std::cout << "Greater-than range sort passed\n";

        return 0;
}

Benchmarks

Benchmark results for indiesort v1.02 under GCC 9.2 x64 on an Intel Xeon E3-1241 (Haswell core) are here.

Version history

Contact: footer
plf:: library and this page Copyright (c) 2023, Matthew Bentley