plf::colony is the highest-performance C++ template-based data container for high-modification scenarios with unordered data. All elements within a colony have a stable memory location, meaning that pointers/iterators to non-erased elements are valid regardless of insertions/erasures to the container. Specifically the container provides better performance than other std:: library containers when:
While the benchmarks later on the page are a better way to get a feel for the performance benefits, the general performance characteristics are:
As explored further in the benchmarks there are some vector/deque modifications which can outperform colony during iteration while maintaining pointer validity, but at a cost to usability and memory usage and without colony's insert/erasure advantages. Colony's other advantages include freeing and/or recycling of unused memory on-the-fly, guaranteed stability of pointers/references/iterators to non-erased elements (which makes programming with containers of inter-related data structures faster and easier), and broad cross-compiler support.
A colony's structure can be reasonably compared to a "bucket array", where each element location can be emptied and skipped over. Unlike some bucket arrays it does not use keys or id's, and it iterates faster due to the use of an advanced skipfield type (previously a high complexity jump-counting pattern, now a low complexity jump-counting pattern) instead of a boolean field, and frees or recycles unused memory from erased elements on-the-fly. It was initially developed predominantly for game development, so is designed to favour larger-than-scalar-type (structs/classes of greater-than 128 bits total) performance over scalar-type (float, int, etc) performance. It supports over-aligned types.
Colony is the container upon which the current standards proposal for "std::hive" is based. The name and some functionality has diverged as a result of colony having to cater for C++03/11/etc containers, some functions being seen as unnecessary and the committee viewing colony as an inappropriate name for various reasons. A reference implementation for hive is here (C++20 and upwards only).
Note: initial motivation for the project came from video game engine
development. Since this point it has become a more generalized container.
Nevertheless, the initial motivation is presented below.
When working on video game engines we are predominantly dealing with
collections of data where:
To meet these requirements, game developers tend to either (a) develop their own custom containers for given scenario or (b) develop workarounds for the failings of std::vector. These workarounds are many and varied, but the most common are probably:
Advantages: Fast erasure.
Disadvantages: Slow to iterate due to branching and unpredictable number of skipfield checks between any two non-skipped elements.
Advantages: Faster iteration.
Disadvantages: Erasure still incurs some reallocation cost, can increase jitter. Insertion comparatively slow.
Advantages: Iteration is at standard vector speed.
Disadvantages: Erase could be slow if objects are large and swap cost is therefore large. Insertion cost is larger than other techniques. All references to elements incur additional costs due to a two-fold reference mechanism.
All three techniques have the disadvantage of slow singular insertions, and the first two will also continually expand memory usage when erasing and inserting over periods of time. The third deals better with this scenario as it swaps from the back rather than leaving gaps in the elements vector, however will suffer in performance if elements within the container are heavily referred to by external objects/elements.
Colony is an attempt to bring a more generic solution to this domain. It has the advantage of good iteration speed while maintaining a similar erasure speed to the boolean technique described above (faster for multiple consecutive erasures), and without causing pointer invalidation upon either erasure or insertion. It's insertion speed is also much faster than a vector's. Memory from erased elements is either reused by subsequent insertions or released to the OS on-the-fly. It achieves these ends via the mechanisms described in the Intro above, examined further in the Implementation section below.
More data on the performance characteristics of the colony versus other containers and workarounds can be found in the benchmarks section, read below to get an understanding of the mechanics.
Any colony implementation is defined by 3 different aspects:
For the first of these, plf::colony uses a chained-group memory allocation pattern with a growth factor of 2, which can otherwise be described as a doubly-linked list of element "groups" - memory blocks with additional structural metadata, including in this case per-memory-block Low complexity jump-counting skipfields. A growth factor of 2 tends to gives the best performance in the majority of scenarios (hence why it is also used for the majority of vector implementations). Colony's minimum and maximum memory block sizes can also be adjusted to suit individual use-cases. Using a multiple-memory-block approach removes the necessity for data reallocation upon insertion (compared to a vector). Because data is not reallocated, all references, pointers and iterators to container elements stay valid post-insertion.
For the second of these aspects, plf::colony uses an advanced skipfield type called a low complexity jump-counting pattern (versions prior to v5.0 utilized the high complexity jump-counting skipfield pattern). The third aspect is currently an intrusive list of groups containing erased elements, combined with per-group free-lists of skipblocks. A skipblock, in colony terminology, refers to a run of contiguous erased elements - including runs of only a single erased element. Versions prior to v5.0 utilized free lists of individual skipped elements, and versions prior to v4.5 utilized a stack of pointers to individual skipped elements.
Due to a std::vector being the most widely-used and commonly-understood of the std:: library containers, we will contrast the storage mechanisms of a colony with that of a vector, starting with insertion:
The lack of reallocation is also why insertion into a colony is faster than insertion into a std::vector. Now we look at erasure. When an element is erased in a std::vector, the following happens:
When an element is erased in a colony, this happens:
This is why a colony has a performance advantage over std::vector when erasing.
Upon subsequent insertions to a colony post-erasure, colony will check to see if its intrusive list of groups-with-erasures is empty (described earlier). If empty, it inserts to the end of the colony, creating a new group if the last group is already full. If the intrusive list of groups-with-erasures isn't empty, the colony takes the first group within that list and re-uses the first memory location from the free list of skipblocks within that group. If you erase all elements in any given group in a colony, the group is removed from the colony's chain of groups and released to the OS, and removed from the intrusive list of groups-with-erasures.
plf::colony is under a permissive zlib license. This means: Software is used on 'as-is' basis. It can be modified and used in commercial software. Authors are not liable for any damages arising from its use. The distribution of a modified version of the software is subject to the following restrictions:
Download here or
view the repository
The colony library is a simple .h header file, to be included with a #include
command.
In addition if you are interested in benchmarking you can also download the plf benchmark suite.
A conan package for colony is available here.
A Build2 package for colony is available here.
Colony meets the requirements of the C++ Container, AllocatorAwareContainer, and ReversibleContainer concepts.
For the most part the syntax and semantics of colony functions are very similar to all std:: c++ libraries. Formal description is as follows:
template <class T, class Allocator = std::allocator<T>, typename
plf::colony_priority Priority = plf::performance> class colony
T
- the element type. In general T must meet the
requirements of Erasable, CopyAssignable
and CopyConstructible.
However, if emplace is utilized to insert elements into the colony, and no
functions which involve copying or moving are utilized, T is only required to
meet the requirements of Erasable.
If move-insert is utilized instead of emplace, T must also meet the
requirements of MoveConstructible.
Allocator
- an allocator that is used to acquire memory to
store the elements. The type must meet the requirements of Allocator. The
behaviour is undefined if Allocator::value_type
is not the same as
T.
Priority
- either plf::performance or
plf::memory_use.
In the context of this implementation this changes the skipfield type
from unsigned short (performance) to unsigned char (memory_use), unless
the type sizeof is <= 10 bytes, in which case the skipfield type
will always be unsigned char (this always performs better for these
smaller types due to the effect on cache).
The skipfield type is used to form the skipfield which skips over
erased elements. The maximum size
of element memory blocks is also constrained by this type's bit-depth
(due to the
nature of a jump-counting skipfield). As an example, unsigned
short
on most platforms is 16-bit which thereby constrains the size of
individual memory blocks to a maximum of 65535 elements (the default
size is less than this, 8192 elements, for performance reasons). unsigned
short
has been found to be the optimal type for performance with elements larger than 10 bytes, based on
benchmarking. However this template parameter is also used in a variety of
scenarios within functions relating to performance or memory usage. In the case of small
collections (e.g. < 512 elements) in a memory-constrained environment, it can be
useful to reduce the memory usage of the skipfield by reducing the skipfield
bit depth to unsigned char. In addition the reduced skipfield size will reduce
cache saturation in this case without impacting iteration speed due to the low
total number of elements. Since these scenarios are on a per-case basis, it is
best to leave the control in the user's hands for the most part.
#include <iostream>
#include "plf_colony.h"
int main(int argc, char **argv)
{
plf::colony<int> i_colony;
// Insert 100 ints:
for (int i = 0; i != 100; ++i)
{
i_colony.insert(i);
}
// Erase half of them:
for (plf::colony<int>::iterator it = i_colony.begin(); it != i_colony.end(); ++it)
{
it = i_colony.erase(it);
}
// Total the remaining ints:
int total = 0;
for (plf::colony<int>::iterator it = i_colony.begin(); it != i_colony.end(); ++it)
{
total += *it;
}
std::cout << "Total: " << total << std::endl;
std::cin.get();
return 0;
}
#include <iostream>
#include "plf_colony.h"
int main(int argc, char **argv)
{
plf::colony<int> i_colony;
plf::colony<int>::iterator it;
plf::colony<int *> p_colony;
plf::colony<int *>::iterator p_it;
// Insert 100 ints to i_colony and pointers to those ints to p_colony:
for (int i = 0; i != 100; ++i)
{
it = i_colony.insert(i);
p_colony.insert(&(*it));
}
// Erase half of the ints:
for (it = i_colony.begin(); it != i_colony.end(); ++it)
{
it = i_colony.erase(it);
}
// Erase half of the int pointers:
for (p_it = p_colony.begin(); p_it != p_colony.end(); ++p_it)
{
p_it = p_colony.erase(p_it);
}
// Total the remaining ints via the pointer colony (pointers will still be valid even after insertions and erasures):
int total = 0;
for (p_it = p_colony.begin(); p_it != p_colony.end(); ++p_it)
{
total += *(*p_it);
}
std::cout << "Total: " << total << std::endl;
if (total == 2500)
{
std::cout << "Pointers still valid!" << std::endl;
}
std::cin.get();
return 0;
}
All read-only operations, swap, std::swap, splice, operator= && (source), reserve, trim | Never. |
clear, operator= & (destination), operator= && (destination) | Always. |
reshape | Only if memory blocks exist whose capacities do not fit within the supplied limits. |
shrink_to_fit | Only if capacity() != size(). |
erase | Only for the erased element. If an iterator is == end() it will be invalidated if the back element of the colony is erased (similar to deque (22.3.8)). Likewise if a reverse_iterator is == rend() it will be invalidated if the front element of the colony is erased. The same applies with cend() and crend() for const_iterator and const_reverse_iterator respectively. |
insert, emplace | If an iterator is == end() or == begin() it may be invalidated by a subsequent insert/emplace. Likewise if a reverse_iterator is == rend() or == rbegin() it may be invalidated by a subsequent insert/emplace. The same applies with cend(), cbegin() and crend(), crbegin() for const_iterator and const_reverse_iterator respectively. |
Member type | Definition |
value_type |
T |
allocator_type |
Allocator |
size_type |
Allocator::size_type (pre-c++11) |
difference_type |
Allocator::difference_type (pre-c++11) |
reference |
Allocator::reference (pre-c++11) |
const_reference |
Allocator::const_reference (pre-c++11) |
pointer |
Allocator::pointer (pre-c++11) |
const_pointer |
Allocator::const_pointer (pre-c++11) |
iterator |
BidirectionalIterator |
const_iterator |
Constant BidirectionalIterator |
reverse_iterator |
BidirectionalIterator |
const_reverse_iterator |
Constant BidirectionalIterator |
Note: from this point onward, if a feature (eg. noexcept) is specified which is only available in a particular language version, that function is available for all language versions, but that aspect of it is only available when that language version or higher is used. Except where otherwise specified.
standard |
constexpr colony() noexcept(noexcept(allocator_type)) |
copy | colony(const colony &source) |
move (C++11 and upwards) | colony(colony &&source) noexcept Note: postcondition state of source colony is the same as that of an empty colony.
|
fill |
colony(size_type n, const allocator_type &alloc = allocator_type()) |
range |
template<typename InputIterator> |
initializer list |
colony(const std::initializer_list<value_type> &element_list, const allocator_type &alloc = allocator_type()) |
ranges v3 (C++20 and upwards) |
template<class range_type>
|
colony<T> a_colony
Default
constructor - default minimum group size is implementation-defined,
default maximum group size is std::numeric_limits<unsigned
short>::max() (unsigned char when priority::memory_use is used). You
cannot set the group sizes from the constructor in this scenario, but
you can call the reshape() member function after construction has
occurred.
Example: plf::colony<int>
int_colony;
colony<T, the_allocator<T> > a_colony(const
allocator_type &alloc = allocator_type())
Default constructor, but using a custom memory allocator e.g. something
other than std::allocator.
Example: plf::colony<int,
tbb::allocator<int> > int_colony;
Example2:
// Using an instance of an allocator as well as
it's type
tbb::allocator<int> alloc_instance;
plf::colony<int, tbb::allocator<int> >
int_colony(alloc_instance);
colony<T> colony(size_type n, plf::colony_limits minmax)
Fill constructor with value_type unspecified, so the value_type's
default constructor is used. n
specifies the number of
elements to create upon construction. If n
is larger than
minmax.min
, the size of the groups created will either be
n
and minmax.max
, depending on which is
smaller. Setting the group sizes can be a performance
advantage if you know in advance roughly how many objects are likely to be
stored in your colony long-term - or at least the rough scale of storage.
If that case, using this can stop many small initial groups being
allocated (reserve() will achieve a similar result).
Example: plf::colony<int>
int_colony(62);
colony<T> colony(const std::initializer_list<value_type> &element_list)
Using an initialiser list to insert into the colony upon construction.
Example: std::initializer_list<int>
&el = {3, 5, 2, 1000};
plf::colony<int> int_colony(el, 64, 512);
colony<T> a_colony(const colony &source)
Copy all contents from source colony, removes any empty (erased) element
locations in the process. Size of groups created is either the total size
of the source colony, or the maximum group size of the source colony,
whichever is the smaller.
Example: plf::colony<int>
int_colony_2(int_colony_1);
colony<T> a_colony(colony &&source)
Move all contents from source colony, does not remove any erased element
locations or alter any of the source group sizes. Source colony is now empty and can be safely destructed or otherwise used.
Example: plf::colony<int> int_colony_1(50,
5, 512, 512); // Fill-construct a colony with min and max group sizes set at 512
elements. Fill with 50 instances of int = 5.
plf::colony<int> int_colony_2(std::move(int_colony_1)); // Move all
data to int_colony_2. All of the above characteristics are now applied to
int_colony2.
All iterators are bidirectional but also provide >, <, >= and <=
for convenience (for example, in for
loops). Functions for
iterator, reverse_iterator, const_iterator and const_reverse_iterator
follow:
operator *
operator ->
operator ++
operator --
operator =
operator ==
operator !=
operator <
operator >
operator <=
operator >=
operator <=> (C++20 and upwards)
base() (reverse_iterator and const_reverse_iterator only)
All operators have O(1) amortised time-complexity. Originally there were +=, -=, + and - operators, however the time complexity of these varied from O(n) to O(1) depending on the underlying state of the colony, averaging in at O(log n). As such they were not includable in the iterator functions (as per C++ standards). These have been transplanted to colony's advance(), next(), prev() and distance() member functions. Greater-than/lesser-than operator usage indicates whether an iterator is higher/lower in position compared to another iterator in the same colony (i.e. closer to the end/beginning of the colony). const_iterator is provided for all C++ language versions.
single element | iterator insert (value_type &val) |
fill | iterator insert (size_type n, value_type
&val) |
range | template <class InputIterator> iterator insert (InputIterator first, InputIterator last) |
move (C++11 and upwards) | iterator insert (value_type&& val) |
initializer list | iterator insert (std::initializer_list<value_type> il) |
ranges v3 (C++20 and upwards) |
template<class range_type> |
iterator insert(const value_type &element)
iterator insert(const_iterator hint, const value_type &element) C++20 and upwards
If a hint is supplied, it is ignored (included solely to support std::insert_iterator, it is an inactive parameter). Inserts the element supplied to the colony, using the object's copy-constructor. Will insert the element into a previously erased element slot if one exists, otherwise will insert to back of colony. Returns iterator to location of inserted element. Example:
plf::colony<unsigned int> i_colony;
i_colony.insert(23);
iterator insert(value_type &&element)
iterator insert(const_iterator hint, &&element) C++20 and upwards
If a hint is supplied, it is ignored. Moves the element supplied to the colony, using the object's move-constructor. Will insert the element in a previously erased element slot if one exists, otherwise will insert to back of colony. Returns iterator to location of inserted element. Example:
std::string string1 = "Some text";
plf::colony<std::string> data_colony;
data_colony.insert(std::move(string1));
void insert (const size_type n, const value_type &val)
Inserts n
copies of val
into the colony. Will
insert the element into a previously erased element slot if one exists,
otherwise will insert to back of colony. Example:
plf::colony<unsigned int> i_colony;
i_colony.insert(10, 3);
template <class InputIterator> void insert (const InputIterator &first, const InputIterator &last)
Inserts a series of value_type
elements from an external
source into a colony holding the same value_type
(e.g. int,
float, a particular class, etcetera). Stops inserting once it reaches
last
. Example:
// Insert all contents of colony2 into
colony1:
colony1.insert(colony2.begin(), colony2.end());
void insert (const std::initializer_list<value_type>
&il)
Copies elements from an initializer list into the colony. Will insert the element in a previously erased element slot if one exists, otherwise will insert to back of colony. Example:
std::initializer_list<int> some_ints =
{4, 3, 2, 5};
plf::colony<int> i_colony;
i_colony.insert(some_ints);
iterator emplace(Arguments &&...parameters) C++11 and
upwards
iterator emplace_hint(const_iterator hint, Arguments &&...parameters) C++20 and upwards
The hint in emplace_hint is ignored (included solely to support std::insert_iterator). Constructs new element directly within colony. Will insert the element
in a previously erased element slot if one exists, otherwise will insert to
back of colony. Returns iterator to location of inserted element.
"...parameters" are whatever parameters are required by the object's
constructor. Example:
class simple_class
{
private:
int number;
public:
simple_class(int a_number): number (a_number) {};
};
plf::colony<simple_class> simple_classes;
simple_classes.emplace(45);
fill | void assign (size_type n, value_type
&val) |
range | template <class InputIterator> void assign (InputIterator first, InputIterator last) |
initializer list | void assign (std::initializer_list<value_type> il) |
ranges v3 (C++20 and upwards) |
template<class range_type> |
void assign (const size_type n, const value_type
&val)
Assigns n
copies of val
to the colony. Colony will erase all current contents and attempt to preserve existing memory blocks.
Example:
plf::colony<unsigned int> i_colony;
i_colony.assign(10, 3);
template <class InputIterator> void assign (const InputIterator &first, const InputIterator &last)
Assigns a series of value_type
elements from an external
source to the colony. Example:
// Assign all contents of colony2 into
colony1:
colony1.assign(colony2.begin(), colony2.end());
void assign (const std::initializer_list<value_type>
&il)
Assigns elements from an initializer list to the colony. Example:
std::initializer_list<int> some_ints =
{4, 3, 2, 5};
plf::colony<int> i_colony;
i_colony.assign(some_ints);
single element | iterator erase(const_iterator it) |
range | iterator erase(const_iterator first, const_iterator
last) |
iterator erase(const const_iterator it)
Removes the element pointed to by the supplied iterator, from the colony. Returns an iterator pointing to the next non-erased element in the colony (or to end() if no more elements are available). This must return an iterator because if a colony group becomes entirely empty, it will be removed from the colony, invalidating the existing iterator. Attempting to erase a previously-erased element results in undefined behaviour (this is checked for via an assert in debug mode). Example:
plf::colony<unsigned int>
data_colony(50);
plf::colony<unsigned int>::iterator an_iterator;
an_iterator = data_colony.insert(23);
an_iterator = data_colony.erase(an_iterator);
iterator erase(const const_iterator first, const const_iterator last)
Erases all elements of a given colony from first
to the
element before the last
iterator. This function is optimized
for multiple consecutive erasures and will always be faster than sequential
single-element erase calls in that scenario. Example:
plf::colony<int> iterator1 =
colony1.begin();
colony1.advance(iterator1, 10);
plf::colony<int> iterator2 = colony1.begin();
colony1.advance(iterator2, 20);
colony1.erase(iterator1, iterator2);
[[nodiscard]] bool empty() const noexcept
Returns a boolean indicating whether the colony is currently empty of
elements.
Example: if (object_colony.empty()) return;
size_type size() const noexcept
Returns total number of elements currently stored in container.
Example: std::cout << i_colony.size()
<< std::endl;
size_type max_size() const noexcept
Returns the maximum number of elements that the allocator can store in
the container.
Example: std::cout << i_colony.max_size()
<< std::endl;
size_type capacity() const noexcept
Returns total number of elements currently able to be stored in
container without expansion.
Example: std::cout << i_colony.capacity()
<< std::endl;
size_type memory() const noexcept
Returns
the memory use of the container plus it's elements. Does not include
memory dynamically-allocated by the elements themselves.
Example: std::cout << i_colony.memory()
<< std::endl;
void shrink_to_fit()
Reduces
container capacity to the amount necessary to store all currently
stored elements, consolidates elements and removes any erased
locations. If the total number of elements is larger than the maximum
group size, the resultant capacity will be equal to ((total_elements / max_group_size) + 1) * max_group_size
(rounding down at division). Invalidates all pointers, iterators and
references to elements within the container.
Example: i_colony.shrink_to_fit();
void trim_capacity() noexcept
void trim_capacity(size_type n) noexcept
Deallocates
any unused groups retained during erase() or unused since the last
reserve(). Unlike shrink_to_fit(), this function cannot invalidate
iterators or pointers to elements, nor will it shrink the capacity to
match size(). If n
parameter is specified, will not remove a group if doing so will reduce the total capacity of the colony below n
elements.
Example: i_colony.trim_capacity();
void reserve(const size_type reserve_amount)
Preallocates memory space sufficient to store the number of elements
indicated by reserve_amount
.
This function is useful from a performance perspective when the user is
inserting elements singly, but the overall number of insertions is
known in advance. By reserving, colony can forgo creating many smaller
memory block allocations (due to colony's growth factor) and reserve
larger memory blocks instead.
Example: i_colony.reserve(15);
void clear() noexcept
Destroys
all elements and sets size() to zero. Currently existing groups are put
in the 'reserved groups' list for future use. To clear these groups as
well, follow up clear() with shrink_to_fit(), or just call reset()
instead (faster).
Example: object_colony.clear();
void reset() noexcept
Destroys
all elements, sets size() to zero, and deallocates all groups. Unlike
clear(), existing groups are not retained. Faster than clear() +
shrink_to_fit().
Example: object_colony.reset();
void reshape(plf::colony_limits minmax)
Changes
the minimum and maximum internal group sizes, in terms of number of
elements stored per group. If the colony is not empty and either
minmax.min is larger than the smallest group in the colony, or
minmax.max is smaller than the largest group in the colony, the colony
will be internally copy-constructed into a new colony which uses the
new group sizes, invalidating all pointers/iterators/references. If
trying to change group sizes with a colony storing a
non-copyable/movable type, an exception is thrown. If minmax.min or
minmax.max are outside of the colony's hard limits, an exception is
thrown.
Example: object_colony.reshape(plf::colony_limits(100, 1000));
constexpr plf::colony_limits block_capacity_limits() const noexcept
Returns the current minimum and maximum block limits within the .min and .max fields of a limits struct.
Example: plf::colony_limits temp = object_colony.block_capacity_limits();
static constexpr plf::colony_limits block_capacity_hard_limits() noexcept
Returns the current minimum and maximum hard block limits within the .min and .max fields of a limits struct. In this context 'hard' means, if the user supplies a block capacity limit to a constructor or reshape(), which is above the hard maximum or below the hard minimum, an exception is thrown. As a static member function, this is designed to be able to be called before instatiating a colony using user-defined limits, or before calling reshape().
static constexpr plf::colony_limits block_capacity_default_limits() noexcept
Returns the current minimum and maximum default block limits for a given colony type within the .min and .max fields of a limits struct.
static constexpr size_type block_capacity_default_min() noexcept
Returns the default minimum block limit for a given colony type.
static constexpr size_type block_capacity_default_max() noexcept
Returns the default maximum block limit for a given colony type.
void
swap(colony &source)
noexcept(std::allocator_traits<allocator_type>::propagate_on_container_swap::value
|| std::allocator_traits<allocator_type>::is_always_equal::value)
Swaps the colony's contents with that of source
.
Example: object_colony.swap(other_colony);
void sort();
template <class comparison_function>
void sort(comparison_function compare);
Sort
the content of the colony. By default this compares the colony content
using a less-than operator, unless the user supplies a comparison
function (i.e. same conditions as std::list's sort function). Uses
std::sort internally but will use another sort function if it's name
(including namespace) is defined as the macro PLF_COLONY_SORT_FUNCTION
before including the colony .h. Supplied sort function must take same
parameters as std::sort.
Example: // Sort a colony of integers in ascending order:
int_colony.sort();
// Sort a colony of doubles in descending order:
double_colony.sort(std::greater<double>());
void unique()
template <class comparison_function>
void unique(comparison_function compare);
Removes
duplicates of values which are consecutive in the iterative sequence.
It is assumed that sort() will have been called before this function,
or that elements have been inserted in a specific order. By default
this compares the colony content using a == operator, unless the user
supplies a comparison function (i.e. same conditions as std::list's
unique function).
Example: // Remove duplicates in a colony of integers in ascending order:
int_colony.unique();
void splice(colony &source)
Transfer all elements from source colony into destination colony without invalidating pointers/iterators to either colony's elements (in other words the destination takes ownership of the source's memory blocks). After the splice, the source colony is empty. Splicing is much faster than range-moving or copying all elements from one colony to another. Colony does not guarantee a particular order of elements after splicing, for performance reasons; the insertion location of source elements in the destination colony is chosen based on the most positive performance outcome for subsequent iterations/insertions. For example if the destination colony is {1, 2, 3, 4} and the source colony is {5, 6, 7, 8} the destination colony post-splice could be {1, 2, 3, 4, 5, 6, 7, 8} or {5, 6, 7, 8, 1, 2, 3, 4}, depending on internal state of both colonies and prior insertions/erasures.
Note: If the minimum
group size of the source is smaller than the destination, the
destination will change it's minimum group size to match the source.
The same applies for maximum group sizes (if source's is larger, the
destination will adjust its).
Example: // Splice two colonies of integers together:
colony<int> colony1 = {1, 2, 3, 4}, colony2 = {5, 6, 7, 8};
colony1.splice(colony2);
colony_data * data();
Returns a pointer to a dynamically-allocated struct with the following members:
colony::aligned_pointer_type * const block_pointers; // dynamically-allocated array of pointers to element memory blocks
unsigned char * * const bitfield_pointers; // dynamically-allocated
array of pointers to bitfields in the form of unsigned char arrays
// these represent whether an element is erased or not (0 for erased).
// they can be reinterpret_cast'd to whatever SIMD mask bitdepth is needed.
size_t * const block_capacities; // dynamically-allocated array of the number of elements in each memory block
const size_t number_of_blocks; // size of each of the arrays above
This can be used with SIMD intrinsics to perform gather/scatter operations on colony non-erased elements. Currently AVX512 is the only intel instruction set with performance-efficient gather/scatter ops. The structure must be freed with delete when finished with. See test suite for a more thorough inspection of the structure.
Example:
colony<int> i_colony(10000, 5);
colony<int>::colony_data *data = i_colony.data();
// do stuff
delete data;
colony & operator = (const colony &source)
Copy the elements from another colony to this colony, clearing this
colony of existing elements first.
Example: // Manually swap data_colony1 and
data_colony2 in C++03
data_colony3 = data_colony1;
data_colony1 = data_colony2;
data_colony2 = data_colony3;
colony
& operator = (colony &&source)
noexcept(std::allocator_traits<allocator_type>::propagate_on_container_move_assignment::value
|| std::allocator_traits<allocator_type>::is_always_equal::value)C++11 and upwards
Move
the elements from another colony to this colony, clearing this colony
of existing elements first. Source colony is now empty and in a valid
state (same as a new colony without any insertions), can be safely
destructed or used in any regular way without problems.
Example: // Manually swap data_colony1 and
data_colony2 in C++11
data_colony3 = std::move(data_colony1);
data_colony1 = std::move(data_colony2);
data_colony2 = std::move(data_colony3);
colony & operator = (const std::initializer_list<value_type> il))
Copy the elements from an initializer list to this colony, clearing this colony of existing elements first.
friend bool operator == (const colony &lh, const colony &rh) noexcept
Compare contents of one colony to this colony. Returns a boolean as
to whether their iterative sequences are equal.
Example: if (object_colony == object_colony2)
return;
friend bool operator != (const colony &lh, const colony &rh) noexcept
Compare contents of another colony to this colony. Returns a boolean as
to whether their iterative sequences are not equal.
Example: if (object_colony != object_colony2)
return;
iterator begin(), const_iterator begin(), iterator end(), const_iterator end(), const_iterator cbegin(),
const_iterator cend()
Return iterators pointing to, respectively, the first element of the colony and the element one-past the end of the colony.
reverse_iterator rbegin(), const_reverse_iterator rbegin(), reverse_iterator rend(), const_reverse_iterator rend(),
const_reverse_iterator crbegin(), const_reverse_iterator crend()
Return reverse iterators pointing to, respectively, the last element of the colony and the element one-before the first element of the colony.
iterator get_iterator(const_pointer element_pointer) noexcept
const_iterator get_const_iterator(const_pointer element_pointer) const noexcept
(non-O(1))
Getting a pointer from an iterator is simple - simply dereference it
then grab the address i.e. "&(*the_iterator);"
. Getting an
iterator from a pointer is typically not so simple. This function enables
the user to do exactly that. This is expected to be useful in the use-case
where external containers are storing pointers to colony elements instead
of iterators (as iterators for colonies have 3 times the size of an element
pointer) and the program wants to erase the element being pointed to or
possibly change the element being pointed to. Converting a pointer to an
iterator using this method and then erasing, is about 20% slower on average
than erasing when you already have the iterator.
This is less dramatic than it sounds, as it is still faster than all std:: container erasure times
which it is roughly equal to. However this is generally a slower,
lookup-based operation. If the lookup doesn't find a non-erased element
based on that pointer, it returns end()
.
Otherwise it returns an iterator pointing to the element in question.
For const pointers, use a const_cast when supplying the pointer to the
function, and for const_iterator, simply supply a const_iterator for
the return value. Example:
plf::colony<a_struct> data_colony;
plf::colony<a_struct>::iterator an_iterator;
a_struct struct_instance;
an_iterator = data_colony.insert(struct_instance);
a_struct *struct_pointer = &(*an_iterator);
iterator another_iterator =
data_colony.get_iterator(struct_pointer);
if (an_iterator == another_iterator) std::cout << "Iterator is
correct" << std::endl;
bool is_active(const const_iterator it) const
This is similar to get_iterator(pointer), but for iterators. For a given iterator, returns whether or not the location the iterator points to is still valid ie. that it points to an active non-erased element in a memory block which is still allocated. It does not differentiate between memory blocks which have been re-used - the only thing it reports, is whether the memory block pointed to by the iterator exists in an active state in this particular colony, and if there is an element being pointed to by the iterator, in that memory block, which is also in an active, non-erased state.
static constexpr size_type block_metadata_memory(const size_type block_capacity) noexcept
Given a block element capacity, tells you, for this particular colony type, what the resultant amount of metadata (including skipfield) allocated for that block will be, in bytes.
static constexpr size_type block_allocation_amount(const size_type block_capacity) noexcept
Given a block element capacity, tells you, for this particular colony type, what the resultant allocation for that block will be, in bytes. In terms of this version of colony, the block and skipfield are allocated at the same time (other block metadata is allocated separately). So this function is expected to be primarily useful for developers working in environments where an allocator or system only allocates a specific capacity of memory per call, eg. some embedded platforms.
static constexpr size_type max_elements_per_allocation(const size_type allocation_amount) noexcept
Given a fixed allocation amount, tells you, for this particular colony type, exactly what the maximum element capacity per-block can be. In other words, if an allocator can only supply a fixed or low amount of memory per allocation, this function tells the developer the upper limit of what their max block capacity limit should be set to in the constructor (or in reshape()). Like the above this function is expected to be primarily useful for developers working in embedded environments. The return value is bounded by the colony's maximum hard block capacity limit (255 or 65535 depending on the type and priority_type used) and will return zero if the return value is below the colony's minimum hard block capacity limit (currently 3).
template <class iterator_type> void advance(iterator_type iterator,
const distance_type distance)
Increments/decrements the iterator supplied by the positive or negative
amount indicated by distance. Speed of incrementation will almost
always be faster than using the ++ operator on the iterator for increments
greater than 1. In some cases it may approximate O(1). The iterator_type
can be an iterator, const_iterator, reverse_iterator or
const_reverse_iterator.
Example: colony<int>::iterator it =
i_colony.begin();
i_colony.advance(it, 20);
template <class iterator_type> iterator_type next(iterator_type &iterator, const distance_type distance)
Creates a copy of the iterator supplied, then increments/decrements this
iterator by the positive or negative amount indicated by
distance.
Example: colony<int>::iterator it =
i_colony.next(i_colony.begin(), 20);
template <class iterator_type> iterator_type prev(iterator_type &iterator, const distance_type distance)
Creates a copy of the iterator supplied, then decrements/increments this
iterator by the positive or negative amount indicated by
distance.
Example: colony<int>::iterator it2 =
i_colony.prev(i_colony.end(), 20);
template <class iterator_type> difference_type distance(const iterator_type &first, const iterator_type &last)
Measures the distance between two iterators, returning the result, which
will be negative if the second iterator supplied is before the first
iterator supplied in terms of it's location in the colony.
Example: colony<int>::iterator it =
i_colony.next(i_colony.begin(), 20);
colony<int>::iterator it2 = i_colony.prev(i_colony.end(), 20);
std::cout "Distance: " i_colony.distance(it, it2) std::endl;
template <class container_type> void std::swap(container_type &A, container_type &B)
Swaps colony A's contents with that of colony B (assumes both colonies
have same element type, allocator type, etc).
Example: swap(object_colony, other_colony);
template <class container_type, class predicate> size_type std::erase_if(container_type A, predicate pred)
Erases all elements in the colony which match predicate pred
. Returns the number of erased elements.
template <class container_type> size_type std::erase(container_type A, container_type::value_type v)
Erases all elements in the colony which == v
. Returns the number of erased elements.
template <plf::colony_iterator_concept it_type>
class reverse_iterator<it_type> : public it_type::reverse_type
Overload for std::reverse_iterator, for returning a reverse_iterator from a supplied colony iterator/const_iterator. Primarily intended for use with ranges v3.
Benchmark results for colony v7.39 under GCC 11.2 x64 on an Intel 2018 i5 are here.
Benchmark results for colony v6.11 under GCC 9.2 x64 on an Intel Xeon E3-1241 (Haswell) are here.
Old benchmark results for colony v3.87 under GCC 5.1 x64 on an Intel E8500 (Core2) are here,
and under MSVC 2015 update 3 on an Intel Xeon E3-1241 (Haswell) are here. There is no commentary for the MSVC results.
A combination of a linked-list of increasingly-large memory blocks with metadata. Part of this metadata is the head of a free-list of erased elements within that group, and also a 'next' pointer to the next group which has erased element locations which can be re-used. Another part of the metadata is a low complexity jump-counting skipfield, which is used to skip over erased elements during iteration in O(1) amortised time. The end result's a bidirectional, unordered C++ template data container which maintains positive performance characteristics while ensuring pointer stability to non-erased container elements.
As mentioned, it is worthwhile for performance reasons in situations where the order of container elements is not important and:
Under these circumstances a colony will generally out-perform other std:: containers. In addition, because it never invalidates pointer references to container elements (except when the element being pointed to has been previously erased) it may make many programming tasks involving inter-relating structures in an object-oriented or modular environment much faster, and could be considered in those circumstances.
Some ideal situations to use a colony: cellular/atomic simulation, persistent octtrees/quadtrees, game entities or destructible-objects in a video game, particle physics, anywhere where objects are being created and destroyed continuously. Also, anywhere where a vector of pointers to dynamically-allocated objects or a std::list would typically end up being used in order to preserve object references but where order is unimportant.
In most cases the time complexity of operations results from the fundamental requirements of the container itself, not the implementation. In some cases, such as erase complexity, this can change a little based on the implementation, but the change is minor.
One of the requirements of colony is that pointers to non-erased elements stay valid regardless of insertion/erasure within the container. For this reason the container must use multiple memory blocks. If a single memory block were used, like in a std::vector, reallocation of elements would occur when the container expanded (and the elements were copied to a larger memory block). Hence, colony will insert into existing memory blocks when able, and create a new memory block when all existing memory blocks are full.
Because colony (by default) has a growth pattern for new memory blocks, the number of insertions before a new memory block is required increases as more insertions take place. Hence the insertion time complexity is O(1) amortised.
This follows the same principle - there will occasionally be a need to create a new memory block, but since this is infrequent it is O(N) amortised.
Generally speaking erasure is a simple matter of destructing the element in question and updating the skipfield. Since we started using the low complexity jump-counting pattern (v5.00) instead of the high complexity jump-counting skipfield pattern, that skipfield update is always O(1). However, when a memory block becomes empty of non-erased elements it must be freed to the OS (or stored for future insertions, depending on implementation) and removed from the colony's sequence of memory blocks. It it was not, we would end up with non-O(1) amortised iteration, since there would be no way to predict how many empty memory blocks there would be between the current memory block being iterated over, and the next memory block with non-erased (active) elements in it. Since removal of memory blocks is infrequent the time complexity becomes O(1) amortised.
In this case, where the element is non-trivially destructible, the time complexity is O(N) amortised, with infrequent deallocation necessary from the removal of an empty memory block as noted above. However where the elements are trivially-destructible, if the range spans an entire memory block at any point, that block and it's skipfield can simply be removed without doing any individual writes to it's skipfield or individual destruction of elements, potentially making this a O(1) operation.
In addition (when dealing with trivially-destructible types) for those memory blocks where only a portion of elements are erased by the range, if no prior erasures have occurred in that memory block you can erase that range in O(1) time, as there will be no need to check within the range for previously erased elements, and the low complexity jump-counting pattern only requires two skipfield writes to indicate a range of skipped nodes. The reason you would need to check for previously erased elements within that portion's range is so you can update the metadata for that memory block to accurately reflect how many non-erased elements remain within the block. If that metadata is not present, there is no way to ascertain when a memory block is empty of non-erased elements and hence needs to be removed from the colony. The reasoning for why empty memory blocks must be removed is included in the Erase(single) section, above.
However in most cases the erase range will not perfectly match the size of all memory blocks, and with typical usage of a colony there is usually some prior erasures in most memory blocks. So for example when dealing with a colony of a trivially-destructible type, you might end up with a tail portion of the first memory block in the erasure range being erased in O(N) time, the second and intermediary memory block being completely erased and freed in O(1) time, and only a small front portion of the third and final memory block in the range being erased in O(N) time. Hence the time complexity for trivially-destructible elements approximates O(log n) on average, being between O(1) and O(N) depending on the start and end of the erasure range.
This relies on basic iteration so is O(N).
Colony only does full-container splicing, not partial-container splicing (use range-insert with std::move to achieve the latter, albiet with the loss of pointer validity to the moved range). When splicing the memory blocks from the source colony are transferred to the destination colony without processing the individual elements or skipfields, inserted after the destination's final block. However if there are unused element memory spaces at the back of the destination container (ie. the final memory block is not full), the skipfield nodes corresponding to those empty spaces must be altered to indicate that these are skipped elements. Luckily as of v5.00 this is also a O(1) operation, hence the overall operation is O(1).
Generally the time complexity is O(1), as the skipfield pattern used (current) must allow for skipping of multiple erased elements. However every so often iteration will involve a transistion to the next/previous memory block in the colony's sequence of blocks, depending on whether we are doing ++ or --. At this point a read of the next/previous memory block's corresponding skipfield is necessary, in case the first/last element(s) in that memory block are erased and hence skipped. So for every block transition, 2 reads of the skipfield are necessary instead of 1. Hence the time complexity is O(1) amortised.
Skipfields must be per-block and independent between memory blocks, as otherwise you would end up with a vector for a skipfield, which would need a range erased every time a memory block was removed from the colony (see notes under Erase above), and reallocation to larger skipfield memory block when the colony expanded. Both of these procedures carry reallocation costs, meaning you could have thousands of skipfield nodes needing to be reallocated based on a single erasure (from within a memory block which only had one non-erased element left and hence would need to be removed from the colony). This is unacceptable latency for any fields involving high timing sensitivity (all of SG14).
For any implementation these should generally be stored as member variables and so returning them is O(1).
The reasoning for this is similar to that of Erase(multiple), above. Memory blocks which are completely traversed by the increment/decrement distance can be skipped in O(1) time by reading the metadata for the block which details the number of non-erased elements in the block. If the current iterator location is not at the very beginning of a memory block, then the distance within that block will need to be processed in O(N) time, unless no erasures have occurred within that block, in which case the skipfield does not need to be checked and traversal is simply a matter of pointer arithmetic and checking the distance between the current iterator position and the end of the memory block, which is a O(1) operation.
Similarly for the final block involved in the traversal (if the traversal spans multiple memory blocks!) if no erasures have occurred in that memory block pointer arithmetic can be used and the procedure is O(1), otherwise it is O(N) (++ iterating over the skipfield). We can therefore say that the overall time complexity for these operations might approximate O(log N), while being between O(1) and O(N) amortised respectively.
A deque is reasonably dissimilar to a colony - being a double-ended queue, it requires a different internal framework. It typically uses a vector of memory blocks, whereas the colony implementation uses a linked list of memory blocks, essentially. A deque can't technically use a linked list of memory blocks because it will make some random_access iterator operations (e.g. + operator) non-O(1). In addition, being random-access container, having a growth factor for memory blocks in a deque is problematic (not impossible though).
A deque and colony have no comparable performance characteristics except for insertion (assuming a good deque implementation). Deque erasure performance varies wildly depending on the implementation compared to std::vector, but is generally similar to vector erasure performance. A deque invalidates pointers to subsequent container elements when erasing elements, which a colony does not.
Unlike a std::vector, a colony can be read from and inserted into at the same time (assuming different locations for read and write), however it cannot be iterated over and written to at the same time. If we look at a (non-concurrent implementation of) std::vector's thread-safe matrix to see which basic operations can occur at the same time, it reads as follows (please note push_back() is the same as insertion in this regard):
std::vector | Insertion | Erasure | Iteration | Read |
Insertion | No | No | No | No |
Erasure | No | No | No | No |
Iteration | No | No | Yes | Yes |
Read | No | No | Yes | Yes |
In other words, multiple reads and iterations over iterators can happen simultaneously, but the potential reallocation and pointer/iterator invalidation caused by insertion/push_back and erasure means those operations cannot occur at the same time as anything else.
Colony on the other hand does not invalidate pointers/iterators to non-erased elements during insertion and erasure, resulting in the following matrix:
plf::colony | Insertion | Erasure | Iteration | Read |
Insertion | No | No | No | Yes |
Erasure | No | No | No | Mostly* |
Iteration | No | No | Yes | Yes |
Read | Yes | Mostly* | Yes | Yes |
* Erasures will not invalidate iterators unless the iterator points to the erased element.
In other words, reads may occur at the same time as insertions and erasures (provided that the element being erased is not the element being read), multiple reads and iterations may occur at the same time, but iterations may not occur at the same time as an erasure or insertion, as either of these may change the state of the skipfield which's being iterated over. Note that iterators pointing to end() may be invalidated by insertion.
So, colony could be considered more inherently thread-safe than a (non-concurrent implementation of) std::vector, but still has some areas which would require mutexes or atomics to navigate in a multithreaded environment.
insert()
and emplace()
, insertion position is
essentially random unless no erasures have been made, or an equal
number of erasures and insertions have been made.Testing so far indicates that storing pointers and then using
get_iterator()
when or if you need to do an erase
operation on the element being pointed to, yields better performance than
storing iterators and performing erase directly on the iterator. This is
simply due to the size of iterators (3 pointers in width).
In the special case where many, many elements are being continually
erased/inserted in realtime, you might want to experiment with limiting the
size of your internal memory groups in the constructor. The form of this is
as follows:
plf::colony<object> a_colony(plf::colony_limits(500, 5000));
where the first number is the minimum size of the internal memory groups
and the second is the maximum size. Note these can be the same size,
resulting in an unchanging group size for the lifetime of the colony
(unless reshape
is called or operator = is
called).
One reason to do this is that it is slightly slower to obtain an element location from the list of groups-with-erasures (subsequently utilising that group's free list of erased locations), than it is to insert a new element to the end of the colony (the default behaviour when there are no previously-erased elements). If there are any erased elements in the colony, the colony will recycle those memory locations, unless the entire group is empty, at which point it is freed to memory. So if a group size is large and many, many erasures occur but the group is not completely emptied, iterative performance may suffer due to large memory gaps between any two non-erased elements. In that scenario you may want to experiment with benchmarking and limiting the minimum/maximum sizes of the groups, and find the optimal size for a particular use case.
Please note that the the fill, range and initializer-list constructors can also take group size parameters, making it possible to construct filled colonies using custom group sizes.
Though I am happy to be proven wrong I suspect colony is it's own abstract data type. While it is similar to a multiset or bag, those utilize key values and are not sortable (by means other than automatically by key value). Colony does not utilize key values, is sortable, and does not provide the sort of functionality one would find in a bag (e.g. counting the number of times a specific value occurs). Some have suggested similarities to deque - but as described earlier the three core aspects of colony are:
The only aspect out of these which deque also shares is a multiple-memory-block allocation pattern - not a strong association. As a result, deques do not guarantee pointer validity to non-erased elements post insertion or erasure, as colony does. Similarly if we look at a multiset, an unordered one could be implemented on top of a colony by utilising a hash table (and would in fact be more efficient than most non-flat implementations) but the fact that there is a necessity to add something to make it a multiset (to take and store key values) means colony is not an multiset.
All operations which allocate memory have strong exception guarantees and will roll back if an exception occurs, except for operator = which has a basic exception guarantee (see below). For colony, iterators are bounded by asserts in debug mode, but unbounded in release mode, so it is possible for an incorrect use of an iterator (iterating past end(), for example) to trigger an out-of-bounds memory exception.
The reason why operator = only has a basic guarantee is they do not utilize the copy-swap idiom, as the copy-swap idiom significantly increases the chance of any exception occurring - this is because the most common way an exception tends to occur during a copy operation is due to a lack of memory space, and the copy-swap idiom doubles the memory requirement for a copy operation by constructing the copy before destructing the original data.
This is doubly inappropriate in game development, which colony has been initially for, where memory constraints are often critical and the runtime lag from memory spikes can cause detrimental game performance. So in the case of colony if a non-memory-allocation-based exception occurs during copy, the = operators will have already destructed their data, so the containers will be empty and cannot roll back - hence they have a basic guarantee, not a strong guarantee.
It was found under GCC and clang that the high-modification scenario benchmarks suffered a 11% overall performance loss when iterator copy and move constructors/operators where not explicitly defined, possibly because GCC did not know to vectorize the operations without explicit code or because the compiler implemented copy/swap idiom instead of simply copying directly. No performance gain was found under any tests when removing the copy and move constructors/operators.
Two reasons:
++
and --
iterator operations become O(random) in terms of
time complexity, making them non-compliant with the C++ standard. At
the moment they are O(1) amortised, typically one update for both
skipfield and element pointers, but two if a skipfield jump takes the
iterator beyond the bounds of the current group and into the next
group. But if empty groups are allowed, there could be anywhere between
1 and size_type
empty groups between the current element and the next. Essentially you
get the same scenario as you do when iterating over a boolean
skipfield. It would be possible to move these to the back of the colony
as trailing groups, but this may create performance issues if any of
the groups are not at their maximum size (see below).It's faster. In attempting on two separate occasions to switch to an index-based iterator approach, utilising three separate pointers was consistently faster.
Group sizes start from the either the default minimum size (8 elements, larger if the type stored is small) or an amount defined by the programmer (with a minimum of 3 elements). Subsequent group sizes then increase the total capacity of the colony by a factor of 2 (so, 1st group 8 elements, 2nd 8 elements, 3rd 16 elements, 4th 32 elements etcetera) until the maximum group size is reached. The default maximum group size is the maximum possible number that the skipfield bitdepth is capable of representing (std::numeric_limits<skipfield_type>::max()). By default the skipfield bitdepth is 16 so the maximum size of a group is 65535 elements. However the skipfield bitdepth is also a template parameter which can be set to any unsigned integer - unsigned char, unsigned int, Uint_64, etc. Unsigned short (guaranteed to be at least 16 bit, equivalent to C++11's uint_least16_t type) was found to have the best performance in real-world testing due to the balance between memory contiguousness, memory waste and the restriction on skipfield update time complexity. Initially the design also fitted the use-case of gaming better (as games tend to utilize lower numbers of elements than some other fields), and that was the primary development field at the time.
reinterpret_cast<element_pointer_type>(skipfield) -
elements
?Because it's faster. While it can be obtained that way, having it precalculated gives a small but significant benefit. And it's only an additional 16 bits per group under 32-bit code (under 64-bit code there is typically no additional size increase, due to struct padding and the size of pointers).
No and yes. Yes if you're careful, no if you're not.
On platforms which support scatter and gather operations you can use
colony with SIMD as much as you want, using gather to load elements
from disparate or sequential locations, into a SIMD register. Then use
scatter to push the post-SIMD-process values elsewhere after. The
data() function is useful for this.
On platforms which require elements to be contiguous in memory for SIMD
processing, this is more complicated. When you have a bunch of erasures
in a colony, there's no guarantee that your objects will be contiguous
in memory, even though they are sequential during iteration. Some of
them may also be in different memory blocks to each other. In these
situations if you want to use SIMD with colony, you must do the
following:
Generally if you want to use SIMD without gather/scatter, it's probably preferable to use a vector or an array.
sizeof(value_type *) * 2
, no performance difference.last
in a loop and return value is necessary to make sure iterator == end()
end condition is reached.
Comment updating.
Some tooling functions made public for access across multiple plf:: libraries.
Other minor code reductions.
Correction to copy constructor.
skipfield_type
parameter removed.Contact:
plf:: library and this page Copyright (c) 2023, Matthew Bentley