### C++ how is STL set/map iterator implemented?

Users of STL containers enjoy the immense power of iterators provided

by different types of containers. The sequence container iterators

are easy to understand and implement conceptually. What about the

associate containers, such as set and map?

STL set and map are normally implemented through red black tree, a type

of self balancing binary search tree. Thus iterators of a set or a map

is internally represented by a pointer to a red black tree node in practice.

Let's look at some code involving the related entities:

template<_Tp>

set<_Tp> s;

node<_Tp> n;

rb_tree > rbt;

typedef node<_Tp> * set<_Tp>::iterator;

typedef node<_Tp> * rb_tree >::iterator;

Let's leave a few template parameters out of the discussion for now to illustrate

the concept. For curious readers, those parameters defines behavior of node allocation

and comparison...

Let's recall the different ways a tree can be iterated or traversed: depth first

traversal (DFS) and breath first traversal (BFS). In DFS, there are 3 orders of traversal:

pre-order (S L R), in-order (L S R), and post-order (L R S). A BST in-order DFS yields

a sorted list of nodes. (L: left, S: self, R: right)

We would stop here now if we are interested in read only set or map traversal.

However node erasure is an equally important operation. And node based containers (

such as list, set, and map) gurantees that erase(iterator) invalides the iterator

being erased only. This is a curious feature. It's easy to understand why this is so

for list. But for red black tree based set and map, how does STL gurantees this

behavior?

An important observation of the set and map containers is that iteration of these

containers always yields

DFS problem. There is an invariance in set/map iteration: order of nodes is predefined.

Consider the following sample code:

The output is:

./test_set_iter

node value: 1 2 3 4 5 10

node value: 1 2 4 5 10

node value: 1 2 3 4 5 10

node value: 2 3 4 5 10

node value: 1 2 4 5 10

As we can see, no matter how the iterations are done, there is a predefined

ascending order of nodes. This means that as long as the underlying red black

tree can do in-order DFS, iteration behavior is guranteed. And as a self-balancing

BST, in-order DFS is guranteed for red black tree.

Note that the 4th row prints 2 3 4 5 10, why is it doing so when we are erasing element 3? This confused me at the begining but the behavior is absolutely correct. Can you explain it?

by different types of containers. The sequence container iterators

are easy to understand and implement conceptually. What about the

associate containers, such as set and map?

STL set and map are normally implemented through red black tree, a type

of self balancing binary search tree. Thus iterators of a set or a map

is internally represented by a pointer to a red black tree node in practice.

Let's look at some code involving the related entities:

template<_Tp>

set<_Tp> s;

node<_Tp> n;

rb_tree

typedef node<_Tp> * set<_Tp>::iterator;

typedef node<_Tp> * rb_tree

Let's leave a few template parameters out of the discussion for now to illustrate

the concept. For curious readers, those parameters defines behavior of node allocation

and comparison...

Let's recall the different ways a tree can be iterated or traversed: depth first

traversal (DFS) and breath first traversal (BFS). In DFS, there are 3 orders of traversal:

pre-order (S L R), in-order (L S R), and post-order (L R S). A BST in-order DFS yields

a sorted list of nodes. (L: left, S: self, R: right)

We would stop here now if we are interested in read only set or map traversal.

However node erasure is an equally important operation. And node based containers (

such as list, set, and map) gurantees that erase(iterator) invalides the iterator

being erased only. This is a curious feature. It's easy to understand why this is so

for list. But for red black tree based set and map, how does STL gurantees this

behavior?

An important observation of the set and map containers is that iteration of these

containers always yields

**sorted**result. This takes us back to the BST in-order

DFS problem. There is an invariance in set/map iteration: order of nodes is predefined.

Consider the following sample code:

#include < set>

#include < iostream>

using namespace std;

void display(const set & s){

cout << "node value: ";

set::const_iterator it = s.begin();

for(;it != s.end(); it++) cout << " " << *it;

cout << endl;

}

int main(){

set s;

s.insert(4);

s.insert(1);

s.insert(10);

s.insert(2);

s.insert(3);

s.insert(5);

display(s);

s.erase(3);

display(s);

s.insert(3);

display(s);

cout << "node value: ";

set::const_iterator it = s.begin();

while(it != s.end()){

if(*it == 3) s.erase(it++);

else ++it;

if(it != s.end()) cout << " " << *it;

}

cout << endl;

display(s);

}

The output is:

./test_set_iter

node value: 1 2 3 4 5 10

node value: 1 2 4 5 10

node value: 1 2 3 4 5 10

node value: 2 3 4 5 10

node value: 1 2 4 5 10

As we can see, no matter how the iterations are done, there is a predefined

ascending order of nodes. This means that as long as the underlying red black

tree can do in-order DFS, iteration behavior is guranteed. And as a self-balancing

BST, in-order DFS is guranteed for red black tree.

Note that the 4th row prints 2 3 4 5 10, why is it doing so when we are erasing element 3? This confused me at the begining but the behavior is absolutely correct. Can you explain it?

<< Home