Chapter 12: Abstract Containers

C++ offers several predefined datatypes, all part of the Standard Template Library, which can be used to implement solutions to frequently occurring problems. The datatypes discussed in this chapter are all containers: you can put stuff inside them, and you can retrieve the stored information from them.

The interesting part is that the kind of data that can be stored inside these containers has been left unspecified at the time the containers were constructed. That's why they are spoken of as abstract containers.

Abstract containers rely heavily on templates, covered in chapter 21 and beyond. To use abstract containers, only a minimal grasp of the template concept is required. In C++ a template is in fact a recipe for constructing a function or a complete class. The recipe tries to abstract the functionality of the class or function as much as possible from the data on which the class or function operates. As the data types on which the templates operate were not known when the template was implemented, the datatypes are either inferred from the context in which a function template is used, or they are mentioned explicitly when a class template is used (the term that's used here is instantiated). In situations where the types are explicitly mentioned, the angle bracket notation is used to indicate which data types are required. For example, below (in section 12.2) we'll encounter the pair container, which requires the explicit mentioning of two data types. Here is a pair object containing both an int and a string:

    pair<int, string> myPair;
The object myPair is defined as an object holding both an int and a string.

The angle bracket notation is used intensively in the upcoming discussion of abstract containers. Actually, understanding this part of templates is the only real requirement for using abstract containers. Now that we've introduced this notation, we can postpone the more thorough discussion of templates to chapter 21, and concentrate on their use in this chapter.

Most abstract containers are sequential containers: they contain data that can be stored and retrieved in some sequential way. Examples are the array, implementing a fixed-sized array; a vector, implementing an extendable array; the list, implementing a data structure that allows for the easy insertion or deletion of data; the queue, also called a FIFO (first in, first out) structure, in which the first element that is entered is the first element to be retrieved again; and the stack, which is a first in, last out (FILO or LIFO) structure.

In addition to sequential containers several special containers are available. The pair is a basic container in which a pair of values (of types that are left open for further specification) can be stored, like two strings, two ints, a string and a double, etc.. Pairs are often used to return data elements that naturally come in pairs. For example, the map is an abstract container storing keys and their associated values. Elements of these maps are returned as pairs.

A variant of the pair is the complex container, implementing operations that are defined on complex numbers.

A tuple (cf. section 22.6) generalizes the pair container to a data structure accomodating any number of different data types.

All abstract containers described in this chapter as well as the string and stream datatypes (cf. chapters 5 and 6) are part of the Standard Template Library.

All but the unordered containers support the following basic set of operators:

Note that before a user-defined type (usually a class-type) can be stored in a container, the user-defined type should at least support: Sequential containers can also be initialized using initializer lists.

Most containers (exceptions are the stack (section 12.4.11), priority_queue (section 12.4.5), and queue (section 12.4.4) containers) support members to determine their maximum sizes (through their member function max_size).

Virtually all containers support copy construction. If the container supports copy construction and the container's data type supports move construction, then move construction is automatically used for the container's data elements when a container is initialized with an anonymous temporary container.

Closely linked to the standard template library are the generic algorithms. These algorithms may be used to perform frequently occurring tasks or more complex tasks than is possible with the containers themselves, like counting, filling, merging, filtering etc.. An overview of generic algorithms and their applications is given in chapter 19. Generic algorithms usually rely on the availability of iterators, representing begin and end-points for processing data stored inside containers. The abstract containers usually support constructors and members expecting iterators, and they often have members returning iterators (comparable to the string::begin and string::end members). In this chapter the iterator concept is not further investigated. Refer to chapter 18 for this.

The url http://www.sgi.com/Technology/STL is worth visiting as it offers more extensive coverage of abstract containers and the standard template library than can be provided by the C++ annotations.

Containers often collect data during their lifetimes. When a container goes out of scope, its destructor tries to destroy its data elements. This only succeeds if the data elements themselves are stored inside the container. If the data elements of containers are pointers to dynamically allocated memory then the memory pointed to by these pointers is not destroyed, resulting in a memory leak. A consequence of this scheme is that the data stored in a container should often be considered the `property' of the container: the container should be able to destroy its data elements when the container's destructor is called. So, normally containers should not contain pointers to data. Also, a container should not be required to contain const data, as const data prevent the use of many of the container's members, like the assignment operator.

12.1: Notations used in this chapter

In this chapter about containers, the following notational conventions are used: Some containers, e.g., the map container, contain pairs of values, usually called `keys' and `values'. For such containers the following notational convention is used in addition:

12.2: The `pair' container

The pair container is a rather basic container. It is used to store two elements, called first and second, and that's about it. Before using pair containers the header file <utility> must be included.

The pair's data types are specified when the pair object is defined (or declared) using the template's angle bracket notation (cf. chapter 21). Examples:

    pair<string, string> piper("PA28", "PH-ANI");
    pair<string, string> cessna("C172", "PH-ANG");
here, the variables piper and cessna are defined as pair variables containing two strings. Both strings can be retrieved using the first and second fields of the pair type:
    cout << piper.first << '\n' <<      // shows 'PA28'
            cessna.second << '\n';      // shows 'PH-ANG'
The first and second members can also be used to reassign values:
    cessna.first = "C152";
    cessna.second = "PH-ANW";
If a pair object must be completely reassigned, an anonymous pair object can be used as the right-hand operand of the assignment. An anonymous variable defines a temporary variable (which receives no name) solely for the purpose of (re)assigning another variable of the same type. Its generic form is
    type(initializer list)
Note that when a pair object is used the type specification is not completed by just mentioning the containername pair. It also requires the specification of the data types which are stored within the pair. For this the (template) angle bracket notation is used again. E.g., the reassignment of the cessna pair variable could have been accomplished as follows:
    cessna = pair<string, string>("C152", "PH-ANW");
In cases like these, the type specification can become quite elaborate, which has caused a revival of interest in the possibilities offered by the typedef keyword. If many pair<type1, type2> clauses are used in a source, the typing effort may be reduced and readability might be improved by first defining a name for the clause, and then using the defined name later. E.g.,
    typedef pair<string, string> pairStrStr;

    cessna = pairStrStr("C152", "PH-ANW");
Apart from this (and the basic set of operations (assignment and comparisons)) the pair offers no further functionality. It is, however, a basic ingredient of the upcoming abstract containers map, multimap and hash_map.

C++ also offers a generalized pair container: the tuple, covered in section 22.6.

12.3: Allocators

Most containers use a special object for allocating the memory that is managed by them. This object is called an allocator, and it's type is (usually by default) specified when a container is constructed. A container's allocator can be obtained using the container's get_allocator member, which returns a copy of the allocator used by the container. Allocators offer the following members:

Here is an example, using the allocator of a vector of strings (see section 12.4.2 below for a description of the vector container):

#include <iostream>
#include <vector>
#include <string>

using namespace std;

int main()
{
    vector<string> vs;

    auto allocator = vs.get_allocator();        // get the allocator

    string *sp = allocator.allocate(3);         // alloc. space for 3 strings

    allocator.construct(&sp[0], "hello world"); // initialize 1st string
    allocator.construct(&sp[1], sp[0]);         // use the copy constructor
    allocator.construct(&sp[2], 12, '=');       // string of 12 = chars

    cout << sp[0] << '\n' <<                    // show the strings
            sp[1] << '\n' <<
            sp[2] << '\n' <<
            "could have allocated " << allocator.max_size() << " strings\n";

    for (size_t idx = 0; idx != 3; ++idx)
        allocator.destroy(sp + idx);            // delete the string's
                                                // contents

    allocator.deallocate(sp, 3);                // and delete sp itself again.
}

12.4: Available Containers

12.4.1: The `array' container

The array class implements a fixed-size array. Before using the array container the <array> header file must be included.

To define a std::array both the data type of its elements and its size must be specified: the data type is given after an opening angular bracket, immediately following the `array' container name. The array's size is provided after the data type specification. Finally, a closing angular bracket completes the array's type. Specifications like this are common practice with containers. The combination of array, type and size defines a type. As a result, array<string, 4> defines another type than array<string, 5>, and a function explicitly defining an array<Type, N> parameter will not accept an array<Type, M> argument if N and M are unequal.

The array's size may may be defined as 0 (although such an array probably has little use as it cannot store any element). The elements of an array are stored contiguously. If array<Type, N> arr has been defined, then &arr[n] + m == &arr[n + m, assuming 0 <= n < N and assuming 0 <= n + m < N.

The following constructors, operators, and member functions are available:

Using an array rather than a standard C style array offers several advantages: In general, when looking for a sequential data structure, the array or vector (introduced in the next section) should be your `weapon of choice'. Only if these containers demonstrably do not fit the problem at hand you should use another type of container.

12.4.2: The `vector' container

The vector class implements an expandable array. Before using the vector container the <vector> header file must be included.

The following constructors, operators, and member functions are available:

12.4.3: The `list' container

The list container implements a list data structure. Before using a list container the header file <list> must be included.

The organization of a list is shown in figure 8.

Figure 8: A list data-structure

Figure 8 shows that a list consists of separate list-elements, connected by pointers. The list can be traversed in two directions: starting at Front the list may be traversed from left to right, until the 0-pointer is reached at the end of the rightmost list-element. The list can also be traversed from right to left: starting at Back, the list is traversed from right to left, until eventually the 0-pointer emanating from the leftmost list-element is reached.

As a subtlety note that the representation given in figure 8 is not necessarily used in actual implementations of the list. For example, consider the following little program:

    int main()
    {
        list<int> l;
        cout << "size: " << l.size() << ", first element: " <<
                l.front() << '\n';
    }
When this program is run it might actually produce the output:
    size: 0, first element: 0
Its front element can even be assigned a value. In this case the implementor has chosen to provide the list with a hidden element. The list actually is a circular list, where the hidden element serves as terminating element, replacing the 0-pointers in figure 8. As noted, this is a subtlety, which doesn't affect the conceptual notion of a list as a data structure ending in 0-pointers. Note also that it is well known that various implementations of list-structures are possible (cf. Aho, A.V., Hopcroft J.E. and Ullman, J.D., (1983) Data Structures and Algorithms (Addison-Wesley)).

Both lists and vectors are often appropriate data structures in situations where an unknown number of data elements must be stored. However, there are some rules of thumb to follow when selecting the appropriate data structure.

At present lists aren't as useful anymore as they used to be (when computers were much slower and more memory-constrained). Except maybe for some rare cases, a vector should be the preferred container; even when implementing algorithms traditionally using lists.

Other considerations related to the choice between lists and vectors should also be given some thought. Although it is true that the vector is able to grow dynamically, the dynamic growth requires data-copying. Clearly, copying a million large data structures takes a considerable amount of time, even on fast computers. On the other hand, inserting a large number of elements in a list doesn't require us to copy non-involved data. Inserting a new element in a list merely requires us to juggle some pointers. In figure 9 this is shown: a new element is inserted between the second and third element, creating a new list of four elements.

Figure 9: Adding a new element to a list

Removing an element from a list is also fairly easy. Starting again from the situation shown in figure 8, figure 10 shows what happens if element two is removed from our list. Again: only pointers need to be juggled. In this case it's even simpler than adding an element: only two pointers need to be rerouted.

Figure 10: Removing an element from a list

To summarize the comparison between lists and vectors: it's probably best to conclude that there is no clear-cut answer to the question what data structure to prefer. There are rules of thumb, which may be adhered to. But if worse comes to worst, a profiler may be required to find out what's best.

The list container offers the following constructors, operators, and member functions:

12.4.4: The `queue' container

The queue class implements a queue data structure. Before using a queue container the header file <queue> must be included.

A queue is depicted in figure 11.

Figure 11: A queue data-structure

In figure 11 it is shown that a queue has one point (the back) where items can be added to the queue, and one point (the front) where items can be removed (read) from the queue. A queue is therefore also called a FIFO data structure, for first in, first out. It is most often used in situations where events should be handled in the same order as they are generated.

The following constructors, operators, and member functions are available for the queue container:

Note that the queue does not support iterators or a subscript operator. The only elements that can be accessed are its front and back element. A queue can be emptied by:

12.4.5: The `priority_queue' container

The priority_queue class implements a priority queue data structure. Before using a priority_queue container the <queue> header file must have been included.

A priority queue is identical to a queue, but allows the entry of data elements according to priority rules. A real-life priority queue is found, e.g., at airport check-in terminals. At a terminal the passengers normally stand in line to wait for their turn to check in, but late passengers are usually allowed to jump the queue: they receive a higher priority than other passengers.

The priority queue uses operator< of the data type stored in the priority queue to decide about the priority of the data elements. The smaller the value, the lower the priority. So, the priority queue could be used to sort values while they arrive. A simple example of such a priority queue application is the following program: it reads words from cin and writes a sorted list of words to cout:

#include <iostream>
#include <string>
#include <queue>
using namespace std;

int main()
{
    priority_queue<string> q;
    string word;

    while (cin >> word)
        q.push(word);

    while (q.size())
    {
        cout << q.top() << '\n';
        q.pop();
    }
}

Unfortunately, the words are listed in reversed order: because of the underlying <-operator the words appearing later in the ASCII-sequence appear first in the priority queue. A solution to that problem is to define a wrapper class around the string datatype, reversing string's operator<. Here is the modified program:

#include <iostream>
#include <string>
#include <queue>

class Text
{
    std::string d_s;

    public:
        Text(std::string const &str)
        :
            d_s(str)
        {}
        operator std::string const &() const
        {
            return d_s;
        }
        bool operator<(Text const &right) const
        {
            return d_s > right.d_s;
        }
};

using namespace std;

int main()
{
    priority_queue<Text> q;
    string word;

    while (cin >> word)
        q.push(word);

    while (q.size())
    {
        word = q.top();
        cout << word << '\n';
        q.pop();
    }
}

Other possibilities to achieve the same exist. One would be to store the contents of the priority queue in, e.g., a vector, from which the elements can be read in reversed order.

The following constructors, operators, and member functions are available for the priority_queue container:

Note that the priority queue does not support iterators or a subscript operator. The only element that can be accessed is its top element. A priority queue can be emptied by:

12.4.6: The `deque' container

The deque (pronounce: `deck') class implements a doubly ended queue data structure (deque). Before using a deque container the header file <deque> must be included.

A deque is comparable to a queue, but it allows for reading and writing at both ends. Actually, the deque data type supports a lot more functionality than the queue, as illustrated by the following overview of available member functions. A deque is a combination of a vector and two queues, operating at both ends of the vector. In situations where random insertions and the addition and/or removal of elements at one or both sides of the vector occurs frequently using a deque should be considered.

The following constructors, operators, and member functions are available for deques:

12.4.7: The `map' container

The map class offers a (sorted) associative array. Before using a map container the <map> header file must be included.

A map is filled with key/value pairs, which may be of any container-accepted type. Since types are associated with both the key and the value, we must specify two types in the angle bracket notation, comparable to the specification we've seen with the pair container (cf. section 12.2). The first type represents the key's type, the second type represents the value's type. For example, a map in which the key is a string and the value is a double can be defined as follows:

    map<string, double> object;
The key is used to access its associated information. That information is called the value. For example, a phone book uses the names of people as the key, and uses the telephone number and maybe other information (e.g., the zip-code, the address, the profession) as value. Since a map sorts its keys, the key's operator< must be defined, and it must be sensible to use it. For example, it is generally a bad idea to use pointers for keys, as sorting pointers is something different than sorting the values pointed at by those pointers.

The two fundamental operations on maps are the storage of Key/Value combinations, and the retrieval of values, given their keys. The index operator using a key as the index, can be used for both. If the index operator is used as lvalue, the expression's rvalue is inserted into the map. If it is used as rvalue, the key's associated value is retrieved. Each key can be stored only once in a map. If the same key is entered again, the new value replaces the formerly stored value, which is lost.

A specific key/value combination can implicitly or explicitly be inserted into a map. If explicit insertion is required, the key/value combination must be constructed first. For this, every map defines a value_type which may be used to create values that can be stored in the map. For example, a value for a map<string, int> can be constructed as follows:

    map<string, int>::value_type siValue("Hello", 1);
The value_type is associated with the map<string, int>: the type of the key is string, the type of the value is int. Anonymous value_type objects are also often used. E.g.,
    map<string, int>::value_type("Hello", 1);
Instead of using the line map<string, int>::value_type(...) over and over again, a typedef is frequently used to reduce typing and to improve readability:
    typedef map<string, int>::value_type StringIntValue
Using this typedef, values for the map<string, int> may now be constructed using:
    StringIntValue("Hello", 1);
Alternatively, pairs may be used to represent key/value combinations used by maps:
    pair<string, int>("Hello", 1);

12.4.7.1: The `map' constructors

The following constructors are available for the map container:

12.4.7.2: The `map' operators

The map supports, in addition to the standard operators for containers, the index operator.

The index operator may be used to retrieve or reassign individual elements of the map. The argument of the index operator is called a key.

If the provided key is not available in the map, a new data element is automatically added to the map using the default value or default constructor to initialize the value part of the new element. This default value is returned if the index operator is used as an rvalue.

When initializing a new or reassigning another element of the map, the type of the right-hand side of the assignment operator must be equal to (or promotable to) the type of the map's value part. E.g., to add or change the value of element "two" in a map, the following statement can be used:

    mapsm["two"] = MyClass();

12.4.7.3: The `map' public members

The following member functions are available for the map container:

12.4.7.4: The `map': a simple example

As mentioned at the beginning of section 12.4.7, the map represents a sorted associative array. In a map the keys are sorted. If an application must visit all elements in a map the begin and end iterators must be used.

The following example illustrates how to make a simple table listing all keys and values found in a map:

    #include <iostream>
    #include <iomanip>
    #include <map>

    using namespace std;

    int main()
    {
        pair<string, int>
            pa[] =
            {
                pair<string,int>("one", 10),
                pair<string,int>("two", 20),
                pair<string,int>("three", 30),
            };
        map<string, int>
            object(&pa[0], &pa[3]);

        for
        (
            map<string, int>::iterator it = object.begin();
                it != object.end();
                    ++it
        )
            cout << setw(5) << it->first.c_str() <<
                    setw(5) << it->second << '\n';
    }
    /*
        Generated output:
      one   10
    three   30
      two   20
    */

12.4.8: The `multimap' container

Like the map, the multimap class implements a (sorted) associative array. Before using a multimap container the header file <map> must be included.

The main difference between the map and the multimap is that the multimap supports multiple values associated with the same key, whereas the map contains single-valued keys. Note that the multimap also accepts multiple identical values associated with identical keys.

The map and the multimap have the same set of constructors and member functions, with the exception of the index operator which is not supported with the multimap. This is understandable: if multiple entries of the same key are allowed, which of the possible values should be returned for object[key]?

Refer to section 12.4.7 for an overview of the multimap member functions. Some member functions, however, deserve additional attention when used in the context of the multimap container. These members are discussed below.

Although the functions lower_bound and upper_bound act identically in the map and multimap containers, their operation in a multimap deserves some additional attention. The next example illustrates lower_bound, upper_bound and equal_range applied to a multimap:
    #include <iostream>
    #include <map>
    using namespace std;

    int main()
    {
        pair<string, int> pa[] =
        {
            pair<string,int>("alpha", 1),
            pair<string,int>("bravo", 2),
            pair<string,int>("charley", 3),
            pair<string,int>("bravo", 6),   // unordered `bravo' values
            pair<string,int>("delta", 5),
            pair<string,int>("bravo", 4),
        };
        multimap<string, int> object(&pa[0], &pa[6]);

        typedef multimap<string, int>::iterator msiIterator;

        msiIterator it = object.lower_bound("brava");

        cout << "Lower bound for `brava': " <<
                it->first << ", " << it->second << '\n';

        it = object.upper_bound("bravu");

        cout << "Upper bound for `bravu': " <<
                it->first << ", " << it->second << '\n';

        pair<msiIterator, msiIterator>
            itPair = object.equal_range("bravo");

        cout << "Equal range for `bravo':\n";
        for (it = itPair.first; it != itPair.second; ++it)
            cout << it->first << ", " << it->second << '\n';
        cout << "Upper bound: " << it->first << ", " << it->second << '\n';

        cout << "Equal range for `brav':\n";
        itPair = object.equal_range("brav");
        for (it = itPair.first; it != itPair.second; ++it)
            cout << it->first << ", " << it->second << '\n';
        cout << "Upper bound: " << it->first << ", " << it->second << '\n';
    }
    /*
        Generated output:

        Lower bound for `brava': bravo, 2
        Upper bound for `bravu': charley, 3
        Equal range for `bravo':
        bravo, 2
        bravo, 6
        bravo, 4
        Upper bound: charley, 3
        Equal range for `brav':
        Upper bound: bravo, 2
    */

In particular note the following characteristics:

12.4.9: The `set' container

The set class implements a sorted collection of values. Before using set containers the <set> header file must be included.

A set contains unique values (of a container-acceptable type). Each value is stored only once.

A specific value can be explicitly created: Every set defines a value_type which may be used to create values that can be stored in the set. For example, a value for a set<string> can be constructed as follows:

    set<string>::value_type setValue("Hello");
The value_type is associated with the set<string>. Anonymous value_type objects are also often used. E.g.,
    set<string>::value_type("Hello");
Instead of using the line set<string>::value_type(...) over and over again, a typedef is often used to reduce typing and to improve readability:
    typedef set<string>::value_type StringSetValue
Using this typedef, values for the set<string> may be constructed as follows:
    StringSetValue("Hello");
Alternatively, values of the set's type may be used immediately. In that case the value of type Type is implicitly converted to a set<Type>::value_type.

The following constructors, operators, and member functions are available for the set container:

12.4.10: The `multiset' container

Like the set, the multiset class implements a sorted collection of values. Before using multiset containers the header file <set> must be included.

The main difference between the set and the multiset is that the multiset supports multiple entries of the same value, whereas the set contains unique values.

The set and the multiset have the same set of constructors and member functions. Refer to section 12.4.9 for an overview of the multiset member functions. Some member functions, however, behave slightly different than their counterparts of the set container. Those members are mentioned here.

Although the functions lower_bound and upper_bound act identically in the set and multiset containers, their operation in a multiset deserves some additional attention. With a multiset container lower_bound and upper_bound produce the same result for non-existing keys: they both return the first element having a key exceeding the provided key.

Here is an example showing the use of various member functions of a multiset:

    #include <iostream>
    #include <set>

    using namespace std;

    int main()
    {
        string
            sa[] =
            {
                "alpha",
                "echo",
                "hotel",
                "mike",
                "romeo"
            };

        multiset<string>
            object(&sa[0], &sa[5]);

        object.insert("echo");
        object.insert("echo");

        multiset<string>::iterator
            it = object.find("echo");

        for (; it != object.end(); ++it)
            cout << *it << " ";
        cout << '\n';

        cout << "Multiset::equal_range(+NOTRANS(&euml;)ch\")\n";
        pair
        <
            multiset<string>::iterator,
            multiset<string>::iterator
        >
            itpair = object.equal_range("ech");

        if (itpair.first != object.end())
            cout << "lower_bound() points at " << *itpair.first << '\n';
        for (; itpair.first != itpair.second; ++itpair.first)
            cout << *itpair.first << " ";

        cout << '\n' <<
                object.count("ech") << " occurrences of 'ech'" << '\n';

        cout << "Multiset::equal_range(+NOTRANS(&euml;)cho\")\n";
        itpair = object.equal_range("echo");

        for (; itpair.first != itpair.second; ++itpair.first)
            cout << *itpair.first << " ";

        cout << '\n' <<
                object.count("echo") << " occurrences of 'echo'" << '\n';

        cout << "Multiset::equal_range(+NOTRANS(&euml;)choo\")\n";
        itpair = object.equal_range("echoo");

        for (; itpair.first != itpair.second; ++itpair.first)
            cout << *itpair.first << " ";

        cout << '\n' <<
                object.count("echoo") << " occurrences of 'echoo'" << '\n';
    }
    /*
        Generated output:

        echo echo echo hotel mike romeo
        Multiset::equal_range("ech")
        lower_bound() points at echo

        0 occurrences of 'ech'
        Multiset::equal_range("echo")
        echo echo echo
        3 occurrences of 'echo'
        Multiset::equal_range("echoo")

        0 occurrences of 'echoo'
    */

12.4.11: The `stack' container

The stack class implements a stack data structure. Before using stack containers the header file <stack> must be included.

A stack is also called a first in, last out (FILO or LIFO) data structure as the first item to enter the stack is the last item to leave. A stack is an extremely useful data structure in situations where data must temporarily remain available. For example, programs maintain a stack to store local variables of functions: the lifetime of these variables is determined by the time these functions are active, contrary to global (or static local) variables, which live for as long as the program itself lives. Another example is found in calculators using the Reverse Polish Notation (RPN), in which the operands of operators are kept in a stack, whereas operators pop their operands off the stack and push the results of their work back onto the stack.

As an example of the use of a stack, consider figure 12, in which the contents of the stack is shown while the expression (3 + 4) * 2 is evaluated. In the RPN this expression becomes 3 4 + 2 *, and figure 12 shows the stack contents after each token (i.e., the operands and the operators) is read from the input. Notice that each operand is indeed pushed on the stack, while each operator changes the contents of the stack.

Figure 12: The contents of a stack while evaluating 3 4 + 2 *

The expression is evaluated in five steps. The caret between the tokens in the expressions shown on the first line of figure 12 shows what token has just been read. The next line shows the actual stack-contents, and the final line shows the steps for referential purposes. Note that at step 2, two numbers have been pushed on the stack. The first number (3) is now at the bottom of the stack. Next, in step 3, the + operator is read. The operator pops two operands (so that the stack is empty at that moment), calculates their sum, and pushes the resulting value (7) on the stack. Then, in step 4, the number 2 is read, which is dutifully pushed on the stack again. Finally, in step 5 the final operator * is read, which pops the values 2 and 7 from the stack, computes their product, and pushes the result back on the stack. This result (14) could then be popped to be displayed on some medium.

From figure 12 we see that a stack has one location (the top) where items can be pushed onto and popped off the stack. This top element is the stack's only immediately visible element. It may be accessed and modified directly.

Bearing this model of the stack in mind, let's see what we formally can do with the stack container. For the stack, the following constructors, operators, and member functions are available:

12.4.12: The `unordered_map' container (`hash table')

In C++ hash tables are available as objects of the class unordered_map.

Before using unordered_map or unordered_multimap containers the header file <unordered_map> must be included.

The unordered_map class implements an associative array in which the elements are stored according to some hashing scheme. As discussed, the map is a sorted data structure. The keys in maps are sorted using the operator< of the key's data type. Generally, this is not the fastest way to either store or retrieve data. The main benefit of sorting is that a listing of sorted keys appeals more to humans than an unsorted list. However, a by far faster way to store and retrieve data is to use hashing.

Hashing uses a function (called the hash function) to compute an (unsigned) number from the key, which number is thereupon used as an index in the table storing the keys and their values. This number is called the bucket number. Retrieval of a key is as simple as computing the hash value of the provided key, and looking in the table at the computed index location: if the key is present, it is stored in the table, at the computed bucket location and its value can be returned. If it's not present, the key is not currently stored in the container.

Collisions occur when a computed index position is already occupied by another element. For these situations the abstract containers have solutions available. A simple solution, used by unordered_maps, consists of using linear chaining, which uses linked list to store colliding table elements.

The term unordered_map is used rather than hash to avoid name collisions with hash tables developed before they were added to the language.

Because of the hashing method, the efficiency of a unordered_map in terms of speed should greatly exceed the efficiency of the map. Comparable conclusions may be drawn for the unordered_set, the unordered_multimap and the unordered_multiset.

12.4.12.1: The `unordered_map' constructors

When defining an unordered_map type five template arguments must be specified :

The generic definition of an unordered_map container looks like this:

    std::unordered_map <KeyType, ValueType, hash type, predicate type,
                        allocator type>
When KeyType is std::string or a built-in type then default types are available for the hash type and the predicate type. In practice the allocator type is not specified, as the default allocator suffices. In these cases an unordered_map object can be defined by merely specifying the key- and value types, like this:
    std::unordered_map<std::string, ValueType> hash(size_t size = implSize);
Here, implSize is the container's default initial size, which is specified by the implementor. The map's size is automatically enlarged by the unordered_map when necessary, in which case the container rehashes all its elements. In practice the default size argument provided by the implementor is completely satisfactory.

The KeyType frequently consists of text. So, a unordered_map using a std::string KeyType is frequently used. Be careful not to use a plain char const * key_type as two char const * values pointing to equal C-strings stored at different locations are considered to be different keys, as their pointer values rather than their textual contents are compared. Here is an example showing how a char const * KeyType can be used. Note that in the example no arguments are specified when constructing months, since default values and constructors are available:

    #include <unordered_map>
    #include <iostream>
    #include <string>
    #include <cstring>

    using namespace std;

    struct EqualCp
    {
        bool operator()(char const *l, char const *r) const
        {
            return strcmp(l, r) == 0;
        }
    };
    struct HashCp
    {
        size_t operator()(char const *str) const
        {
            return std::hash<std::string>()(str);
        }
    };
    int main()
    {
        unordered_map<char const *, int, HashCp, EqualCp> months;
        // or explicitly:
            unordered_map<char const *, int, HashCp, EqualCp>
                                      monthsTwo(61, HashCp(), EqualCp());

        months["april"] = 30;
        months["november"] = 31;

        string apr("april");    // different pointers, same string

        cout << "april     -> " << months["april"] << '\n' <<
                "april     -> " << months[apr.c_str()] << '\n';
    }

If other KeyTypes must be used, then the unordered_map's constructor requires (constant references to) a hash function object, computing a hash value from a key value, and a predicate function object, returning true if two unordered_map::key_type objects are identical. A generic algorithm (see chapter 19) exists performing tests of equality (i.e., equal_to). These tests can be used if the key's data type supports the equality operator. Alternatively, an overloaded operator== or specialized function object could be constructed returning true if two keys are equal and false otherwise.

Constructors

The unordered_map supports the following constructors:

The following example shows a program using an unordered_map containing the names of the months of the year and the number of days these months (usually) have. Then, using the subscript operator the days in several months are displayed (the predicate used here is the generic algorithm equal_to<string>, which is provided by the compiler as the default fourth argument of the unordered_map constructor):

    #include <unordered_map>
    #include <iostream>
    #include <string>
    using namespace std;

    int main()
    {
        unordered_map<string, int> months;

        months["january"] = 31;
        months["february"] = 28;
        months["march"] = 31;
        months["april"] = 30;
        months["may"] = 31;
        months["june"] = 30;
        months["july"] = 31;
        months["august"] = 31;
        months["september"] = 30;
        months["october"] = 31;
        months["november"] = 30;
        months["december"] = 31;

        cout << "september -> " << months["september"] << '\n' <<
                "april     -> " << months["april"] << '\n' <<
                "june      -> " << months["june"] << '\n' <<
                "november  -> " << months["november"] << '\n';
    }
    /*
        Generated output:
    september -> 30
    april     -> 30
    june      -> 30
    november  -> 30
    */

12.4.12.2: The `unordered_map' public members

The unordered_map supports the index operator operating identically to the map's index operator: a (const) reference to the ValueType associated with the provided KeyType's value is returned. If not yet available, the key is added to the unordered_map, and a default ValueType value is returned. In addition, it supports operator==.

The unordered_map provides the following member functions (key_type, value_type etc. refer to the types defined by the unordered_map):

12.4.12.3: The `unordered_multimap' container

The unordered_multimap allows multiple objects using the same keys to be stored in an unordered map. The unordered_multimap container offers the same set of members and constructors as the unordered_map, but without the unique-key restriction imposed upon the unordered_map.

The unordered_multimap does not offer operator[] and does not offer the at members.

Below all members are described whose behavior differs from the behavior of the corresponding unordered_map members:

12.4.13: The `unordered_set' container

The set container, like the map container, orders its elements. If ordering is not an issue, but fast lookups are, then a hash-based set and/or multi-set may be preferred. C++ provides such hash-based sets and multi-sets: the unordered_set and unordered_multiset.

Before using these hash-based set containers the header file <unordered_set> must be included.

Elements stored in the unordered_set are immutable, but they can be inserted and removed from the container. Different from the unordered_map, the unordered_set does not use a ValueType. The set merely stores elements, and the stored element itself is its own key.

The unordered_set has the same constructors as the unordered_map, but the set's value_type is equal to its key_type.

When defining an unordered_set type four template arguments must be specified :

The generic definition of an unordered_set container looks like this:

    std::unordered_set <KeyType, hash type, predicate type, allocator type>
When KeyType is std::string or a built-in type then default types are available for the hash type and the predicate type. In practice the allocator type is not specified, as the default allocator suffices. In these cases an unordered_set object can be defined by merely specifying the key- and value types, like this:
    std::unordered_set<std::string> rawSet(size_t size = implSize);
Here, implSize is the container's default initial size, which is specified by the implementor. The set's size is automatically enlarged when necessary, in which case the container rehashes all its elements. In practice the default size argument provided by the implementor is completely satisfactory.

The unordered_set supports the following constructors:

The unordered_set does not offer an index operator, and it does not offer an at member. Other than those, it offers the same members as the unordered_map. Below the members whose behavior differs from the behavior of the unordered_map are discussed. For a description of the remaining members, please refer to section 12.4.12.2.

12.4.13.1: The `unordered_multiset' container

The unordered_multiset allows multiple objects using the same keys to be stored in an unordered set. The unordered_multiset container offers the same set of members and constructors as the unordered_set, but without the unique-key restriction imposed upon the unordered_set.

Below all members are described whose behavior differs from the behavior of the corresponding unordered_set members:

12.4.14: Heterogeneous lookup

The associative containers offered by C++ allow us to find a value (or values) matching a given key. Traditionally, the type of the key used for the lookup must match the container's key type.

Since the C++14 standard allows arbitrary lookup key types can be used, if a comparison operator is available to compare that type with the container's key type. Thus, a char const * key (or any other type for which an operator< overload for std::string is available) can be used to lookup values in a map<std::string, ValueType>. This is called heterogeneous lookup.

Heterogeneous lookup is allowed when the comparator given to the associative container does allow this. The standard library classes std::less and std::greater were augmented to allow heterogeneous lookup.

12.5: The `complex' container

The complex container defines the standard operations that can be performed on complex numbers. Before using complex containers the header file <complex> must be included.

The complex number's real and imaginary types are specified as the container's data type. Examples:

    complex<double>
    complex<int>
    complex<float>
Note that the real and imaginary parts of complex numbers have the same datatypes.

When initializing (or assigning) a complex object, the imaginary part may be omitted from the initialization or assignment resulting in its value being 0 (zero). By default, both parts are zero.

Below it is silently assumed that the used complex type is complex<double>. Given this assumption, complex numbers may be initialized as follows:

Anonymous complex values may also be used. In the next example two anonymous complex values are pushed on a stack of complex numbers, to be popped again thereafter:
    #include <iostream>
    #include <complex>
    #include <stack>

    using namespace std;

    int main()
    {
        stack<complex<double>>
            cstack;

        cstack.push(complex<double>(3.14, 2.71));
        cstack.push(complex<double>(-3.14, -2.71));

        while (cstack.size())
        {
            cout << cstack.top().real() << ", " <<
                    cstack.top().imag() << "i" << '\n';
            cstack.pop();
        }
    }
    /*
        Generated output:
    -3.14, -2.71i
    3.14, 2.71i
    */

The following member functions and operators are defined for complex numbers (below, value may be either a primitve scalar type or a complex object):

12.6: Unrestricted Unions

We end this chapter about abstract containers with a small detour, introducing extensions to the union concept. Although unions themselves aren't `abstract containers', having covered containers puts us in a good position to introduce and illustrate unrestricted unions.

Whereas traditional unions can only contain primitive data, unrestricted unions allow addition of data fields of types for which non-trivial constructors have been defined. Such data fields commonly are of class-types. Here is an example of such an unrestricted union:

    union Union
    {
        int u_int;
        std::string u_string;
    };
One of its fields defines a constructor, turning this union into an unrestricted union. As an unrestricted union defines at least one field of a type having a constructor the question becomes how these unions can be constructed and destroyed.

The destructor of a union consisting of, e.g. a std::string and an int should of course not call the string's destructor if the union's last (or only) use referred to its int field. Likewise, when the std::string field is used, and a switch is made next from the std::string to the int field, std::string's destructor should be called before any assignment to the double field takes place.

The compiler does not solve the issue for us, and in fact does not implement default constructors or destructors for unrestricted unions at all. If we try to define an unrestricted union like the one shown above, an error message like the following is issued:

    error: use of deleted function 'Union::Union()'
    error: 'Union::Union()' is implicitly deleted because the default
            definition would be ill-formed:
    error: union member 'Union::u_string' with non-trivial
            'std::basic_string<...>::basic_string() ...'

12.6.1: Implementing the destructor

Although the compiler won't provide (default) implementations for constructors and destructors of unrestricted unions, we can. The task isn't difficult, but there are some caveats.

Consider our unrestricted union's destructor. It clearly should destroy u_string's data if that is its currently active field; but it should do nothing if u_int is its currently active field. But how does the destructor know what field to destroy? It doesn't as the unrestricted union holds no information about what field is currently active.

Here is one way to tackle this problem:

If we embed the unrestricted union in a larger aggregate, like a class or a struct, then the class or struct can be provided with a tag data member storing the currently active union-field. The tag can be an enumeration type, defined by the aggregate. The unrestricted union may then be controlled by the aggregate. Under this approach we start out with an explicit empty implementations of the destructor, as there's no way to tell the destructor itself what field to destroy:

    Data::Union::~Union()
    {};

12.6.2: Embedding an unrestricted union in a surrounding class

Next, we embed the unrestricted union in a surrounding aggregate: class Data. The aggregate is provided with an enum Tag, declared in its public section, so Data's users may request tags. Union itself is for Data's internal use only, so Union is declared in Data's private section. Using a struct Data rather than class Data we start out in a public section, saving us from having to specify the initial public: section for enum Tag:
    struct Data
    {
        enum Tag
        {
            INT,
            STRING
        };

        private:
            union Union
            {
                int         u_int;
                std::string u_string;

                ~Union();           // no actions
                // ... to do: declarations of members
            };

            Tag d_tag;
            Union d_union;
    };

Data's constructors receive int or string values. To pass these values on to d_union, we need Union constructors for the various union fields; matching Data constructors also initialize d_tag to proper values:

    Data::Union::Union(int value)
    :
        u_int(value)
    {}
    Data::Union::Union(std::string const &str)
    :
        u_string(str)
    {}

    Data::Data(std::string const &str)
    :
        d_tag(STRING),
        d_union(str)
    {}
    Data::Data(int value)
    :
        d_tag(INT),
        d_union(value)
    {}

12.6.3: Destroying an embedded unrestricted union

Data's destructor has a data member which is an unrestricted union. As the union's destructor can't perform any actions, the union's proper destruction is delegated to a member, Union::destroy destroying the fields for which destructors are defined. As d_tag stores the currently used Union field, Data's destructor can pass d_tag to Union::destroy to inform it about which field should be destroyed.

Union::destroy does not need to perform any action for INT tags, but for STRING tags the memory allocated by u_string must be returned. For this an explicit destructor call is used. Union::destroy and Data's destructor are therefore implemented like this:

    void Data::Union::destroy(Tag myTag)
    {
        if (myTag == Tag::STRING)
            u_string.~string();
    }

    Data::~Data()
    {
        d_union.destroy(d_tag);
    }

12.6.4: Copy and move constructors

Union's copy and move constructors suffer from the same problem as Union's destructor does: the union does not know which is its currently active field. But again: Data does, and by defining `extended' copy and move constructors, also receiving a Tag argument, these extended constructors can perform their proper initializations. The Union's copy- and move-constructors are deleted, and extended copy- and move constructors are declared:
    Union(Union const &other) = delete;
    Union &operator=(Union const &other) = delete;

    Union(Union const &other, Tag tag);
    Union(Union &&tmp, Tag tag);

Shortly we'll encounter a situation where we must be able to initialize a block of memory using an existing Union object. This task can be performed by copy members, whose implementations are trivial, and which may be used by the above constructors. They can be declared in Union's private section, and have identical parameter lists as the above constructors:

    void copy(Union const &other, Tag tag);
    void copy(Union &&other, Tag tag);

The constructors merely have to call these copy members:

    inline Data::Union::Union(Union const &other, Tag tag)
    {
        copy(other, tag);
    }

    inline Data::Union::Union(Union &&tmp, Tag tag)
    {
        copy(std::move(tmp), tag);
    }

Interestingly, no `initialization followed by assignment' happens here: d_union has not been initialized in any way by the time we reach the statement blocks of the above constructors. But upon reaching the statement blocks, d_union memory is merely raw memory. This is no problem, as the copy members use placement new to initialize the Union's memory:

    void Data::Union::copy(Union const &other, Tag otag)
    {
        if (tag == INT)
            u_int = other.u_int;
        else
            new (this) string(other.u_string);
    }

    void Data::Union::copy(Union &&tmp, Tag tag)
    {
        if (tag == INT)
            u_int = tmp.u_int;
        else
            new (this) string(std::move(tmp.u_string));
    }

12.6.5: Assignment

To assign a Data object to another data object, we need an assignment operator. The standard mold for the assignment operator looks like this:
    Class &Class::operator=(Class const &other)
    {
        Class tmp(other);
        swap(*this, tmp);
        return *this;
    }
This implementation is exception safe: it offers the `commit or roll-back' guarantee (cf. section 9.6). But can it be applied to Data?

It depends. It depends on whether Data objects can be fast swapped (cf. section 9.6.1.1) or not. If Union's fields can be fast swapped then we can simply swap bytes and we're done. In that case Union does not require any additional members (to be specific: it won't need an assignment operator).

But now assume that Union's fields cannot be fast swapped. How to implement an exception-safe assignment (i.e., an assignment offering the `commit or roll-back' guarantee) in that case? The d_tag field clearly isn't a problem, so we delegate the responsibility for proper assignment to Union, implementing Data's assignment operators as follows:

    Data &Data::operator=(Data const &rhs)
    {
        if (d_union.assign(d_tag, rhs.d_union, rhs.d_tag))
            d_tag = rhs.d_tag;
        return *this;
    }

    Data &Data::operator=(Data &&tmp)
    {
        if (d_union.assign(d_tag, std::move(tmp.d_union), tmp.d_tag))
            d_tag = tmp.d_tag;
        return *this;
    }

But now for Union::assign. Assuming that both Unions use different fields, but swapping objects of the separate types is allowed. Now things may go wrong. Assume the left-side union uses type X, the right-side union uses type Y and both types use allocation. First, briefly look at standard swapping. It involves three steps:

Usually we assume that these steps do not throw exceptions, as swap itself shouldn't throw exceptions. How could we implement swapping for our union? Assume the fields are known (easily done by passing Tag values to Union::swap):

By C++-standard requirement, the in-place destruction won't throw. Since the standard swap also performs an assignment that part should work fine as well. And since the standard swap also does a copy construction the placement new operations should perform fine as well, and if so, Union may be provided with the following swap member:

    void Data::Union::swap(Tag myTag, Union &other, Tag oTag)
    {
        Union tmp(*this, myTag);    // save lhs

        destroy(myTag);             // destroy lhs
        copy(other, oTag);          // assign rhs

        other.destroy(oTag);        // destroy rhs
        other.copy(tmp, myTag);     // save lhs via tmp
    }

Now that swap is available Data's assignment operators are easily realized:

    Data &Data::operator=(Data const &rhs)
    {
        Data tmp(rhs);  // tmp(std::move(rhs)) for the move assignment

        d_union.swap(d_tag, tmp.d_union, tmp.d_tag);
        swap(d_tag, tmp.d_tag);

        return *this;
    }

What if the Union constructors could throw? In that case we can provide Data with an 'commit or roll-back' assignment operator like this:

    Data &Data::operator=(Data const &rhs)
    {
        Data tmp(rhs);
                            // rolls back before throwing an exception
        d_union.assign(d_tag, rhs.d_union, rhs.d_tag);
        d_tag = rhs.d_tag;

        return *this;
    }

How to implement Union::assign? Here are the steps assign must take:

Here is the implementation of the `commit or roll-back' Union::assign:
    void Data::Union::assign(Tag myTag, Union const &other, Tag otag)
    {
        char saved[sizeof(Union)];
        memcpy(saved, this, sizeof(Union));     // raw copy: saved <- *this
        try
        {
            copy(other, otag);                  // *this = other: may throw
            fswap(*this,                        // *this <-> saved
                    *reinterpret_cast<Union *>(saved));
            destroy(myTag);                     // destroy original *this
            memcpy(this, saved, sizeof(Union)); // install new *this
        }
        catch (...)                             // copy threw
        {
            memcpy(this, saved, sizeof(Union)); // roll back: restore *this
            throw;
        }
    }
The source distribution contains yo/containers/examples/unrestricted2.cc offering a small demo-program in which the here developed Data class is used.