19 December 2007

STL Overview

Any programmer developing a moderately complicated system soon realizes that efficient coding requires the use of fundamental data structures like lists, queues and stacks as well as algorithms to process the data stored in these data structures.

Most Java programmers would be familiar with various implementations of these data structures in the util package. Until 1994, C++ did not have, as a part of its standard library, support for such data structures. This implied that any programmer developing software that required the use of such structures either had to build such a library or purchase one being offered commercially.

The former option had risks of increased development time and costs associated with it. Reducing such risks meant customizing the library to the application for which it was being developed, resulting in its incompatibility with other application that will be developed in the future.

The latter option although reduced the risks associated with the first option but introduced following further risks,

  1. Issues related to potential incompatibility with platforms, development environments and other libraries to which the software has to be linked,
  2. Future costs of upgrades. All programmers are well aware of incremental costs of software upgrades that offer more and/or improved features,
  3. Potential performance and extensibility issues related to the characteristics of implementation. For example, the use of virtual functions in internal objects can not only reduce the performance of the library but also make it difficult to use in a shared memory system. Algorithms in such libraries are usually implemented as class members thereby restricting change.

To overcome these issues, the ANSI (American National Standards Institute) decided to select STL as a standard set of collection classes and algorithms. STL is based on the successful research and development at Hewlett Packard Laboratories in the area of generic programming.

Typical features of STL are as follows,

  1. Efficiency. STL does not use inheritance or virtual functions.
  2. Type safety. The extensive use of C++ templates makes STL type safe.
  3. Consistency. STL container classes use iterators that are generalized pointers to the data contained within the containers. Iterators allow access of data within the containers in a manner similar to C arrays.
  4. Extensibility.
    1. STL algorithms are stand-alone functions that do not access STL containers directly. Rather they operate over the data via the iterators.
    2. STL allows the use of function objects. These are functions that can be encapsulated and associated with data. These can be created, stored and destroyed like objects.
    3. STL memory management does not explicitly use new and delete operators. STL containers use special allocator objects to allocate and deallocate storage. If an application requires different memory allocation and deallocation policies, these can be implemented using specialised allocator objects. These allocators can then replace the standard allocators in the containers to provide custom memory management. This article will, however, only consider the use of standard allocators.

STL Containers

STL containers are classes whose objects can be used to build collections of data of same type. For example, objects of vector and list classes are used to build a one dimensional collection of data (although two dimensional collections are possible as lists of lists), maps are used to build lookup tables and sets provide a unique collection of data (i.e., each data element exists only once).

STL containers can be divided into the following three categories,

  1. Sequential containers. These containers arrange the data they contain in a linear manner.
  2. Associative containers. These containers maintain data in structures suitable for fast associative look-up.
  3. Adapters. These provide different but more appropriate interfaces to containers belonging to the above-mentioned categories.
  4. Specific containers belonging to each category are listed in the following table.

    Category

    Containers

    Characteristics

    Sequential

    vector

    Provides a linear and contiguous storage (similar to an array) that allows fast inserts at the end only.

    list

    It is an implementation of a doubly linked list that allows fast inserts anywhere.

    deque

    Provides a linear but non-contiguous storage that allows fast inserts at extremities.

    Associative

    multiset

    It is an implementation of set where duplicates are allowed and provides fast associative lookup.

    set

    It is an implementation of set where no duplicates are allowed and provides fast associative lookup.

    multimap

    It is an implementation of a key to value mapping structure where a single key can be mapped to many values (1 to many mappings).

    map

    It is an implementation of a key to value mapping structure where a single key can only be mapped to one value (1 to 1 mapping).

    Adapter

    stack

    It is an implementation of a first in last out data structure.

    queue

    It is an implementation of a first in first out data structure.

    priority_queue

    A queue that maintains items in a sorted order

The coming sections will discuss using vectors, lists, sets, maps, stacks and queues in programs. The remaining containers have APIs similar to other containers in their categories.

No comments: