home page -> teaching -> data structures and algorithms -> iterators

Iterators

Motivation

to abstract the details of parsing a collection

    struct Node {
        Element info;
        Node* next;
    };
    
    struct List {
        Node* begin;
    };
    ...
    for(Node* p = list.begin ; p != nullptr ; p = p->next) {
        process(p->info);
    }

Such code must be written each time we parse the list. Imagine the above for some tree-based collection...

Domain and operations

In most of the following, we assume a linear collection (like a list or vector) - elements are put in a well-defined order, controlled by the user; the user is able to add a new element at a position of his choice.

There are several possibilities for the domain

  1. references to all elements of a collection,
  2. references to all elements, plus a special value representing "past the last element",
  3. all positions between two elements, including a position right before the first element and one just after the last element.

Required operations:

Additional simple operations:

Additional operations for special kinds of iterators:

Additional operations for modifying the container:

Discussion

Domain type 1 vs 2

Domain type 1 does not require a special "out of band" value. This makes implementation simpler in cases such as:

  • the collection is a vector of size exactly 256, and the iterator is represented as a 8-bit integer;
  • the collection is a vector and iteration goes backwards, towards index 0 (note that c++ standard allows a pointer just past the end of an array, but not just before the first element; in other words, if we have an array int t[10], then setting int* p=t+10 is legal, although p cannot be dereferenced; however, int* p=t-1 is illegal and, in rare cases, can make the program crash.
  • circular lists.

    Domain type 1 makes using the iterator much harder. Parsing the collection needs an explicit test against empty collection, and then the loop must be with exit in the middle (process, test, increment).

    Usage of a type 1 iterator:

    if(!list.isEmpty()) {
        List::Iterator it=list.iterator();
        while(true) {
            process(*it);
            if(!it.hasNext()) break;
            ++it;
        }
    }
    

    Usage of a type 2 iterator:

    for(List::Iterator it=list.iterator() ; it.isValid() ; ++it) {
        process(*it);
    }
    

    The better usability of type 2 is given by the additional possible value - an iterator on a collection with n elements has n+1 possible values. It allows 2 things: a valid iterator value even for empty collections, and to put the "at end" test after the "go next" instead of before.

    Domain type 2 vs 3

    Domain of type 3 is used by Java collection framework. It is also used implicitly by (sequential access) files - the current position in a file behaves like an iterator pointing between the last read data and the data to be read next.

    With domain type 3 iterators, operation "get current" cannot exist as such, since the iterator points between elements. It can only go together with "go next".

    This makes usage a bit harder:

    List::Iterator it=list.iterator();
    while(it.hasNext()) {
        int* next = it.next();
        *next = *next * 10;
    }
    

    Domain of type 2 is asymmetric. This makes iterating backwards (for a bidirectional iterator) quite different from iterating forward:

    Iterator it = list.end(); // just past the end of the list
    while(it.isNotFirst()) {
        --it;
        process(*it);
    } 
    

    Example: bidirectional iterators for a double-linked list

    The big issue is what is the value of an iterator past-the-end of the list (for type 2 domain). A regular iterator points to a node of the list.

    Usually, the last element of the list has the next pointer null. With this approach, when doing go next on an iterator pointing to the last element, the iterator becomes null. Afterwards, it is impossible to execute go previous.

    The solution is to create a dummy node, that is used only for having its address given to the past-the-end iterator. See the full solution.

    Radu-Lucian LUPŞA
    2016-03-06