Previous Section
 < Free Open Study > 
Next Section


9.2 Heaps

Logical Level

Like a binary search tree, a heap is a binary tree that satisfies two properties, one concerning its shape and the other concerning the order of its elements. The shape property is simple: A heap must be a complete binary tree. The order property says that, for every node in the heap, the value stored in that node is greater than or equal to the value in each of its children. (The heap as a data structure is not to be confused with an unrelated concept of the same name. Heap is also a synonym for the free store-the area of memory available for dynamically allocated data.)

Heap A complete binary tree, each of whose elements contains a value that is greater than or equal to the value of each of its children

Figure 9.2 shows two heaps that contain the letters "A" through "J." Notice that the placement of the values differs in the two trees, but the shape remains the same: a complete binary tree of ten elements. Note also that both heaps have the same root node. A group of values can be stored in a binary tree in many ways and still satisfy the order property of heaps. Because of the shape property, we know that the shape of all heaps with a given number of elements is the same. Because of the order property, we know that the root node always contains the largest value in the heap. This fact gives us a hint as to how we might use this data structure. The special feature of heaps is that we always know the location of the maximum value: It is in the root node.

Click To expand
Figure 9.2: Two heaps containing the letters "A" through "J"

When we refer to a "heap" in this section, we mean the structure defined above. It might also be called a "maximum heap," because the root node contains the maximum value in the structure. We can also create a "minimum heap," each of whose elements contains a value that is less than or equal to the value of each of its children.

Suppose we want to remove the element with the largest value from a heap. The largest element resides in the root node, so we can easily remove it, as illustrated in Figure 9.3(a) on page 535. Of course, its removal leaves a hole at the root position. Because the heap's tree must be complete, we decide to fill the hole with the bottom rightmost element from the heap; now the structure satisfies the shape property (Figure 9.3b). However, the replacement value came from the bottom of the tree, where the smaller values are located; the tree no longer satisfies the order property of heaps.

This situation suggests one of the basic heap operations: Given a complete binary tree whose elements satisfy the heap order property except in the root position, repair the structure so that it again becomes a heap. This operation, called ReheapDown, involves moving the element down from the root position until it ends up in a position where the order property is satisfied (see Figure 9.3c). ReheapDown has the following specification.

Click To expand
Figure 9.3: The ReheapDown operation

We have tried to make this operation fairly general, by telling it where to find the root and the bottom rightmost element of the heap. Letting the root be a parameter, instead of just assuming that we start at the root of the whole heap, generalizes this routine, allowing us to perform the reheap operation on any subtree as well as on the original heap.

Now suppose we want to add an element to the heap. Where do we put it? The shape property tells us that the tree must be complete, so we put the new element in the next bottom rightmost place in the tree, as illustrated in Figure 9.4(a). Now the shape property is satisfied, but the order property may be violated. This situation illustrates the need for another basic heap operation: Given a complete binary tree containing N elements, whose first N - 1 elements satisfy the order property of heaps, repair the structure so that it again becomes a heap. To fix this structure, we need to float the Nth element up in the tree until it is in its correct place (see Figure 9.4b). This operation is called ReheapUp. Here is the specification:

Application Level

Heaps are unusual structures. Like arrays, they are used only as implementation structures for higher-level classes. The main application for a heap is as an implementation structure for a priority queue.

Implementation Level

Although we have graphically depicted heaps as binary trees with nodes and links, it would be very impractical to implement the heap operations using the usual linked-tree representation. The shape property of heaps tells us that the binary tree is complete, so we know that it never has any holes in it. Thus we can easily store the tree in an array with implicit links, as discussed in Chapter 8. Figure 9.5 shows how the values in a heap would be stored in such a representation. If a heap with numElements elements is implemented this way, the shape property says that the heap elements are stored in numElements consecutive slots in the array, with the root element placed in the first slot (index 0) and the last leaf node placed in the slot with index numElements - 1. The order property says that, for every nonleaf node heap.elements[index],

Click To expand
Figure 9.4: The ReheapUp operation
heap.elements[index] >= heap.elements[index * 2 + 1]

and, if there is a right child,

heap.elements[index] >= heap.elements[index * 2 + 2]

Click To expand
Figure 9.5: Heap values in an array representation

We use the following declarations to support this heap implementation:

template<class ItemType>
// Assumes ItemType is either a built-in simple type or a class
// with overloaded relational operators.
struct HeapType
{
  void ReheapDown(int root, int bottom);
  void ReheapUp(int root, int bottom);
  ItemType* elements;   // Array to be allocated dynamically
  int numElements;
};

This declaration is somewhat different from anything we have used so far. We are making HeapType be a struct with member functions. Why not make it into a class? Because heaps are seldom used alone. They are acted upon by other structures rather than being active. We define the functions that restore the heap property as part of the struct, but we also allow access to the data members. Our algorithms are very general, with the positions of both the root and the bottom elements passed as parameters. Additionally, we have chosen to include the data member numElements to record the number of elements on the heap, although our example algorithms do not use it.

We've specified the utility operations ReheapDown and ReheapUp to fix heaps that are "broken" at one end or the other. Now let's look at these operations in more detail.

When ReheapDown is first called, two possibilities arise. If the value in the root node (heap.elements [0]) is greater than or equal to the values in its children, the order property remains intact and we don't have to do anything. Otherwise, we know that the maximum value of the tree is in either the root node's left child (heap.elements[1]) or the right child (heap.elements[2]). We must swap one of these values with the smaller value in the root. Now the subtree rooted at the node that is swapped is a heap-except (possibly) for its root node. We apply the same process again, asking whether the value in this node is greater than or equal to the values in its children. We continue to test smaller and smaller subtrees of the original heap, moving our original root node down, until (1) the root of the current subtree is a leaf node or (2) the value in the root of the current subtree is greater than or equal to the values in both its children.

The algorithm for this function is given here and illustrated with an example in Figure 9.6. At the start, root is the index of the node that (possibly) violates the heap order property.

Click To expand
Figure 9.6: The ReheapDown operation

This algorithm is recursive. In the general case, we swap the value in the root node with its larger child, and then repeat the process. On the recursive call, we specify maxChild as the root of the heap; this shrinks the size of the tree still to be processed, satisfying the smaller-caller question. Two base cases exist: (1) if heap.elements [root] is a leaf, and (2) if the heap order property is already intact. In either of these cases, we do nothing. How can we tell if heap.elements[root] is a leaf? If the calculated position of the left child is greater than bottom, then it is a leaf node. Look back at Figure 9.5. The node with F is the first leaf node. The index of the node with F is 5, so its left child would be in index position 11 if it exists. Because 11 is greater than 9 (bottom), F's left child is not in the heap, so F is a leaf node.

To determine maxChild, we first check whether the current root node has only a single child. If so, it is a left child (because the tree is complete), and we set maxChild to its index. Otherwise, we compare the values in the two child nodes and set maxChild to the index of the node that has the larger value.

The following code gives the whole function. It uses a utility function, Swap, that swaps the values of its two parameters. (Because this function is trivial, we do not show its implementation here.)

template<class ItemType>
void HeapType<ItemType>::ReheapDown(int root, int bottom)
// Post: Heap property is restored.
{
  int maxChild;
  int rightChild;
  int leftChild;

  leftChild = root*2+1;
  rightChild = root*2+2;
  if (leftChild <= bottom)
  {
    if (leftChild == bottom)
      maxChild = leftChild;
    else
    {
      if (elements[leftChild] <= elements[rightChild])
        maxChild = rightChild;
      else
        maxChild = leftChild;
    }
    if (elements[root] < elements[maxChild])
    {
      Swap(elements[root], elements[maxChild]);
      ReheapDown(maxChild, bottom);
    }
  }
}

The complementary operation, ReheapUp, takes a leaf node that violates the order property of heaps and moves it up until its correct position is found. We compare the value in the bottom node with the value in its parent node. If the parent's value is smaller, the order property is violated, so we swap the two nodes. Then we examine the parent, repeating the process until (1) the current node is the root of the heap or (2) the value in the current node is less than or equal to the value in its parent node. The algorithm for this function is given below and illustrated in Figure 9.7.

Click To expand
Figure 9.7: The ReheapUp operation

This algorithm is also recursive. In the general case, we swap the (current) "bottom" node with its parent and reinvoke the function. On the recursive call, we specify parent as the bottom node; this shrinks the size of the tree still to be processed, so the smaller-caller question can be answered affirmatively. Two base cases exist: (1) if we have reached the root node or (2) if the heap order property is satisfied. In either case, we exit the function without doing anything.

How do we find the parent node? This task is not an easy one in a binary tree linked together with pointers from parent to child nodes. As we saw earlier, however, it is very simple in our implicit link implementation:

parent = (index - 1) / 2;

We can now code the whole function:

template<class ItemType>
void HeapType<ItemType>::ReheapUp(int root, int bottom)
// Post: Heap property is restored.
{
  int parent;

  if (bottom > root)
  {
    parent = (bottom-1) / 2;
    if (elements[parent] < elements[bottom])
    {
      Swap(elements[parent], elements[bottom]);
      ReheapUp(root, parent);
    }
  )
}

Application Level Revisited

A heap is an excellent way to implement a priority queue. The following code shows how we declare our PQType class. For brevity, we omit the copy constructor and leave its coding as an exercise.

class FullPQ()
{};
class EmptyPQ()
{}:
template<class ItemType>
class PQType
{
public:
  PQType(int);
  ~PQType();
  void MakeEmpty();
  bool IsEmpty() const;
  bool IsFull() const;
  void Enqueue(ItemType newItem);
  void Dequeue(ItemType& item);
private:
  int length;
  HeapType<ItemType> items;
  int maxItems;
};

template<class ItemType>
PQType<ItemType>::PQType(int max)
{
  maxItems = max;
  items.elements = new ItemType[max];
  length = 0;
}

template<class ItemType>
void PQType<ItemType>::MakeEmpty()
{
  length = 0;
}

template<class ItemType>
PQType<ItemType>::~PQType()
{
  delete [] items.elements;
}

We keep the number of elements in a priority queue in the data member length. Using the heap implementation described earlier, the elements are stored in the first length slots of the array items.elements. Because of the order property, we know that the largest element is in the root-that is, in the first array slot (index 0).

Let's look first at the Dequeue operation. The root element is returned to the caller. After we remove the root, we are left with two subtrees, each of which satisfies the heap property. Of course, we cannot leave a hole in the root position, as that would violate the shape property. Because we have removed an element, length - 1 elements are now left in the priority queue, stored in array slots 1 through length - 1. If we fill the hole in the root position with the bottom element, array slots 0 through length - 2 contain the heap elements. The heap shape property is now intact, but the order property may be violated. The resulting structure is not a heap, but it is almost a heap-all of the nodes except the root node satisfy the order property. This problem is an easy one to correct, as we already have a heap operation to perform exactly this task: ReheapDown. Here is our algorithm for Dequeue:

The Enqueue operation involves adding an element in its "appropriate" place in the heap. Where is this place? If the new element's priority is larger than the current root element's priority, we know that the new element belongs in the root. But that's not the typical case; we want a more general solution. To start, we can put the new element at the bottom of the heap, in the next available leaf position (review Figure 9.4). Now the array contains elements in the first length + 1 slots, preserving the heap shape property. The resulting structure is probably not a heap, but it's almost a heap-the order property is violated in the last leaf position. This problem is easy to solve using the ReheapUp operation. Here is our algorithm for Enqueue:

template<class ItemType>
void PQType<ItemType>::Dequeue(ItemType& item)
// Post: element with highest priority has been removed
//       from the queue; a copy is returned in item.
{
  if (length == 0)
    throw EmptyPQ();
  else
  {
    item = items.elements[0];
    items.elements[0] = items.elements[length-1];
    length--;
    items.ReheapDown(0, length-1);
  }
}

template<class ItemType>
void PQType<ItemType>::Enqueue(ItemType newItem)
// Post: newItem is in the queue.
{
  if (length == maxItems)
    throw FullPQ();
  else
  {
    length++;
    items.elements[length-1] = newItem;
    items.ReheapUp(0, length-1);
  }
}
template<class ItemType>
bool PQType<ItemType>::IsFull() const
// Post: Returns true if the queue is full and false otherwise.
{
  return length == maxItems;
}

template<class ItemType>
bool PQType<ItemType>::IsEmpty() const
// Post: Returns true if the queue is empty and false otherwise.
{
  return length == 0;
}

Heaps versus Other Priority Queue Representations

How efficient is the heap implementation of a priority queue? The MakeEmpty, IsEmpty, and IsFull operations are trivial, so we examine only the operations to add and remove elements.

Enqueue puts the new element into the next free leaf node in the heap. This array position can be accessed directly, so this part of the operation has O(1) complexity. Next, we invoke ReheapUp to correct the order. This operation moves the new element up the tree, level by level; because a complete tree is of minimum height, at most log2N levels exist above the new element (N = length). Thus Enqueue is an O(log2N) operation.

Dequeue removes the element in the root node and replaces it with the bottom rightmost leaf node. Both of these elements in the array can be accessed directly, so this part of the operation has O(1) complexity. Next, we invoke ReheapDown to correct the order. This operation moves the root element down in the tree, level by level. At most log2N levels exist below the root, so Dequeue is also an O(log2N) operation.

How does this implementation compare to the others mentioned earlier in this section? If we implement the priority queue with a linked list, sorted from largest to smallest priority, Dequeue merely removes the first node from the list-an O(1) operation. Enqueue, however, may have to search all the elements in the list to find the appropriate insertion place; it is an O(N) operation.

If we implement the priority queue using a binary search tree, the efficiency of the operations depends on the shape of the tree. When the tree is bushy, both Dequeue and Enqueue are O(log2N) operations. In the worst case, if the tree degenerates to a linked list sorted from smallest to largest priority, both Dequeue and Enqueue have O(N) complexity. Table 9.1 summarizes the efficiency of the various implementations.

Table 9.1: Comparison of Priority Queue Implementations
 

Enqueue

Dequeue

Heap

O(log2N)

O(log2N)

Linked list

O(N)

O(1)

Binary search tree

   
  • Balanced

O(log2N)

O(log2N)

  • Skewed

O(N)

O(N)

Overall, the binary search tree looks good, if it is balanced. It can, however, become skewed, which reduces the efficiency of the operations. The heap, on the other hand, is always a tree of minimum height. It is not a good structure for accessing a randomly selected element, but that is not one of the operations defined for priority queues. The accessing protocol of a priority queue specifies that only the largest (or highest-priority) element can be accessed. The linked list is a good choice for this operation (assuming the list is sorted from largest to smallest), but we may have to search the entire list to find the place to add a new element. For the operations specified for priority queues, therefore, the heap represents an excellent choice.



Previous Section
 < Free Open Study > 
Next Section
Converted from CHM to HTML with chm2web Pro 2.85 (unicode)