6.6 Mutating Binary Search Trees

Now that we have seen how searching works on binary search trees, we will study the two mutating methods of the Multiset/Collection ADT: insertion and deletion. Insertion is covered in lab, so here we’ll only discuss deletion.

The basic idea is quite straightforward: Given an item to delete, we take the same approach as __contains__ to search for the item. If we find it, it will be at the root of a subtree (possibly a very small one—even just a leaf), where we delete it:

    def delete(self, item: Any) -> None:
        """Remove *one* occurrence of <item> from this BST.

        Do nothing if <item> is not in the BST.
        """
        if self.is_empty():
            pass
        elif self._root == item:
            self.delete_root()
        elif item < self._root:
            self._left.delete(item)
        else:
            self._right.delete(item)

    def delete_root(self) -> None:
        """Remove the root of this tree.

        Precondition: this tree is *non-empty*.
        """

Note that we are again using the strategy of defining a helper method, delete_root, to pull out part of the required functionality that’s a little tricky. This keeps our methods from growing too long, and also helps us break down a larger task into smaller steps.

We now need to work on delete_root. One thing we might try is to set self._root = None. Certainly this would remove the old value of the root, but this only works if the tree consists of just the root (with no children), so removing it makes the tree empty. In this case, we need to make sure that we also set _left and _right to None as well, to ensure the representation invariant is satisfied.

    def delete_root(self):
        if self._left.is_empty() and self._right.is_empty():
            self._root = None
            self._left = None
            self._right = None

What about the case when the tree has at least one other item? We can’t just set self._root = None, leaving a root value of None and yet a child that isn’t None; this would violate our representation invariant. We can think of it as leaving a “hole” in the tree. We can analyse the tree structure to detect two “easy” special cases: when at least one of the subtrees is empty, but the other one isn’t. In these cases, we can simply “promote” the other subtree up.

    def delete_root(self) -> None:
        if self._left.is_empty() and self._right.is_empty():
            self._root = None
            self._left = None
            self._right = None
        elif self._left.is_empty():
            # "Promote" the right subtree.
            self._root, self._left, self._right = \
                self._right._root, self._right._left, self._right._right
        elif self._right.is_empty():
            # "Promote" the left subtree.
            self._root, self._left, self._right = \
                self._left._root, self._left._left, self._left._right

Finally, we need to handle the case that both subtrees are non-empty. Rather than restructure the entire tree, we can fill the “hole” at the root by replacing the root item with another value from the tree (and then removing that other value from where it was). The key insight is that there are only two values we could replace it with and still maintain the BST property: the maximum (or, rightmost) value in the left subtree, or the minimum (or, leftmost) value in the right subtree. We’ll pick the left subtree here.

    def delete_root(self) -> None:
        if self._left.is_empty() and self._right.is_empty():
            self._root = None
            self._left = None
            self._right = None
        elif self._left.is_empty():
            # "Promote" the right subtree.
            self._root, self._left, self._right = \
                self._right._root, self._right._left, self._right._right
        elif self._right.is_empty():
            # "Promote" the left subtree.
            self._root, self._left, self._right = \
                self._left._root, self._left._left, self._left._right
        else:
            self._root = self._left.extract_max()

    def extract_max(self) -> object:
        """Remove and return the maximum item stored in this tree.

        Precondition: this tree is *non-empty*.
        """

We’ve once again kicked out the hard part to another helper, extract_max. Finding the maximum item is easy: just keep going right to bigger and bigger values until you can’t anymore. And removing that maximum is much easier than our initial problem of BST deletion because that maximum has at most one child, on the left. (How do we know that?) Here’s the method:

    def extract_max(self) -> object:
        """Remove and return the maximum item stored in this tree.

        Precondition: this tree is *non-empty*.
        """
        if self._right.is_empty():
            max_item = self._root
            # Once again, "Promote" the left subtree.
            self._root, self._left, self._right = \
                self._left._root, self._left._left, self._left._right
            return max_item
        else:
            return self._right.extract_max()

The single base case here is actually handling two scenarios: one in which self has a left (but no right) child, and one in which it has no children (i.e., it is a leaf). Confirm for yourself that both of these scenarios are possible, and that the single base case handles both of them correctly.

One deletion exercise

Try implementing delete_all, which is similar to delete_item except that it deletes all occurrences of an item from a BST. Think carefully about how to handle duplicate elements!