26. Binary Search Trees

  • With binary trees covered, a more specific type of binary trees can be discussed

    • Namely, the binary search tree

  • Compared to a binary tree, binary search trees have specific properties and constraints that must be followed

  • These properties and constraints can be taken advantage of for improvements in certain functionality

26.1. Binary Search Tree Definition

  • A binary search tree is a binary tree where:

    • All values in the left subtree are less than the value in the root node

    • All values in the right subtree are greater than or equal to the value in the root node

Note

One could make the left subtree less than or equal to and the right subtree strictly greater than. All that matters is that only on subtree includes the equal to part.

Alternatively, one could make the right subtree exclusively greater than, but this would disallow duplicates.

../../_images/binary_search_tree_no_yes.png

Two binary trees. The binary tree on the left is a binary search tree. The binary tree on the right is not a binary search tree. This is because the value 16, which is in the left subtree of the subtree with the root 15, violates this property — 16 is greater than 15, thus 16 must be to the right of 15 if it was a binary search tree.

../../_images/binary_search_tree_base.png

Arbitrary binary search tree.

  • In the above example, note that the leftmost node is the smallest value in the tree and the rightmost is the largest

26.1.1. A Binary Search Tree is a Binary Tree

  • A binary search tree is a special case of a binary tree

    • Therefore, it will have all the operations a binary tree would have

  • But a few additional operations will be added

    • Add, but based on the important ordering

    • Remove, but must preserve ordering

    • Remove max

    • Remove min

26.2. Searching a Binary Search Tree

  • The special ordering property enables a more efficient search through the tree when compared to a search regular binary tree

  • For this section, assume the use of a linked implementation of a binary search tree

    • More on this below

26.3. Binary Search Tree Operations

  • Getting the size of the collection, checking if it’s is empty, and getting iterators should be relatively simple by this stage

  • The adding to and removing from the binary search tree is where the complexity comes in

26.3.1. Add

  • Adding is going to be similar to a binary search

  • Do a binary search until there is no child

  • Insert the element where the child does not exist

  • New nodes will be added as a leaf

 1Define add
 2    If the node does not exist
 3        Insert the new node here with the element to be added
 4
 5    If the node exists
 6        If the element is less than the node
 7            Call add on the left child node
 8
 9        If the element is greater than or equal to the node
10            Call add on the right child node
  • Starting an existing binary search tree, follow the pseudocode to add the value 13

../../_images/binary_search_tree_add0.png

Adding the value 13 to a binary search tree.

  • Starting with an empty tree, follow the pseudocode to add the following 6 elements

../../_images/binary_search_tree_add1.png

The creation of a binary search tree by starting with an empty tree and adding the elements 26, 15, 38,, 31, 7, and 34. Note that when the value 26 is added, it is being added to an empty tree, thus the root node does not exist.

26.3.2. Min & Remove Min

  • Given the nature of the binary search tree ordering, the node with the minimum element must be in the leftmost node in the tree

  • There are three cases to consider when looking for the minimum node:

    1. The leftmost node is the root (root has no left child)

    2. The leftmost is a leaf node (no children)

    3. The leftmost node is an interior node

  • Finding the minimum is simple

    • Assuming a root node exists, keep going to the left subtree until there is no more left subtree

  • Removing will be a little trickier since it may be removing something that has children that must remain in the tree

  • And further, not only must the children stay in the tree, but the proper binary search tree ordering must be preserved

  • Consider the following examples and think of how one would need to manage potential children of a node being removed

../../_images/binary_search_tree_remove_min0.png
../../_images/binary_search_tree_remove_min1.png
../../_images/binary_search_tree_remove_min2.png

Case 1

  • The minimum value is in the root node, this means that

    • There is no left subtree

    • There may be a right subtree

    • Therefore, simply replace the root with the root of the right subtree (which may be null)

  • The ordering is preserved since

    • Nothing is to the left of the root

    • Everything to the right of the root is larger than the value in the root

    • The right subtree is a binary search tree (by definition)

    • Therefore, if the right subtree becomes the new root, the binary search tree’s order will be maintained

Case 2

  • The minimum is a leaf node, this means that

    • There are no children to deal with

    • Therefore, just remove the node

  • By eliminating only a leaf node, the ordering will not be affected

Case 3

  • The minimum is an interior node, this means that

    • The node has no left subtree — otherwise the node would not contain the minimum value since the minimum must be the leftmost node

    • A right subtree exists

    • Therefore, replace it with the node’s right subtree’s root

      • The node being removed’s parent’s left child will become the node being removed’s right child

  • The ordering is preserved in the same way as case 1

    • Nothing is to the left of the node

    • Everything to the right of the node to be removed is larger than the value in the node

    • The right subtree is a binary search tree (by definition)

    • Therefore, if the right subtree replaces the node being removed, the binary search tree’s order will be maintained

All the Cases are the Same?

  • One may have noticed that the rules for each case are actually the same

  • Replace the node with the right subtree’s root

  • This is perhaps more obvious for cases 1 and 3

    • Consider that a leaf node’s right subtree is null

26.3.3. Max & Remove Max

  • Finding and/or removing the maximum value will be very similar to finding and/or removing the minimum

  • Instead of always going to the left subtree, go to the right to find the maximum

  • There are still the three cases to deal with, but like with minimum, they can all be addressed in the same way

    • Replace the node with the left subtree’s root

26.3.4. General Remove

  • A general remove is a little more complex than the remove min or max

../../_images/binary_search_tree_remove_uh.png

How would one remove the value 26, which happens to be the tree’s root, from this binary search tree?

  • In the above example, there may not be an immediately obvious or clear way to address the problem

Case 1

  • If the node being removed is a leaf node, then simply remove it

    • In the above example, if 7 or 34 is to be removed, just remove it

Case 2

  • If the node being removed is an interior node, then things get hairy

  • In some cases this may seem simple — if 15 is removed, just replace it with it’s child

  • But what happens if removing 26 from the tree?

  • The trick here is to replace the node with its inorder predecessor or successor

  • If 26 is removed

    • 15, the value in the tree that comes right before 26, could replace it

    • Or 31, the value in the tree that comes right after 26, could replace it

  • The reason this will work is, if looking for the predecessor

    • All values in the left subtree are smaller than the root and everything in the right subtree

    • The largest value in the left subtree will be greater than or equal to all other values in the left subtree

    • The largest value in the left subtree will be less than all values in the right subtree

    • Therefore, the largest value in the left subtree can replace 26 without destroying the ordering since

      • It’s greater than or equal to everything in the left subtree

      • And smaller than everything in the right subtree

  • Below is an example of removing multiple elements from a binary search tree

../../_images/binary_search_tree_remove1.png
../../_images/binary_search_tree_remove2.png
../../_images/binary_search_tree_remove3.png

26.3.5. Contains

  • Like the other data structures, a way to check if a given element exists within the collection is needed

  • Although an exhaustive depth first search through the tree would work, as discussed for the general binary tree would work

    • \(O(n)\)

  • Here, due to the nature of the binary search tree ordering, a binary search can be used

    • \(O(log_{2}(n))\)

26.3.6. Counting the Number of a Given Element

  • Similar to contains, to count the number of times a given element exists within a binary search tree, a binary search can be used

  • However, instead of returning True or False as soon as the element is found, continue the search after finding an element to continue counting

26.4. Degenerate vs. Balanced

  • A balanced tree has the property that for any node in the tree, the height of its left and right subtrees can differ by at most 1

    • Remember, the height of an empty tree is -1

../../_images/binary_search_tree_balance.png

The left tree is balanced as the heights of all subtrees in the tree differ by no more than 1. The tree on the right is not balanced since the root’s left subtree has a height of -1 and the right subtree has a height of 1 — a difference greater than 1.

  • The balanced property is important since it’s part of the reason the binary search tree is efficient to search

  • Consider adding the numbers 3, 5, 9, 12, 18, 20 in that order to an empty binary search tree

../../_images/binary_search_tree_degenerate_animation.gif

An extreme example of an imbalanced binary search tree. This was caused by adding the elements 3, 5, 9, 12, 18, 20 in order. Each element added is larger than all elements in the tree, which causes each newly added element to be added to the right subtrees.

  • This particular case is called a degenerate binary tree

    • It’s effectively a linear data structure, not a tree

../../_images/binary_search_tree_complexity.png
  • Consider a binary search

  • With a balanced binary search tree, going to the left or right subtrees eliminates roughly half the elements

    • This is what gives the \(O(log(n))\) search

  • With a degenerate tree, where it’s basically a linear data structure, all \(n\) elements must be searched through

    • Only one element is eliminated from the search every time a node is investigated

    • Thus, the search through a degenerate tree would be \(O(n)\)

  • Fortunately there exist strategies for keeping binary search trees balanced, but these are outside the scope of this course

    • Two popular examples are AVL Trees and Red-Black Trees

26.5. For next time

26.6. For Next Time

  • Read Chapter 11 Sections 1 – 3

    • 17 pages

26.6.1. Playing Code