************************ Searching and Complexity ************************ * Before continuing, we should take a moment and reflect on what we have learned so far * The Python programming environment is a relatively familiar sight (Colab, IPython, PyCharm, Spyder) * You know how to ``print`` and ``input`` information * You know about types, values, and variables * You have been making use of functions * You have learned about basic logic and know how to use ``if`` and ``else`` * ``for`` and ``while`` loops are tools you have used * You learned how to use objects * Linear data structures, like a ``List``, are familiar and you are used to working with reference variables * You know how to read and write data from/to files * You have learned how to write classes and define your own objects * You can look at Python code and have a reasonable understanding of what it is doing * You know how to write your own code, one step at a time * You know the importance of testing your code and how to write tests * The last few topics of the course have been heavy on data and data structures * At this stage of the course, we have actually equipped ourselves with the majority of tools programmers use on a daily basis * Here, we will start to focus more on algorithms and how we can use all the tools we have learned to solve more interesting problems * Further, we will take some time to think about the amount of work the algorithm does and what makes an algorithm *good* Linear Search ============= * By now we have seen linear search multiple times * Given our familiarity of the problem and the linear search algorithm, we will use it as a starting point to begin thinking a little deeper into our algorithms .. code-block:: python :linenos: def linear_search(haystack, needle): for element in haystack: if element == needle: return True return False * Having a correct algorithm for a problem is clearly important * But we also want to make sure we have a *good* algorithm * However, what does it mean for an algorithm to be *good*? .. admonition:: Activity :class: activity Discuss this with your neighbours: * On average, how many iterations of the loop does ``linear_search`` need before it finds the ``needle``? * If the element you are looking for is *not* in the list, how many times will the loop run? * What would be the situation to cause the loop to only need to run one iteration? * How much *space* does your algorithm require? * Is your solution the best possible? * Might there exist some super clever algorithm that is somehow better (faster) than yours? .. raw:: html Complexity Analysis ------------------- * The above activity highlights the major ideas used to discuss the *goodness* of an algorithm * More precisely, we analyze the *Computational Complexity* of an algorithm * The amount of resources the algorithm needs to run * The main focus is the *time complexity* and the *space* complexity * Time complexity is the number of basic operations the algorithm requires * Space complexity is the amount of memory the algorithm requires * To keep things simple, it is common to think about the worst case scenario for the complexity analysis * We also like to consider the amount of work required based on the size of the input * For example, the worst case scenario for ``linear_search`` is if the ``needle`` does not exist within ``haystack`` * If ``haystack`` has length of 10, the loop will run 10 times before we conclude that ``needle`` is not there * If ``haystack`` has length of 100, the loop will run 100 times * If ``haystack`` has length of 10,000,000, the loop will run 10,000,000 times * If ``haystack`` has length of :math:`n`, the loop will run :math:`n` times * As for space complexity, this ``linear_search`` only requires space for storing ``haystack`` * Assuming ``haystack`` has length :math:`n`, then we require :math:`n` amount of space Binary Search ============= .. admonition:: Activity :class: activity I will guess a number between :math:`0` -- :math:`1023` in :math:`10` or less guesses. There is a catch however --- you have to tell me if the number is *higher* or *lower* than my guess before I guess again. * This is an example of a *binary search* * Like linear search, binary search is used to find an element within some collection * However, if I was doing a linear search for a number between :math:`0` -- :math:`1023`, I could only guarantee that I would find the element in :math:`1024` guesses * But with binary search, I was able to do it in :math:`10` or less guesses * Though, this required the higher/lower information --- the data was sorted * A linear search has no such requirement Complexity Analysis ------------------- * The magic with binary search is that, with every guess I made, I was able to eliminate half of the remaining numbers * My first guess was :math:`512` --- if you said *lower* I know the number is between :math:`0` -- :math:`511`, if you said *higher* I know it's between :math:`513` -- :math:`1023` * To generalize the idea, if I had :math:`n` numbers, and I guess the number :math:`\frac{n}{2}` * If you say lower, then the number must be between :math:`0` -- :math:`(\frac{n}{2} - 1)` * If you said higher, then the number must be between :math:`(\frac{n}{2} + 1)` -- :math:`(n - 1)` * With linear search, every guess only eliminated one number * Once again, let's consider the worse case scenario --- the ``needle`` is not within the ``haystack`` of size :math:`1024` * One guess gets me to :math:`512` numbers * Two guesses gets me to :math:`256` numbers * Three guesses get me to :math:`128` * Four gets me to :math:`64` * Five gets me to :math:`32` * Six gets me :math:`16` * Seven :math:`8` * Eight :math:`4` * Nine :math:`2` * Ten :math:`1` * Originally, with linear search, the relationship between the input :math:`n` and the amount of work is :math:`n` * If there are :math:`n` things in the ``haystack``, I have to look at all :math:`n` * If we doubled the size to :math:`2n`, the amount of things needed to be looked at also doubles to :math:`2n` * With binary search however, the relationship between the size of the input :math:`n` and the amount of work is :math:`log_{2}(n)` * Doubling the size to :math:`2n` only adds one more guess * Given that binary search requires :math:`log_{2}(n)` basic operations vs. linear search's :math:`n`, binary search is the clear winner * But, there is no such thing as a free lunch * With binary search, we have the catch that the data must be sorted * This is a *very* common pattern in developing algorithms * The more *general* your algorithm is, the worse the solution * The more you know about the *structure* of your problem, the more opportunities you have to use that knowledge to improve your algorithm Linear Search in Other Programming Languages ============================================ * At this stage we have been programming exclusively in Python * However, there are many other programming languages * Learning a new programming language may feel intimidating, but you may be surprised at how similar many of them are * First, the underlying algorithms are the same, regardless of the language --- a linear search is a linear search * Second, even the syntax between many languages are remarkably similar * Below is a collection of linear search algorithms in various popular programming languages * The purpose of their inclusion here is to get a sense of how similar and dissimilar programing languages can be * Despite never learning the various languages, chances are you can still understand much of the code completely Python ------ .. code-block:: python :linenos: def linear_search(haystack: list[int], needle: int) -> bool: for i in range(len(haystack)): if haystack[i] == needle: return True return False Java ---- .. code-block:: java :linenos: public static boolean linearSearch(int[] haystack, int needle){ for(int i = 0 ; i < haystack.length ; i++){ if(haystack[i] == needle){ return true; } } return false; } C# -- .. code-block:: c# :linenos: public static boolean linearSearch(int[] haystack, int needle){ for(int i = 0 ; i < haystack.length ; i++){ if(haystack[i] == needle){ return true; } } return false; } C++ --- .. code-block:: cpp :linenos: bool linear_search(std::vector haystack, int needle){ for(int i = 0 ; i < haystack.size() ; i++){ if(haystack[i] == needle){ return true; } } return false; } C - .. code-block:: c :linenos: bool linear_search(int haystack[], int n, int needle){ for(int i = 0 ; i < n ; i++){ if(haystack[i] == needle){ return true; } } return false; } Haskell ------- * Below you will see a linear search that looks quite different from the previous * Haskell is an entirely different kind of programming language --- it is a *functional* programming language * It is, for better or worse, not nearly as popular as the languages seen in the above examples .. code-block:: haskell :linenos: linear_search :: Eq a => [a] -> a -> Bool linear_search [] _ = False linear_search (x:xs) y = x==y || linear_search xs y For Next Class ============== * Read `Chapter 14 of the text `_