A search is an algorithm that takes a collection and a target item and determines whether the target is in the collection, often returning the index of the target.
The simplest search algorithm is a “linear search,” which traverses the items of the collection in order, stopping if it finds the target. In the worst case it has to traverse the entire collection, so the run time is linear.
in operator for sequences uses a linear search; so do string methods like
If the elements of the sequence are in order, you can use a bisection search, which is O(logn). Bisection search is similar to the algorithm you probably use to look a word up in a dictionary (a real dictionary, not the data structure). Instead of starting at the beginning and checking each item in order, you start with the item in the middle and check whether the word you are looking for comes before or after. If it comes before, then you search the first half of the sequence. Otherwise you search the second half. Either way, you cut the number of remaining items in half.
If the sequence has 1,000,000 items, it will take about 20 steps to find the word or conclude that it’s not there. So that’s about 50,000 times faster than a linear search.
Write a function called
bisection that takes a sorted list and a target value and returns the index of the value in the list, if it’s there, or
None if it’s not.
Or you could read the documentation of the
bisect module and use that!
Bisection search can be much faster than linear search, but it requires the sequence to be in order, which might require extra work.
There is another data structure, called a hashtable that is even faster—it can do a search in constant time—and it doesn’t require the items to be sorted. Python dictionaries are implemented using hashtables, which is why most dictionary operations, including the
in operator, are constant time.