Detailed Analysis of the Binary Search



Detailed Analysis of the Binary Search

As established in CS1, the best-case running time of a binary search of n elements is O(1), corresponding to when we find the element for which we are looking on our first comparison. We also showed that the worst-case running time is O(log n).

This leads to the question: What is the average case running time of a binary search on n sorted elements?

In order to answer this question, we will make some assumptions:

1) The value we are searching for is in the array.

2) Each value is equally likely to be in the array.

3) The size of the array is n = 2k-1, where k is a positive integer.

The first assumption isn't necessary, but makes life easier so we don't have to assign a probability to how often a search fails.

The second assumption is necessary since we don't actually know how often each value would be searched for.

The third assumption will make our math easier since the sum we will have to calculate will more easily follow a pattern. (Our general result we obtain will still hold w/o this assumption.)

First, we note that using 1 comparison, we can find 1 element. If we use two comparisons exactly, there are 2 possible elements we can find. In general, after using k comparisons, we can find 2k-1 elements. (To see this, consider doing a binary search on the array 2, 5, 6, 8, 12, 17, 19. 8 would be found in 1 comparison, 5 and 17 in two, and 1, 6, 12 and 19 would be found in 3 comparisons.)

The expected number of comparisons we make when running the algorithm would be a sum over the number of comparisons necessary to find each individual element multiplied by the probability we are searching for that element. Let p(j) represent the number of comparisons it would take to find element j, then the sum we have is:

[pic]

Now, the trick will be to determine that sum. BUT, we have already out lined that p(j) will be 1 for one value of j, 2 for 2 values of j, 3 for 4 values of j, etc. Since n=2k-1, we can formulate the sum as follows:

[pic]

This is because the value j appears exactly 2j-1 times in the original sum.

We can determine the sum using the technique shown in lab.

[pic]

[pic]

Subtracting the bottom equation from the top, we get the following:

[pic]

[pic]

[pic]

[pic]

Thus, the average run-time of the binary search is

[pic]

So, for this particular algorithm, the average case run-time is much closer to the worst case run-time than the best-case run time. (Notice that the worst case number of comparisons is k with the average number of comparisons is k-1, where k = log n.)

Recurrence Relations and More Analysis

A powerful tool in algorithm analysis for recursive methods is recurrence relations. You should have seen some of this in CS1.

Let's use the Towers of Hanoi as an example. Let T(n) denote the minimal number of moves it takes to transfer a tower of n disks. We know that T(1) = 1. We also previously made the following observation:

In order to move a tower of n disks, we are FORCED to move a tower of n-1 disks, then move the bottom disk, followed by moving a tower of n-1 disks again. Thus we have:

T(n) ( T(n-1) + 1 + T(n-1), so

T(n) ( 2T(n-1) + 1.

But we know we can achieve equality using the method prescribed in class last time, so it follows that

T(n) = 2T(n-1) +1, and T(1) = 1.

We will use iteration to solve this recurrence relation:

T(n) = 2T(n-1) + 1

= 2(2T(n-2) +1) + 1

= 4T(n-2) + 3

= 4(2T(n-3) +1) + 3

= 8T(n-3) + 7

= 2n-1T(1) + 2n-1-1

= 2n-1 + 2n-1-1

= 2(2n-1) - 1

= 2n - 1.

Now, let's use induction to verify the result that T(n)=2n-1.

Base case: n=1 LHS = T(1) = 1, RHS = 21 - 1 = 1

Inductive hypothesis: Assume for an arbitrary value of n=k that T(k) = 2k - 1

Inductive step: Prove that for n=k+1 that T(k+1)=2k+1-1.

T(k+1) = 2T(k) + 1, using the given recurrence

= 2(2k - 1) + 1, using the inductive hypothesis

= 2k+1 - 2 + 1

= 2k+1 - 1, and the proof is finished.

The Change Problem

"The Change Store" was an old SNL skit (a pretty dumb one...) where they would say things like, "You need change for a 20? We'll give you two tens, or a ten and two fives, or four fives, etc."

If you are a dorky minded CS 2 student, you might ask yourself (after you ask yourself why those writers get paid so much for writing the crap that they do), "Given a certain amount of money, how many different ways are there to make change for that amount of money?"

Let us simplify the problem as follows:

Given a positive integer n, how many ways can we make change for n cents using pennies, nickels, dimes and quarters?

Recursively, we could break down the problem as follows:

To make change for n cents we could:

1) Give the customer a quarter. Then we have to make change for n-25 cents

2) Give the customer a dime. Then we have to make change for n-10 cents

3) Give the customer a nickel. Then we have to make change for n-5 cents

4) Give the customer a penny. Then we have to make change for n-1 cents.

If we let T(n) = number of ways to make change for n cents, we get the formula

T(n) = T(n-25)+T(n-10)+T(n-5)+T(n-1)

Is there anything wrong with this?

If you plug in the initial condition T(1) = 1, T(0)=1, T(n)=0 if n ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download