Topic 8 - Searching Techniques

Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

TOPIC 8:

SEARCHING
TECHNIQUES
Sequential Search on an Unordered File
Sequential Search on an Ordered File
OUTLINE Linear Search
Binary Search
Interpolation Search
There are some very common
problems that we use computers to
solve:

-Searching through a lot of records


for a specific record or set of
COMMON records

- P l a c i n g r e c o r d s i n o r d e r, w h i c h w e
PROBLEMS call sorting

-There are numerous algorithms to


perform searches and sorts. We will
briefly explore a few common ones.
SEARCHING
A question you should always ask when
selecting a search algorithm is “How
fast does the search have to be?”
The reason is that, in general, the faster
the algorithm is, the more complex it is.
Bottom line: you don’t always need to
use or should use the fastest
algorithm.
Let’s explore the following search
algorithms, keeping speed in mind.
Sequential (linear) search
Binary search
Sequential Search on an Unordered File

Sequential Search on an Unordered File

Basic algorithm:
Get the search criterion (key)
Get the first record from the file
While ( (record != key) and (still more records) )
Get the next record
End_while

When do we know that there wasn’t a record in the file that


matched the key?
Sequential Search on an Ordered File

• Basic algorithm:
Get the search criterion (key)
Get the first record from the file
While ( (record < key) and (still more records) )
Get the next record
End_while
If ( record = key )
Then success
Else there is no match in the file
End_else
• When do we know that there wasn’t a record in the file that matched the key?
Sequential Search of
Ordered vs. Unordered List

• Let’s do a comparison.
• If the order was ascending alphabetical on customer’s last names, how would the search for
John Adams on the ordered list compare with the search on the unordered list?

o Unordered list
–if John Adams was in the list?
–if John Adams was not in the list?
o Ordered list
–if John Adams was in the list?
–if John Adams was not in the list?
Ordered vs Unordered (con’t)

• How about George Washington?


o Unordered
– if George Washington was in the list?
– If George Washington was not in the list?
o Ordered
– if George Washington was in the list?
– If George Washington was not in the list?
• How about James Madison?
Ordered vs Unordered (con’t)

• Observation: the search is faster on an ordered list only when the item
being searched for is not in the list.
• Also, keep in mind that the list has to first be placed in order for the
ordered search.
• Conclusion: the efficiency of these algorithms is roughly the same.
• So, if we need a faster search, we need a completely different algorithm.
• How else could we search an ordered file?
Linear Search

• It is a very simple search algorithm.


• In this type of search, a sequential search is made over all items one by one.
• Every item is checked and if a match is found, then that particular item is returned, otherwise
the search continues till the end of the data collection.
Algorithm
Linear Search ( Array A, Value x)

Step 1: Set i to 1
Step 2: if i > n then go to step 7
Step 3: if A[i] = x then go to step 6
Step 4: Set i to i + 1
Step 5: Go to Step 2
Step 6: Print Element x Found at index i and go to step 8
Step 7: Print element not found
Step 8: Exit
Pseudocode

procedure linear_search (list, value)

for each item in the list


if match item == value
return the item's location
end if
end for

end procedure
Binary Search

• If we have an ordered list and we know how many things


are in the list (i.e., number of records in a file), we can use
a different strategy.
• The binary search gets its name because the algorithm
continually divides the list into two parts.
Binary Search

• It is a fast search algorithm with run-time complexity


of O(log n).
• This search algorithm works on the principle of divide
and conquer.

Note: For this algorithm to work properly, the data


collection should be in the sorted form.
Binary Search

• Binary search looks for a particular item by comparing the


middle most item of the collection.
• If a match occurs, then the index of the item is returned.
• If the middle item is greater than the item, then the item is
searched in the sub-array to the left of the middle item.
Otherwise, the item is searched for in the sub-array to the
right of the middle item.
• This process continues on the sub-array as well until the size
of the sub-array reduces to zero.
How a Binary Search Works

Always look at the center


value. Each time you get to
discard half of the remaining
list.
Is this fast ?
How a Binary Search Works

For a binary search to work, it is mandatory for the target array to be


sorted. We shall learn the process of binary search with a pictorial
example. The following is our sorted array and let us assume that we
need to search the location of value 31 using binary search.
How a Binary Search Works (con’t)
First, we shall determine half of the array by using this formula −

mid = low + (high - low) / 2

Here it is, 0 + (9 - 0 ) / 2 = 4 (integer value of 4.5). So, 4 is the mid of


the array.
How a Binary Search Works (con’t)
Now we compare the value stored at location 4, with the value being
searched, i.e. 31. We find that the value at location 4 is 27, which is not a
match. As the value is greater than 27 and we have a sorted array, so we
also know that the target value must be in the upper portion of the array.

We change our low to mid + 1 and find the new mid value again.

low = mid + 1
mid = low + (high - low) / 2
How a Binary Search Works (con’t)
Our new mid is 7 now. We compare the value stored at location 7 with
our target value 31.

The value stored at location 7 is not a match, rather it is more than what we
are looking for. So, the value must be in the lower part from this location.
How a Binary Search Works (con’t)
Hence, we calculate the mid again. This time it is 5.

We compare the value stored at location 5 with our target value. We find
that it is a match.

We conclude that the target value 31 is stored at location 5.


How a Binary Search Works (con’t)

Binary search halves the


searchable items and thus
reduces the count of
comparisons to be made to very
less numbers.
How Fast is a Binary Search?

• Worst case: 11 items in the list took 4 tries


• How about the worst case for a list with 32 items ?
o 1st try - list has 16 items
o 2nd try - list has 8 items
o 3rd try - list has 4 items
o 4th try - list has 2 items
o 5th try - list has 1 item
How Fast is a Binary Search? (con’t)
List has 250 items List has 512 items

1st try - 125 items 1st try - 256 items


2nd try - 63 items 2nd try - 128 items
3rd try - 32 items 3rd try - 64 items
4th try - 16 items 4th try - 32 items
5th try - 8 items 5th try - 16 items
6th try - 4 items 6th try - 8 items
7th try - 2 items 7th try - 4 items
8th try - 1 item 8th try - 2 items
9th try - 1 item
What’s the Pattern?

• List of 11 took 4 tries


• List of 32 took 5 tries
• List of 250 took 8 tries
• List of 512 took 9 tries

• 32 = 25 and 512 = 29
• 8 < 11 < 16 23 < 11 < 24
• 128 < 250 < 256 27 < 250 < 28
A Very Fast Algorithm!
• How long (worst case) will it take to find an
item in a list 30,000 items long?
210 = 1024 213 = 8192
211 = 2048 214 = 16384
212 = 4096 215 = 32768

• So, it will take only 15 tries!


Lg n Efficiency

• We say that the binary search algorithm


runs in log2 n time. (Also written as lg n)
• Lg n means the log to the base 2 of some
value of n.
• 8 = 23 lg 8 = 3 16 = 24 lg 16 = 4
• There are no algorithms that run faster than
lg n time.
Procedure binary_search
A ← sorted array
n ← size of array
x ← value to be searched

Set lowerBound = 0
Pseudocode Set upperBound = n

while (lowerBound <= upperBound)

set midPoint = lowerBound + (upperBound- lowerBound) / 2

if A[midPoint] = x
EXIT: x found at location midpoint
if A[midPoint] < x
set lowerBound = midPoint + 1

if A[midPoint] > x
set upperBound = midPoint - 1

if upperBound < lowerBound


EXIT: x does not exists.
end while

end procedure
Interpolation Search

• Interpolation is a searching technique that finds a specified


value in a sorted array.
• The concept of interpolation search is similar to how we
search for names in a telephone book. For example, when
looking for a name “Bob” in a telephone directory, we know
that it will be near the extreme left, so applying a binary
search technique by dividing the list in two halves each time
is not a good idea. We must start scanning the extreme left in
the first pass itself.
Interpolation Search

• It is an improved variant of binary search, this algorithm


works on the probing position of the required value.

Note: for this algorithm to work properly, the data collection


should be in a sorted form and equally distributed.

Binary search has a huge advantage of time complexity over


linear search. Linear search has worst-case complexity of Ο(n)
whereas binary search has Ο(log n).
Positioning in Binary Search

• In binary search, if the desired data is not found then the rest
of the list is divided in two parts, lower and higher. The
search is carried out in either of them.
Positioning Probing in Interpolation
Search

• Interpolation search finds a particular item by computing the


probe position. Initially, the probe position is the position of
the middle most item of the collection.
Positioning Probing in Interpolation
Search (con’t)

• If a match occurs, then the index of the item is returned. To


split the list into two parts, we use the following method −

mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])


where −
A = list
Lo = Lowest index of the list
Hi = Highest index of the list
A[n] = Value stored at index n in the list
Positioning Probing in Interpolation
Search (con’t)

mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])


where −
A = list
Lo = Lowest index of the list
Hi = Highest index of the list
A[n] = Value stored at index n in the list
Positioning Probing in Interpolation
Search (con’t)

mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])


where −
A = list
Lo = Lowest index of the list
Hi = Highest index of the list
A[n] = Value stored at index n in the list
Positioning Probing in Interpolation
Search (con’t)
23 67 90 111 211 345 365 400 413 999

mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])


where −
A = list
Lo = Lowest index of the list
Hi = Highest index of the list
A[n] = Value stored at index n in the list
Positioning Probing in Interpolation
Search (con’t)

• If the middle item is greater than the item, then the probe position is
again calculated in the sub-array to the right of the middle item.
Otherwise, the item is searched in the subarray to the left of the middle
item. This process continues on the sub-array as well until the size of
subarray reduces to zero.

• Runtime complexity of interpolation search algorithm is Ο(log (log n)) as


compared to Ο(log n) of BST in favorable situations.
Algorithm
• As it is an improvisation of the existing BST algorithm, we are
mentioning the steps to search the 'target' data value index, using
position probing −

Step 1 − Start searching data from middle of the list.


Step 2 − If it is a match, return the index of the item, and exit.
Step 3 − If it is not a match, probe position.
Step 4 − Divide the list using probing formula and find the new middle.
Step 5 − If data is greater than middle, search in higher sub-list.
Step 6 − If data is smaller than middle, search in lower sub-list.
Step 7 − Repeat until match.
A → Array list
N → Size of A
X → Target Value

Procedure Interpolation_Search()

Set Lo → 0
Set Mid → -1
Set Hi → N-1

While X does not match

Pseudocode if Lo equals to Hi OR A[Lo] equals to A[Hi]


EXIT: Failure, Target not found
end if

Set Mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X -


A[Lo])

if A[Mid] = X
EXIT: Success, Target found at Mid
else
if A[Mid] < X
Set Lo to Mid+1
else if A[Mid] > X
Set Hi to Mid-1
end if
end if
End While
End Procedure
Questions?

You might also like