Topic 8 - Searching Techniques

TOPIC 8:
SEARCHING
TECHNIQUES
Sequential Search on an Unordered File
Sequential Search on an Ordered File
OUTLINE Linear Search
Binary Search
Interpolation Search
There are some very common
problems that we use computers to
solve:
-Searching through a lot of records

for a specific record or set of
COMMON records
- P l a c i n g r e c o r d s i n o r d e r, w h i c h w e
PROBLEMS call sorting
-There are numerous algorithms to

perform searches and sorts. We will
briefly explore a few common ones.
SEARCHING
A question you should always ask when
selecting a search algorithm is “How
fast does the search have to be?”
The reason is that, in general, the faster
the algorithm is, the more complex it is.
Bottom line: you don’t always need to
use or should use the fastest
algorithm.
Let’s explore the following search
algorithms, keeping speed in mind.
Sequential (linear) search
Binary search
Basic algorithm:
Get the search criterion (key)
Get the first record from the file
While ( (record != key) and (still more records) )
Get the next record
End_while
When do we know that there wasn’t a record in the file that

matched the key?
Sequential Search on an Ordered File
• Basic algorithm:
Get the search criterion (key)
Get the first record from the file
While ( (record < key) and (still more records) )
Get the next record
End_while
If ( record = key )
Then success
Else there is no match in the file
End_else
• When do we know that there wasn’t a record in the file that matched the key?
Sequential Search of
Ordered vs. Unordered List
• Let’s do a comparison.
• If the order was ascending alphabetical on customer’s last names, how would the search for
John Adams on the ordered list compare with the search on the unordered list?
o Unordered list
–if John Adams was in the list?
–if John Adams was not in the list?
o Ordered list
–if John Adams was in the list?
–if John Adams was not in the list?
Ordered vs Unordered (con’t)
• How about George Washington?

o Unordered
– if George Washington was in the list?
– If George Washington was not in the list?
o Ordered
– if George Washington was in the list?
– If George Washington was not in the list?
• How about James Madison?
Ordered vs Unordered (con’t)
• Observation: the search is faster on an ordered list only when the item
being searched for is not in the list.
• Also, keep in mind that the list has to first be placed in order for the
ordered search.
• Conclusion: the efficiency of these algorithms is roughly the same.
• So, if we need a faster search, we need a completely different algorithm.
• How else could we search an ordered file?
Linear Search
• It is a very simple search algorithm.

• In this type of search, a sequential search is made over all items one by one.
• Every item is checked and if a match is found, then that particular item is returned, otherwise
the search continues till the end of the data collection.
Algorithm
Linear Search ( Array A, Value x)
Step 1: Set i to 1
Step 2: if i > n then go to step 7
Step 3: if A[i] = x then go to step 6
Step 4: Set i to i + 1
Step 5: Go to Step 2
Step 6: Print Element x Found at index i and go to step 8
Step 7: Print element not found
Step 8: Exit
Pseudocode
procedure linear_search (list, value)
for each item in the list

if match item == value
return the item's location
end if
end for
end procedure
Binary Search
• If we have an ordered list and we know how many things

are in the list (i.e., number of records in a file), we can use
a different strategy.
• The binary search gets its name because the algorithm
continually divides the list into two parts.
Binary Search
• It is a fast search algorithm with run-time complexity

of O(log n).
• This search algorithm works on the principle of divide
and conquer.
Note: For this algorithm to work properly, the data

collection should be in the sorted form.
Binary Search
• Binary search looks for a particular item by comparing the

middle most item of the collection.
• If a match occurs, then the index of the item is returned.
• If the middle item is greater than the item, then the item is
searched in the sub-array to the left of the middle item.
Otherwise, the item is searched for in the sub-array to the
right of the middle item.
• This process continues on the sub-array as well until the size
of the sub-array reduces to zero.
How a Binary Search Works
Always look at the center

value. Each time you get to
discard half of the remaining
list.
Is this fast ?
How a Binary Search Works
For a binary search to work, it is mandatory for the target array to be

sorted. We shall learn the process of binary search with a pictorial
example. The following is our sorted array and let us assume that we
need to search the location of value 31 using binary search.
How a Binary Search Works (con’t)
First, we shall determine half of the array by using this formula −
mid = low + (high - low) / 2
Here it is, 0 + (9 - 0 ) / 2 = 4 (integer value of 4.5). So, 4 is the mid of

the array.
Now we compare the value stored at location 4, with the value being
searched, i.e. 31. We find that the value at location 4 is 27, which is not a
match. As the value is greater than 27 and we have a sorted array, so we
also know that the target value must be in the upper portion of the array.
We change our low to mid + 1 and find the new mid value again.
low = mid + 1
mid = low + (high - low) / 2
Our new mid is 7 now. We compare the value stored at location 7 with
our target value 31.
The value stored at location 7 is not a match, rather it is more than what we
are looking for. So, the value must be in the lower part from this location.
Hence, we calculate the mid again. This time it is 5.
We compare the value stored at location 5 with our target value. We find
that it is a match.
We conclude that the target value 31 is stored at location 5.

Binary search halves the

searchable items and thus
reduces the count of
comparisons to be made to very
less numbers.
How Fast is a Binary Search?
• Worst case: 11 items in the list took 4 tries

• How about the worst case for a list with 32 items ?
o 1st try - list has 16 items
o 2nd try - list has 8 items
o 3rd try - list has 4 items
o 4th try - list has 2 items
o 5th try - list has 1 item
How Fast is a Binary Search? (con’t)
List has 250 items List has 512 items
1st try - 125 items 1st try - 256 items

2nd try - 63 items 2nd try - 128 items
3rd try - 32 items 3rd try - 64 items
4th try - 16 items 4th try - 32 items
8th try - 1 item 8th try - 2 items
9th try - 1 item
What’s the Pattern?
• List of 11 took 4 tries

• 32 = 25 and 512 = 29
• 8 < 11 < 16 23 < 11 < 24
• 128 < 250 < 256 27 < 250 < 28
A Very Fast Algorithm!
• How long (worst case) will it take to find an
item in a list 30,000 items long?
210 = 1024 213 = 8192
211 = 2048 214 = 16384
212 = 4096 215 = 32768
• So, it will take only 15 tries!

Lg n Efficiency
• We say that the binary search algorithm

runs in log2 n time. (Also written as lg n)
• Lg n means the log to the base 2 of some
value of n.
• 8 = 23 lg 8 = 3 16 = 24 lg 16 = 4
• There are no algorithms that run faster than
lg n time.
Procedure binary_search
A ← sorted array
n ← size of array
x ← value to be searched
Set lowerBound = 0
Pseudocode Set upperBound = n
while (lowerBound <= upperBound)
set midPoint = lowerBound + (upperBound- lowerBound) / 2
if A[midPoint] = x
EXIT: x found at location midpoint
if A[midPoint] < x
set lowerBound = midPoint + 1
if A[midPoint] > x
set upperBound = midPoint - 1
if upperBound < lowerBound

EXIT: x does not exists.
end while
end procedure
• Interpolation is a searching technique that finds a specified

value in a sorted array.
• The concept of interpolation search is similar to how we
search for names in a telephone book. For example, when
looking for a name “Bob” in a telephone directory, we know
that it will be near the extreme left, so applying a binary
search technique by dividing the list in two halves each time
is not a good idea. We must start scanning the extreme left in
the first pass itself.
• It is an improved variant of binary search, this algorithm

works on the probing position of the required value.
Note: for this algorithm to work properly, the data collection

should be in a sorted form and equally distributed.
Binary search has a huge advantage of time complexity over

linear search. Linear search has worst-case complexity of Ο(n)
whereas binary search has Ο(log n).
Positioning in Binary Search
• In binary search, if the desired data is not found then the rest
of the list is divided in two parts, lower and higher. The
search is carried out in either of them.
Positioning Probing in Interpolation
Search
• Interpolation search finds a particular item by computing the

probe position. Initially, the probe position is the position of
the middle most item of the collection.
Search (con’t)
• If a match occurs, then the index of the item is returned. To

split the list into two parts, we use the following method −
mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])

where −
A = list
Lo = Lowest index of the list
Hi = Highest index of the list
A[n] = Value stored at index n in the list
Search (con’t)

where −
A = list
Search (con’t)

where −
A = list
Search (con’t)
23 67 90 111 211 345 365 400 413 999

where −
A = list
Search (con’t)
• If the middle item is greater than the item, then the probe position is
again calculated in the sub-array to the right of the middle item.
Otherwise, the item is searched in the subarray to the left of the middle
item. This process continues on the sub-array as well until the size of
subarray reduces to zero.
• Runtime complexity of interpolation search algorithm is Ο(log (log n)) as

compared to Ο(log n) of BST in favorable situations.
Algorithm
• As it is an improvisation of the existing BST algorithm, we are
mentioning the steps to search the 'target' data value index, using
position probing −
Step 1 − Start searching data from middle of the list.

Step 2 − If it is a match, return the index of the item, and exit.
Step 3 − If it is not a match, probe position.
Step 4 − Divide the list using probing formula and find the new middle.
Step 5 − If data is greater than middle, search in higher sub-list.
Step 6 − If data is smaller than middle, search in lower sub-list.
Step 7 − Repeat until match.
A → Array list
N → Size of A
X → Target Value
Procedure Interpolation_Search()
Set Lo → 0
Set Mid → -1
Set Hi → N-1
While X does not match
Pseudocode if Lo equals to Hi OR A[Lo] equals to A[Hi]

EXIT: Failure, Target not found
end if
Set Mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X -

A[Lo])
if A[Mid] = X
EXIT: Success, Target found at Mid
else
if A[Mid] < X
Set Lo to Mid+1
else if A[Mid] > X
Set Hi to Mid-1
end if
end if
End While
End Procedure
Questions?

Topic 8 - Searching Techniques

Uploaded by

Copyright:

Available Formats

Topic 8 - Searching Techniques

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topic 8 - Searching Techniques

Uploaded by

Copyright:

Available Formats

TOPIC 8:

-Searching through a lot of records

-There are numerous algorithms to

Sequential Search on an Unordered File

When do we know that there wasn’t a record in the file that

• How about George Washington?

• It is a very simple search algorithm.

procedure linear_search (list, value)

for each item in the list

• If we have an ordered list and we know how many things

• It is a fast search algorithm with run-time complexity

Note: For this algorithm to work properly, the data

• Binary search looks for a particular item by comparing the

Always look at the center

For a binary search to work, it is mandatory for the target array to be

mid = low + (high - low) / 2

Here it is, 0 + (9 - 0 ) / 2 = 4 (integer value of 4.5). So, 4 is the mid of

We conclude that the target value 31 is stored at location 5.

Binary search halves the

• Worst case: 11 items in the list took 4 tries

1st try - 125 items 1st try - 256 items

• List of 11 took 4 tries

• So, it will take only 15 tries!

• We say that the binary search algorithm

while (lowerBound <= upperBound)

set midPoint = lowerBound + (upperBound- lowerBound) / 2

if upperBound < lowerBound

• Interpolation is a searching technique that finds a specified

• It is an improved variant of binary search, this algorithm

Note: for this algorithm to work properly, the data collection

Binary search has a huge advantage of time complexity over

• Interpolation search finds a particular item by computing the

• If a match occurs, then the index of the item is returned. To

mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])

mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])

mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])

mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])

• Runtime complexity of interpolation search algorithm is Ο(log (log n)) as

Step 1 − Start searching data from middle of the list.

While X does not match

Pseudocode if Lo equals to Hi OR A[Lo] equals to A[Hi]

Set Mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X -

You might also like