cs3230 Lec01b Beamer

Asymptotic Analysis
S. Halim YJ. Chang
School of Computing
National University of Singapore
CS3230 Lec01b; Tue, 13 Aug 2024

Overview
Introduction
Problem-Solving Example: Fibonacci
Model of Computation: RAM
Asymptotic Analysis
Big O (upper bound)
Ω (lower bound)
New notation Θ (tight bound)
Little-o and ω
Taking Limits
Algorithm
Algorithm
“A sequence of unambiguous and executable instructions for
solving a problem (given a valid input, obtain a valid output)”
Let’s elaborate:
▶ What are the valid inputs?
▶ What is the meaning of unambiguous instructions?
▶ What is the meaning of executable instructions?
▶ Are all algorithms deterministic?
▶ Are all algorithms terminate?
Details
We assume that the algorithm does not need to concern itself with
invalid input, e.g., for Fib(n) later, we will assume that n will
always be a non-negative Integer
Unambiguous instructions are precisely stated, no room for doubt
Instructions should be executable (implementable) on the target

machine, i.e., no magic
*Deterministic: Most of the time, we expect each instruction to be

deterministic, though in some cases we allow randomness or
nondeterminism (we will talk about randomness/nondeterminism
when we deal with them)
*Termination: The algorithm should terminate after finitely many

instructions are executed (exception: Halting problem)
Pseudocode
We can give an algorithm already written in a particular

programming language, pros and cons:
▶ Unambiguous1
▶ Clear
▶ Quite tedious
▶ Harder to understand
Alternative: Pseudocode (we will use this going forward)

▶ Slightly informal
▶ Still precise enough to understand exactly what instructions
are, and how to implement it in some programming language
1
Unless we do not understand that language
An Example
In Python (source code)

In Pseudocode:
A = [(1, 2, 3), (4, 5, 6)]
[*zip(*A)] Given a 2D matrix of size n × m,
transpose it into an m × n matrix
Do you know what is this?
Some Properties of Good Algorithms
There can be many possible algorithms for solving a problem
Given the choices, we prefer:

▶ Correctness (the most important property)
▶ Efficiency (time/space/resources)
▶ Generality: Applicable to a wide range on inputs
and not dependent on a particular computer/device
▶ Usability as a ‘subroutine’ for other problems
▶ Simplicity: so that it is easy to code, understand, debug, etc
▶ Well documented (easy to understand and to extend it)
Some objectives may have trade-offs: simplicity vs efficiency

Design and Analysis of Algorithms
Designing an algorithm is both science and art

You need to know the relevant techniques
But you also need creativity, intuition, perseverance
There is no formula for designing a good efficient algorithm

Every new problem may need a fresh approach
So, learn lots of techniques/strategies/paradigms
By observing the properties of a problem and using the techniques,
one can often design a good algorithm for the given problem
Paradigms
▶ Complete Search (for example, using brute force,

backtracking, branch and bound)
▶ Divide and Conquer (D&C)
▶ Dynamic Programming (DP)
▶ Greedy Algorithm
▶ Deterministic versus non-deterministic strategies
▶ Iterative Improvement
Problem-Solving
The general steps:

1. Understand the problem
2. Design a method to solve the problem
3. Convert it into an algorithm/pseudocode
4. Choose data structures
5. Prove correctness of the algorithm
6. Analyze the complexity of the algorithm
(time/space/resources needed)
7. PS: Implement that correct and efficient algorithm
Live Problem Solving
Solve https://open.kattis.com/problems/rijeci
TL;DR: Given K (1 ≤ K ≤ 45) output Fib(K − 1) and Fib(K )

because |′ A′ | = Fib(K − 1) and |′ B ′ | = Fib(K )
where Fib(n) is the n-th term of Fibonacci number
Fibonacci Numbers
▶ Fib(0) = 0
▶ Fib(1) = 1
▶ For n > 1, Fib(n) = Fib(n − 1) + Fib(n − 2)
▶ First 10 terms: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . .
Problem: Given n as input, compute Fib(n)
We will look at two algorithms:

▶ Recursive algorithm
▶ Iterative algorithm
PS: Yes, there are other (faster) algorithms

Recursive algorithm to compute Fib(n)
define Fib(n)
if n <= 1
return n
else
return Fib(n-1)+Fib(n-2)
Simple, direct recursive implementation from the Fib(n) definition

Live Problem Solving - with recursive Fib(n)
We use the fast C++ to implement recursive Fib(n),

but it gets ‘Time Limit Exceeded’, why?
The live C++ code works, i.e., gets ‘Time Limit Exceeded’ :)
PS: probably no need to re-try
Iterative algorithm to compute Fib(n)
define IFib(n)
if n <= 1
return n
else
prev2 = 0
prev1 = 1
for i = 2 to n
temp = prev1
prev1 = prev1+prev2
prev2 = temp
return prev1
Live Problem Solving - with iterative IFib(n)
We have seen that the recursive Fib(n) (in fast C++) is too slow
But if we use the slow Python to implement iterative IFib(n),

it gets ‘Accepted’ (despite running IFib(n) twice), why?
The live Python code works, i.e., gets Accepted :)
Optional: those who haven’t try, please re-code this yourself
Analysis of an Algorithm
We analyze the resources needed by an algorithm:

▶ Time – in this course, we will mostly concentrate on time
▶ Space – in this course, we assume all data fits in memory
Sometimes, we do trade-offs:
▶ If space is not an issue, most of the time, we sacrifice
(or use more) space to gain faster time
▶ For some applications (e.g., Big Data), we may have to
sacrifice time so that we are able to process the data
Actual time needed to run an algorithm depends on the machine

used, and this is not easy to calculate/measure
Model of Computation: RAM
Random-Access Machine (RAM) model is simple

and close to how real computers work:
▶ Each instruction takes a constant amount of time: fetch the
instruction, execute, store back the results in the memory
▶ We count the number of basic instructions needed
▶ The time complexity is based on input size (more details soon)
RAM, Continued
▶ Word is basic unit of memory

In this course, you can usually assume each number
(or relevant item) can be stored in one word
▶ RAM is an array of words, storing instructions and data
It takes one unit of time to access any word (this is important)
▶ Each arithmetic or logical operation (+, -, *, /, mod, AND,
OR, NOT, etc) takes a constant amount of time (notice that
exponent operation is not constant – see D&C lecture later)
▶ Details of word size and different time taken by different
instructions are important, but USUALLY do not have a large
impact; so we usually ignore it, unless it makes a difference
▶ We need to be careful: when numbers are very large (and thus
cannot fit in one word), the complexity depends on number of
bits/words needed to store the number
For our Fib(n) and IFib(n) analysis
For /rijeci, F (45) = 1 134 903 170 < 231 − 1,

fits in 32-bit signed Integer
For large computation of Fib(n),

the resulting number can be very large
To address the above, one can consider computing the Fibonacci

numbers modulo some m (for example 2wordsize )
We omit this detail in our first analysis to simplify discussion

Analysis of recursive algorithm to compute Fib(n)
Let T (n) be the number of
operations done by Fib(n)
T (0) = T (1) = 2
define Fib(n) (if+return)
if n <= 1
return n For n ≥ 2, T (n) =
else T (n − 1) + T (n − 2) + 6
return Fib(n-1)+Fib(n-2) (if+else+two function
calls+add+return)
See the recursion tree@VisuAlgo
(which is a big tree for large n) So T (n) ≥ Fib(n)
This ‘brute force’ Fib(n) is bad We can show that

n−2
Fib(n) ≥ 2 2 (How?)
T (n) is exponential in n
n−2
How to show that Fib(n) ≥ 2 2 ?
Let Fib(1) = 1
Notice that for n ≥ 2, (including n ≥ 3)
Fib(n) = Fib(n − 1) + Fib(n − 2),
Fib(n) ≥ Fib(n − 2) + Fib(n − 2),
Fib(n) ≥ 2 · Fib(n − 2),
i.e., after two terms, the value of Fib(n) will double (or more), i.e.,
Fib(1), Fib(3), Fib(5), Fib(7), Fib(9), . . . = 1, 2, 5, 13, 34, . . .
Between 1 to n, there are ⌈ n−2

2 ⌉ doubling steps
This takes care of odd vs even n cases
n−2
Hence Fib(n) ≥ 2 2
Analysis of iterative algorithm to compute Fib(n)
define IFib(n)
if n <= 1
return n For n ≥ 2,
else T (n) ≈ 4 + (n − 1) ∗ 5 + 1
prev2 = 0 (if+else+two assignments
prev1 = 1 + (n − 1) iterations,
for i = 2 to n each takes ≈ 5 steps
temp = prev1 +return)
prev1 = prev1+prev2
prev2 = temp So T (n) ≈ 5n, linear in n
return prev1 This is much faster than
the recursive version that
See the recursion DAG@VisuAlgo
runs exponential in n
(small and proportional to n)
This is ‘Dynamic Programming’

(DP) (to be revisited later)
Actual Running Time
Running Time of an Algorithm
▶ We often give the running time in terms of the size of the

input (usually parameter n)
▶ Size of the input can be the number of items (e.g., sorting n
Integers) or length of inputs coded in binary (e.g., Integer n in
Fib(n) requires log n bits encoding – details in the second half)
▶ We usually perform these analysis:
▶ Worst-case analysis: T (n) is the maximum time needed for
any input of size (at most) n
▶ Average-case analysis: T (n) is the expected time taken over
all inputs of size n; either all inputs are equally probable, or we
know the probability distribution over the inputs of size n
▶ We usually do not consider best-case analysis, as inputs that
trigger best-case are usually not the typical ones
▶ It is difficult to compute the exact number of operations
(as seen earlier), thus we often give upper bounds instead
Question 2 at VisuAlgo Online Quiz
Which algorithm is more efficient?

Algorithm 1: Algorithm 2:
T 1(n) = 100n + 1000 T 2(n) = n2 + 5
The answer is ‘it depends’ (see helper Excel file)
Algorithm 2 can be more efficient on small n, i.e., when n < 110
Algorithm 1 is more efficient on large n, especially when n ≥ 110

(this is more important)
Time complexity really matters only for large-sized input,

thus we only compare for asymptotically large values of input size
Asymptotic Analysis
Why we do not measure the actual run time:

▶ Different machines have different speeds,
i.e., new gaming desktop if fast vs 10-years old laptop is slow
but Kattis online judge (one judging server) is optimized for
lower runtime variability across runs; so it is ‘fair enough’
▶ Different programming languages have different runtimes,
i.e., C++ is fast vs Python is slow
We prefer to do asymptotic analysis:

▶ For large inputs, how does the runtime behave?
▶ Comparison of algorithms is based on the asymptotic analysis
▶ We often ignore lower terms and constant multiplicative
factors in the asymptotic analysis
Most common asymptotic notation: Big O (upper bound)
For the following discussion on asymptotics, assume f and g are

functions of one parameter n
f ∈ O(g ) if there exists constant c > 0 and n0 > 0 such that for
all n ≥ n0 : 0 ≤ f (n) ≤ c · g (n)
Interpretation: g is an upper bound on f
O(g ) = {f : there exists constant c > 0 and n0 > 0 such that for
all n ≥ n0 , 0 ≤ f (n) ≤ c · g (n)}
We sometimes also write f = O(g ), though not 100% correct
We frequently write f (n) = O(g (n)), though technically, n should

not have been used (there can be more than one parameter)
Similarly for other asymptotic notations; PS: we accept all versions

Pictorial interpretation of Big O notation
Apologize for poor image quality, we plan to redraw these figures
PS: we need to edit f (n) = O(g (n)) to f (n) ∈ O(g (n))
Big O notation is an upper bound notation

So, saying f (n) is at least O(g (n)) is not correct
Big O (upper bound)
Example: 100n + 1000 ∈ O(n2 )

▶ 0 ≤ 100n + 1000 (for any positive n)
▶ 0 ≤ 100n + 1000 ≤ 101n (for n ≥ 1000)
▶ 0 ≤ 100n + 1000 ≤ 101n ≤ 101n2 (for n ≥ 1000)
i.e., we can set c = 101 and n0 = 1000
Hence, 100n + 1000 ∈ O(n2 )
But is this upper bound tight?

No, we can also show that 100n + 1000 ∈ O(n) using the same
c = 101 and n0 = 1000
Is this the only c and n0 to show that 100n + 1000 ∈ O(n2 )?

No, we can also show that 100n + 1000 ∈ O(n2 ) with:
c = 101 and n0 = 1001 (or any larger n0 ),
c = 1100 (or any larger c) and n0 = 1, etc
Question 3 at VisuAlgo Online Quiz
Let f (n) = 10n3 + 5n + 15 and g (n) = n4
We want to prove that f (n) ∈ O(g (n)) by showing that

0 ≤ f (n) ≤ c · g (n) for all n ≥ n0
What should be the appropriate c and n0 ? (there are > 1 answers)

A). c = 2, n0 = 10
B). c = 1, n0 = 11
C). c = 5, n0 = 1
D). c = 1, n0 = 10
Solution
Reminder: f (n) = 10n3 + 5n + 15 and g (n) = n4

We want to show that 0 ≤ f (n) ≤ c · g (n) for all n ≥ n0
Option C). c = 5, n0 = 1 is incorrect, e.g., for n = n0 = 1:

f (1) = 30; 5 · g (1) = 5 · 1 = 5; so f (n) > c · g (n)
Option D). c = 1, n0 = 10 is incorrect, e.g., for n = n0 = 10:

f (10) = 10 065; 1 · g (10) = 1 · 10 000 = 10 000; so f (n) > c · g (n)
Option A). c = 2, n0 = 10 is correct, i.e., for n ≥ 10, we have:

10n3 + (5n + 15) ≤ 10n3 + (20n) ≤ 10n3 + (10n3 ) ≤ 2n4
Option B). c = 1, n0 = 11 is correct, i.e., for n ≥ 11, we have:

10 · 113 + 5 · 11 + 15 ≤ 11 · 113
5 · 11 + 15 ≤ 113 (the gap will grow with larger n ≥ 10.0641)
Tips: set c = 1 and n0 to be a large value; see if the gap grows
New notation Ω (lower bound)
f ∈ Ω(g ) if there exists constant c > 0 and n0 > 0 such that for
all n ≥ n0 : 0 ≤ c · g (n) ≤ f (n)
Interpretation: g is a lower bound on f

Ω (lower bound)
Example: n2 ∈ Ω(100n + 1000)

We swap f (n) and g (n) from the earlier Big O example
This is the complementary property of asymptotic notations
1
▶ 0 ≤ 101 · (100n + 1000) ≤ n2 for n ≥ 1000
1
i.e., we can set c = 101 and n0 = 1000
just set this c to be the reciprocal of the c in Big O analysis
Again, there are many other possible c and n0
PS: We usually have f (n) as the more complex function and g (n)
to be the simpler one, i.e., 7n2 + 5n + 77 ∈ Ω(n2 )
Pictorial interpretation of Ω-notation
New notation Θ (tight bound)
f ∈ Θ(g ) if there exists constants c1 , c2 > 0 and n0 > 0 such that

for all n ≥ n0 : 0 ≤ c1 · g (n) ≤ f (n) ≤ c2 · g (n)
Interpretation: g is a tight bound on f
We will frequently do Θ analysis in CS3230

Θ-notation (tight bound)
Example: 10n2 + n ∈ Θ(n2 )

▶ 0 ≤ 21 n2 ≤ (10n2 + n) ≤ 11n2 for n ≥ 2
i.e., c1 = 12 , c2 = 11, and n0 = 2, from max(2, 1)
again, these are not the only valid constants c1 , c2 , and n0
Hence, 10n2 + n ∈ Θ(n2 )
Pictorial interpretation of Θ-notation
O, Ω, and Θ
Θ(g ) = O(g ) ∩ Ω(g )

Little-o (strict upper bound)
f ∈ o(g ) if for any constant c > 0, there exists n0 > 0 such that
for all n ≥ n0 : 0 ≤ f (n) < c · g (n) (notice for any constant
c > 0 instead of there exists constant c > 0 and < instead of ≤)
PS: some textbooks define Little-o using ≤ instead of <

This will only change the chosen c and/or n0
Example: n ∈ o(n2 )
1
For any constant c > 0, let n0 = 1 + c (setting n0 = 2 is also ok)
Then, for n ≥ n0 , n < c · n2
But n2 − n ∈ / o(n2 )
Let’s say we pick c = 21 (just need to show one counterexample),
for any n0 and large enough n, we have:
n2 − n > 12 n2
1 2
2n > n
n2 > 2n
ω (strict lower bound)
f ∈ ω(g ) if for any constant c > 0, there exists n0 > 0 such that
for all n ≥ n0 : 0 ≤ c · g (n) < f (n)
Example: n2 − 36 ∈ ω(n) √
For any constant c > 0, let n0 > 36 + c,
0 ≤ c · n < n2 − 36
Asymptotic Notation: Taking Limits
Assume f (n), g (n) > 0, we have:

f (n)
▶ limn→∞ g (n) = 0 ⇒ f (n) ∈ o(g (n))
▶ limn→∞ gf (n)
(n) < ∞ ⇒ f (n) ∈ O(g (n))
f (n)
▶ 0 < limn→∞ g (n) < ∞ ⇒ f (n) ∈ Θ(g (n))
f (n)
▶ limn→∞ g (n) > 0 ⇒ f (n) ∈ Ω(g (n))
▶ limn→∞ gf (n)
(n) = ∞ ⇒ f (n) ∈ ω(g (n))
It is easier to show o, Θ, vs ω using limits

limn→∞ gf (n)
(n) = 0 ⇒ f (n) ∈ o(g (n))
Proof:
By definition of limit, limn→∞ gf (n)

(n) = 0, means
∀ϵ > 0, ∃n0 > 0, such that ∀n ≥ n0 ,
f (n)
g (n) < ϵ
Hence, for any constant c > 0 (i.e., we can set c = ϵ), ∃n0 > 0,
such that ∀n ≥ n0 ,
f (n) < ϵ · g (n), i.e.,
f (n) < c · g (n),
f (n) ∈ o(g (n))
We will prove at least one other during Tut01

Example
By limit, show that n6 + 233n2 ∈ ω(n2 )

n6 +233n2 n4 +233
limn→∞ n2
= limn→∞ 1 = ∞ ⇒ f (n) ∈ ω(g (n))
Asymptotic Notation: Some Properties
▶ Reflexivity: For O, Ω, and Θ,

f (n) ∈ O(f (n)), similarly for Ω and Θ
▶ Transitivity: For all five: O, Ω, Θ, o, and ω
f (n) ∈ O(g (n)) and g (n) ∈ O(h(n)) implies f (n) ∈ O(h(n))
▶ Symmetry:
f (n) ∈ Θ(g (n)) iff g (n) ∈ Θ(f (n))
▶ Complementary:
f (n) ∈ O(g (n)) iff g (n) ∈ Ω(f (n))
f (n) ∈ o(g (n)) iff g (n) ∈ ω(f (n))
We will prove some of these during Tut01
See Asymptotic Analysis-Useful Facts.pdf for math refresher

Acknowledgement
The slides are modified from previous editions of this course and
similar course elsewhere
List of credits: Erik D. Demaine, Charles E. Leiserson, Surender

Baswana, Leong Hon Wai, Lee Wee Sun, Ken Sung, Diptarka
Chakraborty, Steven Halim, Sanjay Jain

cs3230 Lec01b Beamer

Uploaded by

Copyright:

Available Formats

cs3230 Lec01b Beamer

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

cs3230 Lec01b Beamer

Uploaded by

Copyright:

Available Formats

Asymptotic Analysis

S. Halim YJ. Chang

CS3230 Lec01b; Tue, 13 Aug 2024

Problem-Solving Example: Fibonacci

Model of Computation: RAM

Unambiguous instructions are precisely stated, no room for doubt

Instructions should be executable (implementable) on the target

*Deterministic: Most of the time, we expect each instruction to be

*Termination: The algorithm should terminate after finitely many

We can give an algorithm already written in a particular

Alternative: Pseudocode (we will use this going forward)

In Python (source code)

There can be many possible algorithms for solving a problem

Given the choices, we prefer:

Some objectives may have trade-offs: simplicity vs efficiency

Designing an algorithm is both science and art

There is no formula for designing a good efficient algorithm

▶ Complete Search (for example, using brute force,

The general steps:

TL;DR: Given K (1 ≤ K ≤ 45) output Fib(K − 1) and Fib(K )

Problem: Given n as input, compute Fib(n)

We will look at two algorithms:

PS: Yes, there are other (faster) algorithms

Simple, direct recursive implementation from the Fib(n) definition

TL;DR: Given K (1 ≤ K ≤ 45) output Fib(K − 1) and Fib(K )

We use the fast C++ to implement recursive Fib(n),

TL;DR: Given K (1 ≤ K ≤ 45) output Fib(K − 1) and Fib(K )

But if we use the slow Python to implement iterative IFib(n),

We analyze the resources needed by an algorithm:

Actual time needed to run an algorithm depends on the machine

Random-Access Machine (RAM) model is simple

▶ Word is basic unit of memory

For /rijeci, F (45) = 1 134 903 170 < 231 − 1,

For large computation of Fib(n),

To address the above, one can consider computing the Fibonacci

We omit this detail in our first analysis to simplify discussion

This ‘brute force’ Fib(n) is bad We can show that

Between 1 to n, there are ⌈ n−2

This is ‘Dynamic Programming’

▶ We often give the running time in terms of the size of the

Which algorithm is more efficient?

Algorithm 2 can be more efficient on small n, i.e., when n < 110

Algorithm 1 is more efficient on large n, especially when n ≥ 110

Time complexity really matters only for large-sized input,

Why we do not measure the actual run time:

We prefer to do asymptotic analysis:

For the following discussion on asymptotics, assume f and g are

Interpretation: g is an upper bound on f

We sometimes also write f = O(g ), though not 100% correct

We frequently write f (n) = O(g (n)), though technically, n should

Similarly for other asymptotic notations; PS: we accept all versions

Big O notation is an upper bound notation

Example: 100n + 1000 ∈ O(n2 )

But is this upper bound tight?

Is this the only c and n0 to show that 100n + 1000 ∈ O(n2 )?

Let f (n) = 10n3 + 5n + 15 and g (n) = n4

We want to prove that f (n) ∈ O(g (n)) by showing that

What should be the appropriate c and n0 ? (there are > 1 answers)

Reminder: f (n) = 10n3 + 5n + 15 and g (n) = n4