CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions

CS599: Convex and Combinatorial Optimization
Fall 2013
Lectures 5-6: Convex Functions
Instructor: Shaddin Dughmi

Announcements
HW1 is out, due Thursday 9/26

Make sure you get email from me
Today: Convex Functions
Read all of B&V Chapter 3.
Outline
1 Convex Functions
2 Examples of Convex and Concave Functions
3 Convexity-Preserving Operations
Convex Functions
A function f : Rn → R is convex if the line segment between any points
on the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)
Convex Functions 1/24

Convex Functions
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)
Inequality called Jensen’s inequality (basic form)

Convex Functions
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

f is convex iff its restriction to any line {x + tv : t ∈ R} is convex

Convex Functions
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

f is strictly convex if inequality strict when x 6= y.

Convex Functions
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

f is strictly convex if inequality strict when x 6= y.
Analogous definition when the domain of f is a convex subset D
of Rn
Concave and Affine Functions
A function is f : Rn → R is concave if −f is convex. Equivalently:

Line segment between any points on the graph of f lies below f .
If x, y ∈ Rn and θ ∈ [0, 1], then
f (θx + (1 − θ)y) ≥ θf (x) + (1 − θ)f (y)

Concave and Affine Functions
A function is f : Rn → R is concave if −f is convex. Equivalently:

Line segment between any points on the graph of f lies below f .
If x, y ∈ Rn and θ ∈ [0, 1], then
f (θx + (1 − θ)y) ≥ θf (x) + (1 − θ)f (y)
f : Rn → R is affine if it is both concave and convex. Equivalently:

Line segment between any points on the graph of f lies on the
graph of f .
f (x) = a| x + b for some a ∈ Rn and b ∈ R.
We will now look at some equivalent definitions of convex functions
First Order Definition

A differentiable f : Rn → R is convex if and only if the first-order
approximation centered at any point x underestimates f everywhere.
f (y) ≥ f (x) + (5f (x))| (y − x)

We will now look at some equivalent definitions of convex functions
First Order Definition

A differentiable f : Rn → R is convex if and only if the first-order
approximation centered at any point x underestimates f everywhere.
f (y) ≥ f (x) + (5f (x))| (y − x)
Local information → global information

If 5f (x) = 0 then x is a global minimizer of f
Second Order Definition
A twice differentiable f : Rn → R is convex if and only if its derivative is
nondecreasing in all directions. Formally,
52 f (x) 0
for all x.

Second Order Definition
A twice differentiable f : Rn → R is convex if and only if its derivative is
nondecreasing in all directions. Formally,
52 f (x) 0
for all x.
Intepretation
Recall definition of PSD: z | 52 f (x)z > 0 for all z ∈ Rn
At x + δ~z, infitisimal change in gradient is in roughly in the same
direction as ~z
Graph of f curves upwards along any line
When n = 1, this is f 00 (x) ≥ 0.

Epigraph
The epigraph of f is the set of points above the graph of f . Formally,
epi(f ) = {(x, t) : t ≥ f (x)}

Epigraph
The epigraph of f is the set of points above the graph of f . Formally,
epi(f ) = {(x, t) : t ≥ f (x)}
Epigraph Definition
f is a convex function if and only if its epigraph is a convex set.

Jensen’s Inequality (General Form)
f : Rn → R is convex if and only if
P x1 , . . . , xk in the domain of f , and θ1 , . . . , θk ≥ 0 such

For every
that i θi = 1, we have
X X
f( θi x i ) ≤ θi f (xi )
i i
Given a probability measure D on the domain of f , and x ∼ D,
f (E[x]) ≤ E[f (x)]

Jensen’s Inequality (General Form)
f : Rn → R is convex if and only if
P x1 , . . . , xk in the domain of f , and θ1 , . . . , θk ≥ 0 such

For every
that i θi = 1, we have
X X
f( θi x i ) ≤ θi f (xi )
i i
Given a probability measure D on the domain of f , and x ∼ D,
f (E[x]) ≤ E[f (x)]
Adding noise to x can only increase f (x) in expectation.

Local and Global Optimality
Local minimum
x is a local minimum of f if there is a an open ball B containing x
where f (y) ≥ f (x) for all y ∈ B.

When f is convex, x is a local minimum of f if and only if it is a global
minimum.

Local minimum
x is a local minimum of f if there is a an open ball B containing x
where f (y) ≥ f (x) for all y ∈ B.

When f is convex, x is a local minimum of f if and only if it is a global
minimum.
This fact underlies much of the tractability of convex optimization.

Sub-level sets
p
Level sets of f (x, y) = x2 + y 2
Sublevel set
The α-sublevel set of f is {x ∈ domain(f ) : f (x) ≤ α}.

Sub-level sets
p
Sublevel set
Fact
Every sub-level set of a convex function is a convex set.
This fact also underlies tractability of convex optimization

Sub-level sets
p
Sublevel set
Fact
Every sub-level set of a convex function is a convex set.
This fact also underlies tractability of convex optimization
Note: converse false, but nevertheless useful check.

Other Basic Properties
Continuity
Convex functions are continuous.

Other Basic Properties
Continuity
Convex functions are continuous.
Extended-value extension
If a function f : D → R is convex on its domain, and D is convex, then
it can be extended to a convex function on Rn . by setting f (x) = ∞
whenever x ∈ / D.
S
This simplifies notation. ResultingSfunction fe : D → R ∞ is “convex”
with respect to the ordering on R ∞

Outline
1 Convex Functions
Functions on the reals
Affine: ax + b
Exponential: eax convex for any a ∈ R
Powers: xa convex on R++ when a ≥ 1 or a ≤ 0, and concave for
0≤a≤1
Logarithm: log x concave on R++ .
Examples of Convex and Concave Functions 10/24

Norms
Norms are convex.
||θx + (1 − θ)y|| ≤ ||θx|| + ||(1 − θ)y|| = θ||x|| + (1 − θ)||y||
Uses both norm axioms: triangle inequality, and homogeneity.

Applies to matrix norms, such as the spectral norm (radius of
induced ellipsoid)

Norms
Norms are convex.
||θx + (1 − θ)y|| ≤ ||θx|| + ||(1 − θ)y|| = θ||x|| + (1 − θ)||y||
Uses both norm axioms: triangle inequality, and homogeneity.

Applies to matrix norms, such as the spectral norm (radius of
induced ellipsoid)
Max
maxi xi is convex
max(θx + (1 − θ)y)i = max(θxi + (1 − θ)yi )

i i
≤ max θxi + max(1 − θ)yi
i i
= θ max xi + (1 − θ) max yi
i i
If i’m allowed to pick the maximum entry of θx and θy independently, I

can do only better.
Log-sum-exp: log(ex1 + ex2 + . . . + exn ) is
convex
1
Geometric mean: ( ni=1 xi ) n is concave
Q
Log-determinant: log det X is concave

Quadratic form: x| Ax is convex iff A 0
Other examples in book
f (x, y) = log(ex + ey )

Log-sum-exp: log(ex1 + ex2 + . . . + exn ) is
convex
1
Geometric mean: ( ni=1 xi ) n is concave
Q
Log-determinant: log det X is concave

Quadratic form: x| Ax is convex iff A 0
Other examples in book
f (x, y) = log(ex + ey )
Proving convexity often comes down to case-by-case reasoning,

involving:
Definition: restrict to line and check Jensen’s inequality
Write down the Hessian and prove PSD
Express as a combination of other convex functions through
convexity-preserving operations (Next)

Outline
1 Convex Functions
Nonnegative Weighted Combinations
If f1 , f2 , . . . , fk are convex, and w1 , w2 , . . . , wk ≥ 0, then
g = w1 f1 + w2 f2 . . . + wk fk is convex.
Convexity-Preserving Operations 13/24

proof (k = 2)

x+y x+y x+y
g = w1 f1 + w2 f2
2 2 2
f1 (x) + f1 (y) f2 (x) + f2 (y)
≤ w1 + w2
2 2
g(x) + g(y)
=
2

R
Extends to integrals g(x) = y w(y)fy (x) with w(y) ≥ 0, and therefore
expectations Ey fy (x).

R
Extends to integrals g(x) = y w(y)fy (x) with w(y) ≥ 0, and therefore
expectations Ey fy (x).
Worth Noting
Minimizing the expectation of a random convex cost function is also a
convex optimization problem!
A stochastic convex optimization problem is a convex optimization
problem.

Example: Stochastic Facility Location
Average Distance
k customers located at y1 , y2 , . . . , yk ∈ Rn
n
Pa 1facility at x ∈ R , average distance to a customer is
If I place
g(x) = i k ||x − yi ||

Example: Stochastic Facility Location
Average Distance
n
Pa 1facility at x ∈ R , average distance to a customer is
If I place
g(x) = i k ||x − yi ||
Since distance to any one customer is convex in x, so is the

average distance.
Extends to probability measure over customers
Implication
Convex functions are a convex cone in the vector space of functions
from Rn to R.
The set of convex functions is the intersection of an infinite set of

homogeneous linear inequalities indexed by x, y, θ
f (θx + (1 − θ)y) − θf (x) + (1 − θ)f (y) ≤ 0

Composition with Affine Function
If f : Rn → R is convex, and A ∈ Rn×m , b ∈ Rn , then
g(x) = f (Ax + b)
is a convex function from Rm to R.

g(x) = f (Ax + b)
Proof
(x, t) ∈ graph(g) ⇐⇒ t = g(x) = f (Ax+b) ⇐⇒ (Ax+b, t) ∈ graph(f )

g(x) = f (Ax + b)
Proof
(x, t) ∈ epi(g) ⇐⇒ t ≥ g(x) = f (Ax + b) ⇐⇒ (Ax + b, t) ∈ epi(f )

g(x) = f (Ax + b)
Proof
(x, t) ∈ epi(g) ⇐⇒ t ≥ g(x) = f (Ax + b) ⇐⇒ (Ax + b, t) ∈ epi(f )
epi(g) is the inverse image of epi(f ) under the affine mapping
(x, t) → (Ax + b, t)
Examples
||Ax + b|| is convex
max(Ax + b) is convex
| | |
log(ea1 x+b1 + ea2 x+b2 + . . . + ean x+bn ) is convex

Maximum
If f1 , f2 are convex, then g(x) = max {f1 (x), f2 (x)} is also convex.
Generalizes to the maximum of any number of functions, maxki=1 fi (x),

and also to the supremum of an infinite set of functions supy fy (x).

Maximum
If f1 , f2 are convex, then g(x) = max {f1 (x), f2 (x)} is also convex.
Generalizes to the maximum of any number of functions, maxki=1 fi (x),

and also to the supremum of an infinite set of functions supy fy (x).
\
epi g = epi f1 epi f2
Example: Robust Facility Location
Maximum Distance
If I place a facility at x ∈ Rn , maximum distance to a customer is
g(x) = maxi ||x − yi ||

Maximum Distance
g(x) = maxi ||x − yi ||
Since distance to any one customer is convex in x, so is the

worst-case distance.

Maximum Distance
g(x) = maxi ||x − yi ||
Worth Noting
When a convex cost function is uncertain, minimizing the worst-case
cost is also a convex optimization problem!
A robust (in the worst-case sense) convex optimization problem is
a convex optimization problem.
Other Examples
Maximum eigenvalue of a symmetric matrix A is convex in A
max {v | Av : ||v|| = 1}
Sum of k largest components of a vector x is convex in x

n o
max ~1S · x : |S| = k

Minimization
If f (x, y) is convex and C is convex and nonempty, then
g(x) = inf y∈C f (x, y) is convex.

Minimization
If f (x, y) is convex and C is convex and nonempty, then
g(x) = inf y∈C f (x, y) is convex.
Proof (for C = Rk )
epi g is the projection of epi f onto hyperplane y = 0.
f (x, y) = x2 + y 2 g(x) = x2
Example
Distance from a convex set C
f (x, y) = inf ||x − y||

y∈C

Composition Rules
If g : Rn → Rk and h : Rk → R, then f = h ◦ g is convex if
gi are convex, and h is convex and nondecreasing in each
argument.
gi are concave, and h is convex and nonincreasing in each
argument.
Proof (n = k = 1)
f 00 (x) = h00 (g(x))g 0 (x)2 + h0 (g(x))g 00 (x)

Perspective
If f is convex then g(x, t) = tf (x/t) is also convex.
Proof
epi g is inverse image of epi f under the perspective function.

CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions

Uploaded by

Copyright:

Available Formats

CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions

Uploaded by

Copyright:

Available Formats

CS599: Convex and Combinatorial Optimization

Instructor: Shaddin Dughmi

HW1 is out, due Thursday 9/26

2 Examples of Convex and Concave Functions

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Convex Functions 1/24

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Inequality called Jensen’s inequality (basic form)

Convex Functions 1/24

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Inequality called Jensen’s inequality (basic form)

Convex Functions 1/24

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Inequality called Jensen’s inequality (basic form)

Convex Functions 1/24

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Inequality called Jensen’s inequality (basic form)

A function is f : Rn → R is concave if −f is convex. Equivalently:

Convex Functions 2/24

A function is f : Rn → R is concave if −f is convex. Equivalently:

f : Rn → R is affine if it is both concave and convex. Equivalently:

First Order Definition

f (y) ≥ f (x) + (5f (x))| (y − x)

Convex Functions 3/24

First Order Definition

f (y) ≥ f (x) + (5f (x))| (y − x)

Local information → global information

Convex Functions 4/24

Convex Functions 4/24

epi(f ) = {(x, t) : t ≥ f (x)}

Convex Functions 5/24

epi(f ) = {(x, t) : t ≥ f (x)}

Convex Functions 5/24

f : Rn → R is convex if and only if

P x1 , . . . , xk in the domain of f , and θ1 , . . . , θk ≥ 0 such

Given a probability measure D on the domain of f , and x ∼ D,

f (E[x]) ≤ E[f (x)]

Convex Functions 6/24

f : Rn → R is convex if and only if

P x1 , . . . , xk in the domain of f , and θ1 , . . . , θk ≥ 0 such

Given a probability measure D on the domain of f , and x ∼ D,

f (E[x]) ≤ E[f (x)]

Adding noise to x can only increase f (x) in expectation.

Convex Functions 6/24

Local and Global Optimality

Convex Functions 7/24

Local and Global Optimality

This fact underlies much of the tractability of convex optimization.

Convex Functions 7/24

Convex Functions 8/24

This fact also underlies tractability of convex optimization

Convex Functions 8/24

This fact also underlies tractability of convex optimization

Note: converse false, but nevertheless useful check.

Convex Functions 9/24

Convex Functions 9/24

2 Examples of Convex and Concave Functions

Examples of Convex and Concave Functions 10/24