CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

CS599: Convex and Combinatorial Optimization

Fall 2013
Lectures 5-6: Convex Functions

Instructor: Shaddin Dughmi


Announcements

HW1 is out, due Thursday 9/26


Make sure you get email from me
Today: Convex Functions
Read all of B&V Chapter 3.
Outline

1 Convex Functions

2 Examples of Convex and Concave Functions

3 Convexity-Preserving Operations
Convex Functions
A function f : Rn → R is convex if the line segment between any points
on the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Convex Functions 1/24


Convex Functions
A function f : Rn → R is convex if the line segment between any points
on the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Inequality called Jensen’s inequality (basic form)

Convex Functions 1/24


Convex Functions
A function f : Rn → R is convex if the line segment between any points
on the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Inequality called Jensen’s inequality (basic form)


f is convex iff its restriction to any line {x + tv : t ∈ R} is convex

Convex Functions 1/24


Convex Functions
A function f : Rn → R is convex if the line segment between any points
on the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Inequality called Jensen’s inequality (basic form)


f is convex iff its restriction to any line {x + tv : t ∈ R} is convex
f is strictly convex if inequality strict when x 6= y.

Convex Functions 1/24


Convex Functions
A function f : Rn → R is convex if the line segment between any points
on the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

Inequality called Jensen’s inequality (basic form)


f is convex iff its restriction to any line {x + tv : t ∈ R} is convex
f is strictly convex if inequality strict when x 6= y.
Analogous definition when the domain of f is a convex subset D
of Rn
Convex Functions 1/24
Concave and Affine Functions

A function is f : Rn → R is concave if −f is convex. Equivalently:


Line segment between any points on the graph of f lies below f .
If x, y ∈ Rn and θ ∈ [0, 1], then
f (θx + (1 − θ)y) ≥ θf (x) + (1 − θ)f (y)

Convex Functions 2/24


Concave and Affine Functions

A function is f : Rn → R is concave if −f is convex. Equivalently:


Line segment between any points on the graph of f lies below f .
If x, y ∈ Rn and θ ∈ [0, 1], then
f (θx + (1 − θ)y) ≥ θf (x) + (1 − θ)f (y)

f : Rn → R is affine if it is both concave and convex. Equivalently:


Line segment between any points on the graph of f lies on the
graph of f .
f (x) = a| x + b for some a ∈ Rn and b ∈ R.
Convex Functions 2/24
We will now look at some equivalent definitions of convex functions

First Order Definition


A differentiable f : Rn → R is convex if and only if the first-order
approximation centered at any point x underestimates f everywhere.

f (y) ≥ f (x) + (5f (x))| (y − x)

Convex Functions 3/24


We will now look at some equivalent definitions of convex functions

First Order Definition


A differentiable f : Rn → R is convex if and only if the first-order
approximation centered at any point x underestimates f everywhere.

f (y) ≥ f (x) + (5f (x))| (y − x)

Local information → global information


If 5f (x) = 0 then x is a global minimizer of f
Convex Functions 3/24
Second Order Definition
A twice differentiable f : Rn → R is convex if and only if its derivative is
nondecreasing in all directions. Formally,

52 f (x)  0

for all x.

Convex Functions 4/24


Second Order Definition
A twice differentiable f : Rn → R is convex if and only if its derivative is
nondecreasing in all directions. Formally,

52 f (x)  0

for all x.

Intepretation
Recall definition of PSD: z | 52 f (x)z > 0 for all z ∈ Rn
At x + δ~z, infitisimal change in gradient is in roughly in the same
direction as ~z
Graph of f curves upwards along any line
When n = 1, this is f 00 (x) ≥ 0.

Convex Functions 4/24


Epigraph
The epigraph of f is the set of points above the graph of f . Formally,

epi(f ) = {(x, t) : t ≥ f (x)}

Convex Functions 5/24


Epigraph
The epigraph of f is the set of points above the graph of f . Formally,

epi(f ) = {(x, t) : t ≥ f (x)}

Epigraph Definition
f is a convex function if and only if its epigraph is a convex set.

Convex Functions 5/24


Jensen’s Inequality (General Form)

f : Rn → R is convex if and only if

P x1 , . . . , xk in the domain of f , and θ1 , . . . , θk ≥ 0 such


For every
that i θi = 1, we have
X X
f( θi x i ) ≤ θi f (xi )
i i

Given a probability measure D on the domain of f , and x ∼ D,

f (E[x]) ≤ E[f (x)]

Convex Functions 6/24


Jensen’s Inequality (General Form)

f : Rn → R is convex if and only if

P x1 , . . . , xk in the domain of f , and θ1 , . . . , θk ≥ 0 such


For every
that i θi = 1, we have
X X
f( θi x i ) ≤ θi f (xi )
i i

Given a probability measure D on the domain of f , and x ∼ D,

f (E[x]) ≤ E[f (x)]

Adding noise to x can only increase f (x) in expectation.

Convex Functions 6/24


Local and Global Optimality

Local minimum
x is a local minimum of f if there is a an open ball B containing x
where f (y) ≥ f (x) for all y ∈ B.

Local and Global Optimality


When f is convex, x is a local minimum of f if and only if it is a global
minimum.

Convex Functions 7/24


Local and Global Optimality

Local minimum
x is a local minimum of f if there is a an open ball B containing x
where f (y) ≥ f (x) for all y ∈ B.

Local and Global Optimality


When f is convex, x is a local minimum of f if and only if it is a global
minimum.

This fact underlies much of the tractability of convex optimization.

Convex Functions 7/24


Sub-level sets

p
Level sets of f (x, y) = x2 + y 2

Sublevel set
The α-sublevel set of f is {x ∈ domain(f ) : f (x) ≤ α}.

Convex Functions 8/24


Sub-level sets

p
Level sets of f (x, y) = x2 + y 2

Sublevel set
The α-sublevel set of f is {x ∈ domain(f ) : f (x) ≤ α}.

Fact
Every sub-level set of a convex function is a convex set.

This fact also underlies tractability of convex optimization

Convex Functions 8/24


Sub-level sets

p
Level sets of f (x, y) = x2 + y 2

Sublevel set
The α-sublevel set of f is {x ∈ domain(f ) : f (x) ≤ α}.

Fact
Every sub-level set of a convex function is a convex set.

This fact also underlies tractability of convex optimization

Note: converse false, but nevertheless useful check.


Convex Functions 8/24
Other Basic Properties

Continuity
Convex functions are continuous.

Convex Functions 9/24


Other Basic Properties

Continuity
Convex functions are continuous.

Extended-value extension
If a function f : D → R is convex on its domain, and D is convex, then
it can be extended to a convex function on Rn . by setting f (x) = ∞
whenever x ∈ / D.
S
This simplifies notation. ResultingSfunction fe : D → R ∞ is “convex”
with respect to the ordering on R ∞

Convex Functions 9/24


Outline

1 Convex Functions

2 Examples of Convex and Concave Functions

3 Convexity-Preserving Operations
Functions on the reals

Affine: ax + b
Exponential: eax convex for any a ∈ R
Powers: xa convex on R++ when a ≥ 1 or a ≤ 0, and concave for
0≤a≤1
Logarithm: log x concave on R++ .

Examples of Convex and Concave Functions 10/24


Norms
Norms are convex.

||θx + (1 − θ)y|| ≤ ||θx|| + ||(1 − θ)y|| = θ||x|| + (1 − θ)||y||

Uses both norm axioms: triangle inequality, and homogeneity.


Applies to matrix norms, such as the spectral norm (radius of
induced ellipsoid)

Examples of Convex and Concave Functions 11/24


Norms
Norms are convex.

||θx + (1 − θ)y|| ≤ ||θx|| + ||(1 − θ)y|| = θ||x|| + (1 − θ)||y||

Uses both norm axioms: triangle inequality, and homogeneity.


Applies to matrix norms, such as the spectral norm (radius of
induced ellipsoid)

Max
maxi xi is convex

max(θx + (1 − θ)y)i = max(θxi + (1 − θ)yi )


i i
≤ max θxi + max(1 − θ)yi
i i
= θ max xi + (1 − θ) max yi
i i

If i’m allowed to pick the maximum entry of θx and θy independently, I


can do only better.
Examples of Convex and Concave Functions 11/24
Log-sum-exp: log(ex1 + ex2 + . . . + exn ) is
convex
1
Geometric mean: ( ni=1 xi ) n is concave
Q

Log-determinant: log det X is concave


Quadratic form: x| Ax is convex iff A  0
Other examples in book
f (x, y) = log(ex + ey )

Examples of Convex and Concave Functions 12/24


Log-sum-exp: log(ex1 + ex2 + . . . + exn ) is
convex
1
Geometric mean: ( ni=1 xi ) n is concave
Q

Log-determinant: log det X is concave


Quadratic form: x| Ax is convex iff A  0
Other examples in book
f (x, y) = log(ex + ey )

Proving convexity often comes down to case-by-case reasoning,


involving:
Definition: restrict to line and check Jensen’s inequality
Write down the Hessian and prove PSD
Express as a combination of other convex functions through
convexity-preserving operations (Next)

Examples of Convex and Concave Functions 12/24


Outline

1 Convex Functions

2 Examples of Convex and Concave Functions

3 Convexity-Preserving Operations
Nonnegative Weighted Combinations
If f1 , f2 , . . . , fk are convex, and w1 , w2 , . . . , wk ≥ 0, then
g = w1 f1 + w2 f2 . . . + wk fk is convex.

Convexity-Preserving Operations 13/24


Nonnegative Weighted Combinations
If f1 , f2 , . . . , fk are convex, and w1 , w2 , . . . , wk ≥ 0, then
g = w1 f1 + w2 f2 . . . + wk fk is convex.

proof (k = 2)
     
x+y x+y x+y
g = w1 f1 + w2 f2
2 2 2
f1 (x) + f1 (y) f2 (x) + f2 (y)
≤ w1 + w2
2 2
g(x) + g(y)
=
2

Convexity-Preserving Operations 13/24


Nonnegative Weighted Combinations
If f1 , f2 , . . . , fk are convex, and w1 , w2 , . . . , wk ≥ 0, then
g = w1 f1 + w2 f2 . . . + wk fk is convex.

R
Extends to integrals g(x) = y w(y)fy (x) with w(y) ≥ 0, and therefore
expectations Ey fy (x).

Convexity-Preserving Operations 13/24


Nonnegative Weighted Combinations
If f1 , f2 , . . . , fk are convex, and w1 , w2 , . . . , wk ≥ 0, then
g = w1 f1 + w2 f2 . . . + wk fk is convex.

R
Extends to integrals g(x) = y w(y)fy (x) with w(y) ≥ 0, and therefore
expectations Ey fy (x).

Worth Noting
Minimizing the expectation of a random convex cost function is also a
convex optimization problem!
A stochastic convex optimization problem is a convex optimization
problem.

Convexity-Preserving Operations 13/24


Example: Stochastic Facility Location

Average Distance
k customers located at y1 , y2 , . . . , yk ∈ Rn
n
Pa 1facility at x ∈ R , average distance to a customer is
If I place
g(x) = i k ||x − yi ||

Convexity-Preserving Operations 14/24


Example: Stochastic Facility Location

Average Distance
k customers located at y1 , y2 , . . . , yk ∈ Rn
n
Pa 1facility at x ∈ R , average distance to a customer is
If I place
g(x) = i k ||x − yi ||

Since distance to any one customer is convex in x, so is the


average distance.
Extends to probability measure over customers
Convexity-Preserving Operations 14/24
Implication
Convex functions are a convex cone in the vector space of functions
from Rn to R.

The set of convex functions is the intersection of an infinite set of


homogeneous linear inequalities indexed by x, y, θ

f (θx + (1 − θ)y) − θf (x) + (1 − θ)f (y) ≤ 0

Convexity-Preserving Operations 15/24


Composition with Affine Function
If f : Rn → R is convex, and A ∈ Rn×m , b ∈ Rn , then
g(x) = f (Ax + b)

is a convex function from Rm to R.

Convexity-Preserving Operations 16/24


Composition with Affine Function
If f : Rn → R is convex, and A ∈ Rn×m , b ∈ Rn , then
g(x) = f (Ax + b)

is a convex function from Rm to R.

Proof
(x, t) ∈ graph(g) ⇐⇒ t = g(x) = f (Ax+b) ⇐⇒ (Ax+b, t) ∈ graph(f )

Convexity-Preserving Operations 16/24


Composition with Affine Function
If f : Rn → R is convex, and A ∈ Rn×m , b ∈ Rn , then
g(x) = f (Ax + b)

is a convex function from Rm to R.

Proof
(x, t) ∈ graph(g) ⇐⇒ t = g(x) = f (Ax+b) ⇐⇒ (Ax+b, t) ∈ graph(f )
(x, t) ∈ epi(g) ⇐⇒ t ≥ g(x) = f (Ax + b) ⇐⇒ (Ax + b, t) ∈ epi(f )

Convexity-Preserving Operations 16/24


Composition with Affine Function
If f : Rn → R is convex, and A ∈ Rn×m , b ∈ Rn , then
g(x) = f (Ax + b)

is a convex function from Rm to R.

Proof
(x, t) ∈ graph(g) ⇐⇒ t = g(x) = f (Ax+b) ⇐⇒ (Ax+b, t) ∈ graph(f )
(x, t) ∈ epi(g) ⇐⇒ t ≥ g(x) = f (Ax + b) ⇐⇒ (Ax + b, t) ∈ epi(f )
epi(g) is the inverse image of epi(f ) under the affine mapping
(x, t) → (Ax + b, t)
Convexity-Preserving Operations 16/24
Examples
||Ax + b|| is convex
max(Ax + b) is convex
| | |
log(ea1 x+b1 + ea2 x+b2 + . . . + ean x+bn ) is convex

Convexity-Preserving Operations 17/24


Maximum
If f1 , f2 are convex, then g(x) = max {f1 (x), f2 (x)} is also convex.

Generalizes to the maximum of any number of functions, maxki=1 fi (x),


and also to the supremum of an infinite set of functions supy fy (x).

Convexity-Preserving Operations 18/24


Maximum
If f1 , f2 are convex, then g(x) = max {f1 (x), f2 (x)} is also convex.

Generalizes to the maximum of any number of functions, maxki=1 fi (x),


and also to the supremum of an infinite set of functions supy fy (x).

\
epi g = epi f1 epi f2
Convexity-Preserving Operations 18/24
Example: Robust Facility Location

Maximum Distance
k customers located at y1 , y2 , . . . , yk ∈ Rn
If I place a facility at x ∈ Rn , maximum distance to a customer is
g(x) = maxi ||x − yi ||

Convexity-Preserving Operations 19/24


Example: Robust Facility Location

Maximum Distance
k customers located at y1 , y2 , . . . , yk ∈ Rn
If I place a facility at x ∈ Rn , maximum distance to a customer is
g(x) = maxi ||x − yi ||

Since distance to any one customer is convex in x, so is the


worst-case distance.

Convexity-Preserving Operations 19/24


Example: Robust Facility Location

Maximum Distance
k customers located at y1 , y2 , . . . , yk ∈ Rn
If I place a facility at x ∈ Rn , maximum distance to a customer is
g(x) = maxi ||x − yi ||

Worth Noting
When a convex cost function is uncertain, minimizing the worst-case
cost is also a convex optimization problem!
A robust (in the worst-case sense) convex optimization problem is
a convex optimization problem.
Convexity-Preserving Operations 19/24
Other Examples
Maximum eigenvalue of a symmetric matrix A is convex in A

max {v | Av : ||v|| = 1}

Sum of k largest components of a vector x is convex in x


n o
max ~1S · x : |S| = k

Convexity-Preserving Operations 20/24


Minimization
If f (x, y) is convex and C is convex and nonempty, then
g(x) = inf y∈C f (x, y) is convex.

Convexity-Preserving Operations 21/24


Minimization
If f (x, y) is convex and C is convex and nonempty, then
g(x) = inf y∈C f (x, y) is convex.

Proof (for C = Rk )
epi g is the projection of epi f onto hyperplane y = 0.

f (x, y) = x2 + y 2 g(x) = x2
Convexity-Preserving Operations 21/24
Example
Distance from a convex set C

f (x, y) = inf ||x − y||


y∈C

Convexity-Preserving Operations 22/24


Composition Rules
If g : Rn → Rk and h : Rk → R, then f = h ◦ g is convex if
gi are convex, and h is convex and nondecreasing in each
argument.
gi are concave, and h is convex and nonincreasing in each
argument.

Proof (n = k = 1)
f 00 (x) = h00 (g(x))g 0 (x)2 + h0 (g(x))g 00 (x)

Convexity-Preserving Operations 23/24


Perspective
If f is convex then g(x, t) = tf (x/t) is also convex.

Proof
epi g is inverse image of epi f under the perspective function.

Convexity-Preserving Operations 24/24

You might also like