Constrained Optimization

Sample Research Questions

•

Designing a geometrical 3D shape (cylinder, cone, etc.) to maximize volume for fixed surface area

•

Minimizing the surface area required to construct a 3D shape that can hold a given volume of contents.

V = \pi r^2 h \\h = \frac{V}{\pi r^2} \\S = 2\pi r^2 + 2\pi rh \\S = 2\pi r^2 + \frac{2\pi r V}{\pi r^2} \\S = 2\pi r^2 + \frac{2V}{r} \\\frac{dS}{dr} = 4\pi r - \frac{2V}{r^2} = 0 \\4\pi r = \frac{2V}{r^2} \\V = 2\pi r^3 \quad [V = \pi r^2 h] \\\therefore h = 2r

Problem Definition

An optimization problem can be formulated by incorporating the following elements.

Decision Variables

The controllable inputs of an objective function that represent decisions to be made within the problem.

Objective Function

Mathematical expression that calculates the objective value.

Objective Value

The value that the optimization problem seeks to maximize or minimize; evaluating the objective function by substituting decision variables yields the objective value.

Constraints

Logical conditions or real-world limitations that restrict the possible values of decision variables.

Substituting decision variables into and evaluating the objective function yields the objective value.

Optimization Process

The goal of constrained optimization is to fine-tune the decision variables

(x, y)

so that they will yield the highest (for a maximization problem) or the lowest (for a minimization problem) objective value when substituted into the objective function.

Constraints screen for feasible solutions, which satisfy all equality/inequality conditions. Solutions that fail to satisfy even one of the constraints are considered infeasible and therefore excluded from consideration.

The objective function then evaluates and compares objective values corresponding to all feasible solutions, eventually selecting the optimal solution (set of decision variables) that yields the most desirable objective value.

In the mathematics IA, students should strive to define and solve an optimization problem that is more difficult than typical optimization problems they might encounter in the HL syllabus. For example, multivariate objective functions whose objective values will be simultaneously influenced by more than one input variable significantly raises the difficulty of the optimization problem.

Single Variate Optimization

The simplest approach to optimizing multivariate problems is using substitution to represent a multivariate function as a single variate function.

Suppose we seek to maximize the volume of a cylinder with fixed surface area by manipulating its radius and height.

Decision variables:

(r , h)

Objective Function:

V(r,h)= πr^2h

Objective Value:

πr^2h

Constraint:

2πrh+2\pi r^2=S

, where

S

is a constant value.

Maximize

V=πr^2h

subject to

2πr^2+2πrh=S

Algebraically manipulate the constraint to express

h

in terms of

r

, then use substitution to convert

V

into a single variate function.

h=\frac{S-2πr^2}{2πr}

V(r)=πr^2\times \frac{S-2πr^2}{2πr}

Solve for

\frac{dV}{dr}=0

to find radius value that maximizes

(\frac{d^2V}{dr^2}<0)

volume. Once this

r

value is found, substitute into

h=\frac{S-2πr^2}{2πr}

to find the corresponding height.

Gradient Descent

Numerical methods may also be used to find the optimal value of a function.

Gradient

A continuously differentiable real-valued function

f(x,y)

increases the fastest in the direction of

∇f(x,y)

, while decreasing the fastest in the direction of

-∇f(x,y)

∇f(x,y)=<\frac{∂f}{∂x},\frac{∂f}{∂y}>

Gradient Descent Algorithm

Gradient descent algorithm for single variate function:

x_{i+1}=x_i-α ∇ f(x_i)

Gradient descent algorithm for multivariate function:

<x_{i+1},y_{i+1}>=<x_i,y_i>-α ∇ f(x_i,y_i)

Repeat until convergence (where movement from

f(x_i,y_i)

f(x_{i+1},y_{i+1})

offers negligible improvement to objective value that is below a predefined threshold value) while, simultaneously updating variable values,

x_i

y_i

, with each iteration of algorithm.

Investigation

Start from initial point

(x_0,y_0)

, then compute gradient

∇f(x_0,y_0)

. For a problem that seeks to maximize

f(x,y)

, nudge the point towards the direction of maximal increase,

∇f(x_0,y_0)

, by an incremental step size . For a minimization problem, successive iterations should guide the point towards the direction of maximal decrease,

-∇f(x_0,y_0)

, by step size .

After one iteration of the gradient descent algorithm the current solution is updated to

(x_1,y_1)

. Compute the new gradient

∇f(x_1,y_1)

and move towards this direction by step size . Repeating these steps allows the initial point

(x_0,y_0)

to incrementally crawl towards the optimal point

(x^*,y^*)

as illustrated below.

At the optimal point, the gradient may be zero:

∇f(x^*,y^*)=0

indicates that the objective value

f(x^*,y^*)

can no longer be improved. The direction of maximal increase of

f(x,y)

being a zero vector

<0,0>

would imply that the current position of

(x,y)

is the maximum (or minimum) point.

Extension:

•

What are the pitfalls of gradient descent or other numerical methods for optimization?

•

Discuss how gradient descent algorithms may return a local maximum/minimum instead of the true global maximum/minimum.

•

Explore how the gradient descent algorithm may result in different outcomes depending on the initial point (x0,y0)(x_0,y_0)(x0​,y0​) chosen.

Lagrange Multipliers Method

Constrained optimization problems in the form:

Maximize (or Minimize)

f(x,y)

subject to

g(x,y)=c

where

f(x,y)

and

g(x,y)

are smooth functions and

c

is a scalar constant,

can be solved by finding

(x,y)

that satisfy

∇ f(x,y)=λ ∇ g(x,y)

, where

\lambda

is a scalar constant.

Geometric Interpretation of Lagrange Multipliers

If the domain of

(x,y)

was unrestricted, the maximum value of

f(x,y)

can be found at the unconstrained optimal point.

However, the

g(x,y)=c

constraint restricts the scope of search for possible

(x,y)

solutions to points along the bounded

g(x,y)=c

curve (green).

Level curves

f(x,y)=z

(red) are analogous to contour lines;

f(x,y)=z

encompasses all

(x,y)

that yield the same objective value (altitude) when substituted into the objective function

f(x,y)

The constrained optimal point

(x^*,y^*)

is found at the intersection between the constraint curve (green) and one of the possible level curves (red) where the gradients are parallel to one another.

Mathematically speaking,

∇ f(x^*,y^*) ∥ ∇ g(x^*,y^*)

∇ f(x^*,y^*)= λ ∇ g(x^*,y^*)