Mission
home

Constrained Optimization

Tags

Sample Research Questions

Designing a geometrical 3D shape (cylinder, cone, etc.) to maximize volume for fixed surface area
Minimizing the surface area required to construct a 3D shape that can hold a given volume of contents.
V=πr2hh=Vπr2S=2πr2+2πrhS=2πr2+2πrVπr2S=2πr2+2VrdSdr=4πr2Vr2=04πr=2Vr2V=2πr3[V=πr2h]h=2rV = \pi r^2 h \\h = \frac{V}{\pi r^2} \\S = 2\pi r^2 + 2\pi rh \\S = 2\pi r^2 + \frac{2\pi r V}{\pi r^2} \\S = 2\pi r^2 + \frac{2V}{r} \\\frac{dS}{dr} = 4\pi r - \frac{2V}{r^2} = 0 \\4\pi r = \frac{2V}{r^2} \\V = 2\pi r^3 \quad [V = \pi r^2 h] \\\therefore h = 2r

Problem Definition

An optimization problem can be formulated by incorporating the following elements.
Decision Variables
The controllable inputs of an objective function that represent decisions to be made within the problem.
Objective Function
Mathematical expression that calculates the objective value.
Objective Value
The value that the optimization problem seeks to maximize or minimize; evaluating the objective function by substituting decision variables yields the objective value.
Constraints
Logical conditions or real-world limitations that restrict the possible values of decision variables.
Substituting decision variables into and evaluating the objective function yields the objective value.
Optimization Process
The goal of constrained optimization is to fine-tune the decision variables (x,y)(x, y) so that they will yield the highest (for a maximization problem) or the lowest (for a minimization problem) objective value when substituted into the objective function.
Constraints screen for feasible solutions, which satisfy all equality/inequality conditions. Solutions that fail to satisfy even one of the constraints are considered infeasible and therefore excluded from consideration.
The objective function then evaluates and compares objective values corresponding to all feasible solutions, eventually selecting the optimal solution (set of decision variables) that yields the most desirable objective value.
In the mathematics IA, students should strive to define and solve an optimization problem that is more difficult than typical optimization problems they might encounter in the HL syllabus. For example, multivariate objective functions whose objective values will be simultaneously influenced by more than one input variable significantly raises the difficulty of the optimization problem.

Single Variate Optimization

The simplest approach to optimizing multivariate problems is using substitution to represent a multivariate function as a single variate function.
Suppose we seek to maximize the volume of a cylinder with fixed surface area by manipulating its radius and height.
Decision variables: (r,h)(r , h)
Objective Function: V(r,h)=πr2hV(r,h)= πr^2h
Objective Value: πr2hπr^2h
Constraint: 2πrh+2πr2=S2πrh+2\pi r^2=S, where SS is a constant value.
Maximize V=πr2hV=πr^2h
subject to 2πr2+2πrh=S2πr^2+2πrh=S
Algebraically manipulate the constraint to express hh in terms of rr, then use substitution to convert VV into a single variate function.
h=S2πr22πrh=\frac{S-2πr^2}{2πr}
V(r)=πr2×S2πr22πrV(r)=πr^2\times \frac{S-2πr^2}{2πr}
Solve for dVdr=0\frac{dV}{dr}=0 to find radius value that maximizes (d2Vdr2<0)(\frac{d^2V}{dr^2}<0) volume. Once this rr value is found, substitute into h=S2πr22πrh=\frac{S-2πr^2}{2πr} to find the corresponding height.

Gradient Descent

Numerical methods may also be used to find the optimal value of a function.
Gradient
A continuously differentiable real-valued function f(x,y)f(x,y) increases the fastest in the direction of f(x,y)∇f(x,y), while decreasing the fastest in the direction of f(x,y)-∇f(x,y).
f(x,y)=<fx,fy>∇f(x,y)=<\frac{∂f}{∂x},\frac{∂f}{∂y}>
Gradient Descent Algorithm
Gradient descent algorithm for single variate function:
xi+1=xiαf(xi)x_{i+1}=x_i-α ∇ f(x_i)
Gradient descent algorithm for multivariate function:
<xi+1,yi+1>=<xi,yi>αf(xi,yi)<x_{i+1},y_{i+1}>=<x_i,y_i>-α ∇ f(x_i,y_i)
Repeat until convergence (where movement from f(xi,yi)f(x_i,y_i) to f(xi+1,yi+1)f(x_{i+1},y_{i+1}) offers negligible improvement to objective value that is below a predefined threshold value) while, simultaneously updating variable values, xix_i, yiy_i, with each iteration of algorithm.
Investigation
Start from initial point (x0,y0)(x_0,y_0), then compute gradient f(x0,y0)∇f(x_0,y_0). For a problem that seeks to maximize f(x,y)f(x,y), nudge the point towards the direction of maximal increase, f(x0,y0)∇f(x_0,y_0), by an incremental step size . For a minimization problem, successive iterations should guide the point towards the direction of maximal decrease, f(x0,y0)-∇f(x_0,y_0), by step size .
After one iteration of the gradient descent algorithm the current solution is updated to (x1,y1)(x_1,y_1). Compute the new gradient f(x1,y1)∇f(x_1,y_1) and move towards this direction by step size . Repeating these steps allows the initial point (x0,y0)(x_0,y_0) to incrementally crawl towards the optimal point (x,y)(x^*,y^*) as illustrated below.
At the optimal point, the gradient may be zero: f(x,y)=0∇f(x^*,y^*)=0 indicates that the objective value f(x,y)f(x^*,y^*) can no longer be improved. The direction of maximal increase of f(x,y)f(x,y) being a zero vector <0,0><0,0> would imply that the current position of (x,y)(x,y) is the maximum (or minimum) point.
Extension:
What are the pitfalls of gradient descent or other numerical methods for optimization?
Discuss how gradient descent algorithms may return a local maximum/minimum instead of the true global maximum/minimum.
Explore how the gradient descent algorithm may result in different outcomes depending on the initial point (x0,y0)(x_0,y_0) chosen.

Lagrange Multipliers Method

Constrained optimization problems in the form:
Maximize (or Minimize) f(x,y)f(x,y)
subject to g(x,y)=cg(x,y)=c
where f(x,y)f(x,y) and g(x,y)g(x,y) are smooth functions and cc is a scalar constant,
can be solved by finding (x,y)(x,y) that satisfy f(x,y)=λg(x,y)∇ f(x,y)=λ ∇ g(x,y), where λ\lambda is a scalar constant.
Geometric Interpretation of Lagrange Multipliers
If the domain of (x,y)(x,y) was unrestricted, the maximum value of f(x,y)f(x,y) can be found at the unconstrained optimal point.
However, the g(x,y)=cg(x,y)=c constraint restricts the scope of search for possible (x,y)(x,y) solutions to points along the bounded g(x,y)=cg(x,y)=c curve (green).
Level curves f(x,y)=zf(x,y)=z (red) are analogous to contour lines; f(x,y)=zf(x,y)=z encompasses all (x,y)(x,y) that yield the same objective value (altitude) when substituted into the objective function f(x,y)f(x,y).
The constrained optimal point (x,y)(x^*,y^*) is found at the intersection between the constraint curve (green) and one of the possible level curves (red) where the gradients are parallel to one another.
Mathematically speaking,
f(x,y)g(x,y)∇ f(x^*,y^*) ∥ ∇ g(x^*,y^*)
f(x,y)=λg(x,y)∇ f(x^*,y^*)= λ ∇ g(x^*,y^*)