Introduction

This post is a bit of a mathematical diversion from the previous post on robust optimisation. To follow along with the derivation of the Bertsimas and Sim robustness model we’re going to need to optimise an LP minimisation programme which contains a second ’nested’ LP maximisation programme in one of its constraints.

It isn’t possible to solve such a programme directly so we will have to find a way to reformulate it. The reformulation itself uses the interesting ‘duality’ result for LPs and is, at least in my opinion, interesting enough to not simply quote from the paper. In the end I decided that it was bulky enough that instead of jamming it midway through a piece focussing on different definitions and implementations of ‘robustness’ it should go into its own short post.

Nested optimisation problems

Suppose you’re solving a minimisation problem which contains a constraint like this:

j=1naˉjxj+maxz{j=1ncjzj:j=1nzjΓ, 0zj1}b \begin{align*} \sum_{j=1}^{n} \bar a_{j} x_{j} + \max_{z} \Bigg\{ \sum_{j=1}^{n} c_{j} z_{j} : \sum_{j=1}^{n} z_{j} \le \Gamma,\ 0 \le z_{j} \le 1 \Bigg\} \le b \end{align*}

where cjc_j are some coefficients possibly depending on xx, ΓZ\Gamma\in\mathbb{Z} and xx is our decision variable.

We’re faced with a maximisation problem nested inside a constraint of a minimisation problem. This creates a non-convex structure that off-the-shelf LP solvers (gurobi) can’t optimise: We’re asking “what’s the worst-case value of this ‘inner’ maximisation problem?” while simultaneously trying to minimise the outer problem.

Let’s label the ‘inner’ problem (1):

maxzj=1ncjzj s.t.j=1nzjΓ, 0zj1j.(1) \begin{align*} \max_{z} \quad & \sum_{j=1}^n c_j z_j \ \text{s.t.}\quad & \sum_{j=1}^n z_j \le \Gamma, \ & 0 \le z_j \le 1 \quad \forall j. \end{align*} \tag{1}

The approach taken in [1], which I will follow here, is to use duality to convert the inner maximisation into a minimisation. If we can find a dual representation of this problem, we can replace the maximisation with a minimisation over the resulting dual variables. Since both the outer problem and the inner problem are now minimisations, we can combine them: instead of “minimise over xx, where each xx requires solving a separate optimisation over ww”, we can simply “minimise over both xx and ww jointly”. The inner minimisation gets absorbed by treating ww as additional decision variables alongside xx. Proceeding step-by-step:

Step 1: Convert to Standard Form

To take the dual of (1), we first need to rewrite it in standard LP form: only equality constraints are allowed and all variables must be constrained to be non-negative. There we replace the ‘budget’ constraint with equality by introducing a a slack variable σ0\sigma \ge 0

jzj+σ=Γ,σ0. \sum_j z_j + \sigma = \Gamma, \quad \sigma \ge 0.

and we replace each upper bound zj1z_j \le 1 with a slack variable sj0s_j \ge 0:

zj+sj=1,sj0. z_j + s_j = 1, \quad s_j \ge 0.

The constraints can be written compactly as Au=bA u = b if we collect all of our variables into a single vector:

u=(z s σ)0R2n+1. u = \begin{pmatrix} z \ s \ \sigma \end{pmatrix} \ge 0 \in \mathbb{R}^{2n+1}.

and we take A=(In,In,0,1,0,1)A = (I_n, I_n, 0, \mathbf{1}^\top, 0^\top, 1 ) and b=(1,Γ)b = (\mathbf{1}, \Gamma). Only the zjz_j variables appear in the objective, so the cost vector is simply c=(c0cn,0,,0)c = (c_0\ldots c_n, 0, \ldots, 0).

Thus (1) is equivalent to

maxucu s.t.Au=b, u0. \begin{aligned} \max_{u} \quad & c^\top u \ \text{s.t.}\quad & A u = b, \ & u \ge 0. \end{aligned}

Step 2: Take the Dual

A maximisation LP in standard form and its dual are given by, respecitively:

maxcu:Au=b, u0 \max { c^\top u : A u = b,\ u \ge 0 } minbw:Awc. \min { b^\top w : A^\top w \ge c }.

Where the dual variable, ww, has dimension n+1n+1: wj1w_{j\geq1} maps to the original constraint zj+sj=1z_j + s_j = 1 and w0w_0 corresponds back to jzj+σ=Γ\sum_j z_j + \sigma = \Gamma.

The dual objective is given by

bw=j=1n1wj+Γw0=j=1nwj+Γw0. b^\top w = \sum_{j=1}^n 1 \cdot w_j + \Gamma w_0 = \sum_{j=1}^n w_j + \Gamma w_0.

Next we need to look at the dual constraints AwcA^\top w \ge c.

For columns corresponding to zjz_j: Column jj of AA has a 1 in row jj and 1 in the last row, so we get wj+w0cjw_j + w_0 \ge c_j for all jj, and for columns corresponding to sjs_j: Column jj of the second block has 1 in row jj and 0 in the last row, so we have wj0w_j \ge 0 for all jj. Lastly, the column corresponding to σ\sigma results in w00w_0 \ge 0.

Putting this together, we can write the dual problem as

minwj=1nwj+Γw0s.t.wj+w0cj,j=1,,n,wj0,j=1,,n,w00.(2) \begin{align*} \min_{w} \quad & \sum_{j=1}^n w_j + \Gamma w_0 \\ \text{s.t.}\quad & w_j + w_0 \ge c_j, \quad j = 1,\dots,n, \\ & w_j \ge 0,\quad j = 1,\dots,n, \\ & w_0 \ge 0. \end{align*} \tag{2}

Phew, we made it.

Step 3: Reformulate the Original Constraint

We started out with the following awkward constraint

j=1naˉjxj+maxz{j=1ncjzj:j=1nzjΓ, 0zj1}b. \sum_{j=1}^n \bar a_{j} x_{j} + \max_{z} \left\{ \sum_{j=1}^n c_{j} z_{j} : \sum_{j=1}^n z_{j} \le \Gamma,\ 0 \le z_{j} \le 1 \right\} \le b.

By strong duality, the optimal value of the primal problem, (1), is equal to the optimal value of the dual problem, (2). So we can replace the max. with the min.:

j=1naˉjxj+minw{j=1nwj+Γw0:wj+w0cj j, w0};;b \sum_{j=1}^n \bar a_j x_j + \min_{w} \left\{ \sum_{j=1}^n w_j + \Gamma w_0 : w_j + w_0 \ge c_j \ \forall j,\ w \ge 0 \right\} ;\le; b

Recall that the outer problem was also an optimisation, a minimisation over xx, so we can pull out the minimisation over ww and treat the wjw_j as additional decision variables. This gives us an equivalent formulation that’s purely linear:

j=1naˉjxj+j=1nwj+Γw0b,wj+w0cj,j=1,,n,wj0,j=1,,n,w00. \begin{align*} &\sum_{j=1}^n \bar a_j x_j + \sum_{j=1}^n w_j + \Gamma w_0 \le b,\\ &w_j + w_0 \ge c_j, \quad j = 1,\dots,n,\\ &w_j \ge 0,\quad j=1,\dots,n,\\ &w_0 \ge 0. \end{align*}

The nested maximisation has gone and we’ve replaced it with n+1n+1 additional variables and a few extra linear constraints.

In the next part (the last of this mini-series), we’ll use this reformulation to follow along with the Bertsimas and Sim robust optimisation model derivation. Then we’ll implement it and compare the results to the Soyster model from part 1.

References

[1]: The Price of Robustness - Bertsimas and Sim