Introduction

This post is a bit of a mathematical diversion from the previous post on robust optimisation. To follow along with the derivation of the Bertsimas and Sim robustness model we’re going to need to optimise an LP minimisation programme which contains a second ’nested’ LP maximisation programme in one of its constraints.

It isn’t possible to solve such a programme directly so we will have to find a way to reformulate it. The reformulation itself uses the interesting ‘duality’ result for LPs and is, at least in my opinion, interesting enough to not simply quote from the paper. In the end I decided that it was bulky enough that instead of jamming it midway through a piece focussing on different definitions and implementations of ‘robustness’ it should go into its own short post.

Nested optimisation problems

Suppose you’re solving a minimisation problem which contains a constraint like this:

$$ \begin{align*} \sum_{j=1}^{n} \bar a_{j} x_{j} + \max_{z} \Bigg\{ \sum_{j=1}^{n} c_{j} z_{j} : \sum_{j=1}^{n} z_{j} \le \Gamma,\ 0 \le z_{j} \le 1 \Bigg\} \le b \end{align*} $$

where \(c_j\) are some coefficients possibly depending on \(x\), \(\Gamma\in\mathbb{Z}\) and \(x\) is our decision variable.

We’re faced with a maximisation problem nested inside a constraint of a minimisation problem. This creates a non-convex structure that off-the-shelf LP solvers (gurobi) can’t optimise: We’re asking “what’s the worst-case value of this ‘inner’ maximisation problem?” while simultaneously trying to minimise the outer problem.

Let’s label the ‘inner’ problem (1):

$$ \begin{align*} \max_{z} \quad & \sum_{j=1}^n c_j z_j \ \text{s.t.}\quad & \sum_{j=1}^n z_j \le \Gamma, \ & 0 \le z_j \le 1 \quad \forall j. \end{align*} \tag{1} $$

The approach taken in [1], which I will follow here, is to use duality to convert the inner maximisation into a minimisation. If we can find a dual representation of this problem, we can replace the maximisation with a minimisation over the resulting dual variables. Since both the outer problem and the inner problem are now minimisations, we can combine them: instead of “minimise over \(x\), where each \(x\) requires solving a separate optimisation over \(w\)”, we can simply “minimise over both \(x\) and \(w\) jointly”. The inner minimisation gets absorbed by treating \(w\) as additional decision variables alongside \(x\). Proceeding step-by-step:

Step 1: Convert to Standard Form

To take the dual of (1), we first need to rewrite it in standard LP form: only equality constraints are allowed and all variables must be constrained to be non-negative. There we replace the ‘budget’ constraint with equality by introducing a a slack variable \(\sigma \ge 0\)

$$ \sum_j z_j + \sigma = \Gamma, \quad \sigma \ge 0. $$

and we replace each upper bound \(z_j \le 1\) with a slack variable \(s_j \ge 0\):

$$ z_j + s_j = 1, \quad s_j \ge 0. $$

The constraints can be written compactly as \(A u = b\) if we collect all of our variables into a single vector:

$$ u = \begin{pmatrix} z \ s \ \sigma \end{pmatrix} \ge 0 \in \mathbb{R}^{2n+1}. $$

and we take \(A = (I_n, I_n, 0, \mathbf{1}^\top, 0^\top, 1 )\) and \(b = (\mathbf{1}, \Gamma)\). Only the \(z_j\) variables appear in the objective, so the cost vector is simply \(c = (c_0\ldots c_n, 0, \ldots, 0)\).

Thus (1) is equivalent to

$$ \begin{aligned} \max_{u} \quad & c^\top u \ \text{s.t.}\quad & A u = b, \ & u \ge 0. \end{aligned} $$

Step 2: Take the Dual

A maximisation LP in standard form and its dual are given by, respecitively:

$$ \max { c^\top u : A u = b,\ u \ge 0 } $$ $$ \min { b^\top w : A^\top w \ge c }. $$

Where the dual variable, \(w\), has dimension \(n+1\): \(w_{j\geq1}\) maps to the original constraint \(z_j + s_j = 1\) and \(w_0\) corresponds back to \(\sum_j z_j + \sigma = \Gamma\).

The dual objective is given by

$$ b^\top w = \sum_{j=1}^n 1 \cdot w_j + \Gamma w_0 = \sum_{j=1}^n w_j + \Gamma w_0. $$

Next we need to look at the dual constraints \(A^\top w \ge c\).

For columns corresponding to \(z_j\): Column \(j\) of \(A\) has a 1 in row \(j\) and 1 in the last row, so we get \(w_j + w_0 \ge c_j\) for all \(j\), and for columns corresponding to \(s_j\): Column \(j\) of the second block has 1 in row \(j\) and 0 in the last row, so we have \(w_j \ge 0\) for all \(j\). Lastly, the column corresponding to \(\sigma\) results in \(w_0 \ge 0\).

Putting this together, we can write the dual problem as

$$ \begin{align*} \min_{w} \quad & \sum_{j=1}^n w_j + \Gamma w_0 \\ \text{s.t.}\quad & w_j + w_0 \ge c_j, \quad j = 1,\dots,n, \\ & w_j \ge 0,\quad j = 1,\dots,n, \\ & w_0 \ge 0. \end{align*} \tag{2} $$

Phew, we made it.

Step 3: Reformulate the Original Constraint

We started out with the following awkward constraint

$$ \sum_{j=1}^n \bar a_{j} x_{j} + \max_{z} \left\{ \sum_{j=1}^n c_{j} z_{j} : \sum_{j=1}^n z_{j} \le \Gamma,\ 0 \le z_{j} \le 1 \right\} \le b. $$

By strong duality, the optimal value of the primal problem, (1), is equal to the optimal value of the dual problem, (2). So we can replace the max. with the min.:

$$ \sum_{j=1}^n \bar a_j x_j + \min_{w} \left\{ \sum_{j=1}^n w_j + \Gamma w_0 : w_j + w_0 \ge c_j \ \forall j,\ w \ge 0 \right\} ;\le; b $$

Recall that the outer problem was also an optimisation, a minimisation over \(x\), so we can pull out the minimisation over \(w\) and treat the \(w_j\) as additional decision variables. This gives us an equivalent formulation that’s purely linear:

$$ \begin{align*} &\sum_{j=1}^n \bar a_j x_j + \sum_{j=1}^n w_j + \Gamma w_0 \le b,\\ &w_j + w_0 \ge c_j, \quad j = 1,\dots,n,\\ &w_j \ge 0,\quad j=1,\dots,n,\\ &w_0 \ge 0. \end{align*} $$

The nested maximisation has gone and we’ve replaced it with \(n+1\) additional variables and a few extra linear constraints.

In the next part (the last of this mini-series), we’ll use this reformulation to follow along with the Bertsimas and Sim robust optimisation model derivation. Then we’ll implement it and compare the results to the Soyster model from part 1.

References

[1]: The Price of Robustness - Bertsimas and Sim