D.48 About La­grangian mul­ti­pli­ers

This note will de­rive the La­grangian mul­ti­pli­ers for an ex­am­ple prob­lem. Only cal­cu­lus will be used. The ex­am­ple prob­lem will be to find a sta­tion­ary point of a func­tion $f$ of four vari­ables if there are two con­straints. Dif­fer­ent num­bers of vari­ables and con­straints would work out in sim­i­lar ways as this ex­am­ple.

The four vari­ables that ex­am­ple func­tion $f$ de­pends on will be de­noted by $x_1$, $x_2$, $x_3$, and $x_4$. The two con­straints will be taken to be equa­tions of the form $g(x_1,x_2,x_3,x_4)$ $\vphantom0\raisebox{1.5pt}{$=$}$ 0 and $h(x_1,x_2,x_3,x_4)$ $\vphantom0\raisebox{1.5pt}{$=$}$ 0, for suit­able func­tions $g$ and $h$. Con­straints can al­ways be brought in such a form by tak­ing every­thing in the con­straint’s equa­tion to the left-hand side of the equals sign.

So the ex­am­ple prob­lem is:

\mbox{stationarize:} && f(x_1,x_2,x_3,x_4) \\
\mbox{subject to:} && g(x_1,x_2,x_3,x_4)=0,\ h(x_1,x_2,x_3,x_4)=0

Sta­tion­ar­ize means to find lo­ca­tions where the func­tion has a min­i­mum or a max­i­mum, or any other point where it does not change un­der small changes of the vari­ables $x_1,x_2,x_3,x_4$ as long as these sat­isfy the con­straints.

The first thing to note is that rather than con­sid­er­ing $f$ to be a func­tion of $x_1,x_2,x_3,x_4$, you can con­sider it in­stead to be to be a func­tion of $g$ and $h$ and only two ad­di­tional vari­ables from $x_1,x_2,x_3,x_4$, say $x_3$ and $x_4$:

f(x_1,x_2,x_3,x_4)=\tilde f(g,h,x_3,x_4)

The rea­son you can do that is that you should in prin­ci­ple be able to re­con­struct the two miss­ing vari­ables $x_1$ and $x_2$ given $g$, $h$, $x_3$, and $x_4$.

As a re­sult, any small change in the func­tion $f$, re­gard­less of con­straints, can be writ­ten us­ing the ex­pres­sion for a to­tal dif­fer­en­tial as:

{\rm d}f =
\frac{\partial \tilde f}{\partial g}{\rm d}g +
...\rm d}x_3 +
\frac{\partial \tilde f}{\partial x_4}{\rm d}x_4.

At the de­sired sta­tion­ary point, ac­cept­able changes in vari­ables are those that keep $g$ and $h$ con­stant at zero; they have ${\rm d}{g}$ $\vphantom0\raisebox{1.5pt}{$=$}$ 0 and ${\rm d}{h}$ $\vphantom0\raisebox{1.5pt}{$=$}$ 0. So for $f$ to be sta­tion­ary un­der all ac­cept­able changes of vari­ables, you must have that the fi­nal two terms are zero for any changes in vari­ables. This means that the par­tial de­riv­a­tives in the fi­nal two terms must be zero since the changes ${{\rm d}}x_3$ and ${{\rm d}}x_4$ can be ar­bi­trary.

For changes in vari­ables that do go out of bounds, the change in $f$ will not be zero; that change will be given by the first two terms in the right-hand side. So, the er­ro­neous changes in $f$ due to go­ing out of bounds are these first two terms, and if we sub­tract them, we get zero net change for any ar­bi­trary change in vari­ables:

{\rm d}f -
\frac{\partial \tilde f}{\partial g}{\rm d}g -
\frac{\partial \tilde f}{\partial h}{\rm d}h = 0 \mbox{ always.}

In other words, if we pe­nal­ize the change in $f$ for go­ing out of bounds by amounts ${\rm d}{g}$ and ${\rm d}{h}$ at the rate above, any change in vari­ables will pro­duce a pe­nal­ized change of zero, whether it stays within bounds or not.

The two de­riv­a­tives at the sta­tion­ary point in the ex­pres­sion above are the La­grangian mul­ti­pli­ers or penalty fac­tors, call them $\epsilon_1$ $\vphantom0\raisebox{1.5pt}{$=$}$ $\partial\tilde{f}$$\raisebox{.5pt}{$/$}$$\partial{g}$ and $\epsilon_2$ $\vphantom0\raisebox{1.5pt}{$=$}$ $\partial\tilde{f}$$\raisebox{.5pt}{$/$}$$\partial{h}$. In those terms

{\rm d}f - \epsilon_1 {\rm d}g - \epsilon_2{\rm d}h = 0

for what­ever is the change in the vari­ables $g,h,x_3,x_4$, and that means for what­ever is the change in the orig­i­nal vari­ables $x_1,x_2,x_3,x_4$. There­fore, the change in the pe­nal­ized func­tion

f - \epsilon_1 g - \epsilon_2 h

is zero what­ever is the change in the vari­ables $x_1,x_2,x_3,x_4$.

In prac­ti­cal ap­pli­ca­tion, ex­plic­itly com­put­ing the La­grangian mul­ti­pli­ers $\epsilon_1$ and $\epsilon_2$ as the de­riv­a­tives of func­tion $\tilde{f}$ is not needed. You get four equa­tions by putting the de­riv­a­tives of the pe­nal­ized $f$ with re­spect to $x_1$ through $x_4$ equal to zero, and the two con­straints pro­vide two more equa­tions. Six equa­tions is enough to find the six un­knowns $x_1$ through $x_4$, $\epsilon_1$ and $\epsilon_2$.