Hessian Matrix
The Hessian matrix is a square matrix that contains all the second-order partial derivatives of a function: $$ H = \begin{pmatrix}\frac{\partial^2 f}{\partial x^2} & \frac{\partial^2 f}{\partial x \partial y} \\\frac{\partial^2 f}{\partial y \partial x} & \frac{\partial^2 f}{\partial y^2}\end{pmatrix} $$ The main diagonal contains the pure second derivatives - those taken twice with respect to $x$ and $y$. The off-diagonal entries represent the mixed partial derivatives - first with respect to one variable, then the other.
This matrix is an essential tool for analyzing the nature of critical points in functions of multiple variables.
In particular, for functions of two variables like $f(x, y)$, it helps determine whether a critical point is a local minimum, a local maximum, or a saddle point.
How to use the Hessian to analyze critical points
To classify a critical point as a maximum, a minimum, or a saddle point:
- Identify the critical points: solve the system where the first partial derivatives are zero.
- Construct the Hessian: compute the second-order partial derivatives and form the matrix.
- Evaluate the Hessian at each critical point by plugging in the coordinates.
- Compute the determinant $\Delta$ of the Hessian: $$ \Delta = \left( \frac{\partial^2 f}{\partial x^2} \cdot \frac{\partial^2 f}{\partial y^2} \right) - \left( \frac{\partial^2 f}{\partial x \partial y} \right)^2 $$
- Use the value of the determinant to determine the type of critical point:
- If $\Delta > 0$ and $\frac{\partial^2 f}{\partial x^2} > 0$, the point is a local minimum.
- If $\Delta > 0$ and $\frac{\partial^2 f}{\partial x^2} < 0$, the point is a local maximum.
- If $\Delta < 0$, the point is a saddle point.
- If $\Delta = 0$, the Hessian test is inconclusive, and further analysis is needed.
With just a few algebraic steps, you can determine whether the function attains a maximum, a minimum, or exhibits more complex behavior at a critical point.
A practical example
Consider the function:
$$ f(x, y) = xy $$
The first partial derivatives are:
$$ \frac{\partial f}{\partial x} = y $$
$$ \frac{\partial f}{\partial y} = x $$
These derivatives vanish at the point $(0,0)$, identifying it as a critical point where a local extremum might occur.
To classify it, we turn to the Hessian matrix.
The second-order partial derivatives are:
$$ \frac{\partial^2 f}{\partial x^2} = 0 \qquad \frac{\partial^2 f}{\partial y^2} = 0 \qquad \frac{\partial^2 f}{\partial x \partial y} = 1 $$
Thus, the Hessian is:
$$ H = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} $$
Since the matrix contains no variables, we don’t need to substitute any coordinates.
The determinant is:
$$ \Delta = 0 \cdot 0 - 1 \cdot 1 = -1 $$
The determinant is negative, so the point $(0,0)$ is a saddle point.
Example 2
Now consider the function:
$$ f(x,y) = x^4 - 2xy + y^4 $$
The first-order partial derivatives are:
$$ \frac{\partial f}{\partial x} = 4x^3 - 2y \qquad \frac{\partial f}{\partial y} = 4y^3 - 2x $$
To find the critical points, solve the system:
$$ \begin{cases} 4x^3 - 2y = 0 \\ 4y^3 - 2x = 0 \end{cases} \Rightarrow \begin{cases} y = 2x^3 \\ 32x^9 - 2x = 0 \Rightarrow x = 0 \text{ or } x = \pm \frac{1}{\sqrt{2}} \end{cases} $$
This yields three critical points:
- $(0,0)$
- $\left( \frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}} \right)$
- $\left( -\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}} \right)$
Next, compute the second-order partial derivatives:
$$ \frac{\partial^2 f}{\partial x^2} = 12x^2 \qquad \frac{\partial^2 f}{\partial y^2} = 12y^2 \qquad \frac{\partial^2 f}{\partial x \partial y} = -2 $$
So the Hessian becomes:
$$ H = \begin{pmatrix} 12x^2 & -2 \\ -2 & 12y^2 \end{pmatrix} $$
We now evaluate the Hessian at each critical point:
- At (0,0):
$$ H = \begin{pmatrix} 0 & -2 \\ -2 & 0 \end{pmatrix}, \quad \Delta = -4 < 0 $$ The determinant is negative, so this point is a saddle point. - At $\left( \frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}} \right)$:
$$ H = \begin{pmatrix} 6 & -2 \\ -2 & 6 \end{pmatrix}, \quad \Delta = 36 - 4 = 32 > 0 $$ Since both the determinant and $\frac{\partial^2 f}{\partial x^2}$ are positive, this is a local minimum. - At $\left( -\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}} \right)$:
The Hessian is identical to the previous case: $$ H = \begin{pmatrix} 6 & -2 \\ -2 & 6 \end{pmatrix} $$ By symmetry, this point is also a local minimum.
In summary, the function has two local minima at $\left( \frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}} \right)$ and $\left( -\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}} \right)$, and a saddle point at $(0,0)$.
Hessian Matrix with Constraints
When a function is subject to one or more constraints, the standard Hessian method for analyzing critical points is no longer directly applicable, as it is in the unconstrained setting.
In such cases, the Lagrange multiplier method is used to locate constrained critical points. Their nature is then assessed using an augmented Hessian matrix.
To begin, we define the Lagrangian function $ \mathcal{L} $ as:
$$ \mathcal{L}(x, y, \lambda) = f(x, y) - \lambda g(x, y) $$
Here, $f(x,y)$ is the objective function, and $g(x,y) = 0$ defines the constraint - that is, the equation describing the admissible domain.
We compute the partial derivatives of the Lagrangian and set them equal to zero:
$$ \begin{cases} \frac{\partial \mathcal{L}}{\partial x} = 0 \\ \frac{\partial \mathcal{L}}{\partial y} = 0 \\ \frac{\partial \mathcal{L}}{\partial \lambda} = 0 \end{cases} \quad \Rightarrow \quad \text{Lagrange system} $$
Solving this system yields the constrained critical points of the function.
To determine whether a constrained critical point corresponds to a local maximum or minimum, we construct the augmented Hessian matrix, which incorporates the second derivatives of the Lagrangian as well as the gradient of the constraint:
$$ H_L = \begin{pmatrix} \frac{\partial^2 \mathcal{L}}{\partial x^2} & \frac{\partial^2 \mathcal{L}}{\partial x \partial y} & \frac{\partial g}{\partial x} \\ \frac{\partial^2 \mathcal{L}}{\partial y \partial x} & \frac{\partial^2 \mathcal{L}}{\partial y^2} & \frac{\partial g}{\partial y} \\ \frac{\partial g}{\partial x} & \frac{\partial g}{\partial y} & 0 \end{pmatrix} $$
This augmented Hessian accounts for both the curvature of the objective function and the geometric structure imposed by the constraint.
It is the only form of the Hessian that allows for a rigorous classification of constrained critical points.
Note. Some references discuss the Hessian of the Lagrangian: $$ H = \begin{pmatrix} \frac{\partial^2 \mathcal{L}}{\partial x^2} & \frac{\partial^2 \mathcal{L}}{\partial x \partial y} \\ \frac{\partial^2 \mathcal{L}}{\partial y \partial x} & \frac{\partial^2 \mathcal{L}}{\partial y^2} \end{pmatrix} $$ However, this matrix alone is insufficient: it neglects the directions along which movement is constrained. Only the augmented Hessian yields a complete and conclusive test.
Once the matrix $H_L$ is constructed, we evaluate it at each critical point, compute its determinant, and examine the sign:
- If the determinant is positive, the point is a constrained local maximum.
- If the determinant is negative, it is a constrained local minimum.
- If the determinant vanishes, the test is inconclusive, and further analysis is required.
Example: The Hessian Matrix under a Constraint
We analyze the function:
$$ f(x, y) = 3x + 4y $$
subject to the constraint that $(x, y)$ lies on the boundary of the circle:
$$ x^2 + y^2 = 25 $$
We rewrite the constraint in standard form: $ g(x,y) = 0 $
$$ g(x, y) = x^2 + y^2 - 25 = 0 $$
Next, we construct the Lagrangian: $ L(x, y, \lambda) = f(x, y) + \lambda \cdot g(x, y) $
$$ L(x, y, \lambda) = 3x + 4y + \lambda(x^2 + y^2 - 25) $$
We compute the partial derivatives with respect to $x$, $y$, and $\lambda$, and set them to zero:
$$ \begin{cases} \frac{\partial L}{\partial x} = 3 + 2\lambda x = 0 \\ \frac{\partial L}{\partial y} = 4 + 2\lambda y = 0 \\ x^2 + y^2 = 25 \end{cases} $$
Solving the first two equations:
$$ \lambda = -\frac{3}{2x} \qquad \lambda = -\frac{2}{y} $$
Equating the two expressions for $\lambda$:
$$ \frac{3}{2x} = \frac{2}{y} \Rightarrow 3y = 4x \Rightarrow y = \frac{4}{3}x $$
Substitute into the constraint equation:
$$ x^2 + \left( \frac{4}{3}x \right)^2 = 25 \Rightarrow x^2 + \frac{16}{9}x^2 = 25 \Rightarrow \frac{25}{9}x^2 = 25 $$
Solving for $x$:
$$ x^2 = 9 \Rightarrow x = \pm 3 $$
Substituting back, we find $y = \pm 4$.
Thus, the constrained critical points are:
- $ P_1 = (3, 4) $
- $ P_2 = (-3, -4) $
Evaluate $f$ at these points:
- $f(3, 4) = 3 \cdot 3 + 4 \cdot 4 = 9 + 16 = 25$
- $f(-3, -4) = -9 - 16 = -25$
Preliminarily, we conclude that:
- $P_1 = (3, 4)$ is a constrained maximum
- $P_2 = (-3, -4)$ is a constrained minimum
We now construct the augmented Hessian:
$$ H_L = \begin{pmatrix} L_{xx} & L_{xy} & g_x \\ L_{yx} & L_{yy} & g_y \\ g_x & g_y & 0 \end{pmatrix} $$
The required derivatives are:
$$ L_{xx} = 2\lambda \qquad L_{yy} = 2\lambda \qquad L_{xy} = L_{yx} = 0 $$
$$ g_x = 2x \qquad g_y = 2y $$
So the matrix becomes:
$$ H_L = \begin{pmatrix} 2\lambda & 0 & 2x \\ 0 & 2\lambda & 2y \\ 2x & 2y & 0 \end{pmatrix} $$
- At $P_1 = (3, 4)$:
We have $x = 3$, $y = 4$, and from the Lagrange equations, $\lambda = -\frac{1}{2}$. Substituting in: $$ H_L = \begin{pmatrix} -1 & 0 & 6 \\ 0 & -1 & 8 \\ 6 & 8 & 0 \end{pmatrix} $$ Compute the determinant: $$ \Delta = -1 \cdot (0 - 64) - 6 \cdot (0 - 6) = 64 - 36 = 28 > 0 $$ So $P_1$ is confirmed as a constrained maximum. - At $P_2 = (-3, -4)$:
Here $x = -3$, $y = -4$, and $\lambda = \frac{1}{2}$. The matrix becomes: $$ H_L = \begin{pmatrix} 1 & 0 & -6 \\ 0 & 1 & -8 \\ -6 & -8 & 0 \end{pmatrix} $$ The determinant is: $$ \Delta = 1 \cdot (0 - 64) - 6 \cdot (0 - 6) = -64 - 36 = -100 < 0 $$ Therefore, $P_2$ is a constrained minimum.
Both direct evaluation and the augmented Hessian method lead to the same conclusion:
- The constrained maximum occurs at $P_1 = (3,4)$
- The constrained minimum occurs at $P_2 = (-3, -4)$
The augmented Hessian provides a rigorous framework for verifying the nature of constrained critical points - especially when the function is nonlinear and simple evaluation of $f(x,y)$ is insufficient.
And so on.