Matrix Derivative

The derivative of a matrix with respect to either a scalar or vector variable involves calculating the derivative of each element within the matrix, similar to the process used for functions.

Below are the main cases with practical examples:

Derivative of a Matrix with Respect to a Scalar Variable

The derivative of a matrix \( A(t) \), whose elements depend on a scalar variable \( t \), is a new matrix where each element is obtained by differentiating the corresponding element of \( A(t) \) with respect to \( t \).

A Practical Example 

Let’s consider this 2x2 matrix:

$$ \mathbf{A}(t) = \begin{bmatrix} t^2 & \sin(t) \\ e^t & t + 1 \end{bmatrix} $$

This is a matrix function \( \mathbf{A}(t) \) that depends on the scalar variable \( t \).

To find the derivative of \( \mathbf{A}(t) \) with respect to \( t \), I differentiate each element of the matrix separately with respect to \( t \).

The derivatives of each element are:

  • \( \frac{d}{dt} \left( t^2 \right) = 2t \)
  • \( \frac{d}{dt} \left( \sin(t) \right) = \cos(t) \)
  • \( \frac{d}{dt} \left( e^t \right) = e^t \)
  • \( \frac{d}{dt} \left( t + 1 \right) = 1 \)

Therefore, the derivative of \( \mathbf{A}(t) \) with respect to \( t \) is:

$$ \frac{d\mathbf{A}(t)}{dt} = \begin{bmatrix} 2t & \cos(t) \\ e^t & 1 \end{bmatrix} $$

This illustrates a matrix derivative where the elements depend on a scalar variable \( t \). 

Derivative of a Matrix with Respect to a Vector

The derivative of a matrix \( A(\mathbf{x}) \), where the elements depend on a vector \( \mathbf{x} = [x_1, x_2, \ldots, x_n]^T \), results in a new matrix (or tensor) where each element is given by the partial derivative of the corresponding element of \( A(\mathbf{x}) \) with respect to each component of the vector \( \mathbf{x} \).

This process creates a tensor containing all the partial derivatives.

A Practical Example

Consider a matrix \( \mathbf{B}(\mathbf{x}) \) dependent on a vector \( \mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \).

$$ \mathbf{B}(\mathbf{x}) = \begin{bmatrix} x_1^2 & x_1 x_2 \\ x_1 + x_2 & x_2^2 \end{bmatrix} $$

To find the derivative of matrix \( \mathbf{B} \) with respect to vector \( \mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \), I calculate the partial derivatives of each matrix element with respect to each vector component \( x_1 \) and \( x_2 \).

The partial derivatives with respect to \( x_1 \) are as follows:

  • \( \frac{\partial}{\partial x_1} (x_1^2) = 2x_1 \)
  • \( \frac{\partial}{\partial x_1} (x_1 x_2) = x_2 \)
  • \( \frac{\partial}{\partial x_1} (x_1 + x_2) = 1 \)
  • \( \frac{\partial}{\partial x_1} (x_2^2) = 0 \)

Thus, the matrix of partial derivatives with respect to \( x_1 \) is:

$$ \frac{\partial \mathbf{B}}{\partial x_1} = \begin{bmatrix} 2x_1 & x_2 \\ 1 & 0 \end{bmatrix} $$

The partial derivatives with respect to \( x_2 \) are:

  • \( \frac{\partial}{\partial x_2} (x_1^2) = 0 \)
  • \( \frac{\partial}{\partial x_2} (x_1 x_2) = x_1 \)
  • \( \frac{\partial}{\partial x_2} (x_1 + x_2) = 1 \)
  • \( \frac{\partial}{\partial x_2} (x_2^2) = 2x_2 \)

So, the matrix of partial derivatives with respect to \( x_2 \) is:

$$ \frac{\partial \mathbf{B}}{\partial x_2} = \begin{bmatrix} 0 & x_1 \\ 1 & 2x_2 \end{bmatrix} $$

By combining these partial derivatives, we form the Jacobian matrix, which can be visualized as a 3-dimensional array (or tensor).

$$ \mathbf{J}(\mathbf{x}) = \begin{bmatrix} \frac{\partial \mathbf{B}}{\partial x_1}, \frac{\partial \mathbf{B}}{\partial x_2} \end{bmatrix} = \begin{bmatrix} \begin{bmatrix} 2x_1 & x_2 \\ 1 & 0 \end{bmatrix}, \begin{bmatrix} 0 & x_1 \\ 1 & 2x_2 \end{bmatrix} \end{bmatrix} $$

The final result is the Jacobian matrix of \( \mathbf{B} \).

In this case, since \( \mathbf{B} \) has dimensions \( 2 \times 2 \) and \( \mathbf{x} \) has dimensions \( 2 \times 1 \), the Jacobian matrix is essentially a tensor of size \( 2 \times 2 \times 2 \). 

Example 2

Consider a function mapping a vector \( \mathbf{x} = [x, y]^T \) to a matrix \( A(\mathbf{x}) = \begin{bmatrix} xy & x^2 \\ y^2 & xy \end{bmatrix} \).

To compute the derivative of \( A \) with respect to \( \mathbf{x} \), I find all the partial derivatives:

  • Derivative with respect to \( x \): $$ \frac{\partial A}{\partial x} = \begin{bmatrix} y & 2x \\ 0 & y \end{bmatrix} $$
  • Derivative with respect to \( y \): $$ \frac{\partial A}{\partial y} = \begin{bmatrix} x & 0 \\ 2y & x \end{bmatrix} $$

These matrices form the Jacobian of the matrix function with respect to the vector \( \mathbf{x} \).

$$ \mathbf{J}(\mathbf{x}) = \begin{bmatrix} \frac{\partial A}{\partial x} & \frac{\partial A}{\partial y} \end{bmatrix} = \begin{bmatrix} \begin{bmatrix} y & 2x \\ 0 & y \end{bmatrix} & \begin{bmatrix} x & 0 \\ 2y & x \end{bmatrix} \end{bmatrix} $$

And so forth.

 
 

Please feel free to point out any errors or typos, or share suggestions to improve these notes. English isn't my first language, so if you notice any mistakes, let me know, and I'll be sure to fix them.

FacebookTwitterLinkedinLinkedin
knowledge base

Derivatives

Exercises