Resource Garage: Hessian Matrix and Jacobian Matrix

Hessian Matrix

In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a function. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term "functional determinants".

Given the real-valued function

$f(x_1, x_2, \dots, x_n),\,\!$

if all second partial derivatives of f exist and are continuous over the domain of the function, then the Hessian matrix of f is

$H(f)_{ij}(\mathbf x) = D_i D_j f(\mathbf x)\,\!$

where x = (x₁, x₂, ..., x_n) and D_i is the differentiation operator with respect to the ith argument. Thus

$H(f) = \begin{bmatrix} \dfrac{\partial^2 f}{\partial x_1^2} & \dfrac{\partial^2 f}{\partial x_1\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_1\,\partial x_n} \\[2.2ex] \dfrac{\partial^2 f}{\partial x_2\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_2^2} & \cdots & \dfrac{\partial^2 f}{\partial x_2\,\partial x_n} \\[2.2ex] \vdots & \vdots & \ddots & \vdots \\[2.2ex] \dfrac{\partial^2 f}{\partial x_n\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_n\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_n^2} \end{bmatrix}.$

Because f is often clear from context, $H(f)(\mathbf x)$ is frequently abbreviated to $H(\mathbf x)$ .

The Hessian matrix is related to the Jacobian matrix by $H(f)(\mathbf x)$ = $J(\nabla \! f)(\mathbf x)$ .

The determinant of the above matrix is also sometimes referred to as the Hessian.^[1]

Hessian matrices are used in large-scale optimization problems within Newton-type methods because they are the coefficient of the quadratic term of a local Taylor expansion of a function. That is,

$y=f(\mathbf{x}+\Delta\mathbf{x})\approx f(\mathbf{x}) + J(\mathbf{x})\Delta \mathbf{x} +\frac{1}{2} \Delta\mathbf{x}^\mathrm{T} H(\mathbf{x}) \Delta\mathbf{x}$

where J is the Jacobian matrix, which is a vector (the gradient) for scalar-valued functions. The full Hessian matrix can be difficult to compute in practice; in such situations, quasi-Newton algorithms have been developed that use approximations to the Hessian. The best-known quasi-Newton algorithm is the BFGS algorithm

Jacobian Matrix

In vector calculus, the Jacobian matrix is the matrix of all first-order partial derivatives of a vector-valued function. Specifically, suppose $F : \mathbb{R}^n \rightarrow \mathbb{R}^m$ is a function (which takes as input real n-tuples and produces as output real m-tuples). Such a function is given by m real-valued component functions, $F_1(x_1,\ldots,x_n),\ldots,F_m(x_1,\ldots,x_n)$ . The partial derivatives of all these functions with respect to the variables $x_1,\ldots,x_n$ (if they exist) can be organized in an m-by-n matrix, the Jacobian matrix $J$ of $F$ , as follows:

$J=\begin{bmatrix} \dfrac{\partial F_1}{\partial x_1} & \cdots & \dfrac{\partial F_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \dfrac{\partial F_m}{\partial x_1} & \cdots & \dfrac{\partial F_m}{\partial x_n} \end{bmatrix}.$

Resource Garage

Pages

Sunday, May 26, 2013

Hessian Matrix and Jacobian Matrix

No comments:

Post a Comment