From: rusin@vesuvius.math.niu.edu (Dave Rusin)
Subject: Re: Hessian of a matrix
Date: 12 Jul 1999 14:39:25 GMT
Newsgroups: sci.math

In article <7mbg90$a08$1@nnrp1.deja.com>,  <pi_3_14159@hotmail.com> wrote:

>Would someone be so kind as to define the Hessian of a matrix and ouline a
>few of its properties please.

Is it possible you mean "the Hessian matrix"? Given a function  f : R^n -> R 
twice-differentiable at some point  a  in  R^n, the Hessian matrix of  f
at  a  is the  n x n  matrix whose  (i,j)  entry is  d^2f/dx_i dx_j.

It's relevant because the Taylor series for f at a  begins
     f(a) + grad(f)(a).(v-a) + (v-a)^t H(f)(a) (v-a) + ...
that is, (for sufficiently well-behaved functions) the Hessian captures the
behaviour of the function near  a  up to the quadratic level.

In particular, it is the signature of  H  at a critical point  a  which
determines whether  a  is a local minimum, a local maximum, or some kind
of saddle-surface. You can use any of the tests you learned in linear
algebra to decide whether  H  is positive definite or negative definite.

dave
==============================================================================

From: "Ingram" <ingram@mail.telepac.pt>
Subject: Hessian matrix
Date: Tue, 13 Jul 1999 02:15:52 +0100
Newsgroups: [missing]
To: "Dave Rusin" <rusin@vesuvius.math.niu.edu>

Hi, thanks for replying,

: Is it possible you mean "the Hessian matrix"? Given a function  f : R^n
-> R 
: twice-differentiable at some point  a  in  R^n, the Hessian matrix of  f
: at  a  is the  n x n  matrix whose  (i,j)  entry is  d^2f/dx_i dx_j.

This is great, thanks.

: 
: It's relevant because the Taylor series for f at a  begins
:      f(a) + grad(f)(a).(v-a) + (v-a)^t H(f)(a) (v-a) + ...
: that is, (for sufficiently well-behaved functions) the Hessian captures the
: behaviour of the function near  a  up to the quadratic level.

This I don't fully get... what do you mean in the first line, I understand
it to mean f(a) begins... however you are using f(a) in the next line.  Is
grad(f)(a) = grad(f(a))? If not, I don't get this either, and what is v
please? A general vector? And the power of t???  Also is there a dot
product missing after H(f)(a)?

Also, in the final paragraph, to determine the "signature"? By this do you
mean the determinant? Sorry to be so dense!

Thanks for your help,
Piers Ingram
==============================================================================

From: Dave Rusin <rusin@math.niu.edu>
Subject: Re:  Hessian matrix
Date: Mon, 12 Jul 1999 21:00:56 -0500 (CDT)
Newsgroups: [missing]
To: ingram@mail.telepac.pt

Just noticed a typo: you need a "1/2" multiplying the quadratic term,
just like the 1-dimensional Taylor series, in
>      f(a) + grad(f)(a).(v-a) + (v-a)^t H(f)(a) (v-a) + ...
Now, what this series (when corrected) is approximating is  f(v)  where
v  is a point near  a. Would it help if I called it "the Taylor series for f at 
points near a"? I just mean the same idea as for 1-dimensional functions.

>Is grad(f)(a) = grad(f(a))?
Um, well, yes and no. If  f  is a function, "grad(f)" is an n-tuple of
functions, namely   (df/dx1, ..., df/dx_n). You need to evaluate this
n-tuple at  a, that is, you need  grad(f)(a) = ( df/dx1(a), ..., df/dx_n(a) ).
It isn't really right to call it  "grad(f(a))", since if  f  is a function
and  a  is a point, then  f(a)  is a number; what is  grad  of a number? zero?
I don't know, but it's not what I want in any case.

>And the power of t?
Eh? There is no  t  here. Maybe your "t" is my "v"?

>is there a dot product missing after H(f)(a)?
Well, I guess there is _something_ missing.  H(f)(a)  is a matrix,
(v-a) is a (column) vector, and  (v-a)^t  is its transpose. You need to
take the matrix product of  (row vector) times (matrix) times (column vector)
(times 1/2, sorry again for the typo).

>to determine the "signature"? 
It's not the determinant, no. You need to read a linear algebra book to learn
about canonical forms for quadratic forms. Up to a change of basis, every
such function   v^t M v  can be changed to  (v1^2+...+v_k^2) - (v_{k+1}^2 + ...
+ v_r^2)  where  r  is the rank of  M  and the number  k <= r  is an
intrinsic invariant of the (symmetric) matrix  M.

You really should just consult a multivariable calculus book.

dave