gradient.Rd
Given a hyper2
object and a point in probability space,
function gradient()
returns the gradient of the log-likelihood;
function hessian()
returns the bordered Hessian matrix. By
default, both functions are evaluated at the maximum likelihood estimate
for p, as given by maxp()
.
A hyper2
object
Components of a hyper2
object
A vector of probabilities
Character vector of names
Boolean, with default TRUE
meaning to return the
bordered Hessian and FALSE
meaning to return the Hessian
(warning: this option does not respect the unit sum constraint)
A bordered Hessian matrix, understood to have a single
constraint (the unit sum) at the last row and column; the output of
hessian(border=TRUE)
Boolean with default FALSE
meaning for function
is_ok_hessian()
to return whether or not M
corresponds to a negative-definite matrix, and TRUE
meaning
to return more details
Function gradient()
returns the gradient of the log-likelihood
function. If the hyper2
object is of size n, then argument
probs
may be a vector of length n−1 or n; in the
former case it is interpreted as indep(p)
. In both cases, the
returned gradient is a vector of length n−1.
The function returns the derivative of the loglikelihood with respect to
the n−1 independent components of
(p1,…,pn), namely
(p1,…,pn−1). The fillup
value pn is calculated as
1−(p1+⋯+pn−1).
Function gradientn()
returns the gradient of the loglikelihood
function but ignores the unit sum constraint. If the hyper2
object is of size n, then argument probs
must be a vector
of length n, and the function returns a named vector of length
n. The last element of the vector is not treated differently from
the others; all n elements are treated as independent. The sum
need not equal one.
Function hessian()
returns the bordered Hessian, a matrix
of size n+1×n+1, which is useful when using
Lagrange's method of undetermined multipliers. The first row and column
correspond to the unit sum constraint, ∑p1=1.
Row and column names of the matrix are the pnames()
of the
hyper2
object, plus “usc
” for “Unit Sum
Constraint”.
The unit sum constraint borders could have been added with idiom
magic::adiag(0,pad=1,hess)
, which might be preferable.
Function is_ok_hessian()
returns the result of the second
derivative test for the maximum likelihood estimate being a local
maximum on the constraint hypersurface. This is a generalization of the
usual unconstrained problem, for which the test is the Hessian's being
negative-definite.
Function hessian_lowlevel()
is a low-level helper function that
calls the C++ routine.
Further examples and discussion is given in file
inst/gradient.Rmd
. See also the discussion at maxp on the
different optimization routines available.
Function gradient()
returns a vector of length n−1 with
entries being the gradient of the log-likelihood with respect to the
n−1 independent components of
(p1,…,pn), namely
(p1,…,pn−1). The fillup
value pn is calculated as
1−(p1,…,pn−1).
If argument border
is TRUE
, function hessian()
returns an n-by-n matrix of second derivatives; the borders
are as returned by gradient()
. If border
is FALSE
,
ignore the fillup value and return an n−1-by-n−1 matrix.
Calling hessian()
at the evaluate will not return exact zeros for
the constraint on the fillup value; gradient()
is used and this
does not return exactly zeros at the evaluate.
data(chess)
p <- c(1/2,1/3)
delta <- rnorm(2)/1e5 # delta needs to be quite small
deltaL <- loglik(p+delta,chess) - loglik(p,chess)
deltaLn <- sum(delta*gradient(chess,p + delta/2)) # numeric
deltaL - deltaLn # should be small [zero to first order]
#> [1] 5.110577e-12
H <- hessian(icons)
is_ok_hessian(H)
#> [1] 7.126371e-02 -2.728246e+01 1.824722e+04 -7.317948e+06