This document gives a brief description of the estimation methods for population type data that can be used with NONMEM Version V. These include, in particular, a few methods that are new with this version, the centered and hybrid methods. The more important changes from the earlier edition published in 1992, but not all changes, are highlighted with the use of vertical bars in the right margin. This document contains no information about how to communicate with the NONMEM program.
To read this document it may be helpful to have some familiarity with the notation used with the representation of statistical models for the NONMEM program. See discussions of models in NONMEM Users Guide - Part I, but if one’s interest is only in using NONMEM with PREDPP, see discussions of models in NONMEM Users Guides - Parts V and VI. Particular notation used in this Guide VII is given next.
The jth observation from the ith individual is
denoted . Each individual
may have a different number of observations. Each
observation may be measured on a different scale:
continuous, categorical, ordered categorical,
discrete-ordinal.†
----------
An individual can have multivariate observations,
each of different lengths. However, the multivariate nature
of an observation is suppressed, as this is not relevant to
the descriptions given in this document, and so the separate
(scalar-valued) observations comprising the multivariate
observations are all separately indexed by j. Each
multivariate observation may have a different length. The
vector of all the observations from the ith individual is
denoted .
It is assumed that there exists a separate
statistical model for each .
This model is called the intraindividual model or the
individual model for the ith individual. It is
parameterized by
, a
(vector-valued) parameter common to all the separate
intraindividual models, and
, a (vector-valued) parameter specific to the
intraindividual model for
.
Under this model, the likelihood of
for the data
(conditional on
) is denoted by
, the dependence on
being supressed in the
notation. This likelihood is called here the conditional
likelihood of
When all the elements of
are measured on a
continuous scale, an often-used intraindividual model is
given by the multivariate normal model with mean
and variance-covariance
matrix
(usually,
is comprised of parameters
which are the only ones
affecting
, and other
parameters which, along with
, affect
).††
----------
This type of model shall be referred to as the
mean-variance model It is usually expressed in terms
of a multivariate normal vector
with mean 0 and
variance-covariance matrix
. In the notation used here, the parameter
includes
(ignoring the matrix
structure of
). For
example,
where is an
instance of a univariate normal variable
with variance
. (When
is multivariate, the
observation
is modeled in
terms of a single instance of this multivariate random
vector. A few other observations as well may be modeled in
terms of this same instance, and thus under the
model, all such observations are correlated and comprise a
multivariate observation.) In this example,
is
(the mean of
), and
is
(the variance of
). Since the ratio of the
standard deviation of
to
the mean of
is the constant
, this particular model is
called the constant coefficient of variation
model.
The dependence of
on
is often a consequence of
the intraindividual variance depending on the mean function,
as with the above example, which in turn depends on
. This dependence
represents an interaction between
and
. With the (homoscedastic)
model expressed by
there is no such interaction;
is just
. There are two variants of
the first-order conditional estimation method described in
chapter II, one that takes this interaction into account and
another that ignores it.
When an intraindividual model involving
is presented to NM-TRAN
(the "front-end" of the NONMEM system), the model
is automatically transformed. A linearization of the right
side of the equation is used: a first-order approximation in
about 0, the mean value of
. Since the approximate
model is linear in
, it is
a mean-variance model. Clearly, if the given model is itself
a mean-variance model, the transformed model is identical to
the given model. Consider, for example, an intraindividual
model where the elements of
are regarded as lognormally distributed (because the
normally distributed
appear
as logarithms):
In this case the transformed model is the constant cv model given above. (Therefore, no matter whether the given intraindividual model or the constant cv model is presented to NM-TRAN, the results of the analysis will be the same.)
Alternatively, the user might be able to
transform the data so that a mean-variance model applies to
the transformed data, which can then be presented directly
to NM-TRAN. With the above example, and using the log
transformation on the data
, an appropriate mean-variance model to present to NM-TRAN
would be
(Actually, NM-TRAN allows one to explicitly
accomplish the log transformation of both the data and the
.) The results of the
analysis differ depending on whether or not the log
transformation is used. Without the log transformation, the
values of the
are regarded
as arithmetic means (under the approximate model obtained by
linearizing), and with the log transformation, these values
are regarded as geometric means. Use of the log
transformation (when this can be done; when there are no
or
with value 0) can often
lead to a better analysis.
It is also assumed that as individuals are
sampled randomly from the population, the
are also being sampled
randomly (and statistically independently), although these
values are not observable. The value
is called the random
interindividual effect for
. It is assumed that the
are instances of the random
vector
, normally
distributed with mean 0 and variance-covariance matrix
. The density function of
this distribution (at
) is
denoted by
.
Often, some quantity P (viewed as a function of
values of the covariates and the
) is common to different
intraindividual models. For example, a clearance parameter
may be common to different intraindividual models, but its
value differs between different intraindividual models
because the values of the covariates and the
differ. The randomness of
the
in the population
induces randomness in P. The quantity P is said to be a
randomly dispersed parameter When speaking of its
distribution, we are imagining that the values of the
covariates are fixed, so that indeed, there is a unique
distribution in question.
From the above assumptions, the (marginal)
likelihood of and
for the data
is given by
In general, this integral is difficult to compute exactly. The likelihood for all the data is given by
The first-order estimation method was the first
population estimation method available with NONMEM. This
method produces estimates of the population parameters
and
, but it does not produce
estimates of the random interindividual effects. An estimate
of
is nonetheless
obtainable, conditional on the first-order estimates for
and
(or on any other values for
these parameters), by maximizing the empirical Bayes
posterior density of
,
given
:
, with respect to
. In other words, the
estimate is the mode of the posterior distribution. Since
this estimate is obtained after values for
and
are obtained, it is called
the posthoc estimate When a mean-variance model is
used, and a request is put to NONMEM to compute a posthoc
estimate, by default this estimate is computed using
. In other words, the
intraindividual variance-covariance is assumed to be the
same as that for the mean individual the hypothetical
individual having the mean interindividual effect, 0, and
sharing the same values of the covariates as has the ith
individual). However, it is also possible to obtain the
posterior mode without this assumption.
The posterior density can be maximized using any
given values for and
. Since the resulting
estimate for
is obtained
conditionally on these values, it is sometimes called a
conditional estimate at these values, to emphasize
its conditional nature.
In contrast with the first-order method, the
conditional estimation methods to be described produce
estimates of the population parameters and,
simultaneously, estimates of the random
interindividual effects. With each different method, a
different approximation to the likelihood function (1) is
used, and (2) is maximized with respect to
and
. The approximation to (1)
at the values
and
depends on an estimate
, and as this estimate
itself depends on the values
and
, the approximation gives
rise to a further dependence of
on the values of
and
, one not expressed in
(1). Consequently, as different values
and
are tried, different
estimates
are obtained
as a part of the maximization of (2). The estimates
at the values
and
that maximize (2)
constitute the estimates of the random interindividual
effects produced by the method (except for the hybrid
method†). The estimate
also depends on
, and so, the
approximation gives rise to a further dependence of
on
, one also not expressed
in (1).
----------
In contrast with the first-order method, a
conditional estimation method involves multiple
maximizations within a maximization. The estimate
is the value of
that maximizes the
posterior distribution of
given
(except for the
hybrid method††). For each different value of
and
that is tried by the
maximization algorithm used to maximize (2), first a value
is found that maximizes
the posterior distribution given
, then a value
is found that maximizes
the posterior distribution given
, etc. Therefore,
maximizing (2) is a very difficult and CPU intensive task.
The numerical methods by which this is accomplished are not
described in this document.
----------
Fortunately, it often suffices to use the
first-order method; a conditional estimation method is not
needed, or if it is, sometimes it is needed only minimally
during the course of a data analysis. Some guidance is given
in chapter III. Briefly, the need for a conditional
estimation method increases with the degree to which the
intraindividual models are nonlinear in the
. Population
pharmacokinetic models are often actually rather linear in
this respect, although the degree of nonlinearity increases
with the degree of multiple dosing. Population
pharmacodynamic models are more nonlinear. The potential for
a conditional estimation method to produce results different
from those obtained with the first-order estimation method
decreases as the amount of data per individual decreases,
since a conditional estimation method uses conditional
estimates of the
, which
are all shrunken to 0, and the shrinkage is greater the less
the amount of data per individual. Many population analyses
involve little amounts of data per individual.
The conditional estimation methods that are available with NONMEM and which are described in chapter II are: the first-order conditional estimation method (with and without interaction when mean-variance models are used, and with or without centering), the Laplacian method (with and without centering), and the hybrid method (a hybrid between the first-order and first-order conditional estimation methods). For purposes of description here and in other NONMEM Users Guides, the term
conditional estimation methods refers not only to these population estimation methods, but also to methods for obtaining conditional estimates themselves.
To summarize, each of the (population)
conditional estimation methods involves maximizing (2), but
each uses a different approximation to (1). Actually,
is minimized with respect
to
and
. This is called the
objective function Its minimum value serves as a
useful statistic for comparing models. Standard errors for
the estimates (indeed, an estimated asymptotic
variance-covariance matrix for all the estimates) is
obtained by computing derivatives of the objective
function.
TOP
TABLE OF CONTENTS
NEXT CHAPTER ...