Model estimation is the use of statistical analysis techniques to find
parameters that most likely explain observed data. Model estimation is a
component of Model Calibration and Validation.
Consider a statistical process where an outcome is a function of various
predictor variables . It may be desirable to explain this
process with a linear equation,
where are parameters that "explain" the relationship between
the predictor variables and the outcome variables. The residual term is a
random variable that accounts for the difference between the observed value
and the evaluation output of the linear function .
The notation for this equation can be simplified by using a matrix where the
columns are the different predictor variables and the
rows are different observations in a dataset. The first column consists of
ones and corresponds to the intercept
The linear equation above then becomes .
The purpose of model estimation is to find estimates of that minimize
the difference between the true observed response and the "fitted" response
.
Assume we have a linear equation and we want to find
the estimates ; one plausible method would be to find values
that minimize the sum of squared residuals, or the distance between and .
If we take the derivative of this sum with respect to , set equal to zero
and solve for , we arrive at the following estimator equation:
This estimator is referred to as the Ordinary Least Squares (OLS) estimator.
If we assume that the residuals are distributed normally with
variance , variance of the OLS estimates is
OLS is powerful and adequate in many situations; however, there may be cases where
the assumptions of OLS modeling (normally distributed , etc.) are violated.
This is common in transportation engineering especially, where the outcome
variable is often discontinuous. In these cases, it is more common to use
maximum likelihood estimation (MLE).
In a linear model, we assume that the points follow a normal (Gaussian) probability
distribution, with mean and variance : .
The equation of this probability density function is:
What we want to find is the parameters and that maximize this
probability for all points . This is the "likelihood" function, .
For various reasons, it's easier to use the log of the likelihood function:
Most MLE programs work by having a computer attempt to find values of
and that maximize the value of this likelihood function. Note that for
a linear model , the MLE and OLS estimates are equivalent.
MLE is most suitable to problems where an analytical solution is difficult or does
not exist.