Model estimation

Model estimation is the use of statistical analysis techniques to find parameters that most likely explain observed data. Model estimation is a component of Model Calibration and Validation.

# Notation

Consider a statistical process where an outcome is a function of various predictor variables . It may be desirable to explain this process with a linear equation,

where are parameters that "explain" the relationship between the predictor variables and the outcome variables. The residual term is a random variable that accounts for the difference between the observed value and the evaluation output of the linear function .

The notation for this equation can be simplified by using a matrix where the columns are the different predictor variables and the rows are different observations in a dataset. The first column consists of ones and corresponds to the intercept

$$ X = \begin{vmatrix} 1 & x_{11} & x_{21} & \ldots & x_{p1}\ 1 & x_{12} & x_{22} & \ldots & x_{p2}\ \vdots & \vdots & \vdots & & \vdots\ 1 & x_{1n} & x_{2n} & \ldots & x_{pn}\ \end{vmatrix} $$

The linear equation above then becomes . The purpose of model estimation is to find estimates of that minimize the difference between the true observed response and the "fitted" response .

# Ordinary Least Squares

Assume we have a linear equation and we want to find the estimates ; one plausible method would be to find values that minimize the sum of squared residuals, or the distance between and .

If we take the derivative of this sum with respect to , set equal to zero and solve for , we arrive at the following estimator equation:

This estimator is referred to as the Ordinary Least Squares (OLS) estimator. If we assume that the residuals are distributed normally with variance , variance of the OLS estimates is

# Maximum Likelihood Estimation

OLS is powerful and adequate in many situations; however, there may be cases where the assumptions of OLS modeling (normally distributed , etc.) are violated. This is common in transportation engineering especially, where the outcome variable is often discontinuous. In these cases, it is more common to use maximum likelihood estimation (MLE).

In a linear model, we assume that the points follow a normal (Gaussian) probability distribution, with mean and variance : . The equation of this probability density function is:

What we want to find is the parameters and that maximize this probability for all points . This is the "likelihood" function, .

For various reasons, it's easier to use the log of the likelihood function:

$$\log(\mathcal{L}) = \sum_{i = 1}^n-\frac{n}{2}\log(2\pi) -\frac{n}{2}\log(\sigma^2) - \frac{1}{2\sigma^2}(y - X\beta)^2 $$

Most MLE programs work by having a computer attempt to find values of and that maximize the value of this likelihood function. Note that for a linear model , the MLE and OLS estimates are equivalent. MLE is most suitable to problems where an analytical solution is difficult or does not exist.

This site uses cookies to learn which topics interest our readers.