4.37 ON SELECTING REGRESSORS TO MAXIMIZE THEIR SIGNIFICANCE A common problem in applied regression analysis is to select the variables that enter a linear regression. Examples are selection among capital stock series constructed with different depreciation assumptions, or use of variables that depend on unknown parameters, such as Box-Cox transformations, linear splines with parametric knots, and exponential functions with parametric decay rates. It is often computationally convenient to estimate such models by least squares, with variables selected from possible candidates by enumeration, grid search, or Gauss-Newton iteration to maximize their conventional least squares significance level; term this method Prescreened Least Squares (PLS). This note shows that PLS is equivalent to direct estimation by non-linear least squares, and thus statistically consistent under mild regularity conditions. However, standard errors and test statistics provided by least squares are biased. When explanatory variables are smooth in the parameters that index the selection alternatives, Gauss-Newton auxiliary regression is a convenient procedure for obtaining consistent covariance matrix estimates. In cases where smoothness is absent or the true index parameter is isolated, covariance matrix estimates obtained by kernel-smoothing or bootstrap methods appear from examples to be reasonably accurate for samples of moderate size.