ML Cheat Sheet
Contents
ML Cheat Sheet
OLS vs Bootstrap Estimates and Model Stability :
OLS (Ordinary Least Squares) is a fundamental method for estimating the parameters of a linear regression model . It works by minimizing the sum of the squared residuals (differences between actual and predicted values) to find the best-fitting line
Aspect | OLS Estimates | Bootstrap Estimates | Interpretation for Stability |
---|---|---|---|
Definition | Estimates derived from the entire dataset by minimizing squared errors. | Estimates obtained by resampling the dataset multiple times. | If bootstrap estimates match OLS, the model is stable. |
Purpose | Finds the best linear fit for the given dataset. | Evaluates the variability and bias of the OLS estimates. | Close estimates indicate robustness. |
Sensitivity to Data | Can be sensitive if outliers or small sample sizes exist. | Reduces sensitivity by checking results across multiple resamples. | Small differences suggest the model generalizes well. |
Bias | Can be biased if assumptions (normality, independence) are violated. | Measures bias by comparing to OLS estimates. | Small bias means OLS estimates are reliable. |
Variance (Uncertainty) | Uses standard errors from the single dataset. | Provides a distribution of estimates to assess uncertainty. | If bootstrap SE ≈ OLS SE, the model is stable. |
When They Match | Estimates are considered reliable on the dataset. | Confirms that resampling does not significantly change estimates. | Model is not highly sensitive to variations in data. |
When They Differ | Model may be unstable or sensitive to specific observations. | Suggests high variability or bias in the model. | Model may require improvements (e.g., more data, regularization). |
Key Takeaway ✅ If bootstrap estimates closely match the OLS estimates , the model is stable, robust, and reliable for generalization to new data.