Modern datasets in social sciences often contain a high number of potential independent variables to study in relation to a problem. Traditional OLS regression, while still widely used, is inadequate for dealing with multicollinearity and does not provide an optimal solution for model/variable selection. Moreover, the assumptions necessary for OLS regression are frequently violated in social science data. LASSO (Least Absolute Shrinkage and Selection Operator)regression, a type of penalized regression, effectively addresses the problem of variable selection. It applies an increasing lambda penalty parameter to shrink coefficients, while simultaneously measuring model performance. By employing various methods to define the optimal lambda, LASSO provides a robust approach for variable selection.
Participants will explore the core concept of LASSO and learn to implement, interpret, and report LASSO models through practical examples in R.
No prior R knowledge is required!
Course language: english
Course Agenda
- Problems and assumptions of OLS
- Variable selection (Filters, Wrappers, Embedded methods)
- Penalized regressions
- Overview of LASSO (λ, model performance measure, optimization)
- Examples of LASSO from the literature
- Implementing LASSO in R
- Reporting LASSO
- LASSO and Shapley
Instructor: Zoltán Brys (TK SZI)