Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pdslasso standard errors and variable selection

    Hello,

    I have two questions related to the pdslasso command, and I would appreciate any help you can offer.

    1. I am trying to test the null that two coefficients generated with pdslasso are the same. I am using a SURS framework with suest. But because suest cannot take estimates stored from pdslasso, I run OLS regressions that add controls selected. The issue is that the standard errors computed by OLS + pdslasso selected controls are not the same as the standard errors using pdslasso. This stackexchange post has an example that shows that the SEs are different (it seems because pdslasso does not implement a finite sample error correction).

    My question is – is there a way to adjust my OLS regression to produce the same SEs as the pdslasso?

    2. Is it possible to set the command to select variables in pairs rather than one by one? For example, if my exogenous variables are x1, x2, y1, and y2, I want it to evaluate the combinations of (x1, y1) and (x2, y2) together, rather than separately. The specific issue is that x1 has missing data. In a standard OLS, I would replace x1 with the mean of the variable (or any value) and add a control y1 that was a dummy variable with a value 1 if x1 is missing. pdslasso sometimes selects x1 but not y1 or the reverse.

    Thanks!!

  • #2
    Regarding 1, you can "undo" the FSC and repost the "uncorrected" VCV matrix into the estimation results.

    Using pdslasso from Ahrens, Hansen, and Schaffer, and erepost from Ben Jann.

    With OLS standard errors:
    Code:
    use https://statalasso.github.io/dta/AJR.dta
    pdslasso logpgp95 avexpr (lat_abst edes1975 avelf temp* humid* steplow-oilres)
    eststo pdslasso
    regress logpgp95 avexpr edes1975 avelf zinc
    eststo regress
    matrix V = e(V)
    matrix V = V * (e(N)-e(df_m)-1) / (e(N))
    erepost V = V
    eststo regress_nofpc
    esttab pdslasso regress regress_nofpc, b(3) se(8) wide nogaps label
    Result:
    Code:
    -----------------------------------------------------------------------------------------------------------
                                  (1)                          (2)                          (3)                
                         log PPP GD~a                 log PPP GD~a                 log PPP GD~a                
    -----------------------------------------------------------------------------------------------------------
    average protection~o        0.391*** (0.05618621)        0.391*** (0.05851858)        0.391*** (0.05618621)
    % of European d~1975        0.009**  (0.00318402)        0.009**  (0.00331620)        0.009**  (0.00318402)
    ethno fract avg 5i~v       -0.997*** (0.24744530)       -0.997*** (0.25771708)       -0.997*** (0.24744530)
    zinc                       -0.008    (0.02806045)       -0.008    (0.02922527)       -0.008    (0.02806045)
    Constant                    5.764*** (0.37737065)        5.764*** (0.39303579)        5.764*** (0.37737065)
    -----------------------------------------------------------------------------------------------------------
    Observations                   64                           64                           64                
    -----------------------------------------------------------------------------------------------------------
    Standard errors in parentheses
    * p<0.05, ** p<0.01, *** p<0.001
    With robust standard errors:
    Code:
    pdslasso logpgp95 avexpr (lat_abst edes1975 avelf temp* humid* steplow-oilres), rob
    regress logpgp95 avexpr edes1975 avelf stepmid drystep iron silv zinc, vce(robust)
    eststo regress
    matrix V = e(V)
    matrix V = V * (e(N)-e(df_m)-1) / (e(N))
    erepost V = V
    eststo regress_nofpc
    esttab pdslasso regress regress_nofpc, b(3) se(8) wide nogaps label
    Result:
    Code:
    -----------------------------------------------------------------------------------------------------------
                                  (1)                          (2)                          (3)                
                         log PPP GD~a                 log PPP GD~a                 log PPP GD~a                
    -----------------------------------------------------------------------------------------------------------
    average protection~o        0.393*** (0.07266765)        0.393*** (0.07838803)        0.393*** (0.07266765)
    % of European d~1975        0.009**  (0.00277404)        0.009**  (0.00299241)        0.009**  (0.00277404)
    ethno fract avg 5i~v       -0.949**  (0.30702948)       -0.949**  (0.33119877)       -0.949**  (0.30702948)
    stepmid                     0.139    (0.12825508)        0.139    (0.13835129)        0.139    (0.12825508)
    drystep                     0.007    (0.21636786)        0.007    (0.23340029)        0.007    (0.21636786)
    iron                        0.012    (0.04380704)        0.012    (0.04725552)        0.012    (0.04380704)
    silv                        0.057    (0.03583759)        0.057    (0.03865872)        0.057    (0.03583759)
    zinc                       -0.067    (0.06124370)       -0.067    (0.06606479)       -0.067    (0.06124370)
    Constant                    5.731*** (0.43711688)        5.731*** (0.47152663)        5.731*** (0.43711688)
    -----------------------------------------------------------------------------------------------------------
    Observations                   64                           64                           64                
    -----------------------------------------------------------------------------------------------------------
    Standard errors in parentheses
    * p<0.05, ** p<0.01, *** p<0.001
    With clustered standard errors:
    Code:
    egen lat_grp = cut(lat_abst), group(8)
    pdslasso logpgp95 avexpr (lat_abst edes1975 avelf temp* humid* steplow-oilres), cluster(lat_grp)
    eststo pdslasso
    regress logpgp95 avexpr edes1975 avelf stepmid drystep iron silv, vce(cluster lat_grp)
    eststo regress
    matrix V = e(V)
    matrix V = V * ( (e(N_clust)-1) / e(N_clust) ) * ( (e(N)-e(df_m)-2) / (e(N)-1) )
    erepost V = V
    eststo regress_nofpc
    esttab pdslasso regress regress_nofpc, b(3) se(8) wide nogaps label
    Result:
    Code:
    -----------------------------------------------------------------------------------------------------------
                                  (1)                          (2)                          (3)                
                         log PPP GD~a                 log PPP GD~a                 log PPP GD~a                
    -----------------------------------------------------------------------------------------------------------
    average protection~o        0.390*** (0.06341711)        0.390*** (0.07190825)        0.390*** (0.06341711)
    % of European d~1975        0.009*** (0.00211770)        0.009**  (0.00240124)        0.009**  (0.00211770)
    ethno fract avg 5i~v       -1.003**  (0.33403376)       -1.003*   (0.37875868)       -1.003*   (0.33403376)
    stepmid                     0.029    (0.10728250)        0.029    (0.12164693)        0.029    (0.10728250)
    drystep                     0.024    (0.13227958)        0.024    (0.14999094)        0.024    (0.13227958)
    iron                       -0.029    (0.02863587)       -0.029    (0.03247003)       -0.029    (0.02863587)
    silv                        0.023    (0.01843916)        0.023    (0.02090804)        0.023    (0.01843916)
    Constant                    5.768*** (0.40077824)        5.768*** (0.45443981)        5.768*** (0.40077824)
    -----------------------------------------------------------------------------------------------------------
    Observations                   64                           64                           64                
    -----------------------------------------------------------------------------------------------------------
    Standard errors in parentheses
    * p<0.05, ** p<0.01, *** p<0.001

    Comment


    • #3
      Hi Nils,

      Thank you for your detailed response. It was extremely helpful!

      Comment

      Working...
      X