Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Double Machine Learning (DML) in STATA (xporegress command)

    Dear all,

    I am trying to use the command that STATA has for DML (Cross-fit partialing-out lasso linear regression; xporegress) but I have some questions for which I could not find a solution:
    1. I know that usually with Lasso methods we have to standardize the variables. I am wondering, is the command doing this automatically ? Moreover, as DML is not about predictions but inference, should the variables of interest also be standardized?
    2. About this standardization procedure, I know there is a whole debate about whether categorical variables should be standardized. I have a few as well as controls and I am currently just transforming them to be considered as dummy variables. (i. command). Do you think that's enough ?
    3. Lastly, I am also considering applying a log-scale to some of my continuous variables (e.g average income and expenditures). Do you think this is feasible after standardizing the variables?
    4. Is there a way to create automatically interactions between variables as controls ?
    To put it all in context, I am trying to estimate the effect of two variables (one categorical/dummy and one a continuous one) while taking into account the effect of a bunch of controls.
    If you have any ideas about these questions or any general comments, please let me know.
    Many thanks in advance.
    Last edited by Cristian Laurentiu; 05 Jun 2021, 11:47.

  • #2
    Cross-posted at https://stats.stackexchange.com/ques...tandardization

    3. If standardization implies that values are variously negative and positive, you can usefully take logarithms thereafter.

    Comment

    Working...
    X