Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 2 questions about variable selection in matched data and how to deal with a nominal variable with many (70) levels (same data set)

    I’m using Stata 13.1. I have a 3 to 1 propensity score matched data set with one observation per subject. I was considering using -lars- as a model building/variable selection method and then analyzing with -clogit- but I do not believe that -lars- (as implemented in Stata) can be used with matched data. Perhaps there is a sort of conditional version of lasso or another method for use in Stata? I haven't had luck finding anything like that here.

    I also have a nominal dependent variable (employment - included in the propensity score calculation) in the same data set and it consists of 70 job classifications. Some of the levels are on the sparse side (<5) so I was wondering if there was a method to combine levels of this variable?

  • #2
    Hello Joseph,

    Welcome to the Stata Forum/Statalist.

    With regards to your second question, if I got it right, there is no need of a special method, for you may simply aggregate levels beforehand. Indeed, I fear 70 levels (with or without PS) would be too hard for a graphical or numerical representation, anyway.
    Best regards,

    Marcos

    Comment


    • #3
      Thank you Marcos. I was concerned that using a potentially subjective method might introduce bias. Does Stata have anything comparable to fused lasso?

      Comment


      • #4
        Marcos I was able to aggregate the levels. Now that I have done this, is there a conditional lasso in Stata? Or at least something similar that takes matching into account?

        Comment


        • #5
          you should be perfectly capable of doing a search of official Stata yourself; try "hsearch lasso"

          I note that there is a user-written command that might be what you are looking for; type "search lars"

          Comment


          • #6
            Hello Joseph,

            Maybe you wish to take a close look at the user-written plogit. I don't have any experience with it. Apparently, judging by this post from a different forum, the help file is not detailed enough.
            What is more,you may probably need to use some functionalities of R under Stata so as to get the best perfomance in Stata. Also, sorry, I have no experience with this sort of "match" (I mean, using R within Stata), because I feel more comfortable just "dating" (excuses for the pun) with both separately That said, you will surely find here in the Forum a couple of highly skilled members on this topic as well.
            Best regards,

            Marcos

            Comment


            • #7
              Originally posted by Rich Goldstein View Post
              you should be perfectly capable of doing a search of official Stata yourself; try "hsearch lasso"

              I note that there is a user-written command that might be what you are looking for; type "search lars"
              Thank you. As I mentioned, I did find "lars" but did not see a mention of it's usage with matched data. I did "hsearch lasso" as you suggested and since it brought up "lars" I checked it again. I may send an email to the author and ask him.

              Comment


              • #8
                Originally posted by Marcos Almeida View Post
                Hello Joseph,

                Maybe you wish to take a close look at the user-written plogit. I don't have any experience with it. Apparently, judging by this post from a different forum, the help file is not detailed enough.
                What is more,you may probably need to use some functionalities of R under Stata so as to get the best perfomance in Stata. Also, sorry, I have no experience with this sort of "match" (I mean, using R within Stata), because I feel more comfortable just "dating" (excuses for the pun) with both separately That said, you will surely find here in the Forum a couple of highly skilled members on this topic as well.
                Thank you Marcos. I did try the "plogit" command and you are correct, the "help" is somewhat spartan. If I wish to use the lasso option it appears I will need to experiment with different values of delta and/or epsilon otherwise it returns the same answer as "logit". I may try contacting one of the authors since it is unclear if it's appropriate for use with matched data. I will also search posts in this forum for posts by members who have "matched" (as you said) R and Stata.

                Comment


                • #9
                  I did receive a response from Adrian Mander that stated "The program doesn't do anything fancy like matched data, and I'm not sure how you can trick it into considering the matched data as a single line of data". So it appears I won't be using "lars" for my matched data. I will see what I can find on the forum regarding members using R and Stata together.

                  Comment


                  • #10
                    Hello Joseph,

                    Thank you very much for sharing in the Forum the information on the (im)possibility of plogit and matched data.
                    Best regards,

                    Marcos

                    Comment

                    Working...
                    X