Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logistic regression

    to whom it may concern,
    i am working on a topic "risk factors of child mortality" by using logistic regression. Now i've selected 5 risk factors, and i want to know what will happen if i have one random risk factor, and 2 risk factors (any two from those 5 risk factors), and 3 and so on. thanks!

  • #2
    Is your dependent variable binary? If so, then just sequentially run models:

    Code:
    logit depvar explavar(s), vce(cluster clustvar)
    Are your data longitudinal?

    Comment


    • #3
      Originally posted by Maxence Morlet View Post
      Is your dependent variable binary? If so, then just sequentially run models:

      Code:
      logit depvar explavar(s), vce(cluster clustvar)
      Are your data longitudinal?
      thank you for your reply. all my variables are categorized into 0 and 1 (the dependent variable is child mortality which is 1=dead, 0=alive) , so i think they are binary. The data is cross-sectional. may i ask what does the s, cluster, and clustervar mean in the code?

      Comment


      • #4
        The (s) is to highlight you may start with one single explanatory variable, then move on to several (so explanatory variables, the word becomes plural so hence the s ).

        Clustering your standard errors (by child ID in your case) will correct for heteroscedasticity (as they invoke Roger's standard errors that are robust to heteroscedasticity) and (just the jist) for the rest relax the assumption of independence of the distribution (you may want to read this paper: https://www.nber.org/papers/w24003 : it explains everything to do with clustering very well).

        Comment


        • #5
          Originally posted by Maxence Morlet View Post
          The (s) is to highlight you may start with one single explanatory variable, then move on to several (so explanatory variables, the word becomes plural so hence the s ).

          Clustering your standard errors (by child ID in your case) will correct for heteroscedasticity (as they invoke Roger's standard errors that are robust to heteroscedasticity) and (just the jist) for the rest relax the assumption of independence of the distribution (you may want to read this paper: https://www.nber.org/papers/w24003 : it explains everything to do with clustering very well).
          thank you for your reply. so i've already done the multiple logistic regression by using cluster previously. maybe i don't clarify my question well, so what i want to do is more like a loop? i guess. i wish the system can automatically and randomly choose two/three/four risk factors from those five risk factors, and then do the logistic regression. thank you!

          Comment


          • #6
            Olivia, I think more details may help us understand your question. Is this what you'd like to do: Run four logit regressions with one, two, three, and four regressors, respectively. Regressors in each regression are selected randomly from the five.

            Comment


            • #7
              Originally posted by Fei Wang View Post
              Olivia, I think more details may help us understand your question. Is this what you'd like to do: Run four logit regressions with one, two, three, and four regressors, respectively. Regressors in each regression are selected randomly from the five.
              thank you for your reply.
              So for example, i got five risk factors as A,B,C,D,E. my outcome is child_mortality.
              for 2 regressors, i wish it is a combo from any two of my risk factors, as: "logit child_mortality A B" or "logit child_mortality A C" or "logit child_mortality A D" etc.
              similar for 3 and 4 regressors. but if i do it manually, it will be so much work and what i want is a 'loop' that can help me automatically make the combos and run the regression. if i make myself clear.
              thank you.

              Comment


              • #8
                I have a clumsy method - loops of loops - as below. I believe there are easier ways.

                Suppose the five factors are x1, x2, x3, x4 and x5. The codes are for logit regressions for all combs of three regressors.

                Code:
                forvalues i = 1(1)3 {
                    forvalues j = `=`i'+1'(1)4 {
                        forvalues k = `=`j'+1'(1)5 {
                            logit child_mortality x`i' x`j' x`k'
                        }
                    }
                }
                Codes for other cases are similar.

                Comment


                • #9
                  Originally posted by Fei Wang View Post
                  I have a clumsy method - loops of loops - as below. I believe there are easier ways.

                  Suppose the five factors are x1, x2, x3, x4 and x5. The codes are for logit regressions for all combs of three regressors.

                  Code:
                  forvalues i = 1(1)3 {
                  forvalues j = `=`i'+1'(1)4 {
                  forvalues k = `=`j'+1'(1)5 {
                  logit child_mortality x`i' x`j' x`k'
                  }
                  }
                  }
                  Codes for other cases are similar.
                  thank you for your reply. But i have a question, if i, j, k are my 3 factors, does it mean i still have to combo these three factors manually? like i have to choose to use x1,x2,x3, and for the second round, i have to manually change it to x1,x2,x3? rather than it will automatically combo it for me from 5 factors? or do i understand it wrong? thank you!

                  Comment


                  • #10
                    Originally posted by Olivia Kong View Post

                    thank you for your reply. But i have a question, if i, j, k are my 3 factors, does it mean i still have to combo these three factors manually? like i have to choose to use x1,x2,x3, and for the second round, i have to manually change it to x1,x2,x3? rather than it will automatically combo it for me from 5 factors? or do i understand it wrong? thank you!
                    sorry, it should be "for the second round, x1, x2, x4"

                    Comment


                    • #11
                      Originally posted by Olivia Kong View Post

                      thank you for your reply. But i have a question, if i, j, k are my 3 factors, does it mean i still have to combo these three factors manually? like i have to choose to use x1,x2,x3, and for the second round, i have to manually change it to x1,x2,x3? rather than it will automatically combo it for me from 5 factors? or do i understand it wrong? thank you!
                      I am not aware of any command that automatically goes through all combs of regressors. My codes above are essentially to teach Stata how to do regressions from one comb to another, but it still simplifies the procedure. Otherwise, you may have to manually input 10 lines of codes, like:

                      Code:
                      logit child_mortality x1 x2 x3
                      logit child_mortality x1 x2 x4
                      logit child_mortality x1 x2 x5
                      logit child_mortality x1 x3 x4
                      logit child_mortality x1 x3 x5
                      logit child_mortality x1 x4 x5
                      logit child_mortality x2 x3 x4
                      logit child_mortality x2 x3 x5
                      logit child_mortality x2 x4 x5
                      logit child_mortality x3 x4 x5
                      Advantages of the loops will be more prominent with a larger total number of factors. But again, there may be easier ways to do this.

                      Comment


                      • #12
                        You can use the command tuples to loop over all combinations of a list of variables.
                        Code:
                        ssc install tuples
                        Code:
                        tuples x1 x2 x3 x4 x5
                        
                        forvalues i = 1/`ntuples' {
                        logit child_mortality `tuple`i''
                        }

                        Comment


                        • #13
                          Originally posted by Fei Wang View Post

                          I am not aware of any command that automatically goes through all combs of regressors. My codes above are essentially to teach Stata how to do regressions from one comb to another, but it still simplifies the procedure. Otherwise, you may have to manually input 10 lines of codes, like:

                          Code:
                          logit child_mortality x1 x2 x3
                          logit child_mortality x1 x2 x4
                          logit child_mortality x1 x2 x5
                          logit child_mortality x1 x3 x4
                          logit child_mortality x1 x3 x5
                          logit child_mortality x1 x4 x5
                          logit child_mortality x2 x3 x4
                          logit child_mortality x2 x3 x5
                          logit child_mortality x2 x4 x5
                          logit child_mortality x3 x4 x5
                          Advantages of the loops will be more prominent with a larger total number of factors. But again, there may be easier ways to do this.
                          Thank you guys for your advise. So i came up with a new code, which is easier and faster. This is an example for 2 regressors.
                          Code:
                          foreach x in A B C D E {
                             foreach y in A B C D E {
                          logit child_mortality `x' `y', or 
                              } 
                          }

                          Comment

                          Working...
                          X