Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matrix for saving results of logistic regression

    Dear Stata Users,

    I run logistic regressions with the dependent variable HED and 8 independent variables (v1-v8) in 35 countries (COU).
    What I want to do is to save the odds ratio coefficients for the variables and countries. I.e., for all participants from a country there is the same value for each of the independent variables, which is the OR coefficient.
    This can also be in a new dataset.
    Based on that I would like to run a cluster analysis to cluster the countries based on the results of the logistic regression.

    Can anyone help me with saving the coefficients? I learnt about the matrix command and tried to read in the forum, something like
    Code:
    matrix list r(table)
    matrix p_`x' = r(table)
    matrix f_`x' = p_`x'[1, 1...]
    matrix l_`x' = (`x')
    ... but i do really not how to apply it to my analyses-


    My command for the logisitic regression is:

    Code:
    levelsof COU, local(levels)
    foreach j of local levels {
    display `j'
    *forval j = 100/`r(max)' {
    *di "{title:`: label (nation) `j''}"
    logistic HED v1 v2 v3 v4 v5 v6 v7 v8 if COU == `j'
    }
    Would be really happy to get your help - it does not necessarily be the matrix command, happy to hear other possibilities of saving aswell!

    Best
    Anne

  • #2
    In this example the occupational categories play the role of country

    Code:
    // open example data
    frames reset
    sysuse nlsw88, clear
    
    // prepare the data
    
    gen byte occat = cond(occupation < 3, 1,                    ///
                     cond(inlist(occupation,5, 6, 8, 9), 2, 3)) ///
                     if !missing(occupation)
    label variable occat "occupational category"
    label define occat 1 "white collar" ///
                       2 "skilled"      ///
                       3 "unskilled"
    label value occat occat
    
    // make a copy of the data and use that to create a "dataset" of odds ratios
    frame copy default or
    frame change or
    statsby _b _se, by(occat) clear : logit union i.south grade, or
    
    // this stores the coefficients, to make them into odds ratios
    // we use or = exp(coefficient)
    gen or_south   = exp(_stat_2)
    gen or_grade   = exp(union_b_grade)
    gen odds_cons  = exp(union_b_cons)
    
    // use the delta method to compute the standard errors:
    // se_or = exp(coefficient)*se_coefficient
    gen se_south = exp(_stat_2)*_stat_6
    gen se_grade = exp(union_b_grade)*union_se_grade
    gen se_cons  = exp(union_b_cons)*union_se_cons
    
    // the confidence interval is computed using the confidence interval
    // of the coefficients and than exponentiating the upper and lower bounds
    // see: https://www.stata.com/support/faqs/statistics/delta-rule/
    gen lb_south = exp(_stat_2       - invnormal(0.975)*_stat_6)
    gen lb_grade = exp(union_b_grade - invnormal(0.975)*union_se_grade)
    gen lb_cons  = exp(union_b_cons  - invnormal(0.975)*union_se_cons)
    gen ub_south = exp(_stat_2       + invnormal(0.975)*_stat_6)
    gen ub_grade = exp(union_b_grade + invnormal(0.975)*union_se_grade)
    gen ub_cons  = exp(union_b_cons  + invnormal(0.975)*union_se_cons)
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thank you very much for the fast reply, this was indeed very helpful...
      what i used was
      statsby _b _se, by(COU) clear : logistic HED v1 v2 v3 v4 v5 v6 v7 v8
      ...and somehow it worked but I think I didnt express clearly enough my goal, because I would like to have the value for each participant of each country, now its only one observation per country...
      what do i have do change?

      Thank you so much!

      Best
      Anne

      Comment


      • #4
        Code:
        // open example data
        frames reset
        sysuse nlsw88, clear
        
        // prepare the data
        
        gen byte occat = cond(occupation < 3, 1,                    ///
                         cond(inlist(occupation,5, 6, 8, 9), 2, 3)) ///
                         if !missing(occupation)
        label variable occat "occupational category"
        label define occat 1 "white collar" ///
                           2 "skilled"      ///
                           3 "unskilled"
        label value occat occat
        
        // make a copy of the data and use that to create a "dataset" of odds ratios
        frame copy default or
        frame change or
        statsby _b, by(occat) clear : logit union i.south grade, or
        
        // this stores the coefficients, to make them into odds ratios
        // we use or = exp(coefficient)
        gen or_south   = exp(_stat_2)
        gen or_grade   = exp(union_b_grade)
        gen odds_cons  = exp(union_b_cons)
        
        // move back to the original data
        frame change default
        
        // establish a link between or and default
        // 9 observations aren't matched bacause they have missing values on occat
        frlink m:1 occat , frame(or)
        
        // move the or variables to the original data
        frget or_south or_grade odds_cons, from(or)
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          sorry, I do not really get it.

          firstly,
          frame copy default or frame change or
          this does not work, stata does not know a command frame?

          then, i do now understand why to make the coefficients to OR because with my command logistic i get OR, right?

          sorry for my stupid questions..

          maybe
          statsby _b _se, by(COU) clear : logistic HED v1 v2 v3 v4 v5 v6 v7 v8
          this (or similarly) is already the solution... here I have for all countries one value. is there a possiblity to add it to my previous dataset but fill in for each participant? should be matched via the country variable or so (I think you tried to show me, but it did not work. sorry!)

          Comment


          • #6
            As to the first question: what version of Stata are you using?

            As to the second question: logistic displays the odds ratios, but under the hood it is still a regular logistic regression. This means the coefficients that logistic leaves behind, and is collected by statsby, are the log(odds ratios) and not the odds ratios.
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              To the first questions, it is Stata 15.1

              Second one: Thanks, thats an important aspect. Got it!

              Comment


              • #8
                what I tried is to use
                foreach x of numlist 100 191 196 203 208 233 246 300 352 380 428 440 470 528 578 616 620 642 703 705 752 804 {
                display `x'
                logistic HED v1 v2 v3 v4 v5 v6 v7 v8 [pweight=WEIGHT] if COUNTRY==`x'
                matrix p_`x' = r(table)
                matrix f_`x' = p_`x'[1, 1...]
                matrix l_`x' = (`x')
                matrix p_`x' = f_`x', l_`x'
                }
                but i dont know how to go on

                Comment


                • #9
                  Originally posted by anne jagdberg View Post
                  To the first questions, it is Stata 15.1
                  That is important information. This is so important, that the Statalist FAQ asks you to mention this at the very beginning. There is nothing wrong with using an older version of Stata, but if you tell us nothing we have to make an assumption of what version of Stata you are using, and the assumption is that you are using the latest version. So now you know, the next time you ask a question on Statalist mention your version of Stata.

                  Adapting my example to that older version gives:

                  Code:
                  // open example data
                  sysuse nlsw88, clear
                  
                  // prepare the data
                  
                  gen byte occat = cond(occupation < 3, 1,                    ///
                                   cond(inlist(occupation,5, 6, 8, 9), 2, 3)) ///
                                   if !missing(occupation)
                  label variable occat "occupational category"
                  label define occat 1 "white collar" ///
                                     2 "skilled"      ///
                                     3 "unskilled"
                  label value occat occat
                  
                  // make and save temporary copy of the data
                  tempfile tofill
                  save `tofill'
                  
                  //create a "dataset" of odds ratios
                  statsby _b, by(occat) clear : logit union i.south grade, or
                  
                  // this stores the coefficients, to make them into odds ratios
                  // we use or = exp(coefficient)
                  gen or_south   = exp(_stat_2)
                  gen or_grade   = exp(union_b_grade)
                  gen odds_cons  = exp(union_b_cons)
                  
                  // keep only the relevant variables
                  keep occat or_south or_grade odds_cons
                  
                  // merge those variables to the original data
                  merge 1:m occat using `tofill'

                  ---------------------------------
                  Maarten L. Buis
                  University of Konstanz
                  Department of history and sociology
                  box 40
                  78457 Konstanz
                  Germany
                  http://www.maartenbuis.nl
                  ---------------------------------

                  Comment


                  • #10
                    Thank you very much and my huge apologies for not telling the version. Next time I will do it for sure.
                    Again, thanks a lot for you patience and giving advice that fast.

                    Best
                    Anne

                    Comment

                    Working...
                    X