Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reference Category in Ordered Logit

    Dear all,

    I'am using STATA 12 for mac. Currently, I am running an ordered legit regression to find out relationship between certain employment status and poverty. I have 6 levels employment status as my regressor (1=self employed, 2=self-employed with temporary workers, 3=self-employed with permanent worker, 4=employee, 5=casual worker, 6=family worker) and I want to choose one status as reference category. In this case, I want to choose "employee" as my reference category. How to create dummy variable for this?

    gen self_employ=1 if employ_status=1
    replace self_employ=0 if employ_status=4

    Is that command right?. However, how to treat those missing values from other status of employment in that variable?. Will this create a problem in regression?

    Any feedbacks or help from you is really appreciated!

    Thanks

  • #2
    See -help fvvarlist-. If employment status is an explanatory variable, just do something like

    ologit poverty ib3.employ_status

    If employment status is your dependent variable, it doesn't make any sense to refer to a reference category in ologit.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Hi Richard,

      Yes, it solved now!! Thank you SO MUCH for your answer

      Comment


      • #4
        Tungga:
        if you want to solve missing values issue, you should check for their informativeness first.
        You may want to take a look at -mi- and related entries in Stata .pdf manual.
        You may also be interested in Paul Allison's http://www.sagepub.in/textbooks/Book9419 and in www.missingdata.org.uk/; the latter is maintained by Jeremy Bartlett, whose posts appear on this forum from time to time.
        Stata applies listwise deletion for observations with missing values in any of the variables (dependent and/or independent): that approach boils down to reduce your sample size (with the subsequent statistical problems).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          I thought Tungga was referring to the missing values that would have been created with the originally proposed coding strategy. That shouldn't be a concern when using factor variables. But if the problem is indeed broader than what I thought then I second Carlo's advice. Those are good references.
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          StataNow Version: 19.5 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://www3.nd.edu/~rwilliam

          Comment


          • #6
            The theme is far from my field. That said, IMHO, it strikes me as "debatable" whether the outcome variable in #1 is really ordered.

            Being this so, maybe a multinomial logistic regression under - mlogit - could provide an interesting solution.

            By the way, under - mlogit - you may select the outcome baseline level, as you wished. You don't need to stick with the lower levels versus the higher levels, or vice-versa.

            You may choose any level as "the" reference, but it's better to select among the most prevalent ones.
            Best regards,

            Marcos

            Comment


            • #7
              Dear all,
              How to specify the reference category for the dependent variable in the case of ordered logit in stata. When I use ib.dependent variable, I get the following message: depvar may not be a factor variable.
              Best regards

              Comment


              • #8
                Nahed:
                see Richard's helpful reply #5 in this thread.
                In addition, quoting -ologit- entry in Stata .pdf manual (page 1833 for Stata 16):
                ologit and oprobit begin by tabulating the dependent variable. Category i = 1 is defined as

                the minimum value of the variable, i = 2 as the next ordered value, and so on, for the empirically

                determined k categories.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Thank you for your answer. So if I understood correctly stata takes by default the category 1 as reference in an ordered logit.
                  Best regards

                  Comment


                  • #10
                    Nahed:
                    the following toy-example hopefully clarifies what happens behind the -ologit- curtain:
                    Code:
                    use "C:\Program Files\Stata16\ado\base\a\auto.dta"
                    . ologit rep78 i.foreign
                    
                    Iteration 0:   log likelihood = -93.692061 
                    Iteration 1:   log likelihood = -79.696089 
                    Iteration 2:   log likelihood = -79.034005 
                    Iteration 3:   log likelihood = -79.029244 
                    Iteration 4:   log likelihood = -79.029243 
                    
                    Ordered logistic regression                     Number of obs     =         69
                                                                    LR chi2(1)        =      29.33
                                                                    Prob > chi2       =     0.0000
                    Log likelihood = -79.029243                     Pseudo R2         =     0.1565
                    
                    ------------------------------------------------------------------------------
                           rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                         foreign |
                        Foreign  |    2.98155   .6203644     4.81   0.000     1.765658    4.197442
                    -------------+----------------------------------------------------------------
                           /cut1 |  -3.158382   .7224269                     -4.574313   -1.742452
                           /cut2 |  -1.362642   .3557343                     -2.059868   -.6654154
                           /cut3 |   1.232161   .3431227                      .5596532     1.90467
                           /cut4 |   3.246209   .5556657                      2.157124    4.335293
                    ------------------------------------------------------------------------------
                    
                    . mat list e(b)
                    
                    e(b)[1,6]
                             rep78:      rep78:          /:          /:          /:          /:
                                0b.          1.                                               
                           foreign     foreign        cut1        cut2        cut3        cut4
                    y1           0     2.98155  -3.1583824  -1.3626418   1.2321614   3.2462088
                    
                    . mat list e(cat)
                    
                    e(cat)[1,5]
                        c1  c2  c3  c4  c5
                    r1   1   2   3   4   5
                    
                    .
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Dear all,
                      What are the hypotheses to be tested before running an ordered logit regression?
                      Best regards.

                      Comment


                      • #12
                        Nahed:
                        the trivial assumption to be tested is that the levels of your categorical variable (regressand) are actually ranked (bad; decent; good).
                        Conversely, if the levels of your regressand are, say, the colors of the traffic light (green; yellow; red), you should go -mlogit-.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #13
                          Thank you Carlo. Please could you clarify for me what is the difference betwwen odds ratios and marginal effects for an ordered logit regression.
                          Best regards

                          Comment


                          • #14
                            Nahed:
                            your first question can be replied with the help of the usual toy-example:
                            Code:
                            . use "https://www.stata-press.com/data/r16/fullauto.dta"
                            (Automobile Models)
                            
                            . ologit rep78 i.foreign, or
                            Iteration 0:   log likelihood = -93.692061
                            Iteration 1:   log likelihood = -79.696089
                            Iteration 2:   log likelihood = -79.034005
                            Iteration 3:   log likelihood = -79.029244
                            Iteration 4:   log likelihood = -79.029243
                            
                            Ordered logistic regression                     Number of obs     =         69
                                                                            LR chi2(1)        =      29.33
                                                                            Prob > chi2       =     0.0000
                            Log likelihood = -79.029243                     Pseudo R2         =     0.1565
                            
                            ------------------------------------------------------------------------------
                                   rep78 | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                 foreign |
                                Foreign  |   19.71836   12.23257     4.81   0.000     5.845417    66.51596
                            -------------+----------------------------------------------------------------
                                   /cut1 |  -3.158382   .7224269                     -4.574313   -1.742452
                                   /cut2 |  -1.362642   .3557343                     -2.059868   -.6654154
                                   /cut3 |   1.232161   .3431227                      .5596532     1.90467
                                   /cut4 |   3.246209   .5556657                      2.157124    4.335293
                            ------------------------------------------------------------------------------
                            Note: Estimates are transformed only in the first equation.
                            
                            . ologit rep78 i.foreign
                            
                            Iteration 0:   log likelihood = -93.692061
                            Iteration 1:   log likelihood = -79.696089
                            Iteration 2:   log likelihood = -79.034005
                            Iteration 3:   log likelihood = -79.029244
                            Iteration 4:   log likelihood = -79.029243
                            
                            Ordered logistic regression                     Number of obs     =         69
                                                                            LR chi2(1)        =      29.33
                                                                            Prob > chi2       =     0.0000
                            Log likelihood = -79.029243                     Pseudo R2         =     0.1565
                            
                            ------------------------------------------------------------------------------
                                   rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                 foreign |
                                Foreign  |    2.98155   .6203644     4.81   0.000     1.765658    4.197442
                            -------------+----------------------------------------------------------------
                                   /cut1 |  -3.158382   .7224269                     -4.574313   -1.742452
                                   /cut2 |  -1.362642   .3557343                     -2.059868   -.6654154
                                   /cut3 |   1.232161   .3431227                      .5596532     1.90467
                                   /cut4 |   3.246209   .5556657                      2.157124    4.335293
                            ------------------------------------------------------------------------------
                            
                            . di exp(2.98155)
                            19.718356
                            
                            .
                            As far as your (too broad) second question is concerned, please see -margins- entry in Stata .pdf manual. Thanks.
                            Kind regards,
                            Carlo
                            (Stata 19.0)

                            Comment


                            • #15
                              Dear all,
                              How to fix a reference category in a binary logit for the dependent variable.
                              Best regards

                              Comment

                              Working...
                              X