Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Constraining coefficients to equality for all non-base categories

    When using a categorical dependent variable (like in the multinomial Logit), one can constrain coefficients for x to equality for, say, categories cat2 and cat3 (cat1 is the base category) by defining the following constraint:
    Code:
    constraint 1 [cat2=cat3]: x
    Suppose that one has N categories and wants to constrain coefficients for x to equality for all the N-1 non-base categories. Is there a way to do this by defining a single constraint? I mean somthing like
    Code:
    constraint 1 [(all categories)]: x
    where (all categories) is what I don't know how to specify.

    Many thanks to those who can provide some clue.

  • #2
    The easiest way to do this is to not use constraints at all. Create a new variable equal to 0 for the base category of x and 1 for all the non-base categories (and missing for any observations where x is missing.) Then do your regression using this new variable instead of x.

    Comment


    • #3
      Thanks Clyde, but I'm not sure I get your point. Here's a simple example of what I would like to do:
      Code:
      clear all
      set seed 666
      
      set obs 1000
      gen id = _n
      
      gen x = runiform()
      gen eps = runiform()
      
      gen tot = x+eps
      sum tot, d
      
      gen y = .
      replace y = 0 if tot <= .71
      replace y = 1 if tot >  .71 & tot <= 1.01
      replace y = 2 if tot > 1.01 & tot <= 1.31
      replace y = 3 if tot > 1.31
      
      drop tot
      
      tab y
      
      constraint 1 [1=2]: x
      constraint 2 [2=3]: x
      mlogit y x,  baseoutcome(0) constraints(1,2)
      The way I understand your suggestion is:
      Code:
      gen x1 = y != 0
      mlogit y x1,  baseoutcome(0)
      which does not work because there's no variability in x1 among the non-base categories

      Comment


      • #4
        I think you just need one more constraint: a set to make the coefficients equal, then one more to set the constrained value. Your don't get sensible results in your example because it's just a toy dataset, but it shows the idea.

        Code:
        constraint 1 [1=2]: x
        constraint 2 [2=3]: x
        constraint 3 [1]: x=1
        mlogit y x,  baseoutcome(0) constraints(1 2 3)

        Comment


        • #5
          Sorry, I misread what you wrote. I thought you were trying to constrain the levels of an explanatory variable, not an outcome. Here's how it works in that situation, in case somebody is interested:
          Code:
          . sysuse auto, clear
          (1978 automobile data)
          
          .
          . regress price mpg i.rep78
          
                Source |       SS           df       MS      Number of obs   =        69
          -------------+----------------------------------   F(5, 63)        =      4.39
                 Model |   149020603         5  29804120.7   Prob > F        =    0.0017
              Residual |   427776355        63  6790100.88   R-squared       =    0.2584
          -------------+----------------------------------   Adj R-squared   =    0.1995
                 Total |   576796959        68  8482308.22   Root MSE        =    2605.8
          
          ------------------------------------------------------------------------------
                 price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                   mpg |  -280.2615   61.57666    -4.55   0.000    -403.3126   -157.2103
                       |
                 rep78 |
                    2  |   877.6347   2063.285     0.43   0.672     -3245.51     5000.78
                    3  |   1425.657   1905.438     0.75   0.457    -2382.057    5233.371
                    4  |   1693.841   1942.669     0.87   0.387    -2188.274    5575.956
                    5  |   3131.982   2041.049     1.53   0.130    -946.7282    7210.693
                       |
                 _cons |   10449.99   2251.041     4.64   0.000     5951.646    14948.34
          ------------------------------------------------------------------------------
          
          .
          . constraint 1 3.rep78 = 2.rep78
          
          . constraint 2 4.rep78 = 3.rep78
          
          . constraint 3 5.rep78 = 4.rep78
          
          . cnsreg price mpg i.rep78, constraints(1 2 3)
          
          Constrained linear regression                        Number of obs =        69
                                                               F(2, 66)      =      9.18
                                                               Prob > F      =    0.0003
                                                               Root MSE      = 2614.9289
          
           ( 1)  - 2.rep78 + 3.rep78 = 0
           ( 2)  - 3.rep78 + 4.rep78 = 0
           ( 3)  - 4.rep78 + 5.rep78 = 0
          ------------------------------------------------------------------------------
                 price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                   mpg |  -226.7809   54.05666    -4.20   0.000    -334.7085   -118.8533
                       |
                 rep78 |
                    2  |    1696.45   1876.498     0.90   0.369      -2050.1    5442.999
                    3  |    1696.45   1876.498     0.90   0.369      -2050.1    5442.999
                    4  |    1696.45   1876.498     0.90   0.369      -2050.1    5442.999
                    5  |    1696.45   1876.498     0.90   0.369      -2050.1    5442.999
                       |
                 _cons |   9326.899   2169.696     4.30   0.000      4994.96    13658.84
          ------------------------------------------------------------------------------
          
          .
          . recode rep78 (3/5 = 2), gen(rep78_red)
          (59 differences between rep78 and rep78_red)
          
          . regress price mpg i.rep78_red
          
                Source |       SS           df       MS      Number of obs   =        69
          -------------+----------------------------------   F(2, 66)        =      9.18
                 Model |   125498634         2  62749316.9   Prob > F        =    0.0003
              Residual |   451298325        66  6837853.41   R-squared       =    0.2176
          -------------+----------------------------------   Adj R-squared   =    0.1939
                 Total |   576796959        68  8482308.22   Root MSE        =    2614.9
          
          ------------------------------------------------------------------------------
                 price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                   mpg |  -226.7809   54.05666    -4.20   0.000    -334.7085   -118.8533
           2.rep78_red |    1696.45   1876.498     0.90   0.369      -2050.1    5442.999
                 _cons |   9326.899   2169.696     4.30   0.000      4994.96    13658.84
          ------------------------------------------------------------------------------
          Note that the single coefficient for rep78_red gets exactly the same results as the several constrained coefficients for the original levels of rep78.

          You can, in fact, do something similar with an outcome variable, but it doesn't work quite as well because you lose all the information about the constant terms at those levels.
          Code:
          . constraint 1 [2=3]:mpg
          
          . constraint 2 [3=4]:mpg
          
          . constraint 3 [4=5]:mpg
          
          . mlogit rep78 mpg, constraints(1 2 3) baseoutcome(1)
          
          Iteration 0:  Log likelihood = -93.692061  
          Iteration 1:  Log likelihood = -93.689471  
          Iteration 2:  Log likelihood = -93.689468  
          
          Multinomial logistic regression                         Number of obs =     69
                                                                  Wald chi2(1)  =   0.01
          Log likelihood = -93.689468                             Prob > chi2   = 0.9431
          
           ( 1)  [2]mpg - [3]mpg = 0
           ( 2)  [3]mpg - [4]mpg = 0
           ( 3)  [4]mpg - [5]mpg = 0
          ------------------------------------------------------------------------------
                 rep78 | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
          -------------+----------------------------------------------------------------
          1            |  (base outcome)
          -------------+----------------------------------------------------------------
          2            |
                   mpg |   .0090203   .1263047     0.07   0.943    -.2385324    .2565731
                 _cons |   1.195531   2.768717     0.43   0.666    -4.231054    6.622117
          -------------+----------------------------------------------------------------
          3            |
                   mpg |   .0090203   .1263047     0.07   0.943    -.2385324    .2565731
                 _cons |   2.517287   2.752113     0.91   0.360    -2.876755     7.91133
          -------------+----------------------------------------------------------------
          4            |
                   mpg |   .0090203   .1263047     0.07   0.943    -.2385324    .2565731
                 _cons |   2.006462   2.756148     0.73   0.467    -3.395488    7.408412
          -------------+----------------------------------------------------------------
          5            |
                   mpg |   .0090203   .1263047     0.07   0.943    -.2385324    .2565731
                 _cons |   1.513985   2.762554     0.55   0.584     -3.90052    6.928491
          ------------------------------------------------------------------------------
          
          .
          . mlogit rep78_red mpg, baseoutcome(1)
          
          Iteration 0:  Log likelihood =  -9.052649  
          Iteration 1:  Log likelihood = -9.0500588  
          Iteration 2:  Log likelihood = -9.0500558  
          Iteration 3:  Log likelihood = -9.0500558  
          
          Multinomial logistic regression                         Number of obs =     69
                                                                  LR chi2(1)    =   0.01
                                                                  Prob > chi2   = 0.9426
          Log likelihood = -9.0500558                             Pseudo R2     = 0.0003
          
          ------------------------------------------------------------------------------
             rep78_red | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
          -------------+----------------------------------------------------------------
          1            |  (base outcome)
          -------------+----------------------------------------------------------------
          2            |
                   mpg |   .0090207    .126305     0.07   0.943    -.2385325    .2565739
                 _cons |   3.320776    2.74877     1.21   0.227    -2.066714    8.708266
          ------------------------------------------------------------------------------
          As you can see, the coefficient of mpg in the rep78_red regression is equal, except for tiny rounding errors that are common with maximum likelihood estimation, to the constrained mpg coefficients in the constrained model. But, the constrained model still gives you separate constant terms at each level, which are lost in my approach.

          Comment


          • #6
            My reading of this question is not that the OP does not know how to specify multiple constraints. He is asking whether there is a convenient way to specify multiple constraints. With tuples from SSC, one can write a program that simplifies this.

            Code:
            ssc install tuples, replace
            The following can be generalized to any variable with a finite number of levels (within limits). Only change the highlighted.

            Code:
            macro drop _all
            *LOAD DATASET
            sysuse auto, clear
            *SPECIFY VARIABLE (HIGHLIGHTED)
            local var "rep78"
            levelsof `var', local(all)
            local all= substr("`all'", 2,.)
            tuples `all', max(2) min(2) conditionals(1)
            local constraints
            forval i= 1/`ntuples'{
                constraint `i' `=word("`tuple`i''", 1)'.`var' = `=word("`tuple`i''", 2)'.`var'
                local constraints "`constraints' `i'"
            }
            cnsreg price mpg i.rep78, constraints(`constraints')
            Res.:

            Code:
             
            . cnsreg price mpg i.rep78, constraints(`constraints')
            
            Constrained linear regression                        Number of obs =        69
                                                                 F(2, 66)      =      9.18
                                                                 Prob > F      =    0.0003
                                                                 Root MSE      = 2614.9289
            
             ( 1)  2.rep78 - 5.rep78 = 0
             ( 2)  2.rep78 - 4.rep78 = 0
             ( 3)  2.rep78 - 3.rep78 = 0
            ------------------------------------------------------------------------------
                   price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                     mpg |  -226.7809   54.05666    -4.20   0.000    -334.7085   -118.8533
                         |
                   rep78 |
                      2  |    1696.45   1876.498     0.90   0.369      -2050.1    5442.999
                      3  |    1696.45   1876.498     0.90   0.369      -2050.1    5442.999
                      4  |    1696.45   1876.498     0.90   0.369      -2050.1    5442.999
                      5  |    1696.45   1876.498     0.90   0.369      -2050.1    5442.999
                         |
                   _cons |   9326.899   2169.696     4.30   0.000      4994.96    13658.84
            ------------------------------------------------------------------------------
            
            .

            Comment


            • #7
              Thank you all, these are very useful tips!

              Comment

              Working...
              X