Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using margins with a factor with no base level

    I am using fractional regression to understand the effect of belonging to different categories on the dependent variable. I have three independent variables: a categorical c1 and two continuous x1 and x2. The dependent, y, belongs obviously to the range [0,1]. (Sample data at the endo of the post)

    I am interested in the effect of the difference between the various levels of the factor and the sample average. To do so I use

    Code:
    fracreg logit y x1 x2 ibn.c1, noconst
    Such that the categorical variable has no base level.

    However, when I try to compute the average partial effects using margins, the command shows the partial effects as differences from the first level.

    Code:
    . margins, dydx(c1)
    
    Average marginal effects                                   Number of obs = 100
    Model VCE: Robust
    
    Expression: Conditional mean of y, predict()
    dy/dx wrt:  2.c1 3.c1 4.c1
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
              c1 |
              2  |   .1283419   .0954621     1.34   0.179    -.0587604    .3154443
              3  |    .284335   .1916978     1.48   0.138    -.0913857    .6600558
              4  |   .3168349    .250151     1.27   0.205     -.173452    .8071219
    ------------------------------------------------------------------------------
    Note: dy/dx for factor levels is the discrete change from the base level.
    Is there any way in which I can force margins to compute the partial effects as differences from the base level?

    I have also tried to use fvset

    Code:
    fvset base none c1
    
    fracreg logit y x1 x2 i.c1, noconst
    
    margins, dydx(c1)
    But margins in this case returns "option k() invalid"

    The last alternative that comes to my mind is to code the categorical variable as actual dummies in the dataset and then use replace and predict to compute the partial effects. Something along the lines of pages 325 and 326 of the paper describing margins. However, in this case, I don't know how to compute the p-values using robust standard errors.


    Sample data
    Code:
    clear
    input float(y x1 x2) byte c1
     .11370341  3.3505356   -2.410667 3
      .6222994   1.949865    .6029335 2
      .6092747    .441348  -3.0782905 1
      .6233795   1.405452   1.2707415 2
      .8609154   3.044252   1.4059036 3
      .6403106  1.5586594   -3.811766 2
    .009495757   1.826056    1.877843 2
      .2325505    .429377   -.4489842 1
      .6660838     3.8507  -1.3476336 4
      .5142511   3.952387    .8915749 4
      .6935913   1.615322    2.561234 2
     .54497486   3.470164     3.13026 3
      .2827336   .2534688   -2.401092 1
      .9234335    2.73624   -.8753899 3
     .29231584   2.754327   .29276296 3
      .8372957   1.255912   .13203815 2
      .2862233   2.554077    -.890072 3
      .2668208   2.564945  -4.6842246 3
      .1867228   1.971874   -.5567647 2
      .2322259   3.424002  -2.3636072 3
     .31661245   .5814337   .06625055 1
      .3026934  1.6852854   -1.675434 2
       .159046   .8030344  -.23522216 1
     .03999592   3.915496   -1.483045 4
     .21879955  1.9798528    -.816694 2
      .8105986  1.4978696   -.7555415 2
      .5256975  3.4015625   2.6567335 3
      .9146582  2.7776034   1.4328727 3
       .831345  1.4310325   -.4645784 2
     .04577027   1.494418   -2.452454 2
      .4560915   .7670175    2.095076 1
     .26518667   3.129965  -2.8728144 3
      .3046722 .013693668  -2.3649378 1
      .5073069  3.7240875  -1.5524096 4
      .1810962  2.7577405   -1.944627 3
      .7596706  1.0110087   -1.807315 2
     .20124803   2.661737    2.574408 3
      .2588098  1.9072373   -.2123752 2
      .9921504  3.6190605   -1.431742 4
      .8073524  1.2220386   1.3434443 2
      .5533336   3.715097   .39526725 4
      .6464061  1.0443107  -2.4401934 2
      .3118243  1.7724918  -1.1219302 2
      .6218192   2.830579   1.0674653 3
      .3297702   3.611778    3.900631 4
      .5019975  2.0524924   3.1638865 2
      .6770945  2.0020962    1.675575 2
     .48499125  .19916683 -.026139325 1
      .2439288   .9975606   -3.451592 2
      .7654598   3.153924   -1.461707 3
     .07377988    .792657   -.4308277 1
      .3096866  3.5243714    .9804133 4
      .7172717   1.457271   .11878685 2
      .5045459  1.0797999    .1166241 2
     .15299895   2.359898    3.676726 2
      .5039335   .6618024    .7753018 1
      .4939609   1.956279  -.56110144 2
      .7512002   3.853535   -.8570355 4
      .1746498  2.3184118   .19064987 2
      .8483924  3.1631916   -1.626663 3
      .8648338  2.3153849     2.73638 2
     .04185728  2.1674986  -.57507205 2
     .31718215   3.050287  -1.0571048 3
     .01374994   .5690214  -1.7460258 1
     .23902573   1.569427    1.871786 2
      .7064946   3.214833   -2.554535 3
      .3080948  3.7960474   -.9374797 4
     .50854754  .15380874    .6250325 1
     .05164662  2.6767836   -.2166446 3
     .56456983   .6523331   .50158083 1
      .1214802   .4653192   -2.632387 1
      .8928364   .5296923   -.8601509 1
    .014627255   1.564098   -3.408219 2
      .7831211   .8629734  -1.6842747 1
     .08996134   2.788128   2.2965012 3
     .51918995   .6285074    .5056631 1
      .3842667   1.362064   .17057437 2
      .0700525     1.3325  -1.8672522 2
      .3206444   2.539804  -1.1136392 3
      .6684954   1.622833  -2.0327406 2
      .9264005   2.671998    .1070505 3
      .4719097   3.980308   1.5302672 4
     .14261535  3.6151764    .5983896 4
     .54426974     1.3612  -1.2352315 2
     .19617465  1.0291947   1.1055746 2
      .8985805   1.288754     .775817 2
      .3894998  1.9730633    1.686422 2
      .3108708   .0756949   -.8812551 1
     .16002867  1.1404877  -2.3587308 2
      .8961859   2.634834  -.16913813 3
      .1663938  .51937264    .6055554 1
      .9004246   .4633999   3.1181874 1
     .13407819  .12892419  -.13932505 1
     .13161413  1.1543957    .8540788 2
      .1052875  2.3100424   -.1665264 2
     .51158357    .154222   4.3127956 1
      .3001991   2.905073    2.675936 3
    .026716895    3.58549   -.4646089 4
      .3096474    2.95456   -.2548082 3
      .7421197  1.6208978  -.04337081 2
    end


  • #2
    You may try:

    Code:
    margins c1
    Best regards,

    Marcos

    Comment


    • #3
      Originally posted by Marcos Almeida View Post
      You may try:

      Code:
      margins c1
      This would give me the mean of the dependent variable for the category. How can I compute the partial effect and the relative standard error?

      Comment


      • #4
        You cant do that
        You need something to compare the data to, in order to estimate a marginal effect.
        For example, if you use a linear regression , how would you manually estimate marginal effects for dummies without a base?

        Comment


        • #5
          Originally posted by FernandoRios View Post
          You cant do that
          You need something to compare the data to, in order to estimate a marginal effect.
          For example, if you use a linear regression , how would you manually estimate marginal effects for dummies without a base?
          The base level is when all dummies are set to zero. Please refer to the Stata Journal paper describing margins, pages 325-326 (18 and 19 of the pdf file).

          Partial effects in the case of categorical variables are absolute differences between the value of the link function when dummy=1 and the same function when dummy=0

          Here is the code section describing this issue in the paper

          Code:
          . * Replicate AME for black without using margins
          . clonevar xblack = black
          . quietly logit diabetes i.xblack i.female age, nolog
          . replace xblack = 0
          (1086 real changes made)
          . predict adjpredwhite
          (option pr assumed; Pr(diabetes))
          . replace xblack = 1
          (10335 real changes made)
          . predict adjpredblack
          (option pr assumed; Pr(diabetes))
          . generate meblack = adjpredblack - adjpredwhite
          . summarize adjpredwhite adjpredblack meblack
          VariableObsMeanStd. Dev.MinMax
          adjpredwhite10335.0443248.0362422.005399.1358214
          adjpredblack10335.084417.0663927.0110063.2436938
          meblack10335.0400922.0301892.0056073.1078724
          One could easily apply this line of reasoning with n dummies corresponding to the n levels of the factor. However, as stated in the post, the problem with this solution is that I would not know how to compute robust standard errors, p-values etc.

          Comment


          • #6
            The partial effect of c1 means how the response would change if c1 changes from one level to another level. So you may predict responses at every level of c1, and compare any pair of levels for your need.

            Code:
            margins c1, pwcompare

            Comment


            • #7
              Originally posted by Fei Wang View Post
              The partial effect of c1 means how the response would change if c1 changes from one level to another level. So you may predict responses at every level of c1, and compare any pair of levels for your need.

              Code:
              margins c1, pwcompare
              This is nice, but it is not exactly what I need. I am interested in the partial effect from the sample average and not between the different categories.

              Comment


              • #8
                When you include the full set of dummies and exclude the constant, one of dummy coefficients is just the excluded constant. You know this from the dummy variable trap. As Fernando points out, the marginal effect for factor levels is the discrete change from the base level. So this is the standard interpretation. Nonetheless, to get what you want, you can create a dummy for the base category and include it in the regression. Thereafter, use the -xi- prefix to create the other dummies, having specified that the base be excluded. The following illustrates using linear regression where the coefficients are themselves marginal effects, but it should work for nonlinear models as well.

                Code:
                sysuse auto, clear
                regress mpg weight turn ibn.rep78, nocons
                margins, dydx(*)
                *WANTED
                gen rep1=1.rep78
                char rep78[omit] 1
                xi: regress mpg weight turn rep1 i.rep78, nocons
                margins, dydx(*)
                Res.:

                Code:
                . regress mpg weight turn ibn.rep78, nocons
                
                      Source |       SS           df       MS      Number of obs   =        69
                -------------+----------------------------------   F(7, 62)        =    383.92
                       Model |   32856.975         7  4693.85357   Prob > F        =    0.0000
                    Residual |  758.024975        62  12.2262093   R-squared       =    0.9774
                -------------+----------------------------------   Adj R-squared   =    0.9749
                       Total |       33615        69  487.173913   Root MSE        =    3.4966
                
                ------------------------------------------------------------------------------
                         mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                      weight |  -.0046742   .0010856    -4.31   0.000    -.0068442   -.0025042
                        turn |   -.186099   .2028659    -0.92   0.363    -.5916222    .2194242
                             |
                       rep78 |
                          1  |   43.12012   6.326649     6.82   0.000     30.47332    55.76691
                          2  |   42.87318   6.235771     6.88   0.000     30.40805    55.33831
                          3  |   42.49602     5.7424     7.40   0.000     31.01713    53.97492
                          4  |   42.24647   5.557165     7.60   0.000     31.13785    53.35508
                          5  |   44.85245   5.421293     8.27   0.000     34.01544    55.68946
                ------------------------------------------------------------------------------
                
                .
                . margins, dydx(*)
                
                Average marginal effects                        Number of obs     =         69
                Model VCE    : OLS
                
                Expression   : Linear prediction, predict()
                dy/dx w.r.t. : weight turn 2.rep78 3.rep78 4.rep78 5.rep78
                
                ------------------------------------------------------------------------------
                             |            Delta-method
                             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                      weight |  -.0046742   .0010856    -4.31   0.000    -.0068442   -.0025042
                        turn |   -.186099   .2028659    -0.92   0.363    -.5916222    .2194242
                             |
                       rep78 |
                          2  |  -.2469336   2.780014    -0.09   0.930    -5.804101    5.310234
                          3  |  -.6240918   2.561763    -0.24   0.808    -5.744984      4.4968
                          4  |  -.8736495   2.626997    -0.33   0.741    -6.124941    4.377642
                          5  |   1.732332   2.755398     0.63   0.532     -3.77563    7.240293
                ------------------------------------------------------------------------------
                Note: dy/dx for factor levels is the discrete change from the base level.
                
                .
                . *WANTED
                
                .
                . gen rep1=1.rep78
                (5 missing values generated)
                
                .
                . char rep78[omit] 1
                
                .
                . xi: regress mpg weight turn rep1 i.rep78, nocons
                i.rep78           _Irep78_1-5         (naturally coded; _Irep78_1 omitted)
                
                      Source |       SS           df       MS      Number of obs   =        69
                -------------+----------------------------------   F(7, 62)        =    383.92
                       Model |   32856.975         7  4693.85357   Prob > F        =    0.0000
                    Residual |  758.024975        62  12.2262093   R-squared       =    0.9774
                -------------+----------------------------------   Adj R-squared   =    0.9749
                       Total |       33615        69  487.173913   Root MSE        =    3.4966
                
                ------------------------------------------------------------------------------
                         mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                      weight |  -.0046742   .0010856    -4.31   0.000    -.0068442   -.0025042
                        turn |   -.186099   .2028659    -0.92   0.363    -.5916222    .2194242
                        rep1 |   43.12012   6.326649     6.82   0.000     30.47332    55.76691
                   _Irep78_2 |   42.87318   6.235771     6.88   0.000     30.40805    55.33831
                   _Irep78_3 |   42.49602     5.7424     7.40   0.000     31.01713    53.97492
                   _Irep78_4 |   42.24647   5.557165     7.60   0.000     31.13785    53.35508
                   _Irep78_5 |   44.85245   5.421293     8.27   0.000     34.01544    55.68946
                ------------------------------------------------------------------------------
                
                .
                . margins, dydx(*)
                
                Average marginal effects                        Number of obs     =         69
                Model VCE    : OLS
                
                Expression   : Linear prediction, predict()
                dy/dx w.r.t. : weight turn rep1 _Irep78_2 _Irep78_3 _Irep78_4 _Irep78_5
                
                ------------------------------------------------------------------------------
                             |            Delta-method
                             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                      weight |  -.0046742   .0010856    -4.31   0.000    -.0068442   -.0025042
                        turn |   -.186099   .2028659    -0.92   0.363    -.5916222    .2194242
                        rep1 |   43.12012   6.326649     6.82   0.000     30.47332    55.76691
                   _Irep78_2 |   42.87318   6.235771     6.88   0.000     30.40805    55.33831
                   _Irep78_3 |   42.49602     5.7424     7.40   0.000     31.01713    53.97492
                   _Irep78_4 |   42.24647   5.557165     7.60   0.000     31.13785    53.35508
                   _Irep78_5 |   44.85245   5.421293     8.27   0.000     34.01544    55.68946
                ------------------------------------------------------------------------------
                
                .

                Comment


                • #9
                  Thank you for your suggestion, using dummies goes in the right direction. However, I am not sure that margins is computing partial effects of the factor in this way.
                  This is due to the fact that the average effects when one dummy is equal to 0 or 1 would be computed using the actual values of the other dummies rather than all zeros. This can be solved by using the "at" option of the margins function to manually set the rest of the dummies to 0.

                  Thank you, everybody, for all the answers!

                  Comment

                  Working...
                  X