Using margins with a factor with no base level

Mark Page

Join Date: Dec 2021
Posts: 5

Using margins with a factor with no base level

03 Dec 2021, 09:46

I am using fractional regression to understand the effect of belonging to different categories on the dependent variable. I have three independent variables: a categorical c1 and two continuous x1 and x2. The dependent, y, belongs obviously to the range [0,1]. (Sample data at the endo of the post)

I am interested in the effect of the difference between the various levels of the factor and the sample average. To do so I use

Code:

fracreg logit y x1 x2 ibn.c1, noconst

Such that the categorical variable has no base level.

However, when I try to compute the average partial effects using margins, the command shows the partial effects as differences from the first level.

Code:

. margins, dydx(c1)

Average marginal effects                                   Number of obs = 100
Model VCE: Robust

Expression: Conditional mean of y, predict()
dy/dx wrt:  2.c1 3.c1 4.c1

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
          c1 |
          2  |   .1283419   .0954621     1.34   0.179    -.0587604    .3154443
          3  |    .284335   .1916978     1.48   0.138    -.0913857    .6600558
          4  |   .3168349    .250151     1.27   0.205     -.173452    .8071219
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

Is there any way in which I can force margins to compute the partial effects as differences from the base level?

I have also tried to use fvset

Code:

fvset base none c1

fracreg logit y x1 x2 i.c1, noconst

margins, dydx(c1)

But margins in this case returns "option k() invalid"

The last alternative that comes to my mind is to code the categorical variable as actual dummies in the dataset and then use replace and predict to compute the partial effects. Something along the lines of pages 325 and 326 of the paper describing margins. However, in this case, I don't know how to compute the p-values using robust standard errors.

Sample data

Code:

clear
input float(y x1 x2) byte c1
 .11370341  3.3505356   -2.410667 3
  .6222994   1.949865    .6029335 2
  .6092747    .441348  -3.0782905 1
  .6233795   1.405452   1.2707415 2
  .8609154   3.044252   1.4059036 3
  .6403106  1.5586594   -3.811766 2
.009495757   1.826056    1.877843 2
  .2325505    .429377   -.4489842 1
  .6660838     3.8507  -1.3476336 4
  .5142511   3.952387    .8915749 4
  .6935913   1.615322    2.561234 2
 .54497486   3.470164     3.13026 3
  .2827336   .2534688   -2.401092 1
  .9234335    2.73624   -.8753899 3
 .29231584   2.754327   .29276296 3
  .8372957   1.255912   .13203815 2
  .2862233   2.554077    -.890072 3
  .2668208   2.564945  -4.6842246 3
  .1867228   1.971874   -.5567647 2
  .2322259   3.424002  -2.3636072 3
 .31661245   .5814337   .06625055 1
  .3026934  1.6852854   -1.675434 2
   .159046   .8030344  -.23522216 1
 .03999592   3.915496   -1.483045 4
 .21879955  1.9798528    -.816694 2
  .8105986  1.4978696   -.7555415 2
  .5256975  3.4015625   2.6567335 3
  .9146582  2.7776034   1.4328727 3
   .831345  1.4310325   -.4645784 2
 .04577027   1.494418   -2.452454 2
  .4560915   .7670175    2.095076 1
 .26518667   3.129965  -2.8728144 3
  .3046722 .013693668  -2.3649378 1
  .5073069  3.7240875  -1.5524096 4
  .1810962  2.7577405   -1.944627 3
  .7596706  1.0110087   -1.807315 2
 .20124803   2.661737    2.574408 3
  .2588098  1.9072373   -.2123752 2
  .9921504  3.6190605   -1.431742 4
  .8073524  1.2220386   1.3434443 2
  .5533336   3.715097   .39526725 4
  .6464061  1.0443107  -2.4401934 2
  .3118243  1.7724918  -1.1219302 2
  .6218192   2.830579   1.0674653 3
  .3297702   3.611778    3.900631 4
  .5019975  2.0524924   3.1638865 2
  .6770945  2.0020962    1.675575 2
 .48499125  .19916683 -.026139325 1
  .2439288   .9975606   -3.451592 2
  .7654598   3.153924   -1.461707 3
 .07377988    .792657   -.4308277 1
  .3096866  3.5243714    .9804133 4
  .7172717   1.457271   .11878685 2
  .5045459  1.0797999    .1166241 2
 .15299895   2.359898    3.676726 2
  .5039335   .6618024    .7753018 1
  .4939609   1.956279  -.56110144 2
  .7512002   3.853535   -.8570355 4
  .1746498  2.3184118   .19064987 2
  .8483924  3.1631916   -1.626663 3
  .8648338  2.3153849     2.73638 2
 .04185728  2.1674986  -.57507205 2
 .31718215   3.050287  -1.0571048 3
 .01374994   .5690214  -1.7460258 1
 .23902573   1.569427    1.871786 2
  .7064946   3.214833   -2.554535 3
  .3080948  3.7960474   -.9374797 4
 .50854754  .15380874    .6250325 1
 .05164662  2.6767836   -.2166446 3
 .56456983   .6523331   .50158083 1
  .1214802   .4653192   -2.632387 1
  .8928364   .5296923   -.8601509 1
.014627255   1.564098   -3.408219 2
  .7831211   .8629734  -1.6842747 1
 .08996134   2.788128   2.2965012 3
 .51918995   .6285074    .5056631 1
  .3842667   1.362064   .17057437 2
  .0700525     1.3325  -1.8672522 2
  .3206444   2.539804  -1.1136392 3
  .6684954   1.622833  -2.0327406 2
  .9264005   2.671998    .1070505 3
  .4719097   3.980308   1.5302672 4
 .14261535  3.6151764    .5983896 4
 .54426974     1.3612  -1.2352315 2
 .19617465  1.0291947   1.1055746 2
  .8985805   1.288754     .775817 2
  .3894998  1.9730633    1.686422 2
  .3108708   .0756949   -.8812551 1
 .16002867  1.1404877  -2.3587308 2
  .8961859   2.634834  -.16913813 3
  .1663938  .51937264    .6055554 1
  .9004246   .4633999   3.1181874 1
 .13407819  .12892419  -.13932505 1
 .13161413  1.1543957    .8540788 2
  .1052875  2.3100424   -.1665264 2
 .51158357    .154222   4.3127956 1
  .3001991   2.905073    2.675936 3
.026716895    3.58549   -.4646089 4
  .3096474    2.95456   -.2548082 3
  .7421197  1.6208978  -.04337081 2
end

Tags: average partial effects, categorical, fracreg, margins

Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

03 Dec 2021, 10:29

You may try:

Code:

margins c1

Best regards,

Marcos
Comment
Mark Page

Join Date: Dec 2021

Posts: 5
#3

03 Dec 2021, 10:37

Originally posted by Marcos Almeida View Post

You may try:

Code:

margins c1

This would give me the mean of the dependent variable for the category. How can I compute the partial effect and the relative standard error?
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#4

03 Dec 2021, 10:48

You cant do that
You need something to compare the data to, in order to estimate a marginal effect.
For example, if you use a linear regression , how would you manually estimate marginal effects for dummies without a base?
Comment
Mark Page

Join Date: Dec 2021

Posts: 5
#5

03 Dec 2021, 11:33

Originally posted by FernandoRios View Post

You cant do that
You need something to compare the data to, in order to estimate a marginal effect.
For example, if you use a linear regression , how would you manually estimate marginal effects for dummies without a base?

The base level is when all dummies are set to zero. Please refer to the Stata Journal paper describing margins, pages 325-326 (18 and 19 of the pdf file).

Partial effects in the case of categorical variables are absolute differences between the value of the link function when dummy=1 and the same function when dummy=0

Here is the code section describing this issue in the paper

Code:

. * Replicate AME for black without using margins . clonevar xblack = black . quietly logit diabetes i.xblack i.female age, nolog . replace xblack = 0 (1086 real changes made) . predict adjpredwhite (option pr assumed; Pr(diabetes)) . replace xblack = 1 (10335 real changes made) . predict adjpredblack (option pr assumed; Pr(diabetes)) . generate meblack = adjpredblack - adjpredwhite . summarize adjpredwhite adjpredblack meblack VariableObsMeanStd. Dev.MinMax adjpredwhite10335.0443248.0362422.005399.1358214 adjpredblack10335.084417.0663927.0110063.2436938 meblack10335.0400922.0301892.0056073.1078724

One could easily apply this line of reasoning with n dummies corresponding to the n levels of the factor. However, as stated in the post, the problem with this solution is that I would not know how to compute robust standard errors, p-values etc.
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#6

03 Dec 2021, 12:10

The partial effect of c1 means how the response would change if c1 changes from one level to another level. So you may predict responses at every level of c1, and compare any pair of levels for your need.

Code:

margins c1, pwcompare
Comment
Mark Page

Join Date: Dec 2021

Posts: 5
#7

04 Dec 2021, 00:23

Originally posted by Fei Wang View Post

The partial effect of c1 means how the response would change if c1 changes from one level to another level. So you may predict responses at every level of c1, and compare any pair of levels for your need.

Code:

margins c1, pwcompare

This is nice, but it is not exactly what I need. I am interested in the partial effect from the sample average and not between the different categories.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10213

04 Dec 2021, 03:41

When you include the full set of dummies and exclude the constant, one of dummy coefficients is just the excluded constant. You know this from the dummy variable trap. As Fernando points out, the marginal effect for factor levels is the discrete change from the base level. So this is the standard interpretation. Nonetheless, to get what you want, you can create a dummy for the base category and include it in the regression. Thereafter, use the -xi- prefix to create the other dummies, having specified that the base be excluded. The following illustrates using linear regression where the coefficients are themselves marginal effects, but it should work for nonlinear models as well.

Code:

sysuse auto, clear
regress mpg weight turn ibn.rep78, nocons
margins, dydx(*)
*WANTED
gen rep1=1.rep78
char rep78[omit] 1
xi: regress mpg weight turn rep1 i.rep78, nocons
margins, dydx(*)

Res.:

Code:

. regress mpg weight turn ibn.rep78, nocons

      Source |       SS           df       MS      Number of obs   =        69
-------------+----------------------------------   F(7, 62)        =    383.92
       Model |   32856.975         7  4693.85357   Prob > F        =    0.0000
    Residual |  758.024975        62  12.2262093   R-squared       =    0.9774
-------------+----------------------------------   Adj R-squared   =    0.9749
       Total |       33615        69  487.173913   Root MSE        =    3.4966

------------------------------------------------------------------------------
         mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |  -.0046742   .0010856    -4.31   0.000    -.0068442   -.0025042
        turn |   -.186099   .2028659    -0.92   0.363    -.5916222    .2194242
             |
       rep78 |
          1  |   43.12012   6.326649     6.82   0.000     30.47332    55.76691
          2  |   42.87318   6.235771     6.88   0.000     30.40805    55.33831
          3  |   42.49602     5.7424     7.40   0.000     31.01713    53.97492
          4  |   42.24647   5.557165     7.60   0.000     31.13785    53.35508
          5  |   44.85245   5.421293     8.27   0.000     34.01544    55.68946
------------------------------------------------------------------------------

.
. margins, dydx(*)

Average marginal effects                        Number of obs     =         69
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : weight turn 2.rep78 3.rep78 4.rep78 5.rep78

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |  -.0046742   .0010856    -4.31   0.000    -.0068442   -.0025042
        turn |   -.186099   .2028659    -0.92   0.363    -.5916222    .2194242
             |
       rep78 |
          2  |  -.2469336   2.780014    -0.09   0.930    -5.804101    5.310234
          3  |  -.6240918   2.561763    -0.24   0.808    -5.744984      4.4968
          4  |  -.8736495   2.626997    -0.33   0.741    -6.124941    4.377642
          5  |   1.732332   2.755398     0.63   0.532     -3.77563    7.240293
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

.
. *WANTED

.
. gen rep1=1.rep78
(5 missing values generated)

.
. char rep78[omit] 1

.
. xi: regress mpg weight turn rep1 i.rep78, nocons
i.rep78           _Irep78_1-5         (naturally coded; _Irep78_1 omitted)

      Source |       SS           df       MS      Number of obs   =        69
-------------+----------------------------------   F(7, 62)        =    383.92
       Model |   32856.975         7  4693.85357   Prob > F        =    0.0000
    Residual |  758.024975        62  12.2262093   R-squared       =    0.9774
-------------+----------------------------------   Adj R-squared   =    0.9749
       Total |       33615        69  487.173913   Root MSE        =    3.4966

------------------------------------------------------------------------------
         mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |  -.0046742   .0010856    -4.31   0.000    -.0068442   -.0025042
        turn |   -.186099   .2028659    -0.92   0.363    -.5916222    .2194242
        rep1 |   43.12012   6.326649     6.82   0.000     30.47332    55.76691
   _Irep78_2 |   42.87318   6.235771     6.88   0.000     30.40805    55.33831
   _Irep78_3 |   42.49602     5.7424     7.40   0.000     31.01713    53.97492
   _Irep78_4 |   42.24647   5.557165     7.60   0.000     31.13785    53.35508
   _Irep78_5 |   44.85245   5.421293     8.27   0.000     34.01544    55.68946
------------------------------------------------------------------------------

.
. margins, dydx(*)

Average marginal effects                        Number of obs     =         69
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : weight turn rep1 _Irep78_2 _Irep78_3 _Irep78_4 _Irep78_5

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |  -.0046742   .0010856    -4.31   0.000    -.0068442   -.0025042
        turn |   -.186099   .2028659    -0.92   0.363    -.5916222    .2194242
        rep1 |   43.12012   6.326649     6.82   0.000     30.47332    55.76691
   _Irep78_2 |   42.87318   6.235771     6.88   0.000     30.40805    55.33831
   _Irep78_3 |   42.49602     5.7424     7.40   0.000     31.01713    53.97492
   _Irep78_4 |   42.24647   5.557165     7.60   0.000     31.13785    53.35508
   _Irep78_5 |   44.85245   5.421293     8.27   0.000     34.01544    55.68946
------------------------------------------------------------------------------

.

Comment

Mark Page

Join Date: Dec 2021

Posts: 5
#9

04 Dec 2021, 06:18

Thank you for your suggestion, using dummies goes in the right direction. However, I am not sure that margins is computing partial effects of the factor in this way.
This is due to the fact that the average effects when one dummy is equal to 0 or 1 would be computed using the actual values of the other dummies rather than all zeros. This can be solved by using the "at" option of the margins function to manually set the rest of the dummies to 0.

Thank you, everybody, for all the answers!
Comment

Announcement

Using margins with a factor with no base level

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment