Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to compute linear combination of parameters when variables are categorical, for each level of one categorical variable?

    Dear Statalist users,

    I need to estimate the mean value (and the standard error) of the linear combination of parameters when the predictors are categorical.

    Here is my model:

    reg y i.x1 x1#x2, ro

    where y is a continuous variable (monthly earnings of individual i), x1 is a field of professional training followed by individual i (computer, English, finance...), x2 is the firm in which individual i has followed her professional training.

    I want to estimate, for each x1 separately, its marginal effect (and the corresponding standard errors) on y, thus considering every firm-specific effects.

    Since "lincom" cannot be combined with "bysort", and given the categorical nature of my explanatory variables, I'm a bit lost.

    How to proceed?

    Thanks in advance for your help!

    Martin

  • #2
    Welcome to Statalist, Martin.

    The margins command would seem to be useful in your situation.

    If you are not already familiar with the margins command, you can find a nice overview of margins prepared by Richard Williams, a frequent contributor here, at https://www3.nd.edu/~rwilliam/stats3/Margins01.pdf with a more detailed paper in the Stata Journal at http://www.stata-journal.com/article...article=st0260. I'll also note that Margins01.pdf is followed by Margins02.pdf ... Margins05.pdf covering more specialized topics.

    Comment


    • #3
      William gave the sure path to mastering - margins - command.

      Please check whether this toy example fits in your needs:

      Code:
      . sysuse nlsw88.dta
      (NLSW, 1988 extract)
      
      . regress wage tenure i.race##i.married, ro
      
      Linear regression                               Number of obs     =      2,231
                                                      F(6, 2224)        =      22.27
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.0480
                                                      Root MSE          =      5.632
      
      --------------------------------------------------------------------------------
                     |               Robust
                wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      ---------------+----------------------------------------------------------------
              tenure |   .1885916   .0194354     9.70   0.000     .1504782     .226705
                     |
                race |
              black  |  -2.192357    .412885    -5.31   0.000    -3.002037   -1.382676
              other  |  -.0846836      1.689    -0.05   0.960    -3.396866    3.227499
                     |
             married |
            married  |  -1.085912   .3521421    -3.08   0.002    -1.776473   -.3953501
                     |
        race#married |
      black#married  |   1.233947   .5485089     2.25   0.025     .1583035     2.30959
      other#married  |   .9840779    2.07827     0.47   0.636    -3.091474     5.05963
                     |
               _cons |   7.772679   .3514197    22.12   0.000     7.083534    8.461823
      --------------------------------------------------------------------------------
      
      . margins, at(married=0) at(married=1)
      
      Predictive margins                              Number of obs     =      2,231
      Model VCE    : Robust
      
      Expression   : Linear prediction, predict()
      
      1._at        : married         =           0
      
      2._at        : married         =           1
      
      ------------------------------------------------------------------------------
                   |            Delta-method
                   |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               _at |
                1  |   8.331075   .2405536    34.63   0.000     7.859342    8.802809
                2  |   7.576319    .142638    53.12   0.000     7.296601    7.856036
      ------------------------------------------------------------------------------
      Hopefully that helps.
      Best regards,

      Marcos

      Comment


      • #4
        Hi-

        Many thanks for your help, the "margins" command indeed seems relevant for my problem.

        Just one addtionnal question:

        Since my x1 variable of interest (training field) takes 250 values, is it possible to ask "margins" to compute (and report) the marginal effects automatically, for every single possibility I mean? (rather than taping by hand "at (x1=0) at (x1=1) .... at(x1=250)")

        Again, many thanks for your prompt and kind answer,

        Martin

        Comment


        • #5
          at(x=1(1)250)

          Comment


          • #6
            Even easier:

            Code:
            margins x1
            That will give you the adjusted predicted values of y at each value of x.

            By the way, there appears to be a mis-specification of your model. Although -reg y i.x1 x1#x2- is syntactically legal, it is not a legitimate model. When you include an interaction term, you must include both of its constituents. (You can omit one of them if there is another variable in the model with which it is colinear--in that case you are omitting it in the code, but the information about it is still there, so the model is well specified.) One way to do this with minimal typography, and also to assure that you never leave one out by accident is to use the ## operator.

            Code:
            regress y i.x1##i.x2
            The use of ## prevents you from making errors of omission. It also saves keystrokes. And if there is a variable in the model colinear with x1 or x2 that warrants its omission, Stata will do that for you automatically. In fact, as a bonus, if that is supposed to be the case but there is a problem with the data so it is not, the failure of Stata to omit it will be an early warning sign of trouble.

            Also it's not clear to me exactly what your goal is. You refer to wanting the marginal effect of x1 for each value of x1, but that does not make any sense. The marginal effect of x1 varies according to the value of x2, but not by value of x1. What varies with the value of x1 is the marginal effect of x2. So you can do either (or both) of these:

            Code:
            margins x1, dydx(x2)
            margins x2, dydx(x1)

            Comment

            Working...
            X