Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • margins with xi: prefix

    Hello everyone,

    I'm using the SSC - metareg - which won't allow factor notation directly.

    When I type - xi: metareg y i.x - for example, I get the output with "_I" notation. So far so good.

    But I wonder whether there is a way to get the margins.

    I searched on the Stata Forum and found this old thread, still lacking an answer.

    Thank you in advance.
    Best regards,

    Marcos

  • #2
    Hi Marcos
    have you considered making a small modification in the program to allow for factor notation?
    if you open the adofile metareg.ado
    you could change
    Code:
     syntax varlist(min=1 numeric) [if] [in] , [ /*
    with

    Code:
     syntax varlist(fv min=1 numeric) [if] [in] , [ /*
    I haven't tested this, but chances are that simple change will enable the use of factor notation.
    HTH

    Comment


    • #3
      Directly, this is just due to how the variable is named. margins expects factor variables to be named as "#.var". The workaround is to install Ben Jann's erepost command from SSC and rename the factor variable.

      Code:
      sysuse auto, clear
      regress mpg weight i.foreign
      margins i.foreign
      xi: regress mpg weight i.foreign
      mat l e(b)
      *RENAME FACTOR VARIABLE
      local colnms: coln e(b)
      local colnms = subinstr("`colnms'","_Iforeign_1", "1.foreign", .)
      mat b= e(b)
      mat colnames b= `colnms'
      *PROGRAM TO RENAME e(b)
      capt prog drop repb
      program repb, eclass
      erepost b= b, rename
      end
      *RUN PROGRAM
      repb
      margins i.foreign
      Res.:

      Code:
      *WITH FVARS
      . margins i.foreign
      
      Predictive margins                              Number of obs     =         74
      Model VCE    : OLS
      
      Expression   : Linear prediction, predict()
      
      ------------------------------------------------------------------------------
                   |            Delta-method
                   |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
           foreign |
         Domestic  |   21.78785   .5091123    42.80   0.000     20.77271    22.80299
          Foreign  |   20.13782   .8535566    23.59   0.000     18.43587    21.83976
      ------------------------------------------------------------------------------
      
      *WORKAROUND
      
      . margins i.foreign
      Warning: cannot perform check for estimable functions.
      
      Predictive margins                              Number of obs     =         74
      Model VCE    : OLS
      
      Expression   : Linear prediction, predict()
      
      ------------------------------------------------------------------------------
                   |            Delta-method
                   |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
           foreign |
         Domestic  |   21.78785   .5091123    42.80   0.000     20.77271    22.80299
          Foreign  |   20.13782   .8535566    23.59   0.000     18.43587    21.83976
      ------------------------------------------------------------------------------
      Last edited by Andrew Musau; 25 Mar 2020, 14:55.

      Comment


      • #4
        FernandoRios Thank you very much for the suggestion. A nice strategy. I will try it out!

        Andrew Musau Thank you very much for the suggestion. Since I have a 5-level categorical variable, I just needed to adpat your code. It worked perfectly!
        Best regards,

        Marcos

        Comment


        • #5
          A word of caution.

          The suggestion in #2 will cause syntax to accept factor variable notation. That, however, does by no means guarantee that the code will handle those factor variables correctly. There is hughe difference between allowing, e.g., i.foreign, in a variable list that syntax parses and handlig the resulting `varlist', which merely expands to i.foreign, correctly in the estimation. If you are lucky, you will get an error message from the code following the syntax command; if you are not, you may believe that everything went fine when it actually did not.

          Similar arguements apply to the suggestion in #3. While you will indeed get the simple margins, you might not get the correct results for, e.g., the dydx() option. This is because margins actually expects not only 1.foreign but also 0b.foreign, i.e., the base level in the matrix e(b). Things get more complicated with categorical variables that have more than two levels, interactions, etc. I have discussed a similar approach in length here.

          The bottom line is this: When you mess with the internals of Stata like that, it is often easy to work-around error messages; it is therefore also very easy to produce incorrect results without realizing.

          Best
          Daniel
          Last edited by daniel klein; 26 Mar 2020, 00:45.

          Comment


          • #6
            Similar arguements apply to the suggestion in #3. While you will indeed get the simple margins, you might not get the correct results for, e.g., the dydx() option. This is because margins actually expects not only 1.foreign but also 0b.foreign, i.e., the base level in the matrix e(b). Things get more complicated with categorical variables that have more than two levels, interactions, etc.
            I agree with daniel klein's point concerning the -dydx()- option with categorical variables having more than 2 levels. Here, margins not finding a base level will drop one of the existing categories. A quick fix to this problem is to add a variable that will be dropped from the regression and assign this the base category when renaming the columns of e(b). Here is the approach following #3.

            Code:
            sysuse auto, clear
            regress mpg weight i.rep78
            margins, dydx(*) atmeans
            *GENERATE ZERO VAR
            gen zero=0
            xi: regress mpg weight zero i.rep78
            mat l e(b)
            *RENAME FACTOR VARIABLE
            local colnms: coln e(b)
            forval i=2/5{
            local colnms = subinstr("`colnms'","_Irep78_`i'", "`i'.rep78", .)
            }
            local colnms = subinstr("`colnms'","o.zero", "1b.rep78", .)
            mat b= e(b)
            mat colnames b= `colnms'
            *PROGRAM TO RENAME e(b)
            capt prog drop repb
            program repb, eclass
            erepost b= b, rename
            end
            *RUN PROGRAM
            repb
            margins, dydx(*) atmeans
            As long as the interactions are specified properly when using the xi prefix (i.e., coefficient estimates are the same as would be obtained using factor variables) and their columns appropriately renamed subsequently, I do not see why the margins results should be affected. As stated in #3, in my opinion, the observed difference is simply a naming issue, but I would love to be corrected otherwise.

            Res.:

            Code:
            *FVARS
            . margins, dydx(*) atmeans
            
            Conditional marginal effects                    Number of obs     =         69
            Model VCE    : OLS
            
            Expression   : Linear prediction, predict()
            dy/dx w.r.t. : weight 2.rep78 3.rep78 4.rep78 5.rep78
            at           : weight          =    3032.029 (mean)
                           1.rep78         =    .0289855 (mean)
                           2.rep78         =     .115942 (mean)
                           3.rep78         =    .4347826 (mean)
                           4.rep78         =    .2608696 (mean)
                           5.rep78         =    .1594203 (mean)
            
            ------------------------------------------------------------------------------
                         |            Delta-method
                         |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                  weight |   -.005503    .000601    -9.16   0.000     -.006704    -.004302
                         |
                   rep78 |
                      2  |  -.4786043   2.765035    -0.17   0.863    -6.004085    5.046877
                      3  |  -.4715623   2.553145    -0.18   0.854    -5.573614     4.63049
                      4  |  -.5990319   2.606599    -0.23   0.819    -5.807905    4.609841
                      5  |   2.086276   2.724817     0.77   0.447    -3.358836    7.531388
            ------------------------------------------------------------------------------
            Note: dy/dx for factor levels is the discrete change from the base level.
            
            *WORKAROUND
            
            . margins, dydx(*) atmeans
            Warning: cannot perform check for estimable functions.
            
            Conditional marginal effects                    Number of obs     =         69
            Model VCE    : OLS
            
            Expression   : Linear prediction, predict()
            dy/dx w.r.t. : weight 2.rep78 3.rep78 4.rep78 5.rep78
            at           : weight          =    3032.029 (mean)
                           1.rep78         =    .0289855 (mean)
                           2.rep78         =     .115942 (mean)
                           3.rep78         =    .4347826 (mean)
                           4.rep78         =    .2608696 (mean)
                           5.rep78         =    .1594203 (mean)
            
            ------------------------------------------------------------------------------
                         |            Delta-method
                         |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                  weight |   -.005503    .000601    -9.16   0.000     -.006704    -.004302
                         |
                   rep78 |
                      2  |  -.4786043   2.765035    -0.17   0.863    -6.004085    5.046877
                      3  |  -.4715623   2.553145    -0.18   0.854    -5.573614     4.63049
                      4  |  -.5990319   2.606599    -0.23   0.819    -5.807905    4.609841
                      5  |   2.086276   2.724817     0.77   0.447    -3.358836    7.531388
            ------------------------------------------------------------------------------
            Note: dy/dx for factor levels is the discrete change from the base level.

            Comment


            • #7
              Originally posted by Andrew Musau View Post
              As long as the interactions are specified properly when using the xi prefix (i.e., coefficient estimates are the same as would be obtained using factor variables) and their columns appropriately renamed subsequently, I do not see why the margins results should be affected. As stated in #3, in my opinion, the observed difference is simply a naming issue, but I would love to be corrected otherwise.
              Basically, I agree. That is why I have suggested essentially the same "bogus-approach" a couple of years ago.

              Yet, the margins command is very powerful and I do not feel sufficiently confident to state that changing the marix names of e(b) would yield correct results in general (non-linear and/or mixed models, constraints, etc.). Nobody has claimed that. But I see how the proposed solutions could be misinterpreted to imply a general workaround. Therefore, I just feel it is important to explicitly state that ad-hoc fixes, hacking Stata's internals, might have side-effects that are then hard to spot. In other words: before using those suggestions, make sure you fully understand why the codes work both in a technical and statsitical sense.

              Best
              Daniel
              Last edited by daniel klein; 26 Mar 2020, 07:04.

              Comment


              • #8
                Thank you daniel klein , Andrew Musau and FernandoRios for the enlightening discussion.

                In #4, when I mentioned an adaptation, it was practically the same as Andrew's code in #6.

                By the way, with regards to the (margins) results, I also did some sort of sensitivity analysis under a Bayesian approach, which came up similar, and I believe this is somewhat reassuring.
                Best regards,

                Marcos

                Comment

                Working...
                X