Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    The duplicating a variable approach may be fine if you only want the coefficients. But I am guessing it will cause you grief if you also want the marginal effects.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #17
      Also, my factor variable has a lots of levels, so indicating a constraint each level to not be interacted is extremely cumbersome.
      If I want to impose constraints, I find it very useful to first run the command with the coefl option. Then it is easy to name the parameters I want to constrain. You can just copy and paste the parameter names. For example,

      Code:
      webuse nhanes2f, clear
      logit diabetes i.race weight i.race#c.weight, coefl nolog
      constraint 1 _b[3.race#c.weight] = 0
      logit diabetes i.race weight i.race#c.weight, constraints(1) nolog
      Code:
      . webuse nhanes2f, clear
      
      . logit diabetes i.race weight i.race#c.weight, coefl nolog
      
      Logistic regression                             Number of obs     =     10,335
                                                      LR chi2(5)        =      65.47
                                                      Prob > chi2       =     0.0000
      Log likelihood = -1966.3317                     Pseudo R2         =     0.0164
      
      -------------------------------------------------------------------------------
           diabetes |      Coef.  Legend
      --------------+----------------------------------------------------------------
               race |
             Black  |   .0257155  _b[2.race]
             Other  |   .3318753  _b[3.race]
                    |
             weight |   .0169948  _b[weight]
                    |
      race#c.weight |
             Black  |   .0064931  _b[2.race#c.weight]
             Other  |  -.0026229  _b[3.race#c.weight]
                    |
              _cons |  -4.313413  _b[_cons]
      -------------------------------------------------------------------------------
      
      . constraint 1 _b[3.race#c.weight] = 0
      
      . logit diabetes i.race weight i.race#c.weight, constraints(1) nolog
      
      Logistic regression                             Number of obs     =     10,335
                                                      Wald chi2(4)      =      73.49
      Log likelihood = -1966.3384                     Prob > chi2       =     0.0000
      
       ( 1)  [diabetes]3.race#c.weight = 0
      -------------------------------------------------------------------------------
           diabetes |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      --------------+----------------------------------------------------------------
               race |
             Black  |   .0219021   .5441304     0.04   0.968    -1.044574    1.088378
             Other  |    .157999   .3465437     0.46   0.648    -.5212141    .8372122
                    |
             weight |   .0169442   .0030932     5.48   0.000     .0108816    .0230068
                    |
      race#c.weight |
             Black  |   .0065437   .0066256     0.99   0.323    -.0064423    .0195297
             Other  |          0  (omitted)
                    |
              _cons |    -4.3096   .2389319   -18.04   0.000    -4.777898   -3.841302
      -------------------------------------------------------------------------------
      
      .
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #18
        Tech Support has informed me that a stata developer will comment here shortly

        Comment


        • #19
          Level values of a factor variable are a property of said variable
          throughout a varlist specification. This means that levels
          specified on a factor variable in one term within a varlist will
          propagate to the other terms containing that factor variable.

          Let's start with a simple model with a factor variable used in a single
          main-effects term.

          In the following example, the i. operator on rep78
          indicates that regress treat rep78 as a factor variable,
          and find all its levels from the estimation sample.

          Code:
          sysuse auto
          regress price mpg i.rep78
          Stata only searched for the levels of rep78 because no levels
          were explicitly specified.

          You are able to restrict which levels to use in a model by explicitly
          specifying them. You can specify the levels as part of the i.
          operator

          Code:
          regress price mpg i(1 3 5).rep78
          or by spelling out each indicator variable explicitly

          Code:
          regress price mpg 1.rep78 3.rep78 5.rep78
          By default, the lowest of the levels specified is used as the base level.

          You can even specify all levels and pick which ones to "omit" by using
          the o. operator. Again, this can be done by specifying the
          levels as part of the o. operator

          Code:
          regress price mpg i(1 3 5)o(2 4).rep78
          or by spelling out each indicator variable

          Code:
          regress price mpg 1.rep78 2o.rep78 3.rep78 4o.rep78 5.rep78
          The challenge here is to understand what it means when a factor variable
          participates in more than one term within a varlist.

          Remember, level values of a factor variable are a property of said
          variable throughout a varlist specification.

          Here is Ariel's original test case:

          Code:
          regress price c.mpg##1.rep78 i.rep78
          The first regressor term is

          Code:
          c.mpg##1.rep78
          which expands to

          Code:
          mpg 1.rep78 c.mpg#1.rep78
          Since a level for rep78 was specified, the next regressor term

          Code:
          i.rep78
          expands to

          Code:
          1.rep78
          Duplicate elements of factor variable terms reduce down, so Ariel's
          original test case translates into

          Code:
          regress price mpg 1.rep78 c.mpg#1.rep78
          This is not what Ariel wanted. Joseph Coveney then pointed out that the
          following did not do what was expected either.

          Code:
          regress price 2b.rep78 3.rep78 4.rep78 5.rep78 c.mpg##1.rep78
          This translates to

          Code:
          regress price 1.rep78           ///
                        2b.rep78          ///
                        3.rep78           ///
                        4.rep78           ///
                        5.rep78           ///
                        c.mpg##1.rep78    ///
                        c.mpg##2b.rep78   ///
                        c.mpg##3.rep78    ///
                        c.mpg##4.rep78    ///
                        c.mpg##5.rep78
          Based on the discussion, it appears that Ariel wants the following:

          Code:
          regress price mpg               ///
                        1.rep78           ///
                        2b.rep78          ///
                        3.rep78           ///
                        4.rep78           ///
                        5.rep78           ///
                        c.mpg#1.rep78     ///
                        co.mpg#2o.rep78   ///
                        co.mpg#3o.rep78   ///
                        co.mpg#4o.rep78   ///
                        co.mpg#5o.rep78   ///
                        , allbase
          A shorter syntax for this is

          Code:
          regress price co.mpg##b(2)o(2/5).rep78, allbase
          This can be generalized with a few lines of Stata code:

          Code:
          local case 1
          levelsof rep78, local(levs)
          local levs : list levs - case
          gettoken base : levs
          regress price co.mpg##i(`case')b(`base')o(`levs').rep78, allbase

          Comment


          • #20
            Thanks Jeff. This should definitely be in a FAQ or in the manual.

            Still, I wonder why it works this way. You say

            Remember, level values of a factor variable are a property of said variable throughout a varlist specification.
            Why? If i say

            Code:
            regress price i.rep78 c.mpg c.mpg#1.rep78
            why can't I just get the one interaction term I want along with all the terms for rep78?

            The code you come up with works but it is far from intuitive, at least to me. Is there some reason that the code I prefer could actually cause some great problems, at least in some situations?



            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            StataNow Version: 19.5 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment

            Working...
            X