Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • manually created interaction term versus ## command

    I would like to include an interaction term with two continuous variables in an OLS model, I originally computed the interactiont term by hand, i.e. gen new_ variable = variable_A * variable_B and included both variables and the interaction term in the model. Next, I created the same model but using c.variable_A##c.variable_B. I expected the coefficients to be the same but they differ dramatically. I use Stata 13. Why are the coefficients not identical?

  • #2
    Are you including the lower order terms (i.e. 'variable_A' and 'variable_B') in addition to 'new_ variable' in your first ('by hand') model? This is what factor variable operator ## does, and what you should do.


    If this is not the reason for the differences, show us the exact commands you typed.

    Best
    Daniel

    Comment


    • #3
      Katharina should follow Daniel's advice, as proved by:
      Code:
      sysuse auto.dta
      g Int= trunk* weight
      reg price trunk weight Int
      reg price trunk weight c.trunk#c.weight
      I hope this helps.

      Kind regards,
      Carlo
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #4
        Hi,
        Sorry to ask thus question, but could you please clarify the difference between the last two stata commands that you wrote?
        reg price trunk weight Int reg price trunk weight c.trunk#c.weight Thank You

        Comment


        • #5
          Sugandha:
          actually, there's no substantive difference, as you can see from the following toy-example (I also add a third code to express the same regression model):
          Code:
          . sysuse auto.dta
          (1978 Automobile Data)
          
          . g Int= trunk* weight
          
          .
          . reg price trunk weight Int
          
                Source |       SS           df       MS      Number of obs   =        74
          -------------+----------------------------------   F(3, 70)        =     10.01
                 Model |   190636755         3  63545585.1   Prob > F        =    0.0000
              Residual |   444428641        70  6348980.58   R-squared       =    0.3002
          -------------+----------------------------------   Adj R-squared   =    0.2702
                 Total |   635065396        73  8699525.97   Root MSE        =    2519.7
          
          ------------------------------------------------------------------------------
                 price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                 trunk |  -287.6778   309.9737    -0.93   0.357    -905.9009    330.5452
                weight |   1.172154   1.510518     0.78   0.440    -1.840479    4.184786
                   Int |   .0754153   .0979483     0.77   0.444    -.1199365     .270767
                 _cons |   3284.654   4248.161     0.77   0.442    -5188.037    11757.34
          ------------------------------------------------------------------------------
          
          .
          . reg price trunk weight c.trunk#c.weight
          
                Source |       SS           df       MS      Number of obs   =        74
          -------------+----------------------------------   F(3, 70)        =     10.01
                 Model |   190636755         3  63545585.1   Prob > F        =    0.0000
              Residual |   444428641        70  6348980.58   R-squared       =    0.3002
          -------------+----------------------------------   Adj R-squared   =    0.2702
                 Total |   635065396        73  8699525.97   Root MSE        =    2519.7
          
          ----------------------------------------------------------------------------------
                     price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -----------------+----------------------------------------------------------------
                     trunk |  -287.6778   309.9737    -0.93   0.357    -905.9009    330.5452
                    weight |   1.172154   1.510518     0.78   0.440    -1.840479    4.184786
                           |
          c.trunk#c.weight |   .0754153   .0979483     0.77   0.444    -.1199365     .270767
                           |
                     _cons |   3284.654   4248.161     0.77   0.442    -5188.037    11757.34
          ----------------------------------------------------------------------------------
          
          . reg price  c.trunk##c.weight
          
                Source |       SS           df       MS      Number of obs   =        74
          -------------+----------------------------------   F(3, 70)        =     10.01
                 Model |   190636755         3  63545585.1   Prob > F        =    0.0000
              Residual |   444428641        70  6348980.58   R-squared       =    0.3002
          -------------+----------------------------------   Adj R-squared   =    0.2702
                 Total |   635065396        73  8699525.97   Root MSE        =    2519.7
          
          ----------------------------------------------------------------------------------
                     price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -----------------+----------------------------------------------------------------
                     trunk |  -287.6778   309.9737    -0.93   0.357    -905.9009    330.5452
                    weight |   1.172154   1.510518     0.78   0.440    -1.840479    4.184786
                           |
          c.trunk#c.weight |   .0754153   .0979483     0.77   0.444    -.1199365     .270767
                           |
                     _cons |   3284.654   4248.161     0.77   0.442    -5188.037    11757.34
          ----------------------------------------------------------------------------------
          
          .
          However these coded behaves differently in terms of efficiency: the second one allows a relationship with -margins- and -marginsplot- (whereas the first one does not). The third code does the same job of the second one, but with a morer parsimonious code.
          Kind regards,
          Carlo
          (Stata 18.0 SE)

          Comment


          • #6

            Ok, thank you so much for your reply. I shall keep the differences in mind.
            Thanks a lot.

            Comment


            • #7
              Dear Sir Carlo Lazzaro, while doing this command on my data its showing this error
              . g int=intro* CFO_sd
              too few variables specified
              Would you please guide me why its not working?

              Comment


              • #8
                int is a reserved word indicating a storage type for variables. Choose a different name. See [U] 11.3 on your set-up or at https://www.stata.com/manuals/u11.pdf

                Comment


                • #9
                  ...or impose un upper-case starting i (that is -Int- vs -int-).
                  Kind regards,
                  Carlo
                  (Stata 18.0 SE)

                  Comment


                  • #10
                    Int would be a different name, so that's not a different suggestion. My advice, however, is to avoid reserved words or words close to them completely. All of us are in situations where we need to understand our own code later and many of us are in situations where other people need to understand our code. Minor tricks, dodges or fudges don't help either goal. Whatever your code, I would write in a comment like

                    Code:
                    * Note that -int- is a reserved word: See {U] 11.3
                    as a reminder to whoever reads the code, including yourself.

                    Comment


                    • #11
                      Nick is obviously correct.
                      My previous reply was not intended as a different suggestion.
                      I should have been more detailed in adding that -Int- was the name of the variable I created in the toy-example reported in #5: that's why I mentioned it.
                      Admittedly, I've forgotten that -int- is a reserved word that Stata uses for storage type and I've just refreshed my memory by taking a look at -help data_types-: thanks to Nick for his precious reminder.
                      Kind regards,
                      Carlo
                      (Stata 18.0 SE)

                      Comment

                      Working...
                      X