Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Durbin Watson

    I ran a simple multiple regression. 1 DV, 2 IVs. Wanted to check independence of the IVs. Ran Durbin Watson estat dwatson. Got this error message: time variable not set, use tsset varname ...

    I do not understand this. The Stata manual says to simply type estat dwatson.

    Any ideas?

  • #2
    The Durbin-Watson test is a test of serial correlation of the regression residuals, which requires knowing the chronological order of the observations. The only way Stata will know that is if you first -tsset- the data. See -help tsset- for the use of that command.

    However, I think most people would not describe testing for serial correlation of the residuals as a test of the "independence of the IVs." So if ou mean something other than looking for serial correlation of residuals, then you need to be clear about what exactly you want to check for and then find an appropriate approach to do that, which would not be a Durbin-Watson test.

    Comment


    • #3
      Thanks, Clyde. I got my information about using Durbin-Watson from https://www.statology.org/multiple-l...n-assumptions/ . Apparently the rather simplistic instruction at Item 3, Independence of IVs, is either not correct or too abbreviated. Further checking suggests vif may be used to test the independence of the IVs, but that is the same test used to determine multicollinearity. If vif is used for both, then it would seem multicollinearity and independence of IVs are one and the same, at least for practical purposes.

      Thanks again for your prompt and helpful response.

      Comment


      • #4
        The source you give does mention serial correlation twice and autocorrelation once under that heading. But it's not a very clear explanation at all.

        Twice, the assumption (following others, I prefer to say "ideal condition") that residuals are normally distributed is referred to as an assumption of multivariate normality. But the residuals (or more pedantically the errors) are a single distribution; there is nothing multivariate there.

        I would start with looking at the correlations between your predictor variables and a scatter plot matrix. I wish that the terminology independent variables would just fade away.

        It's really hard to be so concise without introducing scope for misunderstanding. For example, adding the square of a predictor to model quadratic relationship is usually adding a variable -- that square -- that is highly correlated with the original predictor, which rather goes against other advice. I think the apparent contradiction can be resolved, but in a much longer discussion.

        Comment


        • #5
          Thank you, Nick, for this clarification and confirmation that the source I used was somewhat, to say the least, confusing.

          I don't quite understand what you mean by "the correlations between your predictor variables and a scatter plot matrix." Scatter plot matrix of what?

          Comment


          • #6
            A scatterplot matrix of your data, including the response and the predictors. Suppose your response (DV in your terminology) is called y and your predictors (IVs) are called x1 x2. Then

            Code:
            graph matrix x1 x2 y, half
            gives you a scatterplot matrix, with the detail that the last-named variable
            y appears on the vertical axis of each graph on the last row, as is conventional.

            In your case, you have just two rows. Naturally the option
            half is not compulsory but it is usually a good idea.

            Comment


            • #7
              Thanks.

              Comment

              Working...
              X