Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing panel data for normality -- is sktest appropriate?

    Hello list,

    After doing searching on Statalist and the Web, I can't seem to find guidance on what seems like a simple question: What is the appropriate test for normality for a panel data set?

    My dependent variable is a ratio (megawatts per state-year), my panel IDs are states, my time period is 12 years and my regressors are all numeric (ratios and intervals).

    I am attaching the output from a sktest. As is evident, most of the variables are skewed. Can I report these results or is there an alternative that's more appropriate for panel data?

    Thanks!

    -nick


  • #2
    Possibly no test is any use at all is one answer. Without knowing anything much about your panel I'd guess abstractly at different means and standard deviations and no reason to suppose that the pooled result is normal except that it would seem to make the researcher's life simpler.

    Which assumption of which model you are using is it that implies that collectively the distribution should be normal at all? Whenever normality enters, it is that conditionally on some structure errors are normal, a quite different ball game.

    This kind of exercise underlies my puzzlement:

    Code:
     
    webuse grunfeld 
    
    foreach v in invest kstock mvalue { 
        qnorm `v', name(`v')
    }
    There could be many reasons to transform your variables, but not this.

    (You could standardize first by mean and SD for each panel, which disposes of one point here, except that it doesn't, as you presumably don't intend to feed the standardized variables into you model.)

    Comment


    • #3
      I'm using pooled OLS, xtreg and a spatial panel regression (xsmle).

      Following Kennedy's A Guide To Econometrics (2008) I'm concerned that my data violates Assumption One of the CLR: Nonlinear relationship between Y and Xs. Or does that not apply to a panel?
      Last edited by Nick Cain; 17 Jun 2015, 13:43.

      Comment


      • #4
        I don't know what Kennedy (2008) is (full references please), but normality and linearity are different things; sktest is not appropriate for checking linearity; I don't understand the structure of your data at all, but why not start with a lowess curve? or try either fp or cubic splines (mkspline)?

        Comment


        • #5
          Rich: Thanks for your note. Indeed, they are not the same thing. I started with sktest, but have also run QQplots and augmented component and residual plots. I see a problem with nonlinearity between the DV and regressors. So I'm just trying to understand the best approaches to diagnosing the problem.

          Comment


          • #6
            Perhaps you could add a polynomial term for the suspect predictor(s) as one way of exploring nonlinearity.
            Code:
            summarize suspectful, meanonly
            generate double centered_suspectful = suspectful - r(mean)
            xtreg response c.unimpeachable c.centered_suspectful##c.centered_suspectful, i(entity) mle nolog
            estimates store Full
            xtreg response c.unimpeachable c.centered_suspectful, i(entity) mle nolog
            lrtest Full // Or just inspect the regression coefficients and their standard errors

            Comment


            • #7
              first, I don't see non-linearity as a problem - unless, of course, you ignore it

              second, I gave you some suggestions (fp and mkspline - see the help files); Joseph has given you another that is simpler but less flexible

              Comment


              • #8
                Thanks for both your comments. From what I'm reading about fp and mkspline, they are both for univariate regressions so I'm not sure how I would apply them to a multivariate panel data set?

                Comment


                • #9
                  Hi folks. Attached is a scatter plot showing my DV vs years and also a scatter plot of my DV against one of several predictors (DV is megawatts of wind power per state-year).

                  I know from previous analysis that the growth in wind power (in many states) follow a nonlinear pattern.

                  I've been asked to account for this in my panel regression equation and so that's my interest in testing linear relationships and normality of data.

                  Any thoughts about the best estimation approach given that at least one of my predictors is time invariant?

                  Comment


                  • #10
                    mkspline makes cubic splines for a variable which you can then include with other variables in your model

                    try "mfp" instead of "fp"

                    these are each very flexible; if you want to start more simply, you can follow Joseph's suggestion in #6

                    note that your photos are not readable, at least by me - this is discussed in the FAQ

                    Comment


                    • #11
                      Thanks Rich -- that is very helpful.

                      I searched the FAQ, but didn't see any advice on image formats. I'm reattaching these as JPG instead of PNGs.

                      Comment


                      • #12
                        still can barely see these; what I was referring to is the following from paragraph 12 of the FAQ:
                        Code:
                         You can attach Stata graphs or other images. Note, however, that Stata graphs and other images are highly readable when inserted as .png file attachments (start with the Clipboard icon) and far less readable if inserted as photos (using the Camera icon).
                          Screenshots are possible but often do not help much. Even if they are legible, and they often are not, they do not allow copy and paste.

                        Comment


                        • #13
                          Ok, does this work better?
                          Attached Files

                          Comment

                          Working...
                          X