Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression on panel data with autocorrelation

    Hello,

    I’m trying to do the analysis part of my dissertation. I guess I should ask my question to a biostatistician but since my faculty doesn’t have that possibility I thought I would try my luck here, sorry if that really off the mark…

    I’m trying to analyze a dataset that consists of 1 dependent variable (number of cases resistant to a certain drug/1000 inhabitants)+ 5 independent variables for the same year (doses prescribed of a particular drug /1000 inhabitants) + 5 independent variables for the previous year (doses prescribed of a particular drug /1000 inhabitants).

    These data were measured during 5 years in every province (96 provinces). I therefore have 5*96=480 lines in my dataset which all include that 1 dependent and those 10 independent variables. That means my data looks something like this:
    Year Prov Resis (dependent) UseA_lag0 UseA_lag1 UseB_lag0 UseB_lag1 UseC_lag0 UseC_lag1
    2008 21 5,453 425,25 390,64 459,02 424,41 592,79 558,18
    2009 21 6,458 430,85 425,25 464,62 459,02 598,39 592,79
    2010 21 7,001 485,45 430,85 519,22 464,62 652,99 598,39
    2011 21 7,009 490,46 485,45 524,23 519,22 658 652,99
    2012 21 7,401 486,45 490,46 520,22 524,23 653,99 658
    2008 45 3,005 385,65 486,45 419,42 520,22 553,19 653,99
    2009 45 3,452 390,64 385,65 424,41 419,42 558,18 553,19
    There are a couple more variables that I am correcting for such as age, gender etc which I am not mentioning here for the sake of simplicity.

    I would like to see if last year’s value of the independent variables can predict the dependent variable (one national model).

    My question is not about how to construct lags. I know there are very nice ways of creating lag variables but as you can see I just added them as a variable.

    My question is twofold:

    1) I think I should call this type of data “auto correlated” because of the repeated measures in the same area over time, but also “panel data” because measurements took place grouped by province. Is that correct?

    2) I initially thought I would just use linear regression (regress command). However, I guess I need to take the auto correlation into account, meaning that the value of a certain province in 2009 is correlated to its value in 2008. I guess I would even know how to do that, but the panel part makes it a bit too complicated for me. My second question therefor is: any idea as to what type of command I could use to have a linear regression that takes both auto-correlation and panels into account?

    Sorry if this really isn't the type of forum I should ask this to, like I said, I thought I’d try my luck

    Best regards,

    Michiel
Working...
X