Hi all, I am trying to get my feet wet with panel data and multilevel modeling for the first time. I struggle a little to understand various uses of these terminologies across disciplines, and in applying them with Stata code (I am more familiar with R).
I have panel data of individuals, within families (some families have multiple individuals surveyed), measured at 2 time points. I would offer a dataex but with only 99 observations, it does not converge. My actual data has about 4,000 observations (and 2,300 groups/families). My questions are more general than about my specific data, though.
For a mixed model, I understand the code to be something like this, where y, x1, and x2 are all measured at both t1 and t2:
So this would be a three-level model, with random intercepts by individual_id and family_id, with individual_id nested within family_id. Is that accurate? It also seems like this tends to be the preferred model specification in most circumstances, is that intuition correct?
Then, is what I'm additionally trying to understand is the difference between that code above and the below, -xtreg- with re.
Are these very similar? This is effectively allowing random slopes, similar to above?
Final, what about simply employing a "standard" linear regression, but using clustered SEs to account for clustering within families? Like so:
Where ... represents additional time-invariant controls that are accounted for with -xtreg- and -mixed-. I suppose this would be less beneficial because it does not account for all possible variation within 'individual_id', compared to the other two methods?
I have panel data of individuals, within families (some families have multiple individuals surveyed), measured at 2 time points. I would offer a dataex but with only 99 observations, it does not converge. My actual data has about 4,000 observations (and 2,300 groups/families). My questions are more general than about my specific data, though.
For a mixed model, I understand the code to be something like this, where y, x1, and x2 are all measured at both t1 and t2:
Code:
mixed y x1 x2 i.wave || family_id: || individual_id:
Then, is what I'm additionally trying to understand is the difference between that code above and the below, -xtreg- with re.
Code:
xtset individual_id wave xtreg y x1 x2 i.wave, re
Final, what about simply employing a "standard" linear regression, but using clustered SEs to account for clustering within families? Like so:
Code:
reg y x1 x2 i.wave ..., vce(cluster family_id)
