Stata says the panel is strongly balanced but there are many observations missing

Jamie Holmes

Join Date: Jul 2014

Posts: 15
#1

Stata says the panel is strongly balanced but there are many observations missing

12 Aug 2014, 09:18

The xtdescribe command tells me the panel is strongly balanced, however, many countries in my sample have missing data for entire period for some variables. How can a panel be balanced if observations are missing?
Tags: None
Joe Canner

Join Date: Mar 2014

Posts: 580
#2

12 Aug 2014, 09:43

Here is the definition of panel balance from the xtset documentation:

The terms balanced and unbalanced are often used to describe whether a panel dataset is missing
some observations. If a dataset does not contain a time variable, then panels are considered
balanced if each panel contains the same number of observations; otherwise, the panels are unbalanced.

When the dataset contains a time variable, panels are said to be strongly balanced if each panel contains the
same time points, weakly balanced if each panel contains the same number of observations but not the same
time points, and unbalanced otherwise.

My reading of this is that you don't need to have complete data for every variable to have a strongly balanced data set. However, you should provide the complete results from xtdescribe so that we can comment more intelligently.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2581
#3

12 Aug 2014, 09:43

xtdescribe (and xtset) analyse the data structure based on the panel and time identifiers only. As long as you have one row in your data set for each panel-time identifier combination (without missing values in these identifier variables), than those commands will tell you set your data set is strongly balanced (irrespective of any missing values in your remaining variables).

https://www.kripfganz.de/stata/
Comment
Jamie Holmes

Join Date: Jul 2014

Posts: 15
#4

12 Aug 2014, 10:07

summarize

Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
country | 6256 68.5 39.2619 1 136
year | 6256 1985.5 13.27698 1963 2008
incomeineq~y | 3732 42.10842 7.101112 20.57831 59.95708
pppconvert~g | 5481 6469.255 9427.672 49.07564 121189.6
realintere~s | 2976 6.688341 25.7861 -97.81207 789.799
-------------+--------------------------------------------------------
inflationg~r | 4903 36.66549 347.9358 -30.18327 13611.63
externalde~i | 2303 63.14364 84.38922 .1 2687.7
grosscentr~p | 2303 53.81016 59.59731 2.3 1209.3
humancapit~o | 3877 61.73231 33.83302 .18163 162.3487
tradeopenn~s | 1440 7.549694 9.867625 .47 254.58
-------------+--------------------------------------------------------
v11 | 0

. xtset country year
panel variable: country (strongly balanced)
time variable: year, 1963 to 2008
delta: 1 unit

. xt describe
unrecognized command: xt
r(199);

. xtdescribe

country: 1, 2, ..., 136 n = 136
year: 1963, 1964, ..., 2008 T = 46
Delta(year) = 1 unit
Span(year) = 46 periods
(country*year uniquely identifies each observation)

Distribution of T_i: min 5% 25% 50% 75% 95% max
46 46 46 46 46 46 46

Freq. Percent Cum. | Pattern
---------------------------+------------------------------------------------
136 100.00 100.00 | 1111111111111111111111111111111111111111111111
---------------------------+------------------------------------------------
136 100.00 | XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Comment
Jamie Holmes

Join Date: Jul 2014

Posts: 15
#5

12 Aug 2014, 10:09

OK so how do I analyse the balancedness/unbalancedness of the data or from my observations can I assume that it is unbalanced?, I want to know whether to use ANOVA/ML methods to estimate the variance components. Thanks...
Comment
Roberto Ferrer

Join Date: Apr 2014

Posts: 449
#6

12 Aug 2014, 10:30

I follow Joe Canner's citation with some examples:

Code:

* load data webuse pig, clear keep if inrange(id, 1, 3) xtset id week * strongly balanced list, sepby(id) xtset * still strongly balanced replace weight = . in 1 list, sepby(id) xtset * unbalanced drop if missing(weight) list, sepby(id) xtset * load data (again) webuse pig, clear keep if inrange(id, 1, 3) xtset id week * weekly balanced replace week = week + 1 if id == 1 list, sepby(id) xtset

You should:

1. Read the FAQ carefully.

2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.
Comment

Announcement

Stata says the panel is strongly balanced but there are many observations missing

Comment

Comment

Comment

Comment

Comment