Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata says the panel is strongly balanced but there are many observations missing

    The xtdescribe command tells me the panel is strongly balanced, however, many countries in my sample have missing data for entire period for some variables. How can a panel be balanced if observations are missing?

  • #2
    Here is the definition of panel balance from the xtset documentation:

    The terms balanced and unbalanced are often used to describe whether a panel dataset is missing
    some observations. If a dataset does not contain a time variable, then panels are considered
    balanced if each panel contains the same number of observations; otherwise, the panels are unbalanced.

    When the dataset contains a time variable, panels are said to be strongly balanced if each panel contains the
    same time points, weakly balanced if each panel contains the same number of observations but not the same
    time points, and unbalanced otherwise.
    My reading of this is that you don't need to have complete data for every variable to have a strongly balanced data set. However, you should provide the complete results from xtdescribe so that we can comment more intelligently.

    Comment


    • #3
      xtdescribe (and xtset) analyse the data structure based on the panel and time identifiers only. As long as you have one row in your data set for each panel-time identifier combination (without missing values in these identifier variables), than those commands will tell you set your data set is strongly balanced (irrespective of any missing values in your remaining variables).
      https://www.kripfganz.de/stata/

      Comment


      • #4
        summarize

        Variable | Obs Mean Std. Dev. Min Max
        -------------+--------------------------------------------------------
        country | 6256 68.5 39.2619 1 136
        year | 6256 1985.5 13.27698 1963 2008
        incomeineq~y | 3732 42.10842 7.101112 20.57831 59.95708
        pppconvert~g | 5481 6469.255 9427.672 49.07564 121189.6
        realintere~s | 2976 6.688341 25.7861 -97.81207 789.799
        -------------+--------------------------------------------------------
        inflationg~r | 4903 36.66549 347.9358 -30.18327 13611.63
        externalde~i | 2303 63.14364 84.38922 .1 2687.7
        grosscentr~p | 2303 53.81016 59.59731 2.3 1209.3
        humancapit~o | 3877 61.73231 33.83302 .18163 162.3487
        tradeopenn~s | 1440 7.549694 9.867625 .47 254.58
        -------------+--------------------------------------------------------
        v11 | 0

        . xtset country year
        panel variable: country (strongly balanced)
        time variable: year, 1963 to 2008
        delta: 1 unit

        . xt describe
        unrecognized command: xt
        r(199);

        . xtdescribe

        country: 1, 2, ..., 136 n = 136
        year: 1963, 1964, ..., 2008 T = 46
        Delta(year) = 1 unit
        Span(year) = 46 periods
        (country*year uniquely identifies each observation)

        Distribution of T_i: min 5% 25% 50% 75% 95% max
        46 46 46 46 46 46 46

        Freq. Percent Cum. | Pattern
        ---------------------------+------------------------------------------------
        136 100.00 100.00 | 1111111111111111111111111111111111111111111111
        ---------------------------+------------------------------------------------
        136 100.00 | XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

        Comment


        • #5
          OK so how do I analyse the balancedness/unbalancedness of the data or from my observations can I assume that it is unbalanced?, I want to know whether to use ANOVA/ML methods to estimate the variance components. Thanks...

          Comment


          • #6
            I follow Joe Canner's citation with some examples:
            Code:
            * load data
            webuse pig, clear
            keep if inrange(id, 1, 3)
            xtset id week
            
            * strongly balanced
            list, sepby(id)
            xtset
            
            * still strongly balanced
            replace weight = . in 1
            list, sepby(id)
            xtset
            
            * unbalanced
            drop if missing(weight)
            list, sepby(id)
            xtset
            
            * load data (again)
            webuse pig, clear
            keep if inrange(id, 1, 3)
            xtset id week
            
            * weekly balanced
            replace week = week + 1 if id == 1
            list, sepby(id)
            xtset
            You should:

            1. Read the FAQ carefully.

            2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

            3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

            4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.

            Comment

            Working...
            X