Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weakly balanced when each panel contains same timepoints?

    Hi Statalisters

    I have a data set with monthly sex-specific suicide and suicide attempts within all regions in a country over several years. (Male are odd-numbered and females are even-numbered on id-variable.)

    The data set has observations for each id for every time-period and should to the best of my knowledge be "strongly balanced". However, when I use set the data as panel data with:

    Code:
    xtset id eventdate, monthly
    I get the following feedback:

    Code:
    xtset id eventdate, monthly
           panel variable:  id (weakly balanced)
            time variable:  eventdate, January2012 to December2017
                    delta:  1 month
    Here is an extract of the data:

    Code:
    alive_wo_physical_harm    alive_w_phys_harm    dead    month    year    eventdate    regionnum    datasrc    id    regionnumeric    sexnumeric
    10    0    3    1    2012    January2012    1    0    1    North central region    male
    20    0    0    1    2012    January2012    1    1    2    North central region    female
    8    0    9    2    2012    February2012    1    0    3    North central region    male
    11    0    0    2    2012    February2012    1    1    4    North central region    female
    7    1    9    3    2012    March2012    1    0    5    North central region    male
    16    0    6    3    2012    March2012    1    1    6    North central region    female
    5    0    3    4    2012    April2012    1    0    7    North central region    male
    17    0    7    4    2012    April2012    1    1    8    North central region    female
    7    0    7    5    2012    May2012    1    0    9    North central region    male
    12    0    5    5    2012    May2012    1    1    10    North central region    female
    "alive_without_physical_harm", "alive_w_phys_harm" and "dead" is suicide (attempts) variables, while "eventdate" is the applied time variable (formatted as month, year). I would use dataex, but I'm currently on a work computer with download restrictions (however, I can upload later if required).

    Hopefully some of you have an answer to this.

    Best
    Tarjei

  • #2
    A data set is weakly balanced if each panel contains the same number of observations but not the same time points.

    Comment


    • #3
      Thank you for your reply Scott Merryman. I am wondering if the data should be strongly balanced instead of weakly balanced, as there should be data available for all individuals in all time periods. Thus, I was hoping some might see a problem with the organization of data.

      Here is another look at the data organization:

      Code:
             id      eventdate   sexnum~c          regionnumeric   a~l_harm   a~s_harm   dead  
        1.    1    January2012       male   North central region         10          0      3  
        2.    2    January2012     female   North central region         20          0      0  
        3.    3   February2012       male   North central region          8          0      9  
        4.    4   February2012     female   North central region         11          0      0  
        5.    5      March2012       male   North central region          7          1      9  
        6.    6      March2012     female   North central region         16          0      6  
        7.    7      April2012       male   North central region          5          0      3  
        8.    8      April2012     female   North central region         17          0      7  
        9.    9        May2012       male   North central region          7          0      7  
       10.   10        May2012     female   North central region         12          0      5
      The first observation is for individual 1, defined as male in region 1 in month 1, year 1.

      The second observation is for individual 2, defined as female in region 1 in month 1, year 1.

      The third observation is for individual 3, defined as male in region 1 in month 2, year 1. And so on ...



      Comment


      • #4
        It is not clear how the data is sorted, but if it is by eventdate then it would seem to indicate that id = 1 & id =2 begin on January 2012 while id =3 begins on February 2012. The time periods are not overlapping.

        The example below creates an indicate where each panelvar begins and end which can used to find which panelvars are generating the weakly balanced result.

        Code:
        . webuse grunfeld,clear
        
        . replace year = year +10 if com == 1
        (20 real changes made)
        
        . xtset
               panel variable:  company (weakly balanced)
                time variable:  year, 1935 to 1964
                        delta:  1 year
        
        . bys com (year) : gen byte begin = _n ==1
        
        . bys com (year) : gen byte end = _n ==_N
        
        . tab year if begin == 1
        
               year |      Freq.     Percent        Cum.
        ------------+-----------------------------------
               1935 |          9       90.00       90.00
               1945 |          1       10.00      100.00
        ------------+-----------------------------------
              Total |         10      100.00

        Comment


        • #5
          Thanks! I tried sorting by id and eventdate which seemed to solve the balancing problem (former sorted by region and eventdate):

          Code:
          . sort id eventdate
          
          . xtset id eventdate, monthly
                 panel variable:  id (strongly balanced)
                  time variable:  eventdate, 2012m1 to 2017m12
                          delta:  1 month
          
          . xtdescribe
          
                id:  1, 2, ..., 12                                     n =         12
          eventdate:  2012m1, 2012m2, ..., 2017m12                     T =         72
                     Delta(eventdate) = 1 month
                     Span(eventdate)  = 72 periods
                     (id*eventdate uniquely identifies each observation)
          
          Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                                  72      72      72        72        72      72      72
          
               Freq.  Percent    Cum. |  Pattern
           ---------------------------+--------------------------------------------------------------------------
                 12    100.00  100.00 |  111111111111111111111111111111111111111111111111111111111111111111111111
           ---------------------------+--------------------------------------------------------------------------
                 12    100.00         |  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
          However, I would like to do analyses for regional monthly suicide rates combined and by gender. To do this, I'll need an aggregated suicide rate (based on suicide rates for both sexes), which I tried by:

          Code:
          gen agg_suiciderate = suiciderate if female==0 + suiciderate if female==1
          But this does not yield an aggregated regional monthly suicide rate.

          The suicide rate was constructed by

          Code:
          // suicide proportion of regional population
          gen deadprop=dead/regpop
          
          // suiciderate per 100 000
          gen suiciderate=deadprop*100000
          Do you (or anyone else) have any input on a solution to this?
          Last edited by Tarjei W. Havneraas; 17 Aug 2018, 07:28.

          Comment


          • #6
            It is helpful if different questions are on different thread.

            This syntax is invalid:
            Code:
            gen agg_suiciderate = suiciderate if female==0 + suiciderate if female==1
            Perhaps you want something like:
            Code:
            egen agg_suiciderate = total(suiciderate), by(timevar)
            egen agg_suiciderate_sex = total(suiciderate), by(timevar sex)

            Comment

            Working...
            X