Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data with multiple observations per year

    Hello,

    I have a dataset with unbalanced panel data on publication per year. The issue with the data is that some researchers have more than one publication per year. This creates problems when I want to declare the data as panel data using -xtset- command. I need all the observations per year. Does anybody know how can I solve this issue?

    Here is an example of my data.
    The Variables are:
    ID = Id for each researcher
    Year= Years from 2001-2016
    Publication= "serial" number for each publication

    For example, ID 2 has two publication in 2004.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float ID int Year long Publication
     1 2005  396825
     1 2007  367683
     1 2008  362677
     1 2009  351832
     2 2004  423891
     2 2004  423893
     3 2009  351679
     3 2010  521555
     3 2011  844898
     3 2013 1069785
     4 2008  360711
     5 2008  360711
     6 2007  366906
     6 2007  367058
     7 2011  828701
     7 2011  874350
     7 2014 1166926
     7 2016 1365157
     8 2004  420745
     8 2006  376102
     8 2006  375126
     8 2006  502579
     8 2008  371095
     8 2010  500463
     8 2011  844477
     8 2011  844851
     8 2014 1113611
     8 2015 1244850
     9 2011  844657
     9 2011  857577
     9 2011  879488
     9 2011  858148
    10 2016 1347977
    11 2009  346410
    11 2012  951627
    11 2013 1039889
    11 2014 1136447
    12 2012  949008
    12 2013 1038440
    12 2015 1300809
    12 2015 1298752
    13 2011  877564
    13 2013 1030556
    13 2014 1110092
    13 2015 1217700
    14 2011  850128
    15 2006   53589
    15 2010  532591
    16 2007  368421
    16 2007  501775
    end


  • #2
    Hi,

    I am not sure it is right or not. But, I guess, you might generate a new variable use code "group"
    Code:
    egen group=group(year publication)
    Then, you can xtset by the variable of "id" and "group"

    Best wishes,

    Comment


    • #3
      Ludmila:
      if you cannot fine-tune your -timevar- (say, by traslating it into DDMMYY based on the references dates), you can -xtset- your data using -panelvar- only, provided that you do not plan to use time-series commands, such as lags and leads.
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #4
        You most likely want to reshape wide, so that each ID-year appears no more than once, but you can have Publication 1 in that year, Publication 2 in that year, and so forth, as separate variables.

        Code:
        bysort ID Year: gen j=_n
        reshape wide Publication, i(ID Year) j(j)
        xtset ID Year

        Comment


        • #5
          Thank you Carlo.
          Thank you Danial.

          If I -xtset- my data using -panelvar-only, can I still use all panel data models such as Fixed Effects or Random Effects models ?

          Comment


          • #6
            Ludmila:
            yes, you can.
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment

            Working...
            X