Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem in collapse command and two year variables

    Dear Stata Listers,

    I hope you are doing well. I have the following raw data (the first 8 observations are as follows):

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str31 idcaseyear int yearfiled str5 bench int yeardecision byte StateWins int caselag str31 judgename byte judge_number
    "720" 2004 "karhc" 2005 1 1 "Syed Zahid Husain" 1
    "1325" 1999 "karhc" 2005 . 6 "Rehmat Hussain Jafferi" 1
    "2497" 2001 "karhc" 2003 0 2 "Sabihuddin Ahmad" 2
    "2497" 2001 "karhc" 2003 0 2 "Ali Aslam Jafri" 2
    "198" 2003 "karhc" 2003 1 0 "Wahid Bux Brohi" 2
    "198" 2003 "karhc" 2003 1 0 "Rahmat Hussain Jafferi" 2
    "319" 1998 "karhc" 2002 0 4 "Sabihuddin Ahmad" 2
    "319" 1998 "karhc" 2002 0 4 "Syed Ali Aslam Jafri" 2
    end

    From this raw judicial case level data, the district-decisionyear panel (based on averages) is created as follows:

    Code:
    collapse (mean) StateWins (mean) caselag (mean) judge_number, by(bench yeardecision)
    encode bench, gen(district_bench)
    xtset district_bench yeardecision.

    However, I also want to keep yearfiled variable somehow but I cannot do by(bench yeardecision yearfiled) since then I an not allowed to do xtset district_bench yeardecision as it creates repeated decision year values.

    In the end, I was hoping to run a regression that considers the sample of cases with yearfiled < 2010 where I still have the district-yeardecision panel.

    Something like
    Code:
    regress StateWins Reform_Exposure i.yeardecision i.district_bench##c.yeardecision  , vce(cluster district_bench) if yearfiled > 2010

    Look forward to help on this matter. Thank you.

    Cheers,
    Roger
    P.S: yearfiled variable is year when case was filed, whereas yeardecision is year when case was decided/adjudicated. I want to keep my panel as a district-decision year panel but use yearfiled to consider all the cases filed before 2010 but decided before and after 2010 i.e. my entire sample case decision period.

  • #2
    Unfortunately "somehow" is not a precise specification of what you want to be done.

    But you could, for example, do this

    Code:
     collapse (mean) StateWins (mean) caselag (mean) judge_number, by(bench yeardecision yearfiled)
    My wild guess is that you don't or shouldn't want that either.

    Approaching from another angle: Your difficulty seems less with collapse than with thinking how to model your data. From the sound of it

    Code:
    xtset district_bench
    may be worthwhile, but if there are repeated instances of any year variable within districts, not only is it illegal to specify a year variable to xtset, it makes no sense either. If it's part of the process that repetitions within a year are entirely possible and you don't want to make use of lagged variables (perhaps the main incentive to specifying a time variable in xtset) then it is fine just to specify a panel identifier.

    Approaching from yet another angle: Nothing in your intended regression depends on even an xtset, so far as I can see, so even this question may be irrelevant too.
    Last edited by Nick Cox; 31 Aug 2017, 03:54.

    Comment

    Working...
    X