Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with Manipulating data to construct a scatter plot (RDD)

    Dear all,

    I hope you are doing well. I wanted to draw a judge-time graph from case-time data. Specifically, I have the following data set where rows represent cases filed in the courts (e.g. a case 5052013 with two judges Amin Ud Din Khan and Abid Aziz Khan Sheikh forms the first two rows):

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str31 idcaseyear int(yearfiled yeardecision) str32 judgename byte StateWins int caselag byte(lawyer_number AFR criminal constitutional) int pagesjudgenum
    "5052013"  2013 2013 "Amin Ud Din Khan"           . 0 2 1 0 1 .
    "5052013"  2013 2013 "Abid Aziz Sheikh"           0 0 2 1 0 1 .
    "162007"   2007 2009 "Syed Iftikhar Hussain Shah" . 6 3 1 0 0 .
    "162007"   2007 2009 "Shoaib Saeed"               . 6 3 1 0 0 .
    "792013"   2013 2013 "Imtiaz Ahmed"               . 0 2 1 0 1 .
    "792013"   2013 2013 "Amin Ud Din Khan"           0 0 2 1 0 1 .
    "3632012"  2012 2013 "Syed Iftikhar Hussain Shah" . 1 2 1 0 0 .
    "3672004"  2004 2007 "Abdus Sattar Asghar"        . 9 3 1 0 0 .
    "102012"   2012 2013 "Syed Mansoor Ali Shah"      . 1 4 1 0 0 .
    "46382011" 2011 2013 "Umar Ata Bandial"           . 2 2 1 0 1 .
    end
    I want to draw a scatter plot graph where my "StateWins" is y-axis and my "yeardecision" is x-axis where I have each point as an individual judge (judgename).

    Basically, I want to compare judges just before the decision year 2010 with judges just after the threshold 2010 decisonyears.

    I am not sure I can sort my data to get judge-time scatter plot?

    Any one of your help in this regard will really be appreciated. Thank you.

    Kind Regards,
    Roger
    Last edited by Roger More; 23 Sep 2017, 02:46.

  • #2
    Thanks for providing a -dataex-. In terms of just syntax to create a scatterplot, you are looking for something like this syntax

    Code:
    scatter State year, mlabel (judge)
    but based on the structure of your data, I bet that isn't giving you a satisfactory graph.


    Here are some other ideas:

    Code:
    *1. a few data manipulations to make this example work better with a graph since your outcome was always zero in the example:
    keep StateWins judgename yeardecision
    encode judgename, g(judge)
    desc
    
    
    replace State = 1 in -5/l
    replace State = 0 in 1/-5
    
    **You can see how this isnt really helpful, but it's what you requested:
    scatter State year, mlabel(judge)
    
    **perhaps you want the predicted estimate of the Statewins from a model rather than the 0/1 condition?, e.g., 
    
    expand 100
    replace State = rbinomial(1, .4)
    replace year = int(2007+runiform()*5)
    
    
    logit State i.year#i.judge
    margins year#judge
    marginsplot, noci
    
    
    **2.  So, perhaps a better way to think about this is sum up the # of wins by judge/year and in the pre/post-2010 period and then plot them as an bar graph or connected plot like:
    
    bys judge year: egen numwins = total(StateWins)
    scatter numwins year, mlabel(judge) //this still isnt ideal
    
    
    graph hbar (mean) numwins, over(year) over(judge)
    
    gen thresh = cond(year<2010, "<2010", ">=2010", "")
    bys judge thresh: egen numwins2 = total(StateWins)
    graph hbar (mean) numwins2 , over(thresh) over(judge)
    
    tw (connected numwins year if thresh=="<2010", sort) (connected numwins2 year if thresh==">=2010") , by(judge) legend(order(1 "<2010" 2 ">=2010"))
    Eric A. Booth | Senior Director of Research | Far Harbor | Austin TX

    Comment


    • #3
      Thank you very much. I will work through these examples and see if I get what I want. Thank you again!

      Cheers!

      Comment


      • #4
        Dear Eric,

        Thank you so much for your help. I have went through your examples and learned a lot from them. I think your idea about using a model to compute predicted probability is very intuitive but with 1000 judges it would be hard to have a clear picture of data in the form of line graphs or histograms for each judge. Perhaps it would be possible for you to suggest something.

        Basically, I wanted to compare StateWins of judges before and after 2010 threshold, of ONLY those judges that had the identifier AfterReformJudge = 0.
        That is I want to compare the state wins of half of my judges sample (about 500) that has AfterReformJudge = = 0 for before and after 2010 decisionyear threshold. The hypothesis I am testing was that after 2010 the state wins of judges reduced or not?

        Do you have any suggestions to show it in a single graph if I have many judges (in particular, I would be interested in comparing judges just before and after 2010 for a RDD kind of interpretation).

        My example dataset now with the AfterReformJudge is as follows:

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int yeardecision str32 judgename byte StateWins int caselag byte AfterReformJudge str31 idcaseyear
        2009 "Abdus Sattar Asghar" . 2 0 "131442012"
        2010 "Abdus Sattar Asghar" . 0 0 "25212014"
        2013 "Abdus Sattar Asghar" . 0 0 "233842013"
        2009 "Abid Aziz Sheikh"    0 1 1 "14102"    
        2009 "Abid Aziz Sheikh"    . 3 1 "1452010"  
        2013 "Abid Aziz Sheikh"    . 3 1 "1452010"  
        2013 "Abid Aziz Sheikh"    . 6 1 "1652007"  
        2013 "Abid Aziz Sheikh"    . 0 1 "1322009"  
        end
        If it would be possible for you to help me, it would be great, thank you in any case!!

        Cheers,
        Roger
        Last edited by Roger More; 23 Sep 2017, 16:46.

        Comment

        Working...
        X