Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop with special criterion (Panel)

    Hi,

    I'm dealing with a panel dataset of firms interviewed once a year in the period between 2010 and 2020.
    I'd want to investigate the impact of a policy introduced in 2015, but I would like to consider only those firms that before 2015 had a particular characteristic.
    Every observation in my dataset have a variable gpkey (i.e. the firm identification code) and a variable year (i.e. the reference year), together obviously with other variables of interest.

    I kept only those firms for which I have at least one observation before 2015 and at least one observation after 2015:
    bysort gpkey (jahr): generate tokeep = year[1]<2015 & year[_N]>=2015
    keep if tokeep

    Now, I created a dummy that becomes 1 when (wages<10 & year<=2015)
    I want to keep only those firms for which dummy=1.

    If I do a simple

    keep if dummy==1

    I keep only observations before 2015, whereas I want to keep all FIRMS whose observations before 2015 makes the dummy=1. In other words, if firm A has been observed in 2013 and in 2017, and dummy =1 for the observation of 2013, I want to maintain both observations.

    I'm really a beginner at Stata and I'm struggling

    Very grateful to whomever may help me out

    Best,

    Mike

  • #2
    This FAQ covers techniques in this area: https://www.stata.com/support/faqs/d...ble-recording/. You want to append to the code:

    Code:
    bys gpkey: egen wanted= max(dummy)
    keep if wanted

    Comment


    • #3
      Thanks Andrew.
      However, you code doesn't completely fix the problem.
      Suppose I have a firm with gpkey 1234 that has been surveyed in 2012, in 2013 and in 2016. So I have 3 observations with gpkey 1234, one for each year. Suppose however that only in 2013 wages were <10. So only for the observation in year 2013 I have that dummy=1.
      With your code, I maintain all 3 observations, when actually I would need to keep only the observation of 2013 and the observation of 2015, dropping the observation of 2012.

      How can I fix that?

      Thx in advance!!

      Comment


      • #4
        In #1, you said you wanted to keep all FIRMS [emphasis yours] if dummy == 1. I would have understood that to mean what the code in #2 accomplishes as well.

        If I understand your new request correctly, you want to keep firms that have dummy == 1 at some point in time, but, among those firms observations, discard any before 2015 for which dummy != 1
        You could do this as:
        Code:
        bys gpkey: egen wanted= max(dummy) // AS PER ANDREW MUSAU
        replace wanted = 0 if year < 2015 & dummy != 1
        keep if wanted
        Last edited by Clyde Schechter; 28 Dec 2022, 18:48.

        Comment


        • #5
          My apologies for not making it clear in my previous answer: Andrew Musau 's solution was perfect for my initial question. The problem was in my question, that was not phrased correctly, and not in his answer.
          The right problem was the second, and thanks Clyde Schechter because your integration fixes it perfectly.

          Sorry for my mental confusion, but I'm still a beginner both in Stata and in statistical analysis...

          Thanks again, Happy New Year to you all.

          Comment

          Working...
          X