Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tracking a particular variable in a panel dataset over time

    Hi everyone,

    I have the following panel dataset. I would like to track ASINs that change from being Prime=0 in year=1 to becoming Prime=1 in years 2 to 5. Is there a way to keep only these ASINs in the dataset? Similarly, I would like to track ASINs that changed from Prime=1 in year=1 to becoming Prime=0 in years 2 to 5. Is there a way to also keep only this subsample in the dataset? Thank you so much for your help.

    ASIN Year Prime
    "B00008CQRK" 1 0
    "B00008CQRK" 2 0
    "B00008CQRK" 3 0
    "B00008CQRK" 4 0
    "B00008CQRK" 5 0
    "B0000E5JK7" 1 0
    "B0000E5JK7" 2 0
    "B0000E5JK7" 3 0
    "B0000E5JK7" 4 0
    "B0000E5JK7" 5 0
    "B0000E5JK8" 1 0
    "B0000E5JK8" 2 0
    "B0000E5JK8" 3 0
    "B0000E5JK8" 4 0
    "B0000E5JK8" 5 0
    "B0000E5JKC" 1 0
    "B0000E5JKC" 2 0
    "B0000E5JKC" 3 0
    "B0000E5JKC" 4 0
    "B0000E5JKC" 5 0
    "B0000E5JRF" 1 1
    "B0000E5JRF" 2 1
    "B0000E5JRF" 3 1
    "B0000E5JRF" 4 1
    "B0000E5JRF" 5 1
    "B0000TLES8" 1 0
    "B0000TLES8" 2 0
    "B0000TLES8" 3 0
    "B0000TLES8" 4 0
    "B0000TLES8" 5 0
    "B0000VYKXC" 1 0
    "B0000VYKXC" 2 1
    "B0000VYKXC" 3 1
    "B0000VYKXC" 4 0
    "B0000VYKXC" 5 1
    "B0001Y9U88" 1 0
    "B0001Y9U88" 2 0
    "B0001Y9U88" 3 0
    "B0001Y9U88" 4 0
    "B0001Y9U88" 5 0
    "B0002AHT84" 1 1
    "B0002AHT84" 2 0
    "B0002AHT84" 3 0
    "B0002AHT84" 4 0
    "B0002AHT84" 5 0
    "B0002HV730" 1 1
    "B0002HV730" 2 1
    "B0002HV730" 3 1
    "B0002HV730" 4 1
    "B0002HV730" 5 1
    "B00060NI9Y" 1 1
    "B00060NI9Y" 2 1
    "B00060NI9Y" 3 1
    "B00060NI9Y" 4 1
    "B00060NI9Y" 5 0
    "B00060NUEW" 1 0
    "B00060NUEW" 2 0
    "B00060NUEW" 3 0
    "B00060NUEW" 4 0
    "B00060NUEW" 5 0
    "B00060O578" 1 0
    "B00060O578" 2 0
    "B00060O578" 3 0
    "B00060O578" 4 0
    "B00060O578" 5 0
    "B00061EMH0" 1 0
    "B00061EMH0" 2 0
    "B00061EMH0" 3 0
    "B00061EMH0" 4 0
    "B00061EMH0" 5 0
    "B00067LYEI" 1 0
    "B00067LYEI" 2 0
    "B00067LYEI" 3 0
    "B00067LYEI" 4 0
    "B00067LYEI" 5 0
    "B0007G96XO" 1 0
    "B0007G96XO" 2 0
    "B0007G96XO" 3 0
    "B0007G96XO" 4 0
    "B0007G96XO" 5 0
    "B0007M3E8Q" 1 0
    "B0007M3E8Q" 2 1
    "B0007M3E8Q" 3 1
    "B0007M3E8Q" 4 0
    "B0007M3E8Q" 5 0
    "B0007VDRHK" 1 0
    "B0007VDRHK" 2 0
    "B0007VDRHK" 3 0
    "B0007VDRHK" 4 0
    "B0007VDRHK" 5 0
    "B0008JEZ8G" 1 0
    "B0008JEZ8G" 2 0
    "B0008JEZ8G" 3 0
    "B0008JEZ8G" 4 0
    "B0008JEZ8G" 5 0
    "B000BD3Y7C" 1 1
    "B000BD3Y7C" 2 1
    "B000BD3Y7C" 3 1
    "B000BD3Y7C" 4 1
    "B000BD3Y7C" 5 1
    Last edited by Chinonso Etumnu; 12 Jun 2022, 01:42.

  • #2
    It is a little unclear whether you are looking for all values of Prime in years 2 to 5 being 1 or 0, or just any such value occurring (at least once).

    The main trick here is to exploit the equivalence:

    Any 1 in say 0000 1111 0001 0100 corresponds to a maximum value of 1 in the sequence (because if the maximum were 0, no value is 1).

    All 1 in say 0000 1111 0001 0100 corresponds to a minimum value of 1 in the sequence (because if the minimum were 0, the values are not all 1.

    with a related trick to find "any 0" or "all 0".

    Much more at https://www.stata.com/support/faqs/d...ble-recording/

    I also use a technique of bundling each panel's history into one string. More at https://journals.sagepub.com/doi/pdf...36867X20909698

    Thanks for your data example -- which needs a small tweak to run as desired.

    Code:
    clear 
    input str10 ASIN Year Prime
    "B00008CQRK" 1 0
    "B00008CQRK" 2 0
    "B00008CQRK" 3 0
    "B00008CQRK" 4 0
    "B00008CQRK" 5 0
    "B0000E5JK7" 1 0
    "B0000E5JK7" 2 0
    "B0000E5JK7" 3 0
    "B0000E5JK7" 4 0
    "B0000E5JK7" 5 0
    "B0000E5JK8" 1 0
    "B0000E5JK8" 2 0
    "B0000E5JK8" 3 0
    "B0000E5JK8" 4 0
    "B0000E5JK8" 5 0
    "B0000E5JKC" 1 0
    "B0000E5JKC" 2 0
    "B0000E5JKC" 3 0
    "B0000E5JKC" 4 0
    "B0000E5JKC" 5 0
    "B0000E5JRF" 1 1
    "B0000E5JRF" 2 1
    "B0000E5JRF" 3 1
    "B0000E5JRF" 4 1
    "B0000E5JRF" 5 1
    "B0000TLES8" 1 0
    "B0000TLES8" 2 0
    "B0000TLES8" 3 0
    "B0000TLES8" 4 0
    "B0000TLES8" 5 0
    "B0000VYKXC" 1 0
    "B0000VYKXC" 2 1
    "B0000VYKXC" 3 1
    "B0000VYKXC" 4 0
    "B0000VYKXC" 5 1
    "B0001Y9U88" 1 0
    "B0001Y9U88" 2 0
    "B0001Y9U88" 3 0
    "B0001Y9U88" 4 0
    "B0001Y9U88" 5 0
    "B0002AHT84" 1 1
    "B0002AHT84" 2 0
    "B0002AHT84" 3 0
    "B0002AHT84" 4 0
    "B0002AHT84" 5 0
    "B0002HV730" 1 1
    "B0002HV730" 2 1
    "B0002HV730" 3 1
    "B0002HV730" 4 1
    "B0002HV730" 5 1
    "B00060NI9Y" 1 1
    "B00060NI9Y" 2 1
    "B00060NI9Y" 3 1
    "B00060NI9Y" 4 1
    "B00060NI9Y" 5 0
    "B00060NUEW" 1 0
    "B00060NUEW" 2 0
    "B00060NUEW" 3 0
    "B00060NUEW" 4 0
    "B00060NUEW" 5 0
    "B00060O578" 1 0
    "B00060O578" 2 0
    "B00060O578" 3 0
    "B00060O578" 4 0
    "B00060O578" 5 0
    "B00061EMH0" 1 0
    "B00061EMH0" 2 0
    "B00061EMH0" 3 0
    "B00061EMH0" 4 0
    "B00061EMH0" 5 0
    "B00067LYEI" 1 0
    "B00067LYEI" 2 0
    "B00067LYEI" 3 0
    "B00067LYEI" 4 0
    "B00067LYEI" 5 0
    "B0007G96XO" 1 0
    "B0007G96XO" 2 0
    "B0007G96XO" 3 0
    "B0007G96XO" 4 0
    "B0007G96XO" 5 0
    "B0007M3E8Q" 1 0
    "B0007M3E8Q" 2 1
    "B0007M3E8Q" 3 1
    "B0007M3E8Q" 4 0
    "B0007M3E8Q" 5 0
    "B0007VDRHK" 1 0
    "B0007VDRHK" 2 0
    "B0007VDRHK" 3 0
    "B0007VDRHK" 4 0
    "B0007VDRHK" 5 0
    "B0008JEZ8G" 1 0
    "B0008JEZ8G" 2 0
    "B0008JEZ8G" 3 0
    "B0008JEZ8G" 4 0
    "B0008JEZ8G" 5 0
    "B000BD3Y7C" 1 1
    "B000BD3Y7C" 2 1
    "B000BD3Y7C" 3 1
    "B000BD3Y7C" 4 1
    "B000BD3Y7C" 5 1
    end 
    
    egen Prime1 = mean(cond(Year == 1, Prime, .)), by(ASIN)
    
    egen Any1later = max(cond(Year > 1, Prime, .)), by(ASIN)
    egen All1later = min(cond(Year > 1, Prime, .)), by(ASIN)
    egen Any0later = max(cond(Year > 1, Prime == 0, .)), by(ASIN)
    egen All0later = min(cond(Year > 1, Prime == 0, .)), by(ASIN)
    
    bysort ASIN (Year) : gen history = strofreal(Prime)
    by ASIN : replace history = history[_n-1] + strofreal(Prime) if _n > 1 
    by ASIN : replace history = history[_N]
    
    egen tag = tag(ASIN)
    
    list ASIN history Any1 All1 Any0 All0 if tag , noobs  
    
     +------------------------------------------------------------------+
      |       ASIN   history   Any1la~r   All1la~r   Any0la~r   All0la~r |
      |------------------------------------------------------------------|
      | B00008CQRK     00000          0          0          1          1 |
      | B0000E5JK7     00000          0          0          1          1 |
      | B0000E5JK8     00000          0          0          1          1 |
      | B0000E5JKC     00000          0          0          1          1 |
      | B0000E5JRF     11111          1          1          0          0 |
      |------------------------------------------------------------------|
      | B0000TLES8     00000          0          0          1          1 |
      | B0000VYKXC     01101          1          0          1          0 |
      | B0001Y9U88     00000          0          0          1          1 |
      | B0002AHT84     10000          0          0          1          1 |
      | B0002HV730     11111          1          1          0          0 |
      |------------------------------------------------------------------|
      | B00060NI9Y     11110          1          0          1          0 |
      | B00060NUEW     00000          0          0          1          1 |
      | B00060O578     00000          0          0          1          1 |
      | B00061EMH0     00000          0          0          1          1 |
      | B00067LYEI     00000          0          0          1          1 |
      |------------------------------------------------------------------|
      | B0007G96XO     00000          0          0          1          1 |
      | B0007M3E8Q     01100          1          0          1          0 |
      | B0007VDRHK     00000          0          0          1          1 |
      | B0008JEZ8G     00000          0          0          1          1 |
      | B000BD3Y7C     11111          1          1          0          0 |
      +------------------------------------------------------------------+
    
    .
    Your indicator variables are then possibly

    Code:
    gen wanted1 = Prime1 == 1 & Any1 == 1
    gen wanted2 = Prime1 == 1 & All1 == 1
    gen wanted3 = Prime1 == 0 & Any0 == 1
    gen wanted4 = Prime1 == 0 & All0 == 1
    and so on.

    Comment


    • #3
      It is a little unclear whether you are looking for all values of Prime in years 2 to 5 being 1 or 0, or just any such value occurring (at least once).

      The main trick here is to exploit the equivalence:

      Any 1 in say 0000 1111 0001 0100 corresponds to a maximum value of 1 in the sequence (because if the maximum were 0, no value is 1).

      All 1 in say 0000 1111 0001 0100 corresponds to a minimum value of 1 in the sequence (because if the minimum were 0, the values are not all 1.

      with a related trick to find "any 0" or "all 0".

      Much more at https://www.stata.com/support/faqs/d...ble-recording/

      I also use a technique of bundling each panel's history into one string. More at https://journals.sagepub.com/doi/pdf...36867X20909698

      Thanks for your data example -- which needs a small tweak to run as desired.

      Code:
      clear 
      input str10 ASIN Year Prime
      "B00008CQRK" 1 0
      "B00008CQRK" 2 0
      "B00008CQRK" 3 0
      "B00008CQRK" 4 0
      "B00008CQRK" 5 0
      "B0000E5JK7" 1 0
      "B0000E5JK7" 2 0
      "B0000E5JK7" 3 0
      "B0000E5JK7" 4 0
      "B0000E5JK7" 5 0
      "B0000E5JK8" 1 0
      "B0000E5JK8" 2 0
      "B0000E5JK8" 3 0
      "B0000E5JK8" 4 0
      "B0000E5JK8" 5 0
      "B0000E5JKC" 1 0
      "B0000E5JKC" 2 0
      "B0000E5JKC" 3 0
      "B0000E5JKC" 4 0
      "B0000E5JKC" 5 0
      "B0000E5JRF" 1 1
      "B0000E5JRF" 2 1
      "B0000E5JRF" 3 1
      "B0000E5JRF" 4 1
      "B0000E5JRF" 5 1
      "B0000TLES8" 1 0
      "B0000TLES8" 2 0
      "B0000TLES8" 3 0
      "B0000TLES8" 4 0
      "B0000TLES8" 5 0
      "B0000VYKXC" 1 0
      "B0000VYKXC" 2 1
      "B0000VYKXC" 3 1
      "B0000VYKXC" 4 0
      "B0000VYKXC" 5 1
      "B0001Y9U88" 1 0
      "B0001Y9U88" 2 0
      "B0001Y9U88" 3 0
      "B0001Y9U88" 4 0
      "B0001Y9U88" 5 0
      "B0002AHT84" 1 1
      "B0002AHT84" 2 0
      "B0002AHT84" 3 0
      "B0002AHT84" 4 0
      "B0002AHT84" 5 0
      "B0002HV730" 1 1
      "B0002HV730" 2 1
      "B0002HV730" 3 1
      "B0002HV730" 4 1
      "B0002HV730" 5 1
      "B00060NI9Y" 1 1
      "B00060NI9Y" 2 1
      "B00060NI9Y" 3 1
      "B00060NI9Y" 4 1
      "B00060NI9Y" 5 0
      "B00060NUEW" 1 0
      "B00060NUEW" 2 0
      "B00060NUEW" 3 0
      "B00060NUEW" 4 0
      "B00060NUEW" 5 0
      "B00060O578" 1 0
      "B00060O578" 2 0
      "B00060O578" 3 0
      "B00060O578" 4 0
      "B00060O578" 5 0
      "B00061EMH0" 1 0
      "B00061EMH0" 2 0
      "B00061EMH0" 3 0
      "B00061EMH0" 4 0
      "B00061EMH0" 5 0
      "B00067LYEI" 1 0
      "B00067LYEI" 2 0
      "B00067LYEI" 3 0
      "B00067LYEI" 4 0
      "B00067LYEI" 5 0
      "B0007G96XO" 1 0
      "B0007G96XO" 2 0
      "B0007G96XO" 3 0
      "B0007G96XO" 4 0
      "B0007G96XO" 5 0
      "B0007M3E8Q" 1 0
      "B0007M3E8Q" 2 1
      "B0007M3E8Q" 3 1
      "B0007M3E8Q" 4 0
      "B0007M3E8Q" 5 0
      "B0007VDRHK" 1 0
      "B0007VDRHK" 2 0
      "B0007VDRHK" 3 0
      "B0007VDRHK" 4 0
      "B0007VDRHK" 5 0
      "B0008JEZ8G" 1 0
      "B0008JEZ8G" 2 0
      "B0008JEZ8G" 3 0
      "B0008JEZ8G" 4 0
      "B0008JEZ8G" 5 0
      "B000BD3Y7C" 1 1
      "B000BD3Y7C" 2 1
      "B000BD3Y7C" 3 1
      "B000BD3Y7C" 4 1
      "B000BD3Y7C" 5 1
      end 
      
      egen Prime1 = mean(cond(Year == 1, Prime, .)), by(ASIN)
      
      egen Any1later = max(cond(Year > 1, Prime, .)), by(ASIN)
      egen All1later = min(cond(Year > 1, Prime, .)), by(ASIN)
      egen Any0later = max(cond(Year > 1, Prime == 0, .)), by(ASIN)
      egen All0later = min(cond(Year > 1, Prime == 0, .)), by(ASIN)
      
      bysort ASIN (Year) : gen history = strofreal(Prime)
      by ASIN : replace history = history[_n-1] + strofreal(Prime) if _n > 1 
      by ASIN : replace history = history[_N]
      
      egen tag = tag(ASIN)
      
      list ASIN history Any1 All1 Any0 All0 if tag , noobs  
      
       +------------------------------------------------------------------+
        |       ASIN   history   Any1la~r   All1la~r   Any0la~r   All0la~r |
        |------------------------------------------------------------------|
        | B00008CQRK     00000          0          0          1          1 |
        | B0000E5JK7     00000          0          0          1          1 |
        | B0000E5JK8     00000          0          0          1          1 |
        | B0000E5JKC     00000          0          0          1          1 |
        | B0000E5JRF     11111          1          1          0          0 |
        |------------------------------------------------------------------|
        | B0000TLES8     00000          0          0          1          1 |
        | B0000VYKXC     01101          1          0          1          0 |
        | B0001Y9U88     00000          0          0          1          1 |
        | B0002AHT84     10000          0          0          1          1 |
        | B0002HV730     11111          1          1          0          0 |
        |------------------------------------------------------------------|
        | B00060NI9Y     11110          1          0          1          0 |
        | B00060NUEW     00000          0          0          1          1 |
        | B00060O578     00000          0          0          1          1 |
        | B00061EMH0     00000          0          0          1          1 |
        | B00067LYEI     00000          0          0          1          1 |
        |------------------------------------------------------------------|
        | B0007G96XO     00000          0          0          1          1 |
        | B0007M3E8Q     01100          1          0          1          0 |
        | B0007VDRHK     00000          0          0          1          1 |
        | B0008JEZ8G     00000          0          0          1          1 |
        | B000BD3Y7C     11111          1          1          0          0 |
        +------------------------------------------------------------------+
      
      .
      Your indicator variables are then possibly

      Code:
      gen wanted1 = Prime1 == 1 & Any1 == 1
      gen wanted2 = Prime1 == 1 & All1 == 1
      gen wanted3 = Prime1 == 0 & Any0 == 1
      gen wanted4 = Prime1 == 0 & All0 == 1
      and so on. The indicators can also be generated directly from the history variable, but more care is needed there if you have any gaps or missing values.

      Comment


      • #4
        Apologies for the repetition.

        Comment

        Working...
        X