Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating a variable that counts the appearance of a particular value in rolling bases (across rows)




    Dear Statalist,



    I am having some difficulties in creating a variable I need. I would greatly appreciate it if you could give me any suggestions or advice.

    I want to create a variable that estimates the total number of distinct markets a firm entered over the past three(five) years. For example, firm 66, as of 1999, entered only one market - market 6 - over the past 3 years. The value should be 1 for the variable I want.

    I created the below code, but it considers the same market to be different. For the aforementioned case, my code generated the value of 2 instead of 1 as it counts market 6 twice.

    gen id=market
    ren market market_

    reshape wide market_, i(firm year) j(id)

    foreach var of varlist market_* {
    replace `var'=1 if `var'~=.
    replace `var'=0 if `var'==.
    }

    foreach var of varlist market_* {
    rangestat (cum) `var', by(firm) interval(year -5 -1)
    ren `var'_cum cum_`var'_5

    rangestat (cum) `var', by(firm) interval(year -3 -1)
    ren `var'_cum cum_`var'_3


    }
    *
    egen market_sum3 = rowtotal(cum_market_*_3)
    egen market_sum5 = rowtotal(cum_market_*_5)


    I would appreciate it if you could give me any useful tips. Thank you so much.



    FYI.
    firm: firm id
    market: market id
    product: product id


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(firm year market product)
     66 1995  6  1
     66 1996  6  2
     66 1998  6  3
     66 1999  6  4
     66 2001  4  5
     66 2004  6  6
     66 2004  6  7
     66 2005 10  8
    150 1977 17  9
    150 1985  8 10
    150 1988  1 11
    150 1989 10 12
    150 1990  1 13
    150 1990  1 14
    150 1990  1 15
    150 1990  1 16
    150 1990  1 17
    150 1991  1 18
    150 1992  1 19
    150 1992  1 20
    150 1992  1 21
    150 1992  1 22
    150 1992  1 23
    150 1992  1 24
    150 1992  1 25
    150 1993  1 26
    150 1993  1 27
    150 1993  1 28
    150 1993  1 29
    150 1993  1 30
    150 1993  1 31
    150 1994  1 32
    150 2011  1 33
    end

  • #2
    This question comes up quite often, as witness (e.g https://www.statalist.org/forums/for...evious-5-years and its links.

    This solution uses rangerun and rangestat from SSC.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(firm year market product)
     66 1995  6  1
     66 1996  6  2
     66 1998  6  3
     66 1999  6  4
     66 2001  4  5
     66 2004  6  6
     66 2004  6  7
     66 2005 10  8
    150 1977 17  9
    150 1985  8 10
    150 1988  1 11
    150 1989 10 12
    150 1990  1 13
    150 1990  1 14
    150 1990  1 15
    150 1990  1 16
    150 1990  1 17
    150 1991  1 18
    150 1992  1 19
    150 1992  1 20
    150 1992  1 21
    150 1992  1 22
    150 1992  1 23
    150 1992  1 24
    150 1992  1 25
    150 1993  1 26
    150 1993  1 27
    150 1993  1 28
    150 1993  1 29
    150 1993  1 30
    150 1993  1 31
    150 1994  1 32
    150 2011  1 33
    end
    
    program count_distinct 
    qui tab market 
    gen distinct = r(r)
    end
    
    rangerun count_distinct, use(market) inter(year -3 -1) by(firm)
    
    list, sepby(firm)
    
        +-------------------------------------------+
         | firm   year   market   product   distinct |
         |-------------------------------------------|
      1. |   66   1995        6         1          . |
      2. |   66   1996        6         2          1 |
      3. |   66   1998        6         3          1 |
      4. |   66   1999        6         4          1 |
      5. |   66   2001        4         5          1 |
      6. |   66   2004        6         6          1 |
      7. |   66   2004        6         7          1 |
      8. |   66   2005       10         8          1 |
         |-------------------------------------------|
      9. |  150   1977       17         9          . |
     10. |  150   1985        8        10          . |
     11. |  150   1988        1        11          1 |
     12. |  150   1989       10        12          1 |
     13. |  150   1990        1        13          2 |
     14. |  150   1990        1        14          2 |
     15. |  150   1990        1        15          2 |
     16. |  150   1990        1        16          2 |
     17. |  150   1990        1        17          2 |
     18. |  150   1991        1        18          2 |
     19. |  150   1992        1        19          2 |
     20. |  150   1992        1        20          2 |
     21. |  150   1992        1        21          2 |
     22. |  150   1992        1        22          2 |
     23. |  150   1992        1        23          2 |
     24. |  150   1992        1        24          2 |
     25. |  150   1992        1        25          2 |
     26. |  150   1993        1        26          1 |
     27. |  150   1993        1        27          1 |
     28. |  150   1993        1        28          1 |
     29. |  150   1993        1        29          1 |
     30. |  150   1993        1        30          1 |
     31. |  150   1993        1        31          1 |
     32. |  150   1994        1        32          1 |
     33. |  150   2011        1        33          . |
         +-------------------------------------------+

    Comment


    • #3
      Dear Nick,


      Thank you so much for your help. I should've explored further to see if a similar question was posted.
      Have a great weekend. Thank you again.

      Comment

      Working...
      X