Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a dummy: D=1 if X occurred in the last Y months

    Hi guys, I've got a panel and I'm trying to create a dummy which is 1 if the value of another dummy variable ==1 at any point in the last 6 months, and 0 otherwise.

    Not worked out yet with stata how to code 2 if statements so any help greatly appreciated

    Alex



  • #2
    It would help if you made your data names and structures clear, e.g. by using dataex (SSC; see http://www.statalist.org/forums/forum/general-stata-discussion/general/1301224-new-on-ssc-dataex-a-command-to-generate-a-properly-formatted-data-example-for-statalist) to give a simple worked example. With tsset or xtset panel dataset which is monthly this could yield to no if qualifiers at all, e.g.

    Code:
    gen any6 = (L1.some + L2.some + L3. some + L4.some + L5.some + L6.some) > 0
    as if any of 6 previous values is 1, then their sum is positive.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      It would help if you made your data names and structures clear, e.g. by using dataex (SSC; see http://www.statalist.org/forums/forum/general-stata-discussion/general/1301224-new-on-ssc-dataex-a-command-to-generate-a-properly-formatted-data-example-for-statalist) to give a simple worked example. With tsset or xtset panel dataset which is monthly this could yield to no if qualifiers at all, e.g.

      Code:
      gen any6 = (L1.some + L2.some + L3. some + L4.some + L5.some + L6.some) > 0
      as if any of 6 previous values is 1, then their sum is positive.

      Hi, it is a monthly panel dataset running from months 4 to 31 (vbl is monthy). I didn't think of something as simple as that so thanks!

      However the first 5 months get set to 1 even if the value of the dummy (Dinsurance) =0, how do I prevent the first month from being set as a 1?


      (Reading about the dataex command as we speak)
      Last edited by Alex Jeffries; 06 Jul 2015, 04:11.

      Comment


      • #4
        Indeed, missings will bite at the beginning of each panel, so simple tricks can be too simple. max() would be safer.

        Code:
        gen any6 = max(L1.some, L2.some, L3. some, L4.some, L5.some, L6.some)
        Also check out tsegen (SSC).

        Comment


        • #5
          Originally posted by Nick Cox View Post
          Indeed, missings will bite at the beginning of each panel, so simple tricks can be too simple. max() would be safer.

          Code:
          gen any6 = max(L1.some, L2.some, L3. some, L4.some, L5.some, L6.some)
          Also check out tsegen (SSC).
          Much appreciated

          Comment


          • #6
            Here's the tsegen route.

            Code:
            . webuse grunfeld
            
            . su
            
                Variable |       Obs        Mean    Std. Dev.       Min        Max
            -------------+--------------------------------------------------------
                 company |       200         5.5    2.879489          1         10
                    year |       200      1944.5    5.780751       1935       1954
                  invest |       200    145.9583    216.8753        .93     1486.7
                  mvalue |       200    1081.681     1314.47      58.12     6241.7
                  kstock |       200    276.0172    301.1039         .8     2226.3
            -------------+--------------------------------------------------------
                    time |       200        10.5    5.780751          1         20
            
            . tsset
                   panel variable:  company (strongly balanced)
                    time variable:  year, 1935 to 1954
                            delta:  1 year
            
            . gen high = mvalue > 500
            
            . tsegen last6 = rowmax(L(1/6).high)
            (10 missing values generated)
            
            . list mvalue high last6 company if company <= 2, sepby(company)
            
                 +---------------------------------+
                 | mvalue   high   last6   company |
                 |---------------------------------|
              1. | 3078.5      1       .         1 |
              2. | 4661.7      1       1         1 |
              3. | 5387.1      1       1         1 |
              4. | 2792.2      1       1         1 |
              5. | 4313.2      1       1         1 |
              6. | 4643.9      1       1         1 |
              7. | 4551.2      1       1         1 |
              8. | 3244.1      1       1         1 |
              9. | 4053.7      1       1         1 |
             10. | 4379.3      1       1         1 |
             11. | 4840.9      1       1         1 |
             12. | 4900.9      1       1         1 |
             13. | 3526.5      1       1         1 |
             14. | 3254.7      1       1         1 |
             15. | 3700.2      1       1         1 |
             16. | 3755.6      1       1         1 |
             17. |   4833      1       1         1 |
             18. | 4924.9      1       1         1 |
             19. | 6241.7      1       1         1 |
             20. | 5593.6      1       1         1 |
                 |---------------------------------|
             21. | 1362.4      1       .         2 |
             22. | 1807.1      1       1         2 |
             23. | 2676.3      1       1         2 |
             24. | 1801.9      1       1         2 |
             25. | 1957.3      1       1         2 |
             26. | 2202.9      1       1         2 |
             27. | 2380.5      1       1         2 |
             28. | 2168.6      1       1         2 |
             29. | 1985.1      1       1         2 |
             30. | 1813.9      1       1         2 |
             31. | 1850.2      1       1         2 |
             32. | 2067.7      1       1         2 |
             33. | 1796.7      1       1         2 |
             34. | 1625.8      1       1         2 |
             35. |   1667      1       1         2 |
             36. | 1677.4      1       1         2 |
             37. | 2289.5      1       1         2 |
             38. | 2159.4      1       1         2 |
             39. | 2031.3      1       1         2 |
             40. | 2115.5      1       1         2 |
                 +---------------------------------+
            Here there's no information at all on the previous values for the first value, so missing is returned automatically.

            tsegen is a fairly general tool here so long as a suitable function exists for egen.
            .
            Last edited by Nick Cox; 06 Jul 2015, 05:14.

            Comment


            • #7
              The anymatch() egen function can also be used with tsegen for this problem. The advantage is that it does not generate missing values at all.

              Code:
               tsegen last6 = anymatch(L(1/6).high), val(1)

              Comment


              • #8
                I'd argue that missings in the first observation of each panel is most likely to be the right answer here. There really is no information on previous values. Still, that's a substantive choice, and anymatch() is an alternative choice here.

                Comment

                Working...
                X