Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify first observation in panel data and filter based on cut off value

    Hi,

    I have a panel data with securities (id) and month (ymdate), return (ex_ret) and market cap (mcap in $m) etc. Now I want to only keep securities that has initial mcap > $10 (first observation in data set). If it starts above $10 and goes below $10 at a later point in time, I still want to keep them in my data set.

    Any suggestion/help is much appreciated. I have included a snapshot of data.

    Thanks,
    John

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long id float(ex_ret ymdate) int year byte month float mcap
     5    .007135526 408 1994  1  95.32637
     5     .04686265 409 1994  2  111.5997
     5     -.0521462 410 1994  3 120.20005
     5     -.0205943 411 1994  4 118.55016
     5    -.03191356 412 1994  5 115.30073
     5   .0003904014 413 1994  6 108.75031
     5     .02415652 414 1994  7 112.24284
     5     .05768865 415 1994  8 119.38356
     5    .006272078 416 1994  9  123.8092
     5    -.04605908 417 1994 10  127.4961
     5   -.033803094 418 1994 11 125.62856
     5    -.07200204 419 1994 12  115.3449
     5    -.08217537 420 1995  1 103.74718
     5    -.04307023 421 1995  2  95.07434
     5    -.09054957 422 1995  3  78.60113
     5    .016433334 423 1995  4  75.64328
     5   -.006503144 424 1995  5  69.77741
     5 -.00028255107 425 1995  6  62.30355
     5    .004845794 426 1995  7  59.46569
     5     -.0335671 427 1995  8  54.49944
     5        -.0043 428 1995  9  46.45511
     5    -.05573758 429 1995 10  44.24756
     5    -.08221418 430 1995 11 36.724445
     5    .002151282 431 1995 12  33.28134
     5  -.0004807766 432 1996  1  30.08087
     5   -.008338808 433 1996  2 26.936356
     5    -.07077898 434 1996  3  22.63101
     5   -.019309897 435 1996  4  12.20499
     5    -.03222314 436 1996  5 11.884175
     5     .09798147 437 1996  6  8.665211
     5    -.05694231 438 1996  7  8.209953
     5    -.05392935 439 1996  8  4.001592
    11         .0594 408 1994  1         .
    11        -.0275 409 1994  2         .
    11        -.0164 410 1994  3         .
    11         -.004 411 1994  4         .
    11         .0343 412 1994  5         .
    11         .0201 413 1994  6         .
    11        -.0155 414 1994  7      15.9
    11         .0049 415 1994  8         .
    11        -.0037 416 1994  9         .
    11        -.0195 417 1994 10         .
    11        -.0149 418 1994 11         .
    11        -.0192 419 1994 12         .
    11         .0007 420 1995  1         .
    11         .0028 421 1995  2         .
    11         .0154 422 1995  3         .
    11         .0173 423 1995  4         .
    12         .0415 410 1994  3 14.923252
    12        -.0003 413 1994  6 16.368946
    12          .026 416 1994  9  19.79441
    12         .0133 419 1994 12   21.1824
    12         .0387 422 1995  3 22.145514
    12          .035 425 1995  6 25.975367
    12         .0199 428 1995  9         .
    12         .0299 431 1995 12        60
    12         .0268 434 1996  3        61
    12         .0262 437 1996  6        80
    12         .0133 440 1996  9        83
    12         .0221 443 1996 12        87
    12         .0367 446 1997  3        80
    12         .0352 449 1997  6        90
    12         .0304 452 1997  9        96
    12         .0226 455 1997 12       118
    12          .037 458 1998  3       151
    13        -.0697 410 1994  3         .
    13         .0959 413 1994  6         .
    13         .3213 416 1994  9        28
    13         .0076 419 1994 12         .
    13         .1414 422 1995  3        31
    13         .0933 425 1995  6        33
    13         .0477 428 1995  9         .
    13         .0561 431 1995 12      38.1
    13         .1641 434 1996  3         .
    13         -.021 437 1996  6         .
    13         .0136 440 1996  9        30
    13        -.0556 443 1996 12        40
    13         .0457 446 1997  3         .
    13         .1803 449 1997  6         .
    13         .1296 452 1997  9         .
    13         .0152 455 1997 12         .
    13         .3681 458 1998  3         .
    13         .0609 461 1998  6         .
    13         .0394 464 1998  9         .
    13         .3012 467 1998 12       106
    13           .05 470 1999  3       110
    14         .0058 408 1994  1         .
    14        -.0007 409 1994  2         .
    14        -.0467 410 1994  3         .
    14        -.0216 411 1994  4         .
    14        -.0244 412 1994  5         .
    14        -.0195 413 1994  6         .
    14         .0118 414 1994  7         .
    14         .0009 415 1994  8         .
    14        -.0066 416 1994  9         2
    15         .0241 408 1994  1 18.121399
    15         .0391 409 1994  2 19.069885
    15         .0031 410 1994  3  19.21628
    15         .0216 411 1994  4  20.59685
    15         .0094 412 1994  5 20.934223
    end
    format %tm ymdate


  • #2
    The gist of it is simple enough:

    Code:
    by id (ymdate), sort: keep if mcap[1] > 10
    But there is a wrinkle. Several of the firms in the example have a missing value of mcap in their first month in the data. In Stata, missing value is interpreted as being larger than any number. So the above code will retain these firms. You don't say what you want to do with these firms. If you want to drop them:

    Code:
    by id (ymdate), sort: keep if mcap[1] > 10 & !missing(mcap[1])
    I suppose there is another possibility: you might want to keep them if their first non-missing mcap value in the data, whenever that happens to be, is > 10. That would be:

    Code:
    gen byte no_mcap = missing(mcap)
    by id (no_mcap ymdate), sort: keep if mcap[1] > 10

    Comment


    • #3
      Hi Clyde,

      Thanks for the detailed solution and the various options. This is indeed very helpful.

      Best,
      John.

      Comment


      • #4
        Hi Clyde,

        As a follow up, I am trying to create a dummy variable based on mcap and multiple conditions.
        - the security stops reporting (column dead is populated with a month number, the security goes dead on the last month of reporting for the that id)
        - trailing average 6 month return (ex_ret) is less than zero (going back from the last reporting date)
        - trialing mcap is declined over the last year (12 months) leading up to going dead

        The dummy variable by id will be 1 on the last month of reporting for the security if it is dead and all the other conditions are met.

        I am attaching a snapshot of the data:

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input long id int year byte month float newdate int(incep dead monthnum) float(ex_ret mcap)
         5 1994  1 408 154 199 168    .007135526  95.32637
         5 1994  2 409 154 199 169     .04686265  111.5997
         5 1994  3 410 154 199 170     -.0521462 120.20005
         5 1994  5 412 154 199 172    -.03191356 115.30073
         5 1994  6 413 154 199 173   .0003904014 108.75031
         5 1994  7 414 154 199 174     .02415652 112.24284
         5 1994  8 415 154 199 175     .05768865 119.38356
         5 1994  9 416 154 199 176    .006272078  123.8092
         5 1994 10 417 154 199 177    -.04605908  127.4961
         5 1994 11 418 154 199 178   -.033803094 125.62856
         5 1994 12 419 154 199 179    -.07200204  115.3449
         5 1995  1 420 154 199 180    -.08217537 103.74718
         5 1995  2 421 154 199 181    -.04307023  95.07434
         5 1995  4 423 154 199 183    .016433334  75.64328
         5 1995  5 424 154 199 184   -.006503144  69.77741
         5 1995  6 425 154 199 185 -.00028255107  62.30355
         5 1995  7 426 154 199 186    .004845794  59.46569
         5 1995  9 428 154 199 188        -.0043  46.45511
         5 1995 10 429 154 199 189    -.05573758  44.24756
         5 1995 11 430 154 199 190    -.08221418 36.724445
         5 1995 12 431 154 199 191    .002151282  33.28134
         5 1996  1 432 154 199 192  -.0004807766  30.08087
         5 1996  2 433 154 199 193   -.008338808 26.936356
         5 1996  3 434 154 199 194    -.07077898  22.63101
         5 1996  4 435 154 199 195   -.019309897  12.20499
         5 1996  5 436 154 199 196    -.03222314 11.884175
         5 1996  6 437 154 199 197     .09798147  8.665211
         5 1996  8 439 154 199 199    -.05392935  4.001592
        11 1994  1 408 120 183 168         .0594         .
        11 1994  2 409 120 183 169        -.0275         .
        11 1994  3 410 120 183 170        -.0164         .
        11 1994  4 411 120 183 171         -.004         .
        11 1994  5 412 120 183 172         .0343         .
        11 1994  6 413 120 183 173         .0201         .
        11 1994  8 415 120 183 175         .0049         .
        11 1994  9 416 120 183 176        -.0037         .
        11 1994 10 417 120 183 177        -.0195         .
        11 1994 11 418 120 183 178        -.0149         .
        11 1994 12 419 120 183 179        -.0192         .
        11 1995  1 420 120 183 180         .0007         .
        11 1995  2 421 120 183 181         .0028         .
        11 1995  4 423 120 183 183         .0173         .
        12 1994  3 410 104 218 170         .0415 14.923252
        12 1994  6 413 104 218 173        -.0003 16.368946
        12 1994  9 416 104 218 176          .026  19.79441
        12 1994 12 419 104 218 179         .0133   21.1824
        12 1995  6 425 104 218 185          .035 25.975367
        12 1995 12 431 104 218 191         .0299        60
        12 1996  3 434 104 218 194         .0268        61
        12 1996  6 437 104 218 197         .0262        80
        12 1996  9 440 104 218 200         .0133        83
        12 1996 12 443 104 218 203         .0221        87
        12 1997  3 446 104 218 206         .0367        80
        12 1997  6 449 104 218 209         .0352        90
        12 1997  9 452 104 218 212         .0304        96
        12 1997 12 455 104 218 215         .0226       118
        12 1998  3 458 104 218 218          .037       151
        13 1994  3 410  86 230 170        -.0697         .
        13 1994  6 413  86 230 173         .0959         .
        13 1994  9 416  86 230 176         .3213        28
        13 1994 12 419  86 230 179         .0076         .
        13 1995  9 428  86 230 188         .0477         .
        13 1995 12 431  86 230 191         .0561      38.1
        13 1996  3 434  86 230 194         .1641         .
        13 1996  6 437  86 230 197         -.021         .
        13 1996  9 440  86 230 200         .0136        30
        13 1996 12 443  86 230 203        -.0556        40
        13 1997  3 446  86 230 206         .0457         .
        13 1997  6 449  86 230 209         .1803         .
        13 1997  9 452  86 230 212         .1296         .
        13 1997 12 455  86 230 215         .0152         .
        13 1998  3 458  86 230 218         .3681         .
        13 1998  6 461  86 230 221         .0609         .
        13 1998  9 464  86 230 224         .0394         .
        13 1998 12 467  86 230 227         .3012       106
        14 1994  2 409 160 176 169        -.0007         .
        14 1994  3 410 160 176 170        -.0467         .
        14 1994  4 411 160 176 171        -.0216         .
        14 1994  5 412 160 176 172        -.0244         .
        14 1994  6 413 160 176 173        -.0195         .
        14 1994  7 414 160 176 174         .0118         .
        14 1994  8 415 160 176 175         .0009         .
        14 1994  9 416 160 176 176        -.0066         2
        15 1994  1 408 120   . 168         .0241 18.121399
        15 1994  2 409 120   . 169         .0391 19.069885
        15 1994  3 410 120   . 170         .0031  19.21628
        15 1994  4 411 120   . 171         .0216  20.59685
        15 1994  5 412 120   . 172         .0094 20.934223
        15 1994  6 413 120   . 173        -.0582  19.50379
        15 1994  7 414 120   . 174         .0249   20.4468
        15 1994  9 416 120   . 176         .0091 21.878357
        15 1994 10 417 120   . 177         .0268 23.323545
        15 1994 11 418 120   . 178        -.0108  23.13353
        15 1994 12 419 120   . 179         .0317  23.26434
        15 1995  1 420 120   . 180         .0019  30.16556
        15 1995  2 421 120   . 181         .0273  31.36802
        15 1995  4 423 120   . 183         .0041 36.430573
        15 1995  5 424 120   . 184        -.0209  35.75147
        15 1995  6 425 120   . 185         .0264 37.159645
        15 1995  7 426 120   . 186         .0409  39.63629
        end
        format %tm newdate

        Any help/suggestion is much appreciated.

        Best,
        John.

        Comment


        • #5
          John,

          This is unrelated, from a Stata perspective, to the original topic of the post. You should start a new thread for this so that others who search for solutions will be able to find this.

          Next, it is a bad idea, in general, to address your posts to a specific person. It just discourages others who might be able to help you sooner from responding. More to the point, this is a community and these posts are not a dialog between two particular people. I understand that when there is a back and forth working out the fine points of a particular problem it feels natural to address the person already engaged in the problem, and in that situation probably others would not jump in anyway unless they saw a better solution than the one on offer, or if the post went unanswered for a long time. But even then, it is better to just put the post up for the whole community.

          Your conditions are not clear to me. When you post back in a new thread, please clarify the following:

          1. What do you mean by "- the security stops reporting (column dead is populated with a month number, the security goes dead on the last month of reporting for the that id)?" As a description of your data sample this is false. id 13 has dead == 230, but the last monthnum is 227. And id 15 has missing values for dead, rather than being the final reporting month 186. So what is going on here?

          2. Does 6 month trailing average mean from 1 month before through 6 months before, or does it mean the current month through 5 months before?

          3. What does "trialing mcap is declined over the last year (12 months) leading up to going dead" mean? I truly have no idea what you're getting at here.

          Comment


          • #6
            Hi Clyde,

            Thanks for your comments. I have started a new thread. I have tried to answer all your questions.

            In some cases, I may have incomplete data so monthnum may be smaller than dead (month number) . Also if the dead column has missing values, that means the security is alive.

            For id 15 since the security is alive, the final reporting month will depend on the data set (in my case it goes till 12/2013) but the sample data set doesn't show that.

            Best,
            John.

            Comment

            Working...
            X