Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to tag the most recent fiscal year end value based on the calandar time?

    Dear Stata experts,

    I would like to find the most recent fiscal-year-end investment based on the calendar time, so for example, on calandar time Jan2017 for the firm i, I would like to identify the most recent available date of fiscal year end (fym) and it could be on Dec2016 or Oct2016, which depends on the data availability. I've found some related discussions on forum, but still couldn't get the result I want after several trials. The difficulty here is to identify the most recent fiscal year end according to the calandar time, do you know how to adjust my following codes to satisfy that? Many thanks for your help in advance!

    Here are my trial codes:
    Code:
    egen whenlast = max(cond(!missing(inv), fym, .)), by(permno year)
    
    egen mostrecent = total(cond(fym <= whenlast, inv, .)), by(permno year)
    Here is my data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double permno float(fym ym year inv)
    10001 329 329 1987          .
    10001 341 341 1988  .04923744
    10001 353 353 1989          .
    10001 365 365 1990  .12323585
    10001 377 377 1991  .00625579
    10001 389 389 1992          .
    10001 401 401 1993          .
    10001 413 413 1994 .028103314
    10001 425 425 1995          .
    10001 437 437 1996          .
    10001 449 449 1997  .03011862
    10001 461 461 1998  .04490795
    10001 473 473 1999          .
    10001 485 485 2000          .
    10001 497 497 2001  .00778224
    10001 509 509 2002          .
    10001 521 521 2003          .
    10001 533 533 2004 .036568023
    10001 545 545 2005 .007604275
    10001 557 557 2006 .030736925
    10001 569 569 2007  .02768201
    10001 587 587 2008  .03786452
    10001 599 599 2009          .
    10001 611 611 2010  .02021314
    10001 623 623 2011          .
    10001 635 635 2012          .
    10001 647 647 2013          .
    10001 659 659 2014          .
    10001 671 671 2015  .06002158
    10005 335 335 1987          .
    10005 347 347 1988   .3489565
    10005 359 359 1989  .11902058
    10006 179 179 1974          .
    10006 191 191 1975 .009795986
    10006 203 203 1976          .
    10006 215 215 1977          .
    10006 227 227 1978          .
    10006 239 239 1979          .
    10006 251 251 1980          .
    10006 263 263 1981 .003754211
    10006 275 275 1982          .
    10010 333 333 1987          .
    10010 345 345 1988 .013301323
    10010 357 357 1989 .016951194
    10010 369 369 1990          .
    10010 381 381 1991          .
    10010 393 393 1992          .
    10010 405 405 1993  .21574366
    10011 349 361 1990          .
    10011 361 373 1991  .11395872
    10011 373 385 1992  .04502336
    10011 385 397 1993          .
    10011 397 409 1994          .
    10011 409 421 1995          .
    10011 421 433 1996 .018474387
    10011 433 445 1997          .
    10012 325 337 1988          .
    10012 337 349 1989          .
    10012 349 361 1990          .
    10012 361 373 1991  .10245787
    10012 373 385 1992  .01330823
    10012 385 397 1993 .025130806
    10012 397 409 1994          .
    10012 409 421 1995  .04874434
    10012 421 433 1996          .
    10012 433 445 1997  .04086566
    10012 445 457 1998  .01229265
    10012 457 469 1999  .06302412
    10012 469 481 2000          .
    10012 481 493 2001  .06938227
    10012 493 505 2002  .06559808
    10012 505 517 2003 .009870172
    10012 517 529 2004          .
    10014 170 182 1975          .
    10014 182 194 1976 .003883881
    10015 299 299 1984          .
    10016 335 335 1987          .
    10016 347 347 1988 .012448853
    10016 359 359 1989 .014390157
    10016 371 371 1990  .00545693
    10016 383 383 1991          .
    10016 395 395 1992  .02559407
    10016 407 407 1993 .008933168
    10016 419 419 1994 .035720687
    10016 431 431 1995 .008099532
    10016 443 443 1996  .02214086
    10016 455 455 1997 .024634454
    10016 467 467 1998  .07937292
    10016 479 479 1999   .0356203
    10018 333 333 1987          .
    10018 345 345 1988  .08612505
    10018 357 357 1989          .
    10018 369 369 1990  .12523298
    10018 381 381 1991  .04002704
    10018 393 393 1992          .
    10018 405 405 1993          .
    10018 417 417 1994   .1547711
    10019 329 329 1987          .
    10019 341 341 1988          .
    10019 353 353 1989          .
    end
    format %tm fym
    format %tm ym
    Last edited by Jae Li; 28 Jan 2019, 08:24.

  • #2
    I think what you want is this:
    Code:
    //    CREATE A DATA SET WITH AVAILABLE INV
    preserve
    drop if missing(inv)
    tempfile candidates
    save `candidates'
    restore
    
    rangejoin fym . ym using `candidates', by(permno)
    by permno year (fym), sort: keep if _n == _N
    rename fym_U whenlast
    rename inv_U mostrecent
    drop *_U
    -rangejoin- is written by Robert Picard and is available from SSC. To use it, you must also have installed -rangestat-, by Robert Picard, Nick Cox, and Roberto Ferrer, also available from SSC.

    Comment


    • #3
      Dear @Clyde Schechter, thank you for your swift reply!

      It's always nice to learn new techniques from you! Thank you for introduing me about the -rangejoin-! I read the help file and would like to confirm a few questions if possible. First, fym_U and inv_U and all variables with _U are from original dataset called "candidates", is that right? After running -rangejoin-, how does the variable of fym change exactly? What does the variable of whenlast mean exactly?

      Many thanks for your time and patience in advance! Hope to hear from you soon!
      Last edited by Jae Li; 29 Jan 2019, 09:31.

      Comment


      • #4
        First, fym_U and inv_U and all variables with _U are from original dataset called "candidates", is that right?
        Yes, that's right. -rangejoin- automatically adds a _U suffix to each variable that is brings into the data set from the -using- data.

        After running -rangejoin-, how does the variable of fym change exactly?
        It doesn't.

        What does the variable of whenlast mean exactly?
        I chose the name whenlast because that is what you used in the trial code you showed in #1. It is the value of the last fiscal year end (fym) among the observations for the current permno which precedes or equals the current value of variable ym.

        Comment


        • #5
          @Clyde Schechter Hi Clyde, thank you for your explanations! So if I understand it right and please do correct me if I am wrong, that is, the major difference between -rangejoin- and other similar commands(e.g., -merge- or -joinby-) is the master dataset can be merged with the using dataset by using -rangejoin- if the keyvar of the using dataset is within a certain range of boundaries in the master dataset, but the other commands are not able to let the using dataset to fit in such range criteria, is that correct?

          Many thanks for your help with my inquiries!

          Comment


          • #6
            Yes, in terms of the end results that are produced, that is correct. Of course, you could also obtain the same results by using -merge- or -joinby- (depending on the -merge-ability of the keys) and then impose the range by then using a -keep if...- command. So one of the real strengths of -rangejoin- is not only does it produce these results, it does it much faster and uses less memory as well. In a small data set the difference will not be noticeable, but in a large one the difference in speed is dramatic.

            Comment

            Working...
            X