Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting year from str6 variable

    Dear stata experts,

    I am trying to create a new variable "fyear" which will extract the year from FQ variable.

    I am using the following code;

    gen fyear=real(substr(FQ,-4,4))
    (465,662 missing values generated)

    But only getting missing values. Please help me.

    I highly appreciate any help.


    This is how my data looks.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long gvkey str6 FQ float DFCC
     10043 "1987q4"  -.04554077
     10043 "1988q1"  -.03759932
     10043 "1988q2" -.034101963
     10043 "1988q3"  -.03365408
     10043 "1988q4"  -.03182709
     10043 "1989q1"  -.03142329
     10043 "1989q2" -.031413224
     10043 "1989q3" -.031401183
     10043 "1989q4" -.031644043
     10043 "1990q1"  -.03064395
     10043 "1990q2"   -.0292534
     10043 "1990q3"  -.03242006
     10043 "1990q4" -.031500693
     10043 "1991q1" -.030841853
     10043 "1991q2" -.034647018
     10043 "1991q2" -.030616906
     10043 "1991q2"  -.03145055
     10043 "1991q2" -.034269977
     10043 "1991q2" -.030243317
     10043 "1991q2"  -.02850146
     10043 "1991q2" -.029622065
     10043 "1991q2"  -.02861151
     10043 "1991q2"   -.0288074
     10043 "1991q2" -.025928466
     10043 "1991q2" -.030616906
     10043 "1991q2"  -.02875322
     10043 "1991q2"  -.02556408
     10043 "1991q2"  -.02544752
     10043 "1991q2" -.032088842
     10043 "1991q2" -.028605396
    105691 "2009q2"  -.08628523
    105691 "2009q3"  -.12604311
    105691 "2009q4"  -.11671578
    105691 "2009q4"  -.11671578
    105691 "2010q1"  -.10291442
    105691 "2010q2"  -.11359666
    105691 "2010q3"  -.11458436
    105691 "2010q4"  -.11394295
    105691 "2010q4"  -.11394295
    105691 "2011q1"  -.11097002
    105691 "2011q2"    -.108717
    105691 "2011q3"  -.10790505
    105691 "2011q4"  -.10566665
    105691 "2011q4"  -.10566665
    105691 "2012q1"  -.10272849
    105691 "2012q2"   -.1071661
    105691 "2012q3"  -.10784652
    105691 "2012q4"  -.09705043
    105691 "2012q4"  -.09697405
    105691 "2013q1"  -.11146338
    105691 "2013q2"  -.12216824
    105691 "2013q3"   -.1186447
    105691 "2013q4"  -.11992262
    105691 "2013q4"  -.12097118
    105691 "2014q1"   -.1182868
    105691 "2014q2"  -.11872981
    105691 "2014q3"  -.11989788
    105691 "2014q4"   -.1224336
    105691 "2014q4"   -.1248684
    105691 "2015q1"   -.1249793
    105691 "2015q2"  -.12670094
    105691 "2015q3"  -.12323944
    105691 "2015q4"  -.12871118
    105691 "2015q4"   -.1229784
    105691 "2016q1"  -.12866525
    105691 "2016q2"  -.11631726
    105691 "2016q3"  -.11187375
    105691 "2016q4"  -.11399497
    105691 "2016q4"  -.11258118
    105691 "2017q1"    -.110648
    105691 "2017q2"  -.11038414
    105691 "2017q3"  -.11028598
    105691 "2017q4"  -.10893972
    105691 "2017q4"  -.11012165
    105691 "2018q1"  -.10689934
    105691 "2018q2"  -.10302497
    105691 "2018q3"  -.10005387
    105691 "2018q4"  -.09747024
    105691 "2019q1"  -.09689133
    105691 "2019q2"  -.09382968
    105691 "2019q3"  -.10240284
    105691 "2019q4"   -.0991138
    105691 "2020q1"  -.09569377
    105691 "2020q2"   -.0881667
    105691 "2020q3"  -.09336258
    105691 "2020q4"  -.09517936
    105691 "2021q1"  -.09456224
    105691 "2021q2"  -.09559565
    105691 "2021q3"   -.0949631
    105691 "2021q4"  -.09434204
    105691 "2022q1"  -.09460559
    105691 "2022q2"  -.09165271
    105691 "2022q2"   -.1004464
    105691 "2022q2"  -.09449553
    105691 "2022q2"   -.0984618
    105691 "2022q2"  -.10130215
    105691 "2022q2"  -.10588802
    106716 "2000q2"  -.02625747
    106716 "2000q3"  -.04297526
    106716 "2000q4"  -.04441883
    end


  • #2
    I recommend:
    Code:
    gen year = real(substr(FQ, 1, 4))
    Actually, having the fiscal quarter as a string variable is also not very useful. So what I would almost surely do were I in your shoes is first convert FQ to a Stata internal format quarterly date variable, and then extract the date from that.
    Code:
    gen fiscal_quarter = quarterly(FQ, "YQ")
    assert missing(fiscal_quarter) == missing(FQ)
    format fiscal_quarter %tq
    gen fiscal_year = yofd(dofq(fiscal_quarter))
    This way both fiscal_year and fiscal_quarter will be suitable for date calculations.

    Comment


    • #3
      Dear Clyde Schechter,

      Thank you so much for the useful response.

      The only thing is I am getting this error message

      assert missing(fiscal_quarter) == missing(FQ)
      50 contradictions in 465,662 observations
      assertion is false
      r(9);


      It would be kind if you please help in this regard.

      Best regards,
      Ismat

      Comment


      • #4
        You have 50 problematic observations. So, no harm in looking at them

        Code:
        list fiscal_quarter FQ if missing(fiscal_quarter) != missing(FQ)
        
        tab fiscal_quarter FQ if missing(fiscal_quarter) != missing(FQ), missing

        Comment


        • #5
          Dear Nick Cox,

          Thank you so much for the suggestion.

          Best regards
          ,
          Ismat

          Comment

          Working...
          X