Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New variable to count months

    Hi, I have 12 observations per year, each for each month, but with the same date. I would want to create a new variable called month where it just assigns the number of the month to each observation. This is what my dataset looks like:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long gvkey int year str10 datadate float lvratio
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 1999 "31/12/1999" .5424352
    246759 2000 "31/12/2000" .7837051
    246759 2000 "31/12/2000" .7837051
    246759 2000 "31/12/2000" .7837051
    246759 2000 "31/12/2000" .7837051
    246759 2000 "31/12/2000" .7837051
    246759 2000 "31/12/2000" .7837051
    246759 2000 "31/12/2000" .7837051
    246759 2000 "31/12/2000" .7837051
    end
    And I would want something like this:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long gvkey int year str10 datadate float lvratio month
    246759 1999 "31/12/1999" .5424352  1
    246759 1999 "31/12/1999" .5424352  2
    246759 1999 "31/12/1999" .5424352  3
    246759 1999 "31/12/1999" .5424352  4
    246759 1999 "31/12/1999" .5424352  5
    246759 1999 "31/12/1999" .5424352  6
    246759 1999 "31/12/1999" .5424352  7
    246759 1999 "31/12/1999" .5424352  8
    246759 1999 "31/12/1999" .5424352  9
    246759 1999 "31/12/1999" .5424352  10
    246759 1999 "31/12/1999" .5424352  11
    246759 1999 "31/12/1999" .5424352  12
    246759 2000 "31/12/2000" .7837051  1
    246759 2000 "31/12/2000" .7837051  2
    246759 2000 "31/12/2000" .7837051  3
    246759 2000 "31/12/2000" .7837051  4
    246759 2000 "31/12/2000" .7837051  5
    246759 2000 "31/12/2000" .7837051  6
    246759 2000 "31/12/2000" .7837051  7
    246759 2000 "31/12/2000" .7837051  8
    end
    What is an easy way to do this?

    Thanks

  • #2

    Code:
    * see the fine help 
    help datetime 
    
    gen month = month(daily(datadate, "DMY"))

    Proof of concept

    Code:
    . di month(daily("31/12/2000", "DMY"))
    12

    Comment


    • #3
      Hi Nick, this generates the month variable, but it assigns the number 12 to all, because all the data is from December. Because I have merged a yearly and a monthly dataset, it generated 12 different observations with the same date for each month of the year. But what I want is what I did in the second code (which I added manually), so the variable month is actually the cumulative count of the same gvkey for each year.
      Last edited by Juan Gonzalex; 27 May 2022, 07:31.

      Comment


      • #4
        I found the solution:

        Code:
        by gvkey year: gen byte month = _n

        Comment


        • #5
          You did ask for that.

          That solution in #3 is very dangerous, as you can't be sure that such a sort will preserve your desired order of months.

          Other way round, your data example includes 12 duplicates for every year, which raises the question of why you are holding data in this way.

          Perhaps what you are doing here is expanding a yearly dataset in preparation for a merge with a monthly dataset, but merge will do that automatically for you.

          Comment


          • #6
            Hi, I already have the merged dataset, that's why I have 12 duplicates for every year. Everything seems to work fine

            Comment

            Working...
            X