Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting string variable into month, quarter and year

    Hello,

    Can you help me with the following issues:

    I have a dataset with a variable called "date" that has monthy values for several years by country (the dataset has 71 countries). The aforementioned "date" is a string variable in the following format: 1984_m1, 1984_m2,...1984_m12 (per country).

    I would like to create three variables: one variable only for values of the month, other with the mean values of the corresponding quarter and other with the mean values of the corresponding year. I will call then: "month", "quarter" and "year". I would like it to appear for the "month" the corresponding number of the month: 1 2 3 4 5...12; and for the "quarter" the corresponding numbers of the quarter: 1 2 3 4
    I tried to use the following command but they are not working as I want:

    gen date1=monthly(date,"YM")
    gen date2 = dofm(date1)
    format date2 %d

    (until this step the result of the convertion of "1984_m1" is something like "01jan1984"..in this case I would like it to appear the last day of the month: something like "31jan1984")

    (the other steps don´t give me the desired results):

    gen month=month(date2)
    gen yr=year(date2)
    gen quarter=quarter(date2)

    Can you help me with the necessary commands to achieve the results that I want?
    Thank you in advance.

  • #2
    The way to get a daily date variable to show the last day of the month is to calculate dofm() of the next month and then subtract 1 day. In the code below, the variable date3 does this.

    As for your code for variables month, yr, and quarter, I can see nothing wrong with it, and the results in the example data shown below are correct. So, as you did not describe what the difficulty actually is, and I find correct results, I can't advise you on that.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str8 date
    "1981_m1" 
    "1984_m2" 
    "1984_m12"
    end
    
    gen date1=monthly(date,"YM")
    format date1 %tm
    gen date2 = dofm(date1)
    format date2 %d
    
    gen date3 = dofm(date1+1)-1
    format date3 %td
    
    gen month=month(date2)
    gen yr=year(date2)
    gen quarter=quarter(date2)
    
    list, noobs clean
    In the future, when asking for help with code, post an example data set, and use the -dataex- command to do that. If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Ok, thank you. Next time i will put an example.
      The problem with the codes is that when i format the quarter and year in %tq and %ty ...in appear years and quarters since 1960..in this last case the year is not correct and the quarter also changes and it does not match the real quarter.

      Comment


      • #4
        But you said you want the months to be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12, and the quarters to be 1, 2, 3, 4. That's what they are. Don't apply any display format to them.

        If you apply the %tm format to the month variable, then the number 1 is interpreted as 1 month after January 1960, so it reads as February 1960. Similarly, if you apply %tq format to the quarter variable, then 3 becomes 3 quarters after 1960q1, which is 1960q4.

        You said that you aren't looking to create a variable that shows the quarter and year: you just want the quarter, 1, 2, 3, or 4. The code you used gives you that--but by applying formats to them you made it appear to be something else.

        Last edited by Clyde Schechter; 02 May 2019, 19:56.

        Comment


        • #5
          In short,

          monthly date = months before or after January 1960; month of year = 1(1)12.

          quarterly date = quarters before or after 1960q1; quarter of year = 1(1)4.

          Comment


          • #6
            Hello I have a similar problem I tried doing
            gen date1=monthly(date,"YM"), but it is not working. Therefore I wanted to ask if you could also help me out: my variable for month is in the following format: 1901m12, 1902m1 and so on. I want to separate this and create a monthly variable where January=1, February=2 and so on. What is the command that I could use

            . My data on the dataex command is as follows


            11 GID_3 int month str25 NAME_3 double SPEI_12 float SPEI_tw "PAK.1.1.1_1" -697 "Bagh" -.3789849877357483 5 "PAK.1.1.1_1" -696 "Bagh" -.9119960069656372 1 "PAK.1.1.1_1" -695 "Bagh" -1.2728400230407715 2 "PAK.1.1.1_1" -694 "Bagh" -1.2467800378799438 2 "PAK.1.1.1_1" -693 "Bagh" -1.2774699926376343 2 "PAK.1.1.1_1" -692 "Bagh" -1.6748000383377075 3 "PAK.1.1.1_1" -691 "Bagh" -1.5109399557113647 3 "PAK.1.1.1_1" -690 "Bagh" -1.586359977722168 3 "PAK.1.1.1_1" -689 "Bagh" -1.460710048675537 2 "PAK.1.1.1_1" -688 "Bagh" -1.7392499446868896 3 "PAK.1.1.1_1" -687 "Bagh" -1.638069987297058 3 "PAK.1.1.1_1" -686 "Bagh" -1.5005600452423096 3 "PAK.1.1.1_1" -685 "Bagh" -1.5524200201034546 3 "PAK.1.1.1_1" -684 "Bagh" -1.058109998703003 2 "PAK.1.1.1_1" -683 "Bagh" -1.0765899419784546 2 "PAK.1.1.1_1" -682 "Bagh" -.7318099737167358 1 "PAK.1.1.1_1" -681 "Bagh" -.7395870089530945 1 "PAK.1.1.1_1" -680 "Bagh" -.5689499974250794 1 "PAK.1.1.1_1" -679 "Bagh" -1.0022900104522705 2 "PAK.1.1.1_1" -678 "Bagh" .253574013710022 5 "PAK.1.1.1_1" -677 "Bagh" .8018379807472229 6 "PAK.1.1.1_1" -676 "Bagh" 1.3044999837875366 7 "PAK.1.1.1_1" -675 "Bagh" 1.2515900135040283 7 "PAK.1.1.1_1" -674 "Bagh" 1.1314300298690796 7 "PAK.1.1.1_1" -673 "Bagh" 1.3179999589920044 7 "PAK.1.1.1_1" -672 "Bagh" 1.6106499433517456 8 "PAK.1.1.1_1" -671 "Bagh" 1.5905200242996216 8 "PAK.1.1.1_1" -670 "Bagh" 1.649090051651001 8 "PAK.1.1.1_1" -669 "Bagh" 1.5620700120925903 8 "PAK.1.1.1_1" -668 "Bagh" 1.4635900259017944 7 "PAK.1.1.1_1" -667 "Bagh" 1.5038299560546875 8 "PAK.1.1.1_1" -666 "Bagh" .6224769949913025 6 "PAK.1.1.1_1" -665 "Bagh" .719307005405426 6 "PAK.1.1.1_1" -664 "Bagh" .23469200730323792 5 "PAK.1.1.1_1" -663 "Bagh" .4397610127925873 5 "PAK.1.1.1_1" -662 "Bagh" .4874730110168457 5 "PAK.1.1.1_1" -661 "Bagh" .5284649729728699 6 "PAK.1.1.1_1" -660 "Bagh" .3280450105667114 5 "PAK.1.1.1_1" -659 "Bagh" .6768720149993897 6

            Comment


            • #7
              #6 is already answered in #2, #4 and #5. If you want to extract month of the year from a monthly date you don't use monthly().

              I can't see a readable data example in #6, but this shows the principles:

              Code:
              . clear
              
              . di ym(2019, 5)
              712
              
              . di %tm ym(2019, 5)
               2019m5
              
              . di month(dofm(ym(2019, 5)))
              5
              
              . set obs 12
              number of observations (_N) was 0, now 12
              
              . gen mdate = ym(2019, _n)
              
              . format mdate %tm
              
              . list
              
                   +---------+
                   |   mdate |
                   |---------|
                1. |  2019m1 |
                2. |  2019m2 |
                3. |  2019m3 |
                4. |  2019m4 |
                5. |  2019m5 |
                   |---------|
                6. |  2019m6 |
                7. |  2019m7 |
                8. |  2019m8 |
                9. |  2019m9 |
               10. | 2019m10 |
                   |---------|
               11. | 2019m11 |
               12. | 2019m12 |
                   +---------+
              
              . gen month = month(dofm(mdate))
              
              . l
              
                   +-----------------+
                   |   mdate   month |
                   |-----------------|
                1. |  2019m1       1 |
                2. |  2019m2       2 |
                3. |  2019m3       3 |
                4. |  2019m4       4 |
                5. |  2019m5       5 |
                   |-----------------|
                6. |  2019m6       6 |
                7. |  2019m7       7 |
                8. |  2019m8       8 |
                9. |  2019m9       9 |
               10. | 2019m10      10 |
                   |-----------------|
               11. | 2019m11      11 |
               12. | 2019m12      12 |
                   +-----------------+
              
              . gen month2 = 1 + mod(mdate, 12)
              
              . l
              
                   +--------------------------+
                   |   mdate   month   month2 |
                   |--------------------------|
                1. |  2019m1       1        1 |
                2. |  2019m2       2        2 |
                3. |  2019m3       3        3 |
                4. |  2019m4       4        4 |
                5. |  2019m5       5        5 |
                   |--------------------------|
                6. |  2019m6       6        6 |
                7. |  2019m7       7        7 |
                8. |  2019m8       8        8 |
                9. |  2019m9       9        9 |
               10. | 2019m10      10       10 |
                   |--------------------------|
               11. | 2019m11      11       11 |
               12. | 2019m12      12       12 |
                   +--------------------------+
              In short you have a monthly date variable say mdate then month of the year 1(1)12 is obtainable by

              Code:
              gen month = month(dofm(mdate))
              or by

              Code:
              gen month2 = 1 + mod(mdate, 12)

              Comment


              • #8
                Hello,

                The problem is that I need to calculate the lagged variables using the variable "quarter". Because it is a number...stata doesn´t allow me to do calculate it.
                I try to use "tsset country quarter"..and it appears an error saying "repeated time values within panel".
                Can you help me with the correct command?
                Thank you in advance.

                Comment


                • #9
                  You can in your case only tsset in terms of an identifier and a monthly date variable. Recall that in #3 you promised to post a data example, so please do that.

                  I don't understand precisely what lagged variables you want to define but you can only do it in terms of the dates you have.

                  Comment


                  • #10

                    copy starting from the next line ------ ----------------
                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input str17 country long country_ float(yr quarter log_claims)
                    "ALBANIA" 1 1991 1 -11.422366
                    "ALBANIA" 1 2008 1 -11.329387
                    "ALBANIA" 1 2007 1   8.510169
                    "ALBANIA" 1 2002 1   9.920148
                    "ALBANIA" 1 1988 1  10.381707
                    "ALBANIA" 1 2003 1  -10.69639
                    "ALBANIA" 1 2013 1   7.386471
                    "ALBANIA" 1 1996 1  -9.377295
                    "ALBANIA" 1 2006 1  -13.01515
                    "ALBANIA" 1 1998 1   7.337588
                    "ALBANIA" 1 1995 1   6.526495
                    "ALBANIA" 1 1999 1   12.24839
                    "ALBANIA" 1 1994 1  -9.808243
                    "ALBANIA" 1 2011 1   8.326275
                    "ALBANIA" 1 1997 1   9.186253
                    "ALBANIA" 1 2005 1  -9.779114
                    "ALBANIA" 1 1989 1   11.77078
                    "ALBANIA" 1 1993 1   5.379897
                    "ALBANIA" 1 1992 1  -9.456654
                    "ALBANIA" 1 2000 1   7.280697
                    "ALBANIA" 1 2001 1  -9.935326
                    "ALBANIA" 1 2009 1 -11.979843
                    "ALBANIA" 1 1986 1   9.994014
                    "ALBANIA" 1 2010 1  11.195127
                    "ALBANIA" 1 2012 1   6.300786
                    "ALBANIA" 1 2004 1  11.780338
                    "ALBANIA" 1 1990 1 -11.838799
                    "ALBANIA" 1 1987 1  -8.280204
                    "ALBANIA" 1 2009 2  11.855351
                    "ALBANIA" 1 2005 2  -7.603898
                    "ALBANIA" 1 1990 2  11.120846
                    "ALBANIA" 1 1986 2  -8.044626
                    "ALBANIA" 1 2006 2 -12.903598
                    "ALBANIA" 1 1988 2  11.099514
                    "ALBANIA" 1 2008 2  11.968335
                    "ALBANIA" 1 1996 2   5.758902
                    "ALBANIA" 1 1993 2  -7.977282
                    "ALBANIA" 1 1989 2  10.774237
                    "ALBANIA" 1 1987 2   7.620215
                    "ALBANIA" 1 1998 2   6.447306
                    "ALBANIA" 1 1991 2 -10.993546
                    "ALBANIA" 1 1999 2 -12.152007
                    "ALBANIA" 1 1992 2   8.863333
                    "ALBANIA" 1 2010 2 -12.412668
                    "ALBANIA" 1 1994 2  -7.220374
                    "ALBANIA" 1 2013 2  11.280224
                    "ALBANIA" 1 2012 2 -12.623806
                    "ALBANIA" 1 2001 2   8.533854
                    "ALBANIA" 1 2004 2   9.754059
                    "ALBANIA" 1 2007 2  12.474793
                    "ALBANIA" 1 1995 2  -8.263848
                    "ALBANIA" 1 2003 2   9.494692
                    "ALBANIA" 1 2000 2  -6.961296
                    "ALBANIA" 1 1997 2  -9.730383
                    "ALBANIA" 1 2011 2  12.097195
                    "ALBANIA" 1 2002 2  -9.267855
                    "ALBANIA" 1 2004 3  -9.202812
                    "ALBANIA" 1 1990 3  12.500401
                    "ALBANIA" 1 1987 3  -8.009363
                    "ALBANIA" 1 1998 3   8.581669
                    "ALBANIA" 1 2005 3  13.018928
                    "ALBANIA" 1 1992 3   8.859079
                    "ALBANIA" 1 2013 3  11.941964
                    "ALBANIA" 1 2002 3  -9.778378
                    "ALBANIA" 1 1988 3   9.757826
                    "ALBANIA" 1 2006 3   8.559678
                    "ALBANIA" 1 2003 3   6.001415
                    "ALBANIA" 1 1989 3  11.323386
                    "ALBANIA" 1 2007 3  10.218152
                    "ALBANIA" 1 2010 3  11.180567
                    "ALBANIA" 1 1999 3  -9.956412
                    "ALBANIA" 1 1996 3  11.304498
                    "ALBANIA" 1 2011 3 -10.686955
                    "ALBANIA" 1 1991 3   8.974238
                    "ALBANIA" 1 2012 3    9.26691
                    "ALBANIA" 1 2000 3   9.273033
                    "ALBANIA" 1 1997 3  -7.920446
                    "ALBANIA" 1 2009 3  12.659524
                    "ALBANIA" 1 2008 3     9.7949
                    "ALBANIA" 1 1995 3 -12.170663
                    "ALBANIA" 1 1993 3  10.021404
                    "ALBANIA" 1 1994 3 -10.468688
                    "ALBANIA" 1 1986 3  -10.00211
                    "ALBANIA" 1 2001 3  -7.087574
                    "ALBANIA" 1 2004 4   -5.31812
                    "ALBANIA" 1 1985 4  -6.985642
                    "ALBANIA" 1 1987 4  11.247448
                    "ALBANIA" 1 1992 4   9.862874
                    "ALBANIA" 1 2010 4 -10.715816
                    "ALBANIA" 1 2009 4  11.508797
                    "ALBANIA" 1 2000 4  -9.267193
                    "ALBANIA" 1 2006 4  10.266115
                    "ALBANIA" 1 1990 4 -12.313065
                    "ALBANIA" 1 1986 4  10.566613
                    "ALBANIA" 1 1989 4   8.366603
                    "ALBANIA" 1 1995 4 -11.190155
                    "ALBANIA" 1 1988 4 -11.200171
                    "ALBANIA" 1 2003 4  -8.794673
                    "ALBANIA" 1 2012 4 -10.344255
                    "ALBANIA" 1 1994 4   9.175231
                    end
                    label values country_ country_
                    label def country_ 1 "ALBANIA", modify

                    Comment


                    • #11
                      Hello,

                      I use the "dataex" command and it resulted like this...I would like to calculate the lagged (quarter) variables of "log_claims". Can you help me with this?
                      Thank you in advance.

                      Comment


                      • #12
                        I can only conclude that you are now talking about a completely different dataset. How did you expect us to know that from #8? From what is posted I guess that you need

                        Code:
                        gen qdate = yq(yr, quarter) 
                        format qdate %tq 
                        tsset country_ qdate
                        after which lags will refer to quarterly dates.

                        Comment


                        • #13
                          Hello,

                          Now it worked! Thank you for your help.

                          Best regards

                          Comment

                          Working...
                          X