Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dividing series of variables through looping

    Hi, I have variables Jan_13 to Mar_21, referring to months January 2013 to March 2021. I need to create new variables containing the results of current month/previous month. Ex. Feb_13_PR = Feb_13/Jan_13.

    How can I automate this computation for all months through looping? Thank you.

  • #2
    You should reshape long as working rowwise with such data is just going to be a small nightmare.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      You should reshape long as working rowwise with such data is just going to be a small nightmare.
      I have 40,464 observations. So, I'm not sure if that is what I need.

      Comment


      • #4
        This is my formula if I will do it manually.

        gen PR_jan_13 = jan_13/dec_12
        gen PR_feb_13 = feb_13/jan_13
        gen PR_mar_13 = mar_13/feb_13
        gen PR_apr_13 = apr_13/mar_13
        gen PR_may_13 = may_13/apr_13
        gen PR_jun_13 = jun_13/may_13
        gen PR_jul_13 = jul_13/jun_13
        gen PR_aug_13 = aug_13/jul_13
        gen PR_sep_13 = sep_13/aug_13
        gen PR_oct_13 = oct_13/sep_13
        gen PR_nov_13 = nov_13/oct_13
        gen PR_dec_13 = dec_13/nov_13

        But I have to do this for all the months of years 2013 to 2021, that is why it will be so tedious if I will do it this way.

        Comment


        • #5
          We agree: what you imagine is a nightmare.

          No data example here, but your variable names help.


          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input float(id PR_jan_13 PR_feb_13 PR_mar_13 PR_apr_13 PR_may_13 PR_jun_13 PR_jul_13 PR_aug_13 PR_sep_13 PR_oct_13 PR_nov_13 PR_dec_13)
          1 42 43 44 45 46 47 48 49 50 51 52 53
          end
          
          reshape long PR_ , i(id) j(when) string
          . gen mdate = monthly(substr(when, 1, 3) + "20" + substr(when, -2, 2), "MY")
          
          format mdate %tm
          
          xtset id mdate
                 panel variable:  id (strongly balanced)
                  time variable:  mdate, 2013m1 to 2013m12
                          delta:  1 month
          
          list 
          
               +-----------------------------+
               | id     when   PR_     mdate |
               |-----------------------------|
            1. |  1   jan_13    42    2013m1 |
            2. |  1   feb_13    43    2013m2 |
            3. |  1   mar_13    44    2013m3 |
            4. |  1   apr_13    45    2013m4 |
            5. |  1   may_13    46    2013m5 |
               |-----------------------------|
            6. |  1   jun_13    47    2013m6 |
            7. |  1   jul_13    48    2013m7 |
            8. |  1   aug_13    49    2013m8 |
            9. |  1   sep_13    50    2013m9 |
           10. |  1   oct_13    51   2013m10 |
               |-----------------------------|
           11. |  1   nov_13    52   2013m11 |
           12. |  1   dec_13    53   2013m12 |
               +-----------------------------+
          Now instead of several variables, you need only one


          Code:
          gen wanted = PR_/L.PR_
          What could be different?

          1. You have several stubs, not just PR_ but XY_ AB_ or whatever. Fine: include the other stubs in the reshape

          2. Your identifier is a string variable. Use encode to get a numeric equivalent before calling up xtset.

          3. You tell us.

          Comment


          • #6
            I think my dataset is really complicated for a reshape.

            To give you an idea, I have 40,646 observations, which is composed of 81 provinces with around 200 to 700 commodities (products) each.

            Then I have 140 variables, which is composed of the province code, commodity code, commodity description, and the prices of each commodity for the months January 2012 to March 2021.

            If I'll do a reshape and convert the prices for January 2012 to March 2021 as my observations, I'll have to identify it by province and by commodity. This is why I'm working columnwise.
            Last edited by Rachel Chan; 28 Apr 2021, 04:29.

            Comment


            • #7
              Your choice. You have 99 months, so a reshape long would get you about 4 million observations but correspondingly fewer variables. Your way just involves creating just about a hundred new variables every time you do something similar.

              What is your strategy for analysing such variables? Good luck in graphing them. If you don't have a strategy for analysis, you're making a bad situation worse.

              If presented with your dataset, I wouldn't dream of working with it as it is.

              I doubt any experienced users of Stata would advise differently. Ask around!

              Comment


              • #8
                Hi Sir Nick, I tried your suggestion and it worked fine so far. I didn't get the desired output right away, that's why I thought it will not work for my dataset. I have other steps to work on, I hope you could still help me in the future. Thanks again.

                Comment


                • #9
                  Hi, now that I already generated my desired variable (PR) by computing the price of each commodity in the current month over the previous month, my next step is to replace the contents of PR if the year is 2012. Instead of current month/previous month, for the months of year 2012, PR should be current month/average of 2012. I was able to do it using the codes below:

                  gen yr = yofd(dofm(mdate))
                  by panel: egen AvePr_2012 = mean(Price) if yr==2012
                  replace PR = Price/AvePr_2012 if yr==2012

                  I just wonder if there's a way which I do not have to create the variable "AvePR_2012". Thanks.

                  Comment


                  • #10
                    See https://www.stata-journal.com/articl...article=dm0055 for a survey of technique.


                    Something like this, may be closer to what you need:

                    Code:
                    by panel: egen AvePr_2012 = mean(cond(inrange(mdate, ym(2012, 1), ym(2012,12)), Price, .)) 
                    replace PR = Price/AvePr_2012

                    Comment

                    Working...
                    X