Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adjust for inflation using CPI index

    I'd like to add a list of CPI index values to my panel dataset and use them to adjust the values of certain variables for inflation.

    1) I obtained annual cpi data (index - below) (www.abs.gov.au) and have imported to a dta file. I'd like to adjust a number of variables in my panel data set (in long format) for waves 1 (2001) to wave 18 (2018), I understand I only need to append the second column (the actual index values), but I'm not sure how to do so that the value for 01jun2001 (73.6) aligns with wave 1, etc.
    Code:
    Period    INDEXANNUAL
    01jun2001    73.6
    01jun2002    75.7
    01jun2003    77.975
    01jun2004    79.85
    01jun2005    81.775
    01jun2006    84.4
    01jun2007    86.9
    01jun2008    89.825
    01jun2009    92.625
    01jun2010    94.775
    01jun2011    97.725
    01jun2012    99.975
    01jun2013    102.25
    01jun2014    105.025
    01jun2015    106.825
    01jun2016    108.3
    01jun2017    110.15
    01jun2018    112.275
    01jun2019    114.125
    01jun2020    115.65
    2) As I have a few variables that need to be adjusted for infliation, would it be best to add these to a varlist and apply the adjustment using a loop? And if so, what could this look like? I'm not sure if will affect the code, but for some variables, I only have data in wave 2, 6, 10, 14, 18.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float couple byte wave float(bank super prop)
    3131 2 174510  104000  320000
    2123 2   9000  330000  450000
    4066 2   3750   75000  600000
    1733 2   2220   24500  180000
    7063 2   3660   37000 1050000
    8573 2   4426  405199  280000
    5780 2  22750   99520  250000
    1361 2     30  134000  250000
      42 2   3500   46500  195000
    5478 2  13169  485800  400000
    8999 2   4500       0  520000
    1869 2 180000    5001  480000
    3038 2    700   44000  260000
    8155 2   9200    2200   95000
    5586 2   7500  252000  250000
    4253 2   1920   50000  169706
    1290 2   4647   93000  298000
    6004 2    800   48000  140000
     564 2   4925   22100  230000
    1807 2    226   28000  185000
    1678 2   7000 1361787  400000
    end
    ​​​​​​
    Last edited by Chris Boulis; 18 Jul 2021, 05:17.

  • #2
    I don't see the need for append here. It looks like you'll just want to create a year variable for each dataset and merge on that and then deflate.

    Code:
    clear
    input str9 Period str7 INDEXANNUAL
    "01jun2001" "73.6"  
    "01jun2002" "75.7"  
    "01jun2003" "77.975"
    "01jun2004" "79.85"  
    "01jun2005" "81.775"
    "01jun2006" "84.4"  
    "01jun2007" "86.9"  
    "01jun2008" "89.825"
    "01jun2009" "92.625"
    "01jun2010" "94.775"
    "01jun2011" "97.725"
    "01jun2012" "99.975"
    "01jun2013" "102.25"
    "01jun2014" "105.025"
    "01jun2015" "106.825"
    "01jun2016" "108.3"  
    "01jun2017" "110.15"
    "01jun2018" "112.275"
    "01jun2019" "114.125"
    "01jun2020" "115.65"
    end
    
    gen year = yofd(date(Period, "DMY"))
    
    tempfile cpi
    save `cpi'
    
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float couple byte wave float(bank super prop)
    3131 2 174510  104000  320000
    2123 2   9000  330000  450000
    4066 2   3750   75000  600000
    1733 2   2220   24500  180000
    7063 2   3660   37000 1050000
    8573 2   4426  405199  280000
    5780 2  22750   99520  250000
    1361 2     30  134000  250000
      42 2   3500   46500  195000
    5478 2  13169  485800  400000
    8999 2   4500       0  520000
    1869 2 180000    5001  480000
    3038 2    700   44000  260000
    8155 2   9200    2200   95000
    5586 2   7500  252000  250000
    4253 2   1920   50000  169706
    1290 2   4647   93000  298000
    6004 2    800   48000  140000
     564 2   4925   22100  230000
    1807 2    226   28000  185000
    1678 2   7000 1361787  400000
    end
    
    
    gen year = 2000 + wave
    merge m:1 year using `cpi', keep(matched) assert(matched using) nogen

    Comment


    • #3
      Thank you Justin Niakamal - the year variables merge was successful:
      Code:
      clear
      input str9 Period str7 cpi_index // changed from INDEXANNUAL to cpi_index
      "01jun2001" "73.6"  
      "01jun2002" "75.7"  
      "01jun2003" "77.975"
      "01jun2004" "79.85"  
      "01jun2005" "81.775"
      "01jun2006" "84.4"  
      "01jun2007" "86.9"  
      "01jun2008" "89.825"
      "01jun2009" "92.625"
      "01jun2010" "94.775"
      "01jun2011" "97.725"
      "01jun2012" "99.975"
      "01jun2013" "102.25"
      "01jun2014" "105.025"
      "01jun2015" "106.825"
      "01jun2016" "108.3"  
      "01jun2017" "110.15"
      "01jun2018" "112.275"
      "01jun2019" "114.125"
      "01jun2020" "115.65"
      end
      
      gen year = yofd(date(Period, "DMY"))
      tempfile cpi
      save `cpi'
      
      use "C:/data/fin.dta", clear   
      gen year = 2000 + wave
      merge m:1 year using `cpi', keep(matched) assert(matched using) nogen
      
          Result                           # of obs.
          -----------------------------------------
          not matched                             0
          matched                            92,984  
          -----------------------------------------
      In terms of (2) in #1, my first stab at the code to deflate the respective values received the message "invalid syntax". Suggestions appreciated.
      Code:
      local list bank super equity prop // there are >10 variables so a list would make this easier
      foreach v of list {
      bysort year (wave): egen `v'_r = `v'/cpi_index
      }
      Stata v.15.1. Panel data.

      Comment


      • #4
        UPDATE: Code updated to address error in deflation calculation - Stata still advises "invalid syntax".
        Code:
        local list bank super equity prop
        foreach v of list {
        bysort year (wave): egen `v'_r = `v'/(cpi_index/100)
        }

        Comment


        • #5
          You have misspecified the foreach declaration.

          Comment


          • #6
            Thanks Leonardo Guizzetti for picking up that I mispecifyied the foreach statement.
            Code:
            local list asset fin bank super prop
            foreach v of local list {
            bysort year (wave): egen `v'_r = `v'/(cpi_index/100)
            There appears to be an error in my last line as Stata displayed
            Code:
            unknown egen function asset/()
            r(133);
            I thought that as I have panel data, I should sort by "wave", but is this an issue given I am sorting on year to pick up my cpi_index (as in #2)? That said, I tested the code without wave and was given the same message. I then tried "bys couple (year)" (couple_id) and received the same message.
            Help appreciated.
            Last edited by Chris Boulis; 19 Jul 2021, 02:30.

            Comment


            • #7
              You meant to use gen but typed egen instead. Try it with gen, or explain why you feel you need need to use egen.

              Comment


              • #8
                Hi William Lisowski. I didn't realise I should use 'gen' instead of 'egen' thank you. As the values I want to deflate are at the household level, do you think it makes sense to sort on couple id (couple) in the bysort command (e.g. bysort couple (year): gen `v' = `v'/(cpi_index/100), I then obtained the message
                Code:
                . type mismatch
                r(109);
                I finally realised the cpi index values were a string so I made them numeric using "destring" and the code ran. After inputting the cpi data as in #3, my code is now:
                Code:
                destring cpi_index, replace
                gen year = yofd(date(Period, "DMY"))
                tempfile cpi
                save `cpi'
                
                use "`savingdir'/`filename'", clear
                gen year = 2000 + wave
                merge m:1 year using `cpi', keep(matched) assert(matched using) nogen
                save "`savingdir'/`filename'", replace
                  
                gen cpi = cpi_index/100   // I thought it would be tidier to generate this calculation separately
                  
                local list asset fin bank super prop
                foreach v of local list {
                bysort couple (year): gen `v'r = `v'/cpi  
                }
                I just want to know if I have done this correctly. It would make sense that the deflated values are all higher than the non-deflated values for all years up to 2012 (when the index == 100 (rounded)) and all years since, the deflated values would be less than the inflated values - does that sound right? I've attached a sample:
                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input float couple byte wave float(bank bankr super superr prop propr)
                4066 2  3750  4953.765  75000   99075.3  600000  792602.4
                7643 2   509   672.391 180000  237780.7  110000 145310.44
                4634 2  2706 3574.6365 241227  318661.8  500000  660501.9
                2886 2   935 1235.1387  20850  27542.93  150000  198150.6
                4986 2  2000  2642.008  80000 105680.31   90000 118890.35
                7554 2  6750  8916.776  29800  39365.92  315000  416116.2
                2476 2  8000  10568.03  18460  24385.73  115000 151915.45
                1195 2  2063  2725.231  15000  19815.06  270000  356671.1
                8108 2 45000  59445.18 480000  634081.9  660000  871862.6
                 340 2  3025  3996.037  24000 31704.094  120000 158520.47
                2131 2 58950  77873.18     15  19.81506 1000000   1321004
                6648 2  1700 2245.7065  21000  27741.08  170000 224570.67
                 775 2 36600  48348.74  31900  42140.02  235000  310435.9
                8539 2     0         0   2300  3038.309  170000 224570.67
                2127 2   250   330.251   8901 11758.256  250000    330251
                6878 2  3790  5006.605  25200 33289.297   12561  16593.13
                5780 2 22750  30052.84  99520 131466.31  250000    330251
                4860 2  5300  7001.321  20000  26420.08  150000  198150.6
                9127 2  2756  3640.687  83949 110896.96  360000  475561.4
                1486 2 25125 33190.223  25500   33685.6  160000 211360.63
                 564 2  4925  6505.944  22100 29194.186  230000  303830.9
                5586 2  7500  9907.529 252000    332893  250000    330251
                5093 2  8555 11301.188  75000   99075.3  200000 264200.78
                6178 2  5500  7265.521  83000 109643.33  250000    330251
                7063 2  3660 4834.8745  37000  48877.14 1050000   1387054
                6865 2 22000 29062.086 360000  475561.4  650000  858652.6
                2190 2 27504  36332.89 230050 303896.94 1110000 1466314.4
                end
                Update: In a total sense, I find the deflated totals are higher than the inflated totals, and this may be due to there being 12 years of values which after dividing by the index leads to higher values, and only six years of values in which deflating them leads to lower values.
                Last edited by Chris Boulis; 20 Jul 2021, 07:10.

                Comment


                • #9
                  I didn't realise I should use 'gen' instead of 'egen'
                  When you receive the error message
                  Code:
                  unknown egen function asset/()
                  r(133);
                  it suggests looking at the output of help egen as a starting point to finding out what the problem is, and that will show you that the syntax you used is not acceptable for the egen command, which does not accept an expression on the right hand side, only functions such as the ones documented in the help egen output.
                  Code:
                  Syntax
                  
                          egen [type] newvar = fcn(arguments) [if] [in] [, options]
                  do you think it makes sense to sort on couple id (couple) in the bysort command
                  On closer review of your code, the bysort command is unnecessary, because the results depend only on one observation at a time, not on all the observations for a couple. Your final loop could be
                  Code:
                  local list asset fin bank super prop
                  foreach v of local list {
                      gen `v'r = `v'/cpi  
                  }

                  Comment


                  • #10
                    Hi William Lisowski. I appreciate your advice/guidance. On further reading of the help file, I understand what you mean now - thank you.

                    As suggested, I removed the bysort command and ran the code without issue. The results appear the same as those using the bysort command I used in #8, but I guess that is your point, it is not necessary - thanks.

                    Thank you very much for all help provided in helping me sort through this issue. Regards, Chris.
                    Last edited by Chris Boulis; 20 Jul 2021, 18:07.

                    Comment

                    Working...
                    X