Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cohort Analysis: Plotting Variance over time by age groups or education groups

    Hello everyone,

    I am working synthetic panel due to which i have collapsed data with respect to cbin (cohort bin for age) and education group (edu_group). I wanted to plot Variance of hour_wage over time which is from 2004 to 2014 with two years gap in every round. Then i would like to plot the variance of hour_wage and ae_consumption over time, by cohort groups. Similarly, the variance of the hour_wage and ae_consumption over time, by edu_group. Here below is the data


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float cbin byte edu_group float(ae_consumption hour_wage c_age year)
    3 1  440.6964 20.011213 27 2004
    3 2  423.9432  29.47328 27 2004
    3 3  428.8206  29.82299 27 2004
    4 1  425.2382   24.2419 30 2004
    4 2  420.3893  29.41757 30 2004
    4 3  493.2135  47.94204 30 2004
    5 1  432.2753 25.253265 35 2004
    5 2  452.3809 32.823265 35 2004
    5 3 465.53345  43.67646 35 2004
    6 1   421.073 28.098703 40 2004
    6 2  442.3377 37.223385 40 2004
    6 3  500.1619  52.70139 40 2004
    7 1  397.6652  28.42537 45 2004
    7 2  437.4603   39.7994 45 2004
    7 3  481.4493  57.02699 45 2004
    8 1  408.3766  26.16772 50 2004
    8 2  413.7261  39.21839 50 2004
    8 3  492.6645 71.293304 50 2004
    3 1  393.6165 19.816727 29 2006
    3 2  424.2228 25.147636 29 2006
    3 3  435.8747  41.67512 29 2006
    4 1    399.66 24.987286 34 2006
    4 2 422.47015  31.23556 34 2006
    4 3  491.8668  48.40184 34 2006
    5 1  414.2401  25.99362 39 2006
    5 2  430.1524 33.815296 39 2006
    5 3  455.8836  55.81765 39 2006
    6 1  411.6692  27.97255 44 2006
    6 2  426.3551  40.35061 44 2006
    6 3 485.83725    66.982 44 2006
    7 1  424.8087  28.63389 49 2006
    7 2 443.21085  40.12785 49 2006
    7 3 508.20355  72.61096 49 2006
    2 1 560.89124  22.78973 27 2008
    2 2  526.1003 22.583317 27 2008
    2 3     525.5  76.65598 27 2008
    3 1  542.5067 21.873344 30 2008
    3 2  629.8838 33.068836 30 2008
    3 3  705.8958   53.8134 30 2008
    4 1  637.6502  29.20734 35 2008
    4 2   669.334 36.602306 35 2008
    4 3  771.1184  49.41074 35 2008
    5 1  602.5778  26.90465 40 2008
    5 2  658.8535  39.76491 40 2008
    5 3  707.9938  55.90078 40 2008
    6 1  633.4527  27.88918 45 2008
    6 2  661.4158  41.89492 45 2008
    6 3  730.1896  72.87195 45 2008
    7 1  599.2916  24.39447 50 2008
    7 2  659.2191  44.84071 50 2008
    7 3  714.4351  40.63889 50 2008
    2 1   912.151 35.102135 28 2010
    2 2  910.3687  45.17526 28 2010
    2 3  1037.072  60.86653 28 2010
    3 1  899.1672 36.898376 32 2010
    3 2  965.7375  49.24127 32 2010
    3 3 1077.8995  90.85306 32 2010
    4 1  973.3187  41.19779 36 2010
    4 2  1025.908  54.57022 36 2010
    4 3 1061.3828  95.40849 36 2010
    5 1  952.3712  41.21731 41 2010
    5 2 1095.9092 69.342545 41 2010
    5 3 1197.1328 111.52193 41 2010
    6 1  986.2845  40.08693 46 2010
    6 2 1027.2869 69.145584 46 2010
    6 3  1223.803 115.39976 46 2010
    7 1  911.8214   39.0071 50 2010
    7 2 1099.9957  62.53564 50 2010
    7 3 1207.9797 126.21684 50 2010
    1 1  979.1874  48.36358 26 2012
    1 2 1001.3148   43.3796 26 2012
    1 3 1118.0404   96.3141 26 2012
    2 1  974.6545  42.25338 30 2012
    2 2 1019.4177  59.96129 30 2012
    2 3  1126.359  94.42162 30 2012
    3 1  972.1918  51.99603 35 2012
    3 2 1067.8253   66.5342 35 2012
    3 3 1211.1705  104.8941 35 2012
    4 1 1044.3345  58.05062 40 2012
    4 2 1104.1293  73.66813 40 2012
    4 3 1227.3896  116.4934 40 2012
    5 1 1039.6989  61.18449 45 2012
    5 2 1120.9481  87.70039 45 2012
    5 3  1243.785 130.04353 45 2012
    6 1 1062.4795  63.65802 49 2012
    6 2 1123.9371  91.64032 49 2012
    6 3  1296.465 154.17195 49 2012
    1 1 1050.5168  49.67293 27 2014
    1 2 1099.7997  59.52926 27 2014
    1 3 1225.2017  99.27192 27 2014
    2 1  1086.622  58.78579 31 2014
    2 2 1149.1056  71.10374 31 2014
    2 3 1268.1682 120.85491 31 2014
    3 1 1127.1864  61.70181 36 2014
    3 2 1148.7606  82.30375 36 2014
    3 3 1325.0287 134.49023 36 2014
    4 1  1174.124  69.92044 41 2014
    4 2 1222.4196  100.5159 41 2014
    4 3 1332.8146 147.88669 41 2014
    5 1 1177.6598 74.841034 46 2014
    end
    Please tell me if there is any efficient way of doing it. The section in bold is the main issue for me.

  • #2
    I wouldn't plot variance, but SD: you and your readers, I guess, would find SD in the same units (currency) as wage easier to think about than variance in their square.

    The specification "over time" is not very precise, but in the first instance I suggest collapse followed by twoway line or twiway connected.

    Comment


    • #3
      Agreed with S.D, i have already collapsed the data. But i am confused now how to plot it over time, and how to plot S.D over time (ae_consumption & hour_wage) by education group or by cbin ? over time means against time, on x-axis we may have time range (2004, 2006, 2008, 2010 2012 & 2014)

      Comment


      • #4
        You are not showing any of the code you have tried. I tried

        Code:
        collapse (sd) hour_wage ae_consumption, by(year cbin edu_group)
        on your example data, but there is only one observation for each combination specified by by() so the SD is undefined

        Comment


        • #5
          why do you need to collapse? when data already collapse. I have to collapse data in order to make it a synthetic panel. Data above is already collapsed by cbin and edu_group. The command you have mentioned will work on uncollapsed data. Yes i didn't put up the code because i don't know how to plot it. specially plotting hour_wage (S.D) over time by cbin or edu_group.
          Last edited by Raza Jafri; 17 Jul 2017, 06:22.

          Comment


          • #6
            I have applied your code on uncollpased data, now here is the new data for the mentioned issue. So basically now hour_wage and ae_consumption is in S.D form


            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(year cbin) byte edu_group float(hour_wage ae_consumption)
            2004 3 1  9.198604 181.17894
            2004 3 2  24.06547 167.32167
            2004 3 3 18.756506 138.90439
            2004 4 1 16.769852 132.28737
            2004 4 2   23.0308 150.95128
            2004 4 3 35.189304  158.1243
            2004 5 1 13.131018 144.36212
            2004 5 2 21.714367 151.68411
            2004 5 3  25.47673 153.15536
            2004 6 1 18.151947  134.7916
            2004 6 2 24.451214  156.2335
            2004 6 3  31.67564 147.67294
            2004 7 1  20.18608 132.84694
            2004 7 2 26.444883 153.65346
            2004 7 3 37.503246  161.4292
            2004 8 1 16.787403  142.2849
            2004 8 2 30.229767 142.50476
            2004 8 3  45.13843 143.52515
            2006 3 1  8.975469 121.26255
            2006 3 2 15.393648 153.16737
            2006 3 3 26.780804 123.27008
            2006 4 1 20.421463 120.53606
            2006 4 2  23.31569  144.4429
            2006 4 3 31.392437 152.58656
            2006 5 1   17.4715  132.0433
            2006 5 2 19.674114  145.9066
            2006 5 3 36.860306 143.28853
            2006 6 1 17.079065  129.4971
            2006 6 2  25.88773 134.01862
            2006 6 3  40.41895 150.87633
            2006 7 1  19.73844 136.32036
            2006 7 2  28.71191  154.7265
            2006 7 3   44.7657 154.09235
            2008 2 1   13.7826 171.69467
            2008 2 2  8.308348 148.21573
            2008 2 3  53.26749 136.17967
            2008 3 1 12.123176  184.3892
            2008 3 2 14.170585  238.2856
            2008 3 3 23.739925  250.3253
            2008 4 1 14.031183 211.59775
            2008 4 2  16.07625  221.4025
            2008 4 3 26.371553 228.64455
            2008 5 1  25.40319 210.04654
            2008 5 2 24.093496  194.0629
            2008 5 3  28.89851 248.11374
            2008 6 1 21.935324 246.83635
            2008 6 2  26.70273  215.1757
            2008 6 3  44.35886 251.63666
            2008 7 1  25.95437 163.05244
            2008 7 2 34.165882  229.6174
            2008 7 3  42.38051 275.00946
            2010 2 1 14.165462 268.72116
            2010 2 2 17.757486 289.11227
            2010 2 3  27.47088  224.3783
            2010 3 1 17.888157  310.4362
            2010 3 2  29.79294  324.7763
            2010 3 3  50.02966  345.0162
            2010 4 1 29.703896 272.21866
            2010 4 2 35.991783  324.8092
            2010 4 3  56.93126  330.7441
            2010 5 1  29.29613 279.26028
            2010 5 2  48.40393  325.7744
            2010 5 3  48.82058  313.3741
            2010 6 1  23.43356  351.9747
            2010 6 2  38.91204  282.5165
            2010 6 3  64.68961  347.8223
            2010 7 1  27.52885 292.24268
            2010 7 2  41.29923  373.0884
            2010 7 3  44.80334 315.30695
            2012 1 1  33.61745  279.9091
            2012 1 2   21.6374 297.37952
            2012 1 3  49.07546  533.6936
            2012 2 1  20.17887 285.82233
            2012 2 2 38.110058  316.3071
            2012 2 3  60.42827 366.54135
            2012 3 1 28.602547 291.86584
            2012 3 2  39.02489  321.0946
            2012 3 3  57.00327  374.4373
            2012 4 1 30.596247 304.39758
            2012 4 2  44.56256  330.2934
            2012 4 3  56.77213  373.8436
            2012 5 1  38.83683  311.8416
            2012 5 2  48.62352  335.8161
            2012 5 3  58.58937 351.48715
            2012 6 1  42.68064  328.4412
            2012 6 2  58.06293  348.7668
            2012 6 3  65.19149  384.2078
            2014 1 1 19.181744  343.2126
            2014 1 2  36.65955  367.6964
            2014 1 3  51.71466  439.5841
            2014 2 1  35.19518   347.967
            2014 2 2 37.887463  351.6572
            2014 2 3  82.25789   364.844
            2014 3 1  35.27935 361.47205
            2014 3 2  50.85758  361.6234
            2014 3 3  77.98834  422.4134
            2014 4 1  41.38225  369.5024
            2014 4 2 68.391045  392.5099
            2014 4 3  89.53647  408.6586
            2014 5 1   46.0382  390.0795
            end

            Comment


            • #7
              Sorry, #4 is not needed, as you say you have already collapsed.

              But then precisely how did you collapse? I suggested in #2 that you collapse to SDs. Is that what you did?

              Once more, I underline that you have not shown us any code whatsoever.

              Why not use line or tsline or xtline?


              Comment


              • #8
                I have collapsed data with respect to mean but i realize that is incorrect if i want to plot S.D. The 4th is correct because i have to plot to S.D of hour_wage and ae_consumption. But the command to plot the S.D of hour_wage and ae_consumption over time, by cohort groups, is something i am looking for? If i already knew it, i wouldn't have posted in stata forum.

                Comment


                • #9
                  Now if i use line, there are multiple values for one year? how do i declare time variable? so i can plot S.D of hour_wage over time?

                  Comment


                  • #10
                    Raza: I am bowing out of this. If you won't show us code you're trying, I can't suggest modifications to your code.

                    Also, I already posted code in #4 showing how to collapse by SD. You must go back the original data.

                    Anyone else is welcome to take this forward, but making the same requests again and again is not an interesting use of my time.
                    Last edited by Nick Cox; 17 Jul 2017, 06:48.

                    Comment


                    • #11

                      Sorry for disturbance may be am unable to explain in better way. Here is my code and data one more time.

                      I have collapsed the data in terms of S.D for hour_wage and ae_Consumption by cbin (cohort bin for age). Here it is

                      Code:
                      * Example generated by -dataex-. To install: ssc install dataex
                      clear
                      input float(cbin ae_consumption hour_wage year)
                      3  432.1184 25.237505 2004
                      4 437.92435 31.217815 2004
                      5  446.9146  31.96328 2004
                      6  444.8031 36.298504 2004
                      7 429.28195  38.47503 2004
                      8  430.7834  40.97292 2004
                      3  411.3532 24.489355 2006
                      4  426.5524 32.008793 2006
                      5 429.60745 35.681225 2006
                      6  433.0114  40.95407 2006
                      7  451.3104  42.87645 2006
                      2 518.66034 29.406195 2008
                      3  594.9459  31.35742 2008
                      4  615.9071 34.835896 2008
                      5  622.4978  39.14019 2008
                      6  626.2902   41.5099 2008
                      7  612.4688 32.551098 2008
                      2  941.0918  40.26697 2010
                      3  962.5244  48.83459 2010
                      4 1028.3856  61.44976 2010
                      5 1035.9739  61.48063 2010
                      6 1024.6589  60.80338 2010
                      7 1084.3801  66.30714 2010
                      1  994.4923  48.68895 2012
                      2 1007.7833  54.71659 2012
                      3 1046.2838 65.496506 2012
                      4 1110.2407  77.86418 2012
                      5  1120.174  87.81827 2012
                      6 1141.0681  95.65371 2012
                      1  1090.667  59.39747 2014
                      2 1137.7675  72.69834 2014
                      3 1170.2358  82.62764 2014
                      4 1224.2488  97.36292 2014
                      5  1245.381 109.10123 2014
                      6 1258.4246  112.3734 2014
                      1  487.4936  79.80643 2016
                      2  534.3436  95.84991 2016
                      3  561.3955  110.2328 2016
                      4 584.83575 120.21163 2016
                      5  610.2727 127.37865 2016
                      end
                      Now i want to plot hour_wage over time by cbin. For this purpose, i am using this code

                      Code:
                      xtset year cbin
                      line hour_wage year if cbin==4, recast (connected)
                      This graph is for cbin==4 and i have to repeat the same procedure for all the cbins i think. But now issue appears in the graph especially at x-axist if you can explain me how can i give range from 2004 to 2016 because in graph its in five years interval. Hope now it is clear. Please check the graph as well.
                      Click image for larger version

Name:	collapse1.png
Views:	1
Size:	40.7 KB
ID:	1402312

                      Comment

                      Working...
                      X