Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Common Sample

    Hi,

    I have the following dataset and I would like to create a common sample (i.e same countries and same industries within my unbalanced panel). My end goal is to compare volatilities of value added per worker across a common sample of industries (isic) and countries by decades. Nevertheless, I do not know how to approach, in terms of the syntax. I do not want to compare volatilities by group if the countries and industries are different over time. Thanks!

    Code:
     * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int country str2 isic float tech_intensity int year float(val_per_worker ly)
    894 "15" 1 1999         .         .
    120 "15" 1 1993  9574.547  9.166863
    340 "15" 1 1999         .         .
    410 "15" 1 2004  96024.93 11.472363
    204 "15" 1 1981  9178.527  9.124622
    887 "15" 1 2009  9329.836  9.140973
     56 "15" 1 1993         .         .
    554 "15" 1 1988         .         .
    410 "15" 1 2010         .         .
     51 "15" 1 1988         .         .
    894 "15" 1 1986         .         .
     56 "15" 1 2008 106133.56 11.572453
    208 "15" 1 2005     77704 11.260662
    398 "15" 1 2006         .         .
    780 "15" 1 1989  9008.899  9.105968
    858 "15" 1 2008 33943.656 10.432457
    356 "15" 1 1982 1063.4724  6.969295
    356 "15" 1 2005 4569.7397  8.427212
    288 "15" 1 2002         .         .
    388 "15" 1 1994  9703.523  9.180244
    554 "15" 1 2001 35710.977 10.483213
     68 "15" 1 1988  16495.49  9.710842
    428 "15" 1 1994  7786.376  8.960131
    188 "15" 1 1980         .         .
    170 "15" 1 1985 23876.004  10.08063
    450 "15" 1 1998         .         .
    454 "15" 1 1982  2709.464  7.904506
    788 "15" 1 1989         .         .
    788 "15" 1 1994 14590.942  9.588157
    894 "15" 1 1988         .         .
    242 "15" 1 1985   6796.64  8.824183
    414 "15" 1 1995 18211.824  9.809826
    246 "15" 1 2003  61583.58  11.02815
    196 "15" 1 1981  12115.39  9.402232
    144 "15" 1 1987  5292.756  8.574095
    434 "15" 1 2009         .         .
    434 "15" 1 2006         .         .
    630 "15" 1 1990  99810.48 11.511028
    516 "15" 1 1982         .         .
    352 "15" 1 1993  40615.22 10.611898
    392 "15" 1 2004  90146.76 11.409194
    788 "15" 1 2010         .         .
    528 "15" 1 1994  73991.79  11.21171
    756 "15" 1 2010  114571.5 11.648954
    470 "15" 1 1981 14979.714  9.614452
    762 "15" 1 2008         .         .
    400 "15" 1 1992  8861.312   9.08945
    300 "15" 1 1989 23550.227  10.06689
    724 "15" 1 1991  49845.12 10.816676
    764 "15" 1 2005  9452.918 9.1540785
    703 "15" 1 2003  7702.973  8.949362
    516 "15" 1 1993         .         .
    300 "15" 1 2004  56680.21  10.94518
    214 "15" 1 2008         .         .
    454 "15" 1 2004 10494.156  9.258574
    288 "15" 1 2001         .         .
    158 "15" 1 2010  49238.38  10.80443
    100 "15" 1 1985         .         .
    705 "15" 1 2004         .         .
    516 "15" 1 1999         .         .
     96 "15" 1 2002         .         .
     96 "15" 1 2005         .         .
     96 "15" 1 2006         .         .
     96 "15" 1 2008         .         .
     96 "15" 1 2010         .         .
     56 "15" 1 1982  38303.53 10.553297
    458 "15" 1 1997 22002.035   9.99889
    140 "15" 1 1987  21987.09  9.998211
    191 "15" 1 2007         .         .
    578 "15" 1 2007 105570.44 11.567134
    887 "15" 1 1999 3524.1804  8.167403
    250 "15" 1 2007  71816.05 11.181863
     70 "15" 1 1991 18458.133   9.82326
    348 "15" 1 1998  9240.286  9.131329
    724 "15" 1 2004  59024.43 10.985706
    784 "15" 1 2010         .         .
    352 "15" 1 1987  26591.48 10.188346
    300 "15" 1 1998  39237.49 10.577388
    533 "15" 1 1995         .         .
    578 "15" 1 1999  50568.43 10.831082
    800 "15" 1 2001         .         .
    688 "15" 1 2008         .         .
    616 "15" 1 2004 21374.203   9.96994
    364 "15" 1 2002 11422.383   9.34333
    191 "15" 1 1993         .         .
    344 "15" 1 2000 28505.215 10.257842
    422 "15" 1 1998 23595.736  10.06882
     60 "15" 1 1992         .         .
    428 "15" 1 2003  9231.256   9.13035
    428 "15" 1 1988         .         .
    752 "15" 1 1996  65950.37 11.096658
    112 "15" 1 2008         .         .
    752 "15" 1 1988   58649.9 10.979342
    724 "15" 1 2005  62259.05  11.03906
    442 "15" 1 2007  70717.02 11.166442
    222 "15" 1 1997  7243.638  8.887878
    242 "15" 1 1996         .         .
    858 "15" 1 1984  8215.469  9.013774
    858 "15" 1 1985  8657.342  9.066163
     12 "15" 1 2005         .         .
    end
    Code:
     tab isic                                      List of industries of all industries
    
           isic |      Freq.     Percent        Cum.
    ------------+-----------------------------------
             15 |      6,832        4.17        4.17
             16 |      6,822        4.16        8.33
             17 |      6,822        4.16       12.49
             18 |      6,832        4.17       16.66
             19 |      6,832        4.17       20.82
             20 |      6,832        4.17       24.99
             21 |      6,832        4.17       29.16
             22 |      6,832        4.17       33.33
             23 |      6,832        4.17       37.49
             24 |      6,832        4.17       41.66
             25 |      6,832        4.17       45.83
             26 |      6,832        4.17       49.99
             27 |      6,832        4.17       54.16
             28 |      6,832        4.17       58.33
             29 |      6,832        4.17       62.50
             30 |      6,832        4.17       66.66
             31 |      6,832        4.17       70.83
             32 |      6,832        4.17       75.00
             33 |      6,832        4.17       79.16
             34 |      6,832        4.17       83.33
             35 |      6,832        4.17       87.50
             36 |      6,832        4.17       91.67
             37 |      6,832        4.17       95.83
              D |      6,832        4.17      100.00
    ------------+-----------------------------------
          Total |    163,948      100.00

  • #2
    Hugo:
    I'm not clear with respect to what you want to create a "common sample".
    The usual way to create group of observations is the function -group- from -egen-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Yes, thank you very much. I got help on that. My goal was to choose the group of countries (country) that contains all industries in my data (isic) starting from 2000 in all years until 2020. The results I get after that are a little puzzling. I construct a variables ly, and ly0.

      Code:
       gen ly= log(val_per_worker)
      Code:
       xtset country_industry year
      Code:
       gen ly0= L10.ly
      Then my end goal is to compare the standard deviation for type of industries over time. For instance,

      Code:
       tabstat  ly0 ly if year==2005 & ly0!=. & ly!=., by(tech_intensity) statistics(sd)
      
       Summary statistics: sd
        by categories of: tech_intensity
      
      tech_intensity |       ly0        ly
      ---------------+--------------------
                   0 |  1.155845  1.075978
                   1 |  1.238137  1.199836
                   2 |  1.307299  1.326163
                   3 |  1.166958   .965853
      ---------------+--------------------
               Total |  1.245615  1.194803
      Code:
       tabstat  ly0 ly if year==2015 & ly0!=. & ly!=., by(tech_intensity) statistics(sd)
      
       Summary statistics: sd
        by categories of: tech_intensity
      
      tech_intensity |       ly0        ly
      ---------------+--------------------
                   0 |  1.075115  .8698506
                   1 |   1.23723  1.023555
                   2 |  1.344891  1.050074
                   3 |  1.112831  .9552974
      ---------------+--------------------
               Total |  1.259314  1.036577.
      Shouldnt't ly0 for year==2015 equal to ly for year==2005?

      Last edited by Hugo Rocha; 15 Jun 2022, 14:36.

      Comment


      • #4
        Hi Hugo
        You right it should, assuming that tech intensity has remained constant across time.
        And I suspect that some did not

        Unless you have too many observations, I would do:
        list ly0 ly year tech_intensity if inlist(year,2015,2005) & ly0!=. & ly!=.
        F

        Comment


        • #5
          Originally posted by FernandoRios View Post
          Hi Hugo
          You right it should, assuming that tech intensity has remained constant across time.
          And I suspect that some did not

          Unless you have too many observations, I would do:
          list ly0 ly year tech_intensity if inlist(year,2015,2005) & ly0!=. & ly!=.
          F
          Hi Fernando, I do have a substantial number of observations in the sample (64,272). Hence, the idea of listing is a little hard. The categories tech_intensity (1,2,and 3) are the same across all years. Is that what you mean by tech_intensity remaining constant? the values ly, ly0 are not constant, naturally.

          Thanks

          Comment


          • #6
            Originally posted by Hugo Rocha View Post

            Hi Fernando, I do have a substantial number of observations in the sample (64,272). Hence, the idea of listing is a little hard. The categories tech_intensity (1,2,and 3) are the same across all years. Is that what you mean by tech_intensity remaining constant? the values ly, ly0 are not constant, naturally.

            Thanks
            I did the the list... and the values of ly0 and ly seem very consistent. Though each tech_intensity is a group of industries (isic)
            Attached Files
            Last edited by Hugo Rocha; 15 Jun 2022, 16:51.

            Comment


            • #7
              I think I did find the problem though I am not sure how to fix it.

              Some industries (isic) particularly tech_intensity==3, start having fewer observations over time (though I chose the group of countries (country) that contains all industries in my data (isic) starting from 2000 in all years until 2020). One example is isic==33.

              Code:
               tab ly0 ly if year==2015 & ly0!=. & ly!=.&isic=="33", (that seems to change ly0 and ly depending on the year)
              
                         |     ly
                     ly0 |  9.514457 |     Total
              -----------+-----------+----------
                9.541218 |         1 |         1 
              -----------+-----------+----------
                   Total |         1 |         1

              Comment

              Working...
              X