Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with generating state level variable in individual level dataset

    Hello,

    I need to conduct state level analysis using individual level data. In order to achieve this I need to create a state-level, continuous variable that contains the number of observations by state and year. I also need to calculate a growth variable at the state level between two time points (1970) and (1980). But, I am struggling to create a state level variable in an individual-level dataset.

    I am using Stata 16.1 on a Mac OS Catalina version. 10.15.5
    I am using IPUMS microdata provided by IPUMS, which are individual level data. IPUMS provides one percent samples of the U.S. Census which is collected every ten years. My current dataset includes the years 1940, 1960, 1970, 1980, 2000, and 2010. My research is about Black educators, as such my dataset is restricted to observations identified as “Black” as defined by the census year and respondents with the occ1950 occupation code for “teacher”, and I further restricted it to individuals aged 20 or above.

    I have tried generating new variables for each state that contain the correct number of teachers with such code as:

    Gen NY1970 = NY if year==1970.

    Where NY is a dummy variable. All observations in the dataset are black teachers.


    I have dummy variables for years and states, but this code seems to generate a case of the variable for each observation as opposed to a state level variable.

    I also have failed to generate a growth variable with such code as:

    gen growth = 100 * (teacher70 - teacher80/teacher70)

    I used a dummy variable for “teacher (which is everyone in the dataset) and then created dummy variables “teacher70” and “teacher80” for teachers in appropriate years.
    For instance,
    gen teacher70 = teacher if Y1970==1

    However, the growth variable counted every observation in the dataset as missing data. I think it didn’t work because the dummy variables canceled each other out.

    I then tried to create continuous variables to measure teachers with the code:
    gen teacher70 = 2455 if year==1970
    gen teacher80 = 3913 if Y1980=1

    I've also looked through quite a lot of the data documentation on generating variables and combining variables, but I'm not sure what terminology applies to my problem, which makes it difficult to find the solution.

    But that also resulted in all observations counted as missing. How can I generate the necessary state level variables? Do I just have to generate those variables in a separate file and merge to two files? This is my first time trying to accomplish such a task and I'm unsure of the procedure. I appreciate any suggestions and guidance. I will provide any additional information that might be helpful or clarifying. Below is a sample of my data. Please let me know if I can provide a better data sample.

    Thank you!

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year long(sample serial cbserial) int hhwt double cluster byte(region statefip metro pernum) int perwt byte(sex age race) int(raced occ1950) long incwage float(teacher teacher70 teacher80 teacher801 teacher701 growth)
    1940 194001  1110 . 100 1940000011101 33 40 1 6 100 2 50 2 200 93  640 1 . . . . .
    1940 194001  1316 . 100 1940000013161 32 28 1 1 100 1 60 2 200 93  190 1 . . . . .
    1940 194001  2922 . 100 1940000029221 31 37 1 3 100 2 22 2 200 93  740 1 . . . . .
    1940 194001  3751 . 100 1940000037511 31 13 1 2 100 2 38 2 200 93  500 1 . . . . .
    1940 194001  4075 . 100 1940000040751 22 29 1 2 100 2 28 2 200 93  405 1 . . . . .
    1940 194001  4430 . 100 1940000044301 31 12 1 2 100 2 34 2 200 93  560 1 . . . . .
    1940 194001  5247 . 100 1940000052471 33 48 1 2 100 2 37 2 200 93  816 1 . . . . .
    1940 194001  5266 . 100 1940000052661 31 45 1 1 100 1 55 2 200 93  385 1 . . . . .
    1940 194001  5266 . 100 1940000052661 31 45 1 2 100 2 53 2 200 93  350 1 . . . . .
    1940 194001  5266 . 100 1940000052661 31 45 1 3 100 1 24 2 200 93  110 1 . . . . .
    1940 194001  5266 . 100 1940000052661 31 45 1 4 100 2 21 2 200 93  110 1 . . . . .
    1940 194001  6115 . 100 1940000061151 32  1 3 5 100 2 27 2 200 93  315 1 . . . . .
    1940 194001  6182 . 100 1940000061821 22 29 3 1 100 2 46 2 200 93 2000 1 . . . . .
    1940 194001  7141 .  77 1940000071411 12 36 3 2  77 2 50 2 200 93 2000 1 . . . . .
    1940 194001  7641 . 100 1940000076411 31 37 1 1 100 1 30 2 200 93  750 1 . . . . .
    1940 194001  7808 . 100 1940000078081 31 13 1 3 100 2 20 2 200 93  385 1 . . . . .
    1940 194001  8206 . 100 1940000082061 33 48 1 3 100 1 24 2 200 93  675 1 . . . . .
    1940 194001  8921 . 100 1940000089211 31 13 1 1 100 2 32 2 200 93  280 1 . . . . .
    1940 194001  9193 . 100 1940000091931 21 39 3 1 100 1 39 2 200 93 1280 1 . . . . .
    1940 194001 10479 . 100 1940000104791 33 48 1 1 100 1 48 2 200 93 1320 1 . . . . .
    1940 194001 11556 .  87 1940000115561 31 51 1 2  87 2 31 2 200 93  405 1 . . . . .
    1940 194001 11847 .  53 1940000118471 32 28 1 3  53 2 23 2 200 93  120 1 . . . . .
    1940 194001 12360 . 100 1940000123601 21 39 3 3 100 2 25 2 200 93 1200 1 . . . . .
    1940 194001 12937 . 100 1940000129371 33 48 1 2 100 2 25 2 200 93  250 1 . . . . .
    1940 194001 12938 . 100 1940000129381 33  5 1 2 100 2 26 2 200 93  390 1 . . . . .
    1940 194001 13696 . 100 1940000136961 31 37 1 2 100 2 33 2 200 93  637 1 . . . . .
    1940 194001 13697 . 100 1940000136971 31 37 1 1 100 2 38 2 200 93  680 1 . . . . .
    1940 194001 13697 . 100 1940000136971 31 37 1 2 100 2 38 2 200 93 1008 1 . . . . .
    1940 194001 13853 .  78 1940000138531 21 39 3 2  78 1 25 2 200 93  852 1 . . . . .
    1940 194001 14975 . 100 1940000149751 31 45 4 3 100 2 23 2 200 93  365 1 . . . . .
    1940 194001 14989 . 100 1940000149891 31 45 4 2 100 2 36 2 200 93 1200 1 . . . . .
    1940 194001 15489 . 100 1940000154891 31 45 4 1 100 2 46 2 200 93  754 1 . . . . .
    1940 194001 15489 . 100 1940000154891 31 45 4 2 100 2 28 2 200 93  450 1 . . . . .
    1940 194001 15986 .  78 1940000159861 31 24 3 3  78 2 24 2 200 93  200 1 . . . . .
    1940 194001 16297 . 100 1940000162971 33 48 3 1 100 2 22 2 200 93 1225 1 . . . . .
    1940 194001 16926 . 100 1940000169261 21 39 3 1 100 1 40 2 200 93    0 1 . . . . .
    1940 194001 17323 . 100 1940000173231 33 48 3 2 100 2 45 2 200 93 1400 1 . . . . .
    1940 194001 18244 . 100 1940000182441 33 48 1 3 100 1 27 2 200 93  540 1 . . . . .
    1940 194001 18881 . 100 1940000188811 31 24 3 2 100 2 39 2 200 93 1500 1 . . . . .
    1940 194001 21284 . 100 1940000212841 31 37 1 3 100 2 29 2 200 93  800 1 . . . . .
    1940 194001 21284 . 100 1940000212841 31 37 1 6 100 2 29 2 200 93  660 1 . . . . .
    1940 194001 21284 . 100 1940000212841 31 37 1 7 100 2 24 2 200 93  660 1 . . . . .
    1940 194001 21811 . 100 1940000218111 33 40 1 5 100 1 38 2 200 93  800 1 . . . . .
    1940 194001 21811 . 100 1940000218111 33 40 1 6 100 2 33 2 200 93  900 1 . . . . .
    1940 194001 21854 . 100 1940000218541 31 12 1 2 100 2 30 2 200 93  540 1 . . . . .
    1940 194001 22108 . 100 1940000221081 31 51 1 2 100 2 32 2 200 93  464 1 . . . . .
    1940 194001 23183 . 100 1940000231831 31 11 3 4 100 2 41 2 200 93 1512 1 . . . . .
    1940 194001 23476 . 100 1940000234761 32 28 4 1 100 2 36 2 200 93  320 1 . . . . .
    1940 194001 23582 . 100 1940000235821 32  1 1 1 100 1 32 2 200 93  875 1 . . . . .
    1940 194001 25609 . 100 1940000256091 31 13 1 3 100 2 27 2 200 93  225 1 . . . . .
    1940 194001 26085 . 100 1940000260851 31 12 1 2 100 2 43 2 200 93  600 1 . . . . .
    1940 194001 27371 . 100 1940000273711 32  1 1 2 100 2 29 2 200 93  350 1 . . . . .
    1940 194001 28031 . 100 1940000280311 21 26 3 2 100 2 36 2 200 93 2360 1 . . . . .
    1940 194001 28211 . 100 1940000282111 31 11 3 3 100 2 30 2 200 93 1700 1 . . . . .
    1940 194001 28369 . 100 1940000283691 32 21 1 2 100 2 46 2 200 93  720 1 . . . . .
    1940 194001 29237 . 100 1940000292371 12 42 3 2 100 2 25 2 200 93  500 1 . . . . .
    1940 194001 29779 . 100 1940000297791 21 18 3 1 100 1 56 2 200 93 3000 1 . . . . .
    1940 194001 29779 . 100 1940000297791 21 18 3 2 100 2 39 2 200 93 1200 1 . . . . .
    1940 194001 30337 . 100 1940000303371 33 40 1 6 100 2 24 2 200 93  900 1 . . . . .
    1940 194001 30681 .  70 1940000306811 22 29 3 2  70 2 40 2 200 93 1300 1 . . . . .
    1940 194001 30770 . 100 1940000307701 31 37 1 2 100 1 37 2 200 93  500 1 . . . . .
    1940 194001 31141 . 100 1940000311411 31 13 1 4 100 2 63 2 200 93  190 1 . . . . .
    1940 194001 31361 . 100 1940000313611 12 34 3 4 100 1 32 2 200 93  800 1 . . . . .
    1940 194001 32392 . 100 1940000323921 33 48 1 1 100 2 48 2 200 93  720 1 . . . . .
    1940 194001 33355 . 100 1940000333551 33 48 1 2 100 2 27 2 200 93  100 1 . . . . .
    1940 194001 33638 . 100 1940000336381 12 42 3 1 100 2 35 2 200 93 2200 1 . . . . .
    1940 194001 35486 .  88 1940000354861 31 45 1 3  88 2 30 2 200 93  400 1 . . . . .
    1940 194001 35486 .  88 1940000354861 31 45 1 4  88 1 26 2 200 93  399 1 . . . . .
    1940 194001 35486 .  88 1940000354861 31 45 1 8  88 2 20 2 200 93  400 1 . . . . .
    1940 194001 35721 . 100 1940000357211 32  1 3 3 100 2 21 2 200 93 1125 1 . . . . .
    1940 194001 36040 . 100 1940000360401 33 48 3 2 100 2 27 2 200 93  540 1 . . . . .
    1940 194001 36040 . 100 1940000360401 33 48 3 4 100 2 25 2 200 93  540 1 . . . . .
    1940 194001 36646 . 100 1940000366461 31 11 3 1 100 2 56 2 200 93 3000 1 . . . . .
    1940 194001 37106 . 100 1940000371061 33  5 1 1 100 1 37 2 200 93  240 1 . . . . .
    1940 194001 38383 . 100 1940000383831 31 12 3 1 100 2 25 2 200 93  600 1 . . . . .
    1940 194001 38641 . 100 1940000386411 21 26 1 1 100 1 50 2 200 93 2000 1 . . . . .
    1940 194001 39198 . 100 1940000391981 33 48 1 2 100 2 45 2 200 93  585 1 . . . . .
    1940 194001 40822 . 100 1940000408221 32 28 1 2 100 2 35 2 200 93  150 1 . . . . .
    1940 194001 40822 . 100 1940000408221 32 28 1 7 100 1 28 2 200 93  600 1 . . . . .
    1940 194001 41482 . 100 1940000414821 33 48 1 3 100 1 27 2 200 93 1500 1 . . . . .
    1940 194001 41482 . 100 1940000414821 33 48 1 4 100 2 23 2 200 93 1500 1 . . . . .
    1940 194001 41524 . 100 1940000415241 31 24 3 3 100 2 40 2 200 93 2000 1 . . . . .
    1940 194001 44711 . 100 1940000447111 31 37 1 1 100 1 36 2 200 93 1140 1 . . . . .
    1940 194001 44711 . 100 1940000447111 31 37 1 2 100 2 34 2 200 93  600 1 . . . . .
    1940 194001 44843 . 100 1940000448431 33 48 3 2 100 2 32 2 200 93  360 1 . . . . .
    1940 194001 44952 . 100 1940000449521 33 48 1 1 100 2 51 2 200 93  118 1 . . . . .
    1940 194001 45019 .  63 1940000450191 32  1 3 8  63 2 33 2 200 93 1400 1 . . . . .
    1940 194001 45148 . 100 1940000451481 31 37 1 3 100 2 32 2 200 93  720 1 . . . . .
    1940 194001 45987 . 100 1940000459871 33 22 1 1 100 1 35 2 200 93  400 1 . . . . .
    1940 194001 45987 . 100 1940000459871 33 22 1 2 100 2 32 2 200 93  280 1 . . . . .
    1940 194001 47140 . 100 1940000471401 31 37 1 2 100 2 28 2 200 93    0 1 . . . . .
    1940 194001 48640 . 100 1940000486401 33 48 1 2 100 2 28 2 200 93   94 1 . . . . .
    1940 194001 50986 . 100 1940000509861 33 48 1 2 100 2 23 2 200 93  540 1 . . . . .
    1940 194001 51946 . 100 1940000519461 32 28 1 1 100 2 40 2 200 93  216 1 . . . . .
    1940 194001 53157 . 100 1940000531571 33 22 1 6 100 2 26 2 200 93    0 1 . . . . .
    1940 194001 53185 . 100 1940000531851 31 37 1 1 100 1 33 2 200 93  736 1 . . . . .
    1940 194001 53834 . 100 1940000538341 33 40 1 1 100 1 56 2 200 93  900 1 . . . . .
    1940 194001 54040 .  87 1940000540401 33 22 4 5  87 2 26 2 200 93 1152 1 . . . . .
    1940 194001 54040 .  87 1940000540401 33 22 4 7  87 2 21 2 200 93   24 1 . . . . .
    1940 194001 54097 . 100 1940000540971 33 48 3 1 100 1 45 2 200 93 1500 1 . . . . .
    end
    label values year YEAR
    label def YEAR 1940 "1940", modify
    label values sample SAMPLE
    label def SAMPLE 194001 "1940 1%", modify
    label values region REGION
    label def REGION 12 "middle atlantic division", modify
    label def REGION 21 "east north central div", modify
    label def REGION 22 "west north central div", modify
    label def REGION 31 "south atlantic division", modify
    label def REGION 32 "east south central div", modify
    label def REGION 33 "west south central div", modify
    label values statefip STATEFIP
    label def STATEFIP 1 "alabama", modify
    label def STATEFIP 5 "arkansas", modify
    label def STATEFIP 11 "district of columbia", modify
    label def STATEFIP 12 "florida", modify
    label def STATEFIP 13 "georgia", modify
    label def STATEFIP 18 "indiana", modify
    label def STATEFIP 21 "kentucky", modify
    label def STATEFIP 22 "louisiana", modify
    label def STATEFIP 24 "maryland", modify
    label def STATEFIP 26 "michigan", modify
    label def STATEFIP 28 "mississippi", modify
    label def STATEFIP 29 "missouri", modify
    label def STATEFIP 34 "new jersey", modify
    label def STATEFIP 36 "new york", modify
    label def STATEFIP 37 "north carolina", modify
    label def STATEFIP 39 "ohio", modify
    label def STATEFIP 40 "oklahoma", modify
    label def STATEFIP 42 "pennsylvania", modify
    label def STATEFIP 45 "south carolina", modify
    label def STATEFIP 48 "texas", modify
    label def STATEFIP 51 "virginia", modify
    label values metro METRO
    label def METRO 1 "not in metropolitan area", modify
    label def METRO 3 "in metropolitan area: not in central/principal city", modify
    label def METRO 4 "in metropolitan area: central/principal city status indeterminable (mixed)", modify
    label values sex SEX
    label def SEX 1 "male", modify
    label def SEX 2 "female", modify
    label values age AGE
    label def AGE 20 "20", modify
    label def AGE 21 "21", modify
    label def AGE 22 "22", modify
    label def AGE 23 "23", modify
    label def AGE 24 "24", modify
    label def AGE 25 "25", modify
    label def AGE 26 "26", modify
    label def AGE 27 "27", modify
    label def AGE 28 "28", modify
    label def AGE 29 "29", modify
    label def AGE 30 "30", modify
    label def AGE 31 "31", modify
    label def AGE 32 "32", modify
    label def AGE 33 "33", modify
    label def AGE 34 "34", modify
    label def AGE 35 "35", modify
    label def AGE 36 "36", modify
    label def AGE 37 "37", modify
    label def AGE 38 "38", modify
    label def AGE 39 "39", modify
    label def AGE 40 "40", modify
    label def AGE 41 "41", modify
    label def AGE 43 "43", modify
    label def AGE 45 "45", modify
    label def AGE 46 "46", modify
    label def AGE 48 "48", modify
    label def AGE 50 "50", modify
    label def AGE 51 "51", modify
    label def AGE 53 "53", modify
    label def AGE 55 "55", modify
    label def AGE 56 "56", modify
    label def AGE 60 "60", modify
    label def AGE 63 "63", modify
    label values race RACE
    label def RACE 2 "black/african american/negro", modify
    label values raced RACED
    label def RACED 200 "black/african american/negro", modify
    label values occ1950 OCC1950
    label def OCC1950 93 "teachers (n.e.c.)", modify
    ------------------ copy up to and including the previous line ------------------

    Listed 100 out of 24737 observations
    Use the count() option to list more
    Last edited by Jennifer Hearst; 25 Oct 2021, 09:26.

  • #2
    Jennifer:
    welcome to this forum.
    I reply to your first question only here (as year=1940 throughout all your data excerp):
    Code:
    bysort statefip: egen wanted=total( teacher)
    Second thought: I've replaced 1940 with 1941 in _n==100; ths way, I can reply to your second question too:
    Code:
    . replace new_year = 1941 in 100
    
    
    . forvalues i = 1940(1)1941 {
      2. egen teacher`i'=total(teacher) if new_year==`i'
      3.  }
    Last edited by Carlo Lazzaro; 25 Oct 2021, 09:54.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Carlo,

      Thank you so much for your assistance. Please let me make sure that I understand how the code works.
      bysort statefip: egen wanted=total( teacher)

      This code has sorted the dataset by statefip ID which will allow for level two unit analysis. It has also generated a variable for that contains the total number of teachers by state and includes all years. Is it possible to also sort by year? Would that code like this:

      bysort statefip year: egen wanted=total( teacher)


      I’m sorry the data example only included the year 1940. I’m a little confused by the second set of code. To use a different set of years I would modify the code like this, correct?

      forvalues i = 1970(1)1980 {
      egen teacher`i'=total(teacher) if 1980==`i'
      }

      Is that correct? When I ran this code it created 11 new variables “teacher1970” – “teacher1980” , but only “teacher1980” contained any visible value. Teacher1980 seems to contain the total number of observations in the dataset (24737). Is that correct?

      To my understanding forvalues creates a loop of dummy variables but only 1980 is defined. And the second part of the code with egen stores the total teacher count if 1980 is a specific value. Is that correct? So, if I ran that process for another set of years would I then be able to calculate the growth variable. For instance,

      forvalues i = 1960(1)1970 {
      egen teachers`i'=total(teacher) if 1970==`i'
      }

      gen growth = 100 * (teachers1970 – teacher1980/teachers1970)

      When I ran this code I got a variable with the value of 2473600. So, it seems to have multiplied the total number of observations in the dataset by 100. Did I go wrong somewhere?

      Below is a sample of the new variables from the dataset.

      Thank you again for your help. I greatly appreciate this opportunity to learn more about Stata.

      Thank you,

      Jennifer

      . dataex year sample serial cbserial region statefip teacher wanted teacher1970 teacher1980 teachers1960 teachers1970 growth

      ----------------------- copy starting from the next line -----------------------
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int year long(sample serial cbserial) byte(region statefip) float(teacher wanted teacher1970 teacher1980 teachers1960 teachers1970 growth)
      2010 201001  11569  709406 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  73782       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  20361       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007   3861       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  98209       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  12740  781204 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  33492       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  11331       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001   6486  402518 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  83954       . 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 479210       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  26834       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001   9284  571518 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 480002       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  47424       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  13996  862256 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  13088  801820 32 1 1 1015 . 24737 . 24737 2473600
      1940 194001 389339       . 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 479701       . 32 1 1 1015 . 24737 . 24737 2473600
      1940 194001 306678       . 32 1 1 1015 . 24737 . 24737 2473600
      1940 194001 184165       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007    348       . 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  61172       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  13077  801229 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  19083       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 267892       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 267897       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 269321       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 269730       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 269899       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  18830 1163795 32 1 1 1015 . 24737 . 24737 2473600
      1940 194001 339643       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  16903 1045121 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 272021       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 272349       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 272401       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 272799       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 272904       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007   2418       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  16725       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 273248       . 32 1 1 1015 . 24737 . 24737 2473600
      1940 194001 311454       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  61139       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 273512       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 273888       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 274206       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  18854 1165483 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 269822       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 275170       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 275265       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  16021       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 276343       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  25569       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  11850       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007   8031       . 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 479911       . 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 479217       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  15268  943279 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 476186       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  44969       . 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  58187       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007   5113       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 270530       . 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  47752       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  11999       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  12713  779554 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 269981       . 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 480143       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  10514       . 32 1 1 1015 . 24737 . 24737 2473600
      1950 195001 155401       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 274340       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001   5196  322758 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  20362       . 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 481841       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  69734       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  56259       . 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  41047       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007   9384       . 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 473261       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  15286       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  73089       . 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 475994       . 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  39622       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  19329       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 274815       . 32 1 1 1015 . 24737 . 24737 2473600
      1950 195001 409079       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  29924       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  65549       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007    515       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  52449       . 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  73737       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001  16414 1014699 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  44369       . 32 1 1 1015 . 24737 . 24737 2473600
      2000 200007  13937       . 32 1 1 1015 . 24737 . 24737 2473600
      1990 199001  47424       . 32 1 1 1015 . 24737 . 24737 2473600
      1970 197001 474010       . 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  61862       . 32 1 1 1015 . 24737 . 24737 2473600
      1960 196001 273330       . 32 1 1 1015 . 24737 . 24737 2473600
      2010 201001   8568  528422 32 1 1 1015 . 24737 . 24737 2473600
      1980 198001  55547       . 32 1 1 1015 . 24737 . 24737 2473600
      end
      label values year YEAR
      label def YEAR 1940 "1940", modify
      label def YEAR 1950 "1950", modify
      label def YEAR 1960 "1960", modify
      label def YEAR 1970 "1970", modify
      label def YEAR 1980 "1980", modify
      label def YEAR 1990 "1990", modify
      label def YEAR 2000 "2000", modify
      label def YEAR 2010 "2010", modify
      label values sample SAMPLE
      label def SAMPLE 194001 "1940 1%", modify
      label def SAMPLE 195001 "1950 1%", modify
      label def SAMPLE 196001 "1960 1%", modify
      label def SAMPLE 197001 "1970 form 1 state", modify
      label def SAMPLE 198001 "1980 5%", modify
      label def SAMPLE 199001 "1990 5%", modify
      label def SAMPLE 200007 "2000 1%", modify
      label def SAMPLE 201001 "2010 acs", modify
      label values region REGION
      label def REGION 32 "east south central div", modify
      label values statefip STATEFIP
      label def STATEFIP 1 "alabama", modify
      ------------------ copy up to and including the previous line ------------------

      Listed 100 out of 24737 observations
      Use the count() option to list more

      Comment


      • #4
        I think you will be better served by backing up and explaining what you want to do with this data.

        I need to conduct state level analysis using individual level data. In order to achieve this I need to create a state-level, continuous variable that contains the number of observations by state and year. I also need to calculate a growth variable at the state level between two time points (1970) and (1980). But, I am struggling to create a state level variable in an individual-level dataset.
        Do you just need one variable that counts the number of observations in a given state and year? Secondly, calculate the growth rate of what? The count variable?

        Comment


        • #5
          Jennifer:
          while I do share Andrew's wise advice, please find below some codes that might be useful:
          Code:
          . drop wanted*
          
          . bysort statefip year: egen wanted=total(teacher)
          
          . bysort statefip (year): egen wanted2=total(teacher)
          
          . drop teacher*
          
          . forvalues i = 1940(10)2010 {
          
          .  egen teacher`i'=total(teacher) if year==`i'
           
          .  }
          As an aside, -gen growth...- can't work since when a variable has observed values, the other one has missing ones.
          Kind regards,
          Carlo
          (Stata 18.0 SE)

          Comment


          • #6
            Hello Andrew, thank you for your advice.

            Yes, I need a variable that counts the number of observations in a given state and year. I think I have achieved that with this combination

            For the growth variable, I need to calculate the change of teachers in each state between two time points. For instance, the change in teachers in New York between 1970 and 1980 (all observations are teachers) and I need those values stored in a variable that I can use as a dependent variable in a regression looking at variation in black teachers across states (again, all observations in the dataset are black teachers at this time). Later on, I might be need to do the same at the regional level, but hopefully once I get a handle on the code then I’ll be able to apply it to different scenarios as needed in the future. For instance, I will need to use a similar variable in a difference in difference regression for another part of the project.

            Comment


            • #7
              Carlo,

              Thank you so much for your assistance and help with the code. That certainly would explain the trouble that I’m having with the growth variable. I will work with the code you suggested today. I greatly appreciate your assistance. I do enjoy working with Stata and learning new code.

              Best,

              Jennifer

              Comment


              • #8
                Here is an alternative way to get what you want:


                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input int year long(sample serial cbserial) byte(region statefip) float(teacher wanted teacher1970 teacher1980 teachers1960 teachers1970 growth)
                2010 201001  11569  709406 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  73782       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  20361       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007   3861       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  98209       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  12740  781204 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  33492       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  11331       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001   6486  402518 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  83954       . 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 479210       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  26834       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001   9284  571518 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 480002       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  47424       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  13996  862256 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  13088  801820 32 1 1 1015 . 24737 . 24737 2473600
                1940 194001 389339       . 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 479701       . 32 1 1 1015 . 24737 . 24737 2473600
                1940 194001 306678       . 32 1 1 1015 . 24737 . 24737 2473600
                1940 194001 184165       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007    348       . 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  61172       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  13077  801229 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  19083       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 267892       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 267897       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 269321       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 269730       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 269899       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  18830 1163795 32 1 1 1015 . 24737 . 24737 2473600
                1940 194001 339643       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  16903 1045121 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 272021       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 272349       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 272401       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 272799       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 272904       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007   2418       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  16725       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 273248       . 32 1 1 1015 . 24737 . 24737 2473600
                1940 194001 311454       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  61139       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 273512       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 273888       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 274206       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  18854 1165483 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 269822       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 275170       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 275265       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  16021       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 276343       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  25569       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  11850       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007   8031       . 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 479911       . 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 479217       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  15268  943279 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 476186       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  44969       . 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  58187       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007   5113       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 270530       . 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  47752       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  11999       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  12713  779554 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 269981       . 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 480143       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  10514       . 32 1 1 1015 . 24737 . 24737 2473600
                1950 195001 155401       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 274340       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001   5196  322758 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  20362       . 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 481841       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  69734       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  56259       . 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  41047       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007   9384       . 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 473261       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  15286       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  73089       . 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 475994       . 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  39622       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  19329       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 274815       . 32 1 1 1015 . 24737 . 24737 2473600
                1950 195001 409079       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  29924       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  65549       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007    515       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  52449       . 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  73737       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001  16414 1014699 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  44369       . 32 1 1 1015 . 24737 . 24737 2473600
                2000 200007  13937       . 32 1 1 1015 . 24737 . 24737 2473600
                1990 199001  47424       . 32 1 1 1015 . 24737 . 24737 2473600
                1970 197001 474010       . 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  61862       . 32 1 1 1015 . 24737 . 24737 2473600
                1960 196001 273330       . 32 1 1 1015 . 24737 . 24737 2473600
                2010 201001   8568  528422 32 1 1 1015 . 24737 . 24737 2473600
                1980 198001  55547       . 32 1 1 1015 . 24737 . 24737 2473600
                end
                label values year YEAR
                label def YEAR 1940 "1940", modify
                label def YEAR 1950 "1950", modify
                label def YEAR 1960 "1960", modify
                label def YEAR 1970 "1970", modify
                label def YEAR 1980 "1980", modify
                label def YEAR 1990 "1990", modify
                label def YEAR 2000 "2000", modify
                label def YEAR 2010 "2010", modify
                label values sample SAMPLE
                label def SAMPLE 194001 "1940 1%", modify
                label def SAMPLE 195001 "1950 1%", modify
                label def SAMPLE 196001 "1960 1%", modify
                label def SAMPLE 197001 "1970 form 1 state", modify
                label def SAMPLE 198001 "1980 5%", modify
                label def SAMPLE 199001 "1990 5%", modify
                label def SAMPLE 200007 "2000 1%", modify
                label def SAMPLE 201001 "2010 acs", modify
                label values region REGION
                label def REGION 32 "east south central div", modify
                label values statefip STATEFIP
                label def STATEFIP 1 "alabama", modify
                
                frame put statefip year, into(teachers)
                frame teachers{
                    contract statefip year, freq(frequency)
                    xtset statefip year, delta(10)
                    gen pchange= ((frequency-l.frequency)/l.frequency)*100
                }
                frlink m:1 state year, frame(teachers)
                frget frequency pchange, from(teachers)
                frame drop teachers
                Res.:

                Code:
                . sort statefip year
                . l statefip year frequency pchange, sepby(statefip year)
                
                     +----------------------------------------+
                     | statefip   year   freque~y     pchange |
                     |----------------------------------------|
                  1. |  alabama   1940          5           . |
                  2. |  alabama   1940          5           . |
                  3. |  alabama   1940          5           . |
                  4. |  alabama   1940          5           . |
                  5. |  alabama   1940          5           . |
                     |----------------------------------------|
                  6. |  alabama   1950          2         -60 |
                  7. |  alabama   1950          2         -60 |
                     |----------------------------------------|
                  8. |  alabama   1960         23        1050 |
                  9. |  alabama   1960         23        1050 |
                 10. |  alabama   1960         23        1050 |
                 11. |  alabama   1960         23        1050 |
                 12. |  alabama   1960         23        1050 |
                 13. |  alabama   1960         23        1050 |
                 14. |  alabama   1960         23        1050 |
                 15. |  alabama   1960         23        1050 |
                 16. |  alabama   1960         23        1050 |
                 17. |  alabama   1960         23        1050 |
                 18. |  alabama   1960         23        1050 |
                 19. |  alabama   1960         23        1050 |
                 20. |  alabama   1960         23        1050 |
                 21. |  alabama   1960         23        1050 |
                 22. |  alabama   1960         23        1050 |
                 23. |  alabama   1960         23        1050 |
                 24. |  alabama   1960         23        1050 |
                 25. |  alabama   1960         23        1050 |
                 26. |  alabama   1960         23        1050 |
                 27. |  alabama   1960         23        1050 |
                 28. |  alabama   1960         23        1050 |
                 29. |  alabama   1960         23        1050 |
                 30. |  alabama   1960         23        1050 |
                     |----------------------------------------|
                 31. |  alabama   1970         11   -52.17391 |
                 32. |  alabama   1970         11   -52.17391 |
                 33. |  alabama   1970         11   -52.17391 |
                 34. |  alabama   1970         11   -52.17391 |
                 35. |  alabama   1970         11   -52.17391 |
                 36. |  alabama   1970         11   -52.17391 |
                 37. |  alabama   1970         11   -52.17391 |
                 38. |  alabama   1970         11   -52.17391 |
                 39. |  alabama   1970         11   -52.17391 |
                 40. |  alabama   1970         11   -52.17391 |
                 41. |  alabama   1970         11   -52.17391 |
                     |----------------------------------------|
                 42. |  alabama   1980         10   -9.090909 |
                 43. |  alabama   1980         10   -9.090909 |
                 44. |  alabama   1980         10   -9.090909 |
                 45. |  alabama   1980         10   -9.090909 |
                 46. |  alabama   1980         10   -9.090909 |
                 47. |  alabama   1980         10   -9.090909 |
                 48. |  alabama   1980         10   -9.090909 |
                 49. |  alabama   1980         10   -9.090909 |
                 50. |  alabama   1980         10   -9.090909 |
                 51. |  alabama   1980         10   -9.090909 |
                     |----------------------------------------|
                 52. |  alabama   1990         15          50 |
                 53. |  alabama   1990         15          50 |
                 54. |  alabama   1990         15          50 |
                 55. |  alabama   1990         15          50 |
                 56. |  alabama   1990         15          50 |
                 57. |  alabama   1990         15          50 |
                 58. |  alabama   1990         15          50 |
                 59. |  alabama   1990         15          50 |
                 60. |  alabama   1990         15          50 |
                 61. |  alabama   1990         15          50 |
                 62. |  alabama   1990         15          50 |
                 63. |  alabama   1990         15          50 |
                 64. |  alabama   1990         15          50 |
                 65. |  alabama   1990         15          50 |
                 66. |  alabama   1990         15          50 |
                     |----------------------------------------|
                 67. |  alabama   2000         19    26.66667 |
                 68. |  alabama   2000         19    26.66667 |
                 69. |  alabama   2000         19    26.66667 |
                 70. |  alabama   2000         19    26.66667 |
                 71. |  alabama   2000         19    26.66667 |
                 72. |  alabama   2000         19    26.66667 |
                 73. |  alabama   2000         19    26.66667 |
                 74. |  alabama   2000         19    26.66667 |
                 75. |  alabama   2000         19    26.66667 |
                 76. |  alabama   2000         19    26.66667 |
                 77. |  alabama   2000         19    26.66667 |
                 78. |  alabama   2000         19    26.66667 |
                 79. |  alabama   2000         19    26.66667 |
                 80. |  alabama   2000         19    26.66667 |
                 81. |  alabama   2000         19    26.66667 |
                 82. |  alabama   2000         19    26.66667 |
                 83. |  alabama   2000         19    26.66667 |
                 84. |  alabama   2000         19    26.66667 |
                 85. |  alabama   2000         19    26.66667 |
                     |----------------------------------------|
                 86. |  alabama   2010         15   -21.05263 |
                 87. |  alabama   2010         15   -21.05263 |
                 88. |  alabama   2010         15   -21.05263 |
                 89. |  alabama   2010         15   -21.05263 |
                 90. |  alabama   2010         15   -21.05263 |
                 91. |  alabama   2010         15   -21.05263 |
                 92. |  alabama   2010         15   -21.05263 |
                 93. |  alabama   2010         15   -21.05263 |
                 94. |  alabama   2010         15   -21.05263 |
                 95. |  alabama   2010         15   -21.05263 |
                 96. |  alabama   2010         15   -21.05263 |
                 97. |  alabama   2010         15   -21.05263 |
                 98. |  alabama   2010         15   -21.05263 |
                 99. |  alabama   2010         15   -21.05263 |
                100. |  alabama   2010         15   -21.05263 |
                     +----------------------------------------+
                
                .

                Comment


                • #9
                  Hello Andrew,

                  That is brilliant! Thank you so much for showing me this alternative. I believe it does indeed achieve my goal. I've only ever used preserve and restore. This is my first time learning about frames, but I can see how frames are much more versatile. Thank you so much for your assistance and advice. I have learned a great deal.

                  Best,
                  Jennifer

                  Comment

                  Working...
                  X