Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Exporting Results of a tab command

    Hi

    I run a "by year: tab command" to get the number of households from the IPMUS CPS dataset using the household weights. I have been manually copying and pasting the tables in excel and have not be successful in exporting them using the put excel command. I tried looking up the command online but none of the responses seem to serve my purpose. To be clear, I run the following command:
    by year: tab age_new race_new [iw=hwtsupp] if pernum==1 & gq==1

    which generates the total number of households by age , race and by year. I have data for years from 1994-2017 and need to export them to excel. I can do a manual copy and paste but my manager wants me to automate the whole thing so that there is no chance of an error creeping in.

    The results of the command above are as below (Please excuse me as I cannot install dataex on the stata here as I have it on my office laptop which has high security, therefore pasting the results here).:

    by year: tab age_new race_new [iw=hwtsupp] if pernum==1 & gq==1


    -> year = 1994
    race_new
    age_new H NHB NHO NHW Total
    20 53,218 86,063 23,129 376,499 538,910
    75 366,961 678,916 110,745 8,010,804 9,167,426
    2024 616,148 651,337 185,311 3,273,077 4,725,873
    2529 1,045,268 1,173,399 311,658 5,942,035 8,472,359
    3034 1,280,701 1,513,661 426,486 8,024,158 11,245,005
    3539 1,071,539 1,465,635 455,131 8,803,026 11,795,331
    4044 979,417 1,292,771 376,040 7,849,920 10,498,147
    4549 789,793 1,034,206 302,652 7,094,251 9,220,902
    5054 584,439 775,019 265,277 5,990,874 7,615,609
    5559 459,107 668,473 155,091 4,951,795 6,234,466
    6064 410,656 614,688 106,759 4,821,587 5,953,690
    6569 339,182 567,250 107,665 5,014,310 6,028,408
    7074 286,588 487,201 91,233 4,745,538 5,610,561
    Total 8,283,016 11,008,619 2,917,177 74,897,874 97,106,687

    I have similar results for all years. I want to automatically export them to excel. Can someone help me please?

    Thanks a ton in advance.

    Best
    Rama

  • #2
    You want the tabulation results as a new dataset. I would use contract for that purpose. I don't know why you are using importance weights, but I guess that frequency weights would work fine for your kind of data. What is given by a by: prefix just becomes an extra categorical variable with contract.

    I am not familiar with your data and can't even tell you what IPUMS means, but a guess at your syntax is

    Code:
    contract age_new race_new year [fw=hwtsupp] if pernum==1 & gq==1

    Comment


    • #3
      Thanks Nick. The data I am using is the Current Population Survey by the US Census Bureau. The data contains only a sample of the population. So we have to apply the weights to get the full population data. And these weights are noninteger weights. So I was using iw = hwtsupp. With the contract command, it is not allowing the use of non integer weights. Any other solution you could recommend.

      Comment


      • #4
        Something like

        Code:
        gen freq = 1 
        
        collapse (sum) freq [iw=hwtsupp] if pernum==1 & gq==1, by(age_new race_new year)

        Comment


        • #5
          That worked.. exactly what I was looking for.. thank you so much Nick. Really appreciate your help.

          Comment


          • #6
            I may be able to shed some light on the issue of importance vs probability weights. I used to intern at a think tank in Washington, DC, and they frequently did tabulations of survey data. I was actually instructed to use importance weights when I was there. I believe they work the same way as probability weights (i.e. pweight) for simple tabulations.
            Rama Goda That said, I would recommend using the pweight option, because it is known to be correct. Any statistical analysis on the CPS, or any similar survey, will require you to use pweight. For example, see this page by the University of North Carolina's population center.

            If you tabulated the weight variable, you'd probably see values like, for example, 3000.75 - that means the observation had a 1 / 3000.75 probability of being sampled, or equivalently that it represents 3000.75 persons. pweight means the inverse of the probability of being sampled. The CPS, like many publicly available surveys, has a complex sampling scheme, so each person has a different probability of being sampled.

            fweight does literally mean how many people the observation represents, but it's restricted to integer values. pweights do not take on integer values. What Nick's code did was sum up the individual observations over your by groups, weighted by the pweight. After collapsing like that, there are no longer any weights, and the -contract- command works correctly.
            Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

            When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

            Comment

            Working...
            X