Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • A question on sum and by

    Hello,

    I have a question on how to use r(sum) with by. I am working with SCF data and I am trying to compute the number of individuals in the top decile income group, weighted with the sample weights, for each survey year.

    I tried doing it this way:
    Code:
    gen howmany=1
    bys year: sum(howmany) [aw=wgtI95W95] if incgroup==10
    I need to create a variable that would contain the number of observations for each year. However, I am not sure how to save the number of observations when using bys.

    Would appreciate any help.




  • #2
    However, I am not sure how to save the number of observations when using bys.
    The short answer is, you can't. That's the wrong tool for this job.

    First let me parse your request. On the one hand you say you want a count of the number of observations. But you also are using weights in your -summarize- command. So it isn't really that. What your -summarize- command will actually calculate is the total of wgtI95W95 in each year, restricted to observations with incgroup = 10. Assuming this is what you want, the way to get it for each group is:

    Code:
    by year, sort: egen wanted = total(cond(incgroup == 10, wgtI95W95, .))
    I am working with SCF data
    Statalist is a multidisciplinary, international forum. It may well be that everyone in your circle knows what SCF means, but it is a fair bet that a substantial number of people on this forum do not. It doesn't ring a bell with me. The harm of using unfamiliar abbreviations is that somebody who might have answered your question before me may have seen SCF and decided to skip the question without reading further, assuming that it's going to depend on things he or she doesn't know. The only common knowledge here is a bit of statistics and a bit of Stata, plus anything that any college-educated person around the world would know. Everything else, and when in doubt, should be spelled out.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      The short answer is, you can't. That's the wrong tool for this job.

      First let me parse your request. On the one hand you say you want a count of the number of observations. But you also are using weights in your -summarize- command. So it isn't really that. What your -summarize- command will actually calculate is the total of wgtI95W95 in each year, restricted to observations with incgroup = 10. Assuming this is what you want, the way to get it for each group is:

      Code:
      by year, sort: egen wanted = total(cond(incgroup == 10, wgtI95W95, .))

      Statalist is a multidisciplinary, international forum. It may well be that everyone in your circle knows what SCF means, but it is a fair bet that a substantial number of people on this forum do not. It doesn't ring a bell with me. The harm of using unfamiliar abbreviations is that somebody who might have answered your question before me may have seen SCF and decided to skip the question without reading further, assuming that it's going to depend on things he or she doesn't know. The only common knowledge here is a bit of statistics and a bit of Stata, plus anything that any college-educated person around the world would know. Everything else, and when in doubt, should be spelled out.
      Thank you very much for your prompt response, and for clarification! I apologize for not being more specific, will not use unfamiliar abbreviations in the future. Thanks again!

      Comment

      Working...
      X