Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different number of observations when applying aweight

    Would anyone be able to explain to me why I get a different number of observations when applying aweight in STATA.

    When unweighted I get the following number of observations from the summary statement:


    sum expshare if wave == 6 & expshare > 0 & expshare < 1 & expshare0==0

    Variable | Obs Mean Std. Dev. Min Max
    -------------+---------------------------------------------------------
    expshare | 6,843 .0416952 .0670938 .000657 .9936517


    When I apply aweight the number of observations increases to 6785

    . sum expshare if wave== 6 & expshare > 0 & expshare < 1 & expshare0==0 [aweight=hhwth]

    Variable | Obs Weight Mean Std. Dev. Min Max
    -------------+-----------------------------------------------------------------
    expshare | 6,785 7406469.13 .0390872 .0634303 .000657 .9936517

    Many thanks for your help!

    Ewa

  • #2
    In nearly all Stata commands, there is listwise deletion of any observations that have a missing value for any variable mentioned in the command. You don't show example data illustrating the problem, but I'm pretty confident that this is happening because you have observations with missing values of hhwth that are included in the unweighted -summ- but are excluded from the weighted version. The other possibility is that there are observations with hhwth == 0. In weighted commands, observations with weight zero are excluded.

    You can quickly check my hypothesis with:
    Code:
    count  if wave == 6 & expshare > 0 & expshare < 1 & expshare0==0 & (missing(hhwth) | hhwth == 0)
    assert r(N) == 6843 - 6785
    If I am right, the -assert- command will give no output. If it says the assertion is false, then something else may be going on. In that case, please post back using the -dataex- command to provide example data that illustrates the same phenomenon. (Not the whole data set: just enough observations to show that the weighted and unweighted commands produce different sample sizes)

    If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Last edited by Clyde Schechter; 05 Mar 2023, 17:16.

    Comment


    • #3
      That works. It makes a perfect sense. Thank you very much Clyde!

      Comment

      Working...
      X