Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • SVY and Cluster Standard Errors

    Dear Statalist,

    I have just started to work with svy command in Stata, and I have the next question. I am using World Bank Enterprise Survey database. This database consist of multiple and different companies surveyed in different countries and in different years. For example, in 2017, the survey was run in Argentina to X number of companies. In 2020 in Brazil, and so on. I post a brief example of my data.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str28 country_01 double(idstd wt strata)
    "Afghanistan2014" 533626                  1  46
    "Afghanistan2014" 533670                2.7  16
    "Afghanistan2014" 533494  52.76923076923077   6
    "Afghanistan2014" 533350                  9  72
    "Afghanistan2014" 533706                2.5  59
    "Afghanistan2014" 533362                  1  31
    "Afghanistan2014" 533663                  1  17
    "Afghanistan2014" 533319                  9  72
    "Afghanistan2014" 533667                2.7  16
    "Afghanistan2014" 533339               22.1  15
    "Afghanistan2014" 533699                 12  57
    "Afghanistan2014" 533361                  1  32
    "Afghanistan2014" 533318                  1  74
    "Afghanistan2014" 533597                  9  71
    "Afghanistan2014" 533662                2.7  16
    "Afghanistan2014" 533623 2.1666666666666665  44
    "Afghanistan2014" 533629 27.666666666666668   9
    "Afghanistan2014" 533668                2.7  16
    "Afghanistan2014" 533683  83.66666666666667  25
    "Afghanistan2014" 533701                 12  57
    "Afghanistan2014" 533627               22.1  14
    "Afghanistan2014" 533687  83.66666666666667  25
    "Afghanistan2014" 533697                2.5  27
    "Afghanistan2014" 533406 1.1666666666666667  63
    "Afghanistan2014" 533680                  1  17
    "Afghanistan2014" 533355 2.1666666666666665  45
    "Afghanistan2014" 533661                  1  17
    "Afghanistan2014" 533705                2.5  59
    "Afghanistan2014" 533607                  1  73
    "Afghanistan2014" 533702                 12  57
    "Afghanistan2014" 533328              48.75  40
    "Afghanistan2014" 533664                2.7  16
    "Afghanistan2014" 533684               13.4  29
    "Afghanistan2014" 533332                 26  43
    "Afghanistan2014" 533708                  2  56
    "Afghanistan2014" 533337                  1  31
    "Afghanistan2014" 533641               22.1  14
    "Afghanistan2014" 533660 27.666666666666668   9
    "Afghanistan2014" 533598                  9  71
    "Afghanistan2014" 533672                2.7  16
    "Afghanistan2014" 533700                 12  57
    "Afghanistan2014" 533679 2.6666666666666665  11
    "Afghanistan2014" 533637               22.1  14
    "Afghanistan2014" 533624 2.1666666666666665  44
    "Afghanistan2014" 533665                2.7  16
    "Afghanistan2014" 533669                2.7  16
    "Afghanistan2014" 533331                 26  43
    "Afghanistan2014" 533640               22.1  14
    "Afghanistan2014" 533314                2.5  60
    "Afghanistan2014" 533334                 26  43
    "Afghanistan2014" 533320                  9  72
    "Afghanistan2014" 533600                  9  71
    "Afghanistan2014" 533682               13.4  29
    "Afghanistan2014" 533357                 26  43
    "Afghanistan2014" 533651 27.666666666666668   9
    "Afghanistan2014" 533636               22.1  14
    "Afghanistan2014" 533622 2.1666666666666665  44
    "Afghanistan2014" 533477 1.1666666666666667  63
    "Afghanistan2014" 533621 2.1666666666666665  44
    "Afghanistan2014" 533671                2.7  16
    "Afghanistan2014" 533693                  1  30
    "Afghanistan2014" 533371 25.529411764705884  61
    "Afghanistan2014" 533608                  1  73
    "Afghanistan2014" 533634               22.1  14
    "Afghanistan2014" 533349                  1  74
    "Afghanistan2014" 533645 27.666666666666668   9
    "Afghanistan2014" 533666                2.7  16
    "Afghanistan2014" 533689                  1  30
    "Afghanistan2014" 533612                  1  73
    "Afghanistan2014" 533639               22.1  14
    "Afghanistan2014" 533659 27.666666666666668   9
    "Afghanistan2014" 533356 2.1666666666666665  45
    "Argentina2010"   495153 24.657625198364258  21
    "Argentina2010"   495049 14.333484649658203 126
    "Argentina2010"   494598  7.238102436065674  93
    "Argentina2010"   495015  8.942412376403809  23
    "Argentina2010"   494794 42.291297912597656  19
    "Argentina2010"   494774  7.743322372436523  17
    "Argentina2010"   494680 13.792357444763184  61
    "Argentina2010"   494423 14.333484649658203 127
    "Argentina2010"   494634 14.588567733764648  91
    "Argentina2010"   494456                  1  92
    "Argentina2010"   495048 21.137067794799805 129
    "Argentina2010"   495386  4.113980770111084  66
    "Argentina2010"   495382                  1  22
    "Argentina2010"   495190 42.291297912597656  19
    "Argentina2010"   495396  3.437615394592285  22
    "Argentina2010"   494620 2.0523295402526855  92
    "Argentina2010"   495361 42.291297912597656  19
    "Argentina2010"   495174 2.2152016162872314  24
    "Argentina2010"   495047  3.437615394592285  22
    "Argentina2010"   494780  8.942412376403809  23
    "Argentina2010"   494650                  1  62
    "Argentina2010"   494398  43.27082061767578 127
    "Argentina2010"   494854                  1  22
    "Argentina2010"   494797  5.139439582824707  20
    "Argentina2010"   494992 24.657625198364258  21
    "Argentina2010"   494963  2.739043712615967  24
    "Argentina2010"   494649                  1  62
    "Argentina2010"   494687                  1  70
    end
    label values idstd IDSTD
    label values strata STRATA

    With all this on hand, first I set my survey structure like this:

    Code:
    svyset idstd [pweight=wt], strata(strata) singleunit(scaled)
    svyset idstd [pweight=wt], strata(strata) singleunit(scaled)
    After that, I want to run a logit regression but using cluster standard errors at country level, because might be correlation within a country. Then I type this code:

    Code:
    svy: logit collateral n_outcome age lnemployees i.ownership, vce(cluster country)
    However, Stata tells me: option vce() of logit is not allowed with the svy prefix. So, looking at the design of the survey and therefore, at the command svyset, am I already considering standard errors clustered at country level and hence, adding vce(cluster ...) has no sense, or may I have to specify it but with another command?

    Thank you in advanced!
    Last edited by Ibai Ostolozaga Falcon; 24 Apr 2024, 07:36.

  • #2
    it appears that you have multilevel data so why not use, e.g., -melogit- as your svy estimation option? see
    Code:
    help svy_estimation

    Comment


    • #3
      Unless I'm mistaken, if country is part of the survey sampling design structure, the svy command is already accounting for clustering automatically. There should be survey documentation that explains the sampling design and what the survey weighting is able to accurately produce.

      Comment


      • #4
        Originally posted by Erik Reinbergs View Post
        Unless I'm mistaken, if country is part of the survey sampling design structure, the svy command is already accounting for clustering automatically. There should be survey documentation that explains the sampling design and what the survey weighting is able to accurately produce.
        Thank you for your answer Erik. According to the documentation, the sample weights are constructed for each country so as to create population representative stadistics. These weights capture the probability of an enterprise in one country of being elegible, and is based in the different stratas within the country. The stratas are based on the size, industry and location of the enterprise.

        Comment

        Working...
        X