Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • All standard errors missing in output table for mlogit using survey data

    I am using Stata 16 on both Mac and Windows (the problem occurs across on both).

    DATA
    My dataset looks like this (these are fake data made to look like the proprietary data I'm working with):

    Code:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(identity age empl sex marstat urban educ) long country double wgt int WP12258 double(WP12258A WP12259)
    2    17    5    2    4    3    1    9    0.918483471    1    2017100001    7
    1    21    6    2    2    2    2    9    1.063579888    7    2017100007    7
    2    22    6    1    2    2    2    9    0.337893372    1    2017100001    10
    1    23    6    2    2    2    2    9    1.567769236    18    2017100018    10
    1    27    2    1    2    6    1    9    0.475945213    11    2017100011    16
    1    28    6    2    2    2    3    9    1.056964223    19    2017100019    16
    2    31    1    1    2    2    2    9    0.265830759    7    2017100007    24
    2    31    1    2    1    6    1    9    1.072376993    5    2017100005    25
    2    32    6    2    3    1    2    9    0.458895125    6    2017100006    33
    2    35    6    2    2    3    1    9    1.478014368    1    2017100001    44
    1    36    2    1    2    6    1    9    0.43331561    15    2017100015    45
    1    37    6    2    1    2    2    9    1.008920356    8    2017100008    53
    3    39    1    1    2    2    1    9    0.293730707    1    2017100001    61
    1    41    2    1    2    2    2    9    1.298432642    10    2017100010    61
    1    41    1    1    2    1    2    9    1.511035436    9    2017100009    80
    2    45    2    2    2    1    2    9    0.398746139    1    2017100001    81
    1    60    1    1    2    6    1    9    0.553875541    14    2017100014    87
    1    60    1    1    1    2    2    9    0.280035342    9    2017100009    91
    2    61    2    1    2    2    2    9    2.464917433    5    2017100005    92
    1    67    6    1    1    6    1    9    1.056964223    19    2017100019    96
    
    
    end
    label values identity WP22091
    label def WP22091 1 "Being a part of the city or area where you live", modify
    label def WP22091 2 "Being a part of this country", modify
    label def WP22091 3 "Being a part of the world", modify
    label values age WP1220
    label values empl EMP_2010
    label def EMP_2010 1 "Employed full time for an employer", modify
    label def EMP_2010 2 "Employed full time for self", modify
    label def EMP_2010 5 "Employed part time want full time", modify
    label def EMP_2010 6 "Out of workforce", modify
    label values sex WP1219
    label def WP1219 1 "Male", modify
    label def WP1219 2 "Female", modify
    label values marstat WP1223
    label def WP1223 1 "Single/Never been married", modify
    label def WP1223 2 "Married", modify
    label def WP1223 3 "Separated", modify
    label def WP1223 4 "Divorced", modify
    label values urban WP14
    label def WP14 1 "A rural area or on a farm", modify
    label def WP14 2 "A small town or village", modify
    label def WP14 3 "A large city", modify
    label def WP14 6 "A suburb of a large city", modify
    label values educ WP3117
    label def WP3117 1 "Completed elementary education or less (up to 8 years of basic education)", modify
    label def WP3117 2 "Secondary - 3 year TertiarySecondary education and some education beyond secondary education (9-15 years of educatio", modify
    label def WP3117 3 "Completed four years of education beyond high school and/or received a 4-year college degree.", modify
    label values country country
    label def country 9 "Andorra", modify
    The variables of interest are:
    - Identity: the place that respondents associate themselves with (e.g., city, country, world)
    - Age: in years
    - Empl: employment status, categorical
    - Sex: male/female
    - Marstat: marital status, categorical
    - Educ: educational status, categorical
    - Country: country name

    These are the variables I used to declare my survey design:

    Code:
    . codebook wgt WP12258 WP12258A WP12259
    
    -----------------------------------------------------------------------------
    wgt                                                                    Weight
    -----------------------------------------------------------------------------
    
                      type:  numeric (double)
    
                     range:  [.16155396,5.6427716]        units:  1.000e-11
             unique values:  32,654                   missing .:  0/58,146
    
                      mean:         1
                  std. dev:   .716748
    
               percentiles:        10%       25%       50%       75%       90%
                               .299751    .48121    .81263   1.30616     1.967
    
    -----------------------------------------------------------------------------
    WP12258                                               Sampling Stratification
    -----------------------------------------------------------------------------
    
                      type:  numeric (int)
    
                     range:  [1,9902]                     units:  1
             unique values:  198                      missing .:  0/58,146
    
                      mean:   400.069
                  std. dev:   1175.67
    
               percentiles:        10%       25%       50%       75%       90%
                                     3         6        21       131       901
    
    -----------------------------------------------------------------------------
    WP12258A                                            Sampling Stratification 2
    -----------------------------------------------------------------------------
    
                      type:  numeric (double)
    
                     range:  [1.017e+09,1.970e+11]        units:  10
             unique values:  852                      missing .:  0/58,146
    
                      mean:   6.0e+10
                  std. dev:   5.0e+10
    
               percentiles:        10%       25%       50%       75%       90%
                               8.0e+09   2.4e+10   4.8e+10   7.9e+10   1.5e+11
    
    -----------------------------------------------------------------------------
    WP12259                                                Sampling Stage 1 (PSU)
    -----------------------------------------------------------------------------
    
                      type:  numeric (double)
    
                     range:  [1,1.721e+13]                units:  1
             unique values:  10,023                   missing .:  0/58,146
    
                      mean:   2.7e+12
                  std. dev:   6.2e+12
    
               percentiles:        10%       25%       50%       75%       90%
                                    12        29        62        96   1.7e+13
    Note: WP12258 has strata IDs unique to each country and WP12258A has IDs for the same strata that are unique globally.

    The survey design then is:

    Code:
    svyset [pweight = wgt], strata(WP12258A) psu(WP12259)

    PROBLEM
    I run the following model:

    Code:
    svy: mlogit identity age i.empl i.sex i.marstat i.urban i.educ i.country
    (The country dummies are supposed to be country fixed effects.)

    The output this produces includes the coefficients but no standard errors for any of the independent variables. I know sometimes this can be because the model fits the data perfectly, as in this thread https://www.statalist.org/forums/for...-in-regression. I doubt that's the case here. I know that sometimes we see missing standard errors if the variance matrix is nonsymmetric or highly singular, as in here: https://www.stata.com/statalist/arch.../msg00980.html. But I do not get an error message about the variance matrix and trying, just in case, to locate sparse indicators by dropping each one in turn and re-running the model does not fix the problem.

    I noticed that if I omit declaring strata or psu in svyset, then I do get standard errors in output. I know it's not a solution but perhaps it will help locate the problem.
    Last edited by Rouslan Karimov; 11 Apr 2023, 12:35.

  • #2
    You are probably fitting too many parameters relative to the number of observations. You can verify this by excluding some indicators. If you want to estimate a fixed effects multinomial logit model, see

    Code:
    help xtmlogit
    introduced in Stata 17.

    Comment


    • #3
      Thanks so much, Andrew. There are 58,146 observations in this dataset and I'm using, by my count, 68 indicators (if we count each category from i.var as an indicator, minus 1 for baseline). I don't have access to Stata 17, unfortunately, but isn't what I'm doing here essentially equivalent to what xtmlogit would do? Thanks again.
      Last edited by Rouslan Karimov; 11 Apr 2023, 13:32.

      Comment


      • #4
        In survey analysis with stratification, the degrees of freedom can differ significantly from the actual number of observations in the dataset. See https://notstatschat.rbind.io/2019/0...om-brief-note/.

        but isn't what I'm doing here essentially equivalent to what xtmlogit would do
        No, xtmlogit FE estimator is a conditional maximum likelihood estimator (so the fixed effects are conditioned out of the likelihood and not explicitly estimated).

        Comment


        • #5
          Thanks again, Andrew. My design degrees of freedom are 13,433. I ran a bunch of univariate regressions for each indicator in turn but the problem persisted. I also ran the model without predictors; just the DV. Same problem. What do you think?

          Comment


          • #6
            When I run your code using the example data, I do replicate your problem. However, underneath the output, it also says "Note: Missing standard errors because of stratum with single sampling unit." Do you receive that same message? If so, the solution is to identify all strata with a single sampling unit and then merge those strata with other strata that are, with respect to things that matter for your problem, as similar as possible.

            If you do not receive any notes or warnings from Stata, then the problem is more obscure.

            Added: Had you, in #1, followed the guidance in the FAQ to show the exact complete output of commands that need troubleshooting, you probably would have had your problem resolved in a matter of minutes rather than hours.
            Last edited by Clyde Schechter; 11 Apr 2023, 17:41.

            Comment


            • #7
              Thanks so much, Clyde. This is my first posting; I tried to follow the FAQ but obviously missed a very important part. I do get the same message about single sampling units, so now I know what the problem is. Thanks again to you and Andrew for your time and guidance.

              Comment

              Working...
              X