Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Output from -swaic- differs from output when I run the recommended model

    Using Stata/MP 17.0, I ran the following using xi: logistic and the swaic command and got the following output:
    Code:
     xi: logistic under18SW age country i.q0_c literate i.q1_27 i.employed rapedasminor everpregnant kids q10_22 q10_27 q10_26 bcpill implant malecondom anybc stisymptoms q8_27 q8_81
    i.q0_c            _Iq0_c_1-4          (naturally coded; _Iq0_c_1 omitted)
    i.q1_27           _Iq1_27_1-5         (naturally coded; _Iq1_27_1 omitted)
    i.employed        _Iemployed_1-3      (naturally coded; _Iemployed_1 omitted)
    
    Logistic regression                                     Number of obs =    459
                                                            LR chi2(25)   = 170.38
                                                            Prob > chi2   = 0.0000
    Log likelihood = -180.58356                             Pseudo R2     = 0.3205
    
    ------------------------------------------------------------------------------
       under18SW | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             age |   .8068689   .0344535    -5.03   0.000     .7420898    .8773028
         country |    5.16843   4.340895     1.96   0.051      .996428    26.80843
        _Iq0_c_2 |   .8566898   .4172976    -0.32   0.751      .329761    2.225604
        _Iq0_c_3 |   .8453531   .3933348    -0.36   0.718     .3396113    2.104235
        _Iq0_c_4 |   4.882497   2.780546     2.78   0.005     1.599149    14.90717
        literate |    .626898   .2815075    -1.04   0.298     .2599957    1.511567
       _Iq1_27_2 |   8.227339   5.233809     3.31   0.001     2.364661    28.62529
       _Iq1_27_3 |    .755541   .5434995    -0.39   0.697     .1844782    3.094361
       _Iq1_27_4 |    .570819   .6225295    -0.51   0.607     .0673266     4.83961
       _Iq1_27_5 |   .2721441    .332142    -1.07   0.286      .024885    2.976183
    _Iemployed_2 |   .9248885   .3140542    -0.23   0.818     .4753968    1.799378
    _Iemployed_3 |   .8502925   .2975205    -0.46   0.643      .428282    1.688134
    rapedasminor |   6.407848    4.00002     2.98   0.003     1.885212    21.78032
    everpregnant |   .4024507    .151292    -2.42   0.015     .1926296    .8408184
            kids |   1.038197   .1937045     0.20   0.841     .7202193    1.496564
          q10_22 |   .3100108   .1044891    -3.47   0.001     .1601325    .6001699
          q10_27 |   .7528313   .5496197    -0.39   0.697     .1799952    3.148724
          q10_26 |   .8189497   .7416964    -0.22   0.825     .1387878    4.832402
          bcpill |   1.530976   .9696693     0.67   0.501      .442431    5.297746
         implant |   .4567191   .1877271    -1.91   0.057     .2040687    1.022167
      malecondom |   1.075177   .4430269     0.18   0.860     .4794491    2.411114
           anybc |   .9240853   .5042079    -0.14   0.885     .3171551     2.69248
     stisymptoms |   1.778505   .5658449     1.81   0.070     .9533207    3.317962
           q8_27 |   .4237403   .1716452    -2.12   0.034     .1915588    .9373406
           q8_81 |    .120456   .1060314    -2.40   0.016     .0214565     .676236
           _cons |   921.5155   1567.715     4.01   0.000     32.84037    25858.14
    ------------------------------------------------------------------------------
    Note: _cons estimates baseline odds.
    
    . 
    end of do-file
    
    . do "C:\Users\Ashley\AppData\Local\Temp\STD8a74_000000.tmp"
    
    . swaic, model
    Stepwise Model Selection by AIC
    logistic regression. 
    number of obs = 459
    ------------------------------------------------------------------------------
    under18SW           |  Df     Chi2     P>Chi2  -2*ll    Df Res.  AIC
    --------------------+---------------------------------------------------------
    Null Model          |                          531.55   458      533.55 
    Step 1:age          |  1      50.84    1.0e-12 480.71   457      484.71 
    Step 2:q10_22       |  1      38.015   7.0e-10 442.7    456      448.7  
    Step 3:q8_81        |  1      18.401   1.8e-05 424.29   455      432.29 
    Step 4:implant      |  1      9.5924   .002    414.7    454      424.7  
    Step 5:rapedasminor |  1      8.6041   .0034   406.1    453      418.1  
    Step 6:_Iq1*        |  4      14.079   .007    392.02   449      412.02 
    Step 7:everpregnant |  1      4.7411   .0295   387.28   448      409.28 
    Step 8:_Iq0*        |  3      10.929   .0121   376.35   445      404.35 
    Step 9:q8_27        |  1      5.0783   .0242   371.27   444      401.27 
    Step 10:stisymptoms  | 1      3.9949   .0456   367.28   443      399.28 
    Step 11:country      | 1      3.821    .0506   363.46   442      397.46 
    Step 12:literate     | 1      1.177    .278    362.28   441      398.28 
    Step 13:q10_27       | 1      .32802   .5668   361.95   440      399.95 
    Step 14:bcpill       | 1      .4189    .5175   361.53   439      401.53 
    Step 15:q10_26       | 1      .04794   .8267   361.48   438      403.48 
    Step 16:kids         | 1      .03902   .8434   361.44   437      405.44 
    Step 17:malecondom   | 1      .03609   .8493   361.41   436      407.41 
    Step 18:anybc        | 1      .02188   .8824   361.39   435      409.39 
    Step 19:_Iemployed*  | 2      .21983   .8959   361.17   433      413.17 
    ------------------------------------------------------------------------------
    
    Logistic regression                               Number of obs   =        459
                                                      LR chi2(16)     =     168.10
                                                      Prob > chi2     =     0.0000
    Log likelihood = -181.72788                       Pseudo R2       =     0.3162
    
    ------------------------------------------------------------------------------
       under18SW | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             age |   .8197639   .0315347    -5.17   0.000     .7602296    .8839605
          q10_22 |   .3037018   .0934923    -3.87   0.000      .166116    .5552434
           q8_81 |   .1194191   .0974738    -2.60   0.009     .0241155    .5913603
         implant |   .4261108   .1281858    -2.84   0.005     .2362968    .7683997
    rapedasminor |   6.347597   3.856351     3.04   0.002     1.929654    20.88042
       _Iq1_27_2 |   8.368875   5.109157     3.48   0.001     2.529372     27.6899
       _Iq1_27_3 |   .8229422   .5662495    -0.28   0.777     .2136364    3.170031
       _Iq1_27_4 |   .6491777   .6760554    -0.41   0.678      .084318    4.998125
       _Iq1_27_5 |   .2905738   .3426013    -1.05   0.295     .0288168    2.929992
    everpregnant |   .4210521   .1344656    -2.71   0.007     .2251642    .7873583
        _Iq0_c_2 |   .8320195   .3900762    -0.39   0.695     .3319435    2.085465
        _Iq0_c_3 |   .7314145   .2911038    -0.79   0.432     .3352621    1.595669
        _Iq0_c_4 |   6.483417   3.229525     3.75   0.000      2.44232    17.21097
           q8_27 |   .4408596   .1716653    -2.10   0.035     .2055194    .9456876
     stisymptoms |   1.854351   .5460797     2.10   0.036     1.041179     3.30262
         country |    4.19649   3.466901     1.74   0.083      .831136    21.18851
    ------------------------------------------------------------------------------
    
    minimun AIC =  397.456;  model: age q10_22 q8_81 implant rapedasminor _Iq1* everpregnant _Iq0* q8_27 stisymptoms country
    When I try to run the recommended model, the odds ratios and p values are different:
    Code:
    logistic under18SW age q10_22 q8_81 implant rapedasminor _Iq1* everpregnant _Iq0* q8_27 stisymptoms country
    
    Logistic regression                                     Number of obs =    468
                                                            LR chi2(16)   = 170.00
                                                            Prob > chi2   = 0.0000
    Log likelihood = -186.60462                             Pseudo R2     = 0.3129
    
    ------------------------------------------------------------------------------
       under18SW | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             age |   .8218847   .0312799    -5.15   0.000     .7628081    .8855366
          q10_22 |   .3060227   .0926306    -3.91   0.000     .1690838     .553867
           q8_81 |   .1189295   .0968038    -2.62   0.009     .0241238    .5863182
         implant |   .4295743   .1272993    -2.85   0.004     .2403227    .7678597
    rapedasminor |   6.439381   3.914557     3.06   0.002       1.9561    21.19811
       _Iq1_27_2 |   8.347417   5.097669     3.47   0.001     2.521932    27.62936
       _Iq1_27_3 |   .7923137   .5435732    -0.34   0.734     .2065016    3.039981
       _Iq1_27_4 |   .6814448   .7080105    -0.37   0.712     .0889295    5.221746
       _Iq1_27_5 |   .2942724   .3476964    -1.04   0.301     .0290413    2.981834
    everpregnant |    .400074   .1255705    -2.92   0.004     .2162595    .7401255
        _Iq0_c_2 |    .860965   .3835546    -0.34   0.737     .3595675    2.061534
        _Iq0_c_3 |   .7399185   .2904172    -0.77   0.443     .3428374    1.596907
        _Iq0_c_4 |   6.564508   3.257736     3.79   0.000     2.481879    17.36296
           q8_27 |   .4362714   .1691763    -2.14   0.032     .2040233    .9328968
     stisymptoms |   1.714255   .4959215     1.86   0.062     .9723638     3.02219
         country |   4.089606   3.378917     1.70   0.088     .8098437    20.65198
           _cons |   296.9568   400.0529     4.23   0.000     21.18242    4163.043
    ------------------------------------------------------------------------------
    Note: _cons estimates baseline odds.
    What could explain the discrepancy? Which output should I use? I would appreciate any guidance.

    Many thanks,
    Ashley Grosso

  • #2
    The models aren't fitted to the same subset. Look at the number of observations. This is presumably a side-effect of omissions due to missing values. That is, if you don't use certain predictors. the number of observations available is greater. Or so I guess.

    By the way:

    * swaic from the Stata Journal is community-contributed, as you are asked to explain (FAQ Advice =12).

    * in Stata 17, xi: is essentially redundant, and indeed using it can inhibit some useful stuff. See the help for xi:.

    * many statistical people will tell you that stepwise selection of predictors is evil, although they don't usually agree on what is both simple and definitely better. Frank Harrell's Regression Modeling Strategies (Springer 2015) is one source.

    Comment


    • #3
      Thank you for the reply. I incorrectly assumed that the output from swaic was based on only the observations from the recommended model (which has less missing data because of using fewer variables), when it actually is on the observations from the full model with all the variables.

      I appreciate the clarification that I should have explained that the command is community-contributed and will make sure to do so in the future .

      I used xi because otherwise the command swaic gives the error message "factor-variable operators not allowed r(101);". But I am reviewing the help for xi: as you suggested.

      When I completed my doctoral studies in public administration I was taught not to use stepwise regression, but I work in public health now and it is more commonly used by my collaborators. I will take a look at the source you suggested. I tend not to use stepwise selection if I am interested in the relationship between two specific variables and instead just control for things that are significantly related to both the dependent variable and the main independent variable of interest. But if I'm interested in multiple factors related to one variable (in this case various health and social outcomes related to initiation of selling sex as a minor among sex workers) I sometimes use stepwise selection of predictors.

      Comment


      • #4
        Good point about swaic. It makes sense that it needs old-style output to function at all.

        Comment

        Working...
        X