Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing values in usespss

    Hi Statalist!

    I have a question regarding missing values using the (ssc) command usespss. When I use usespss to import a .sav file into Stata it converts the missing values to " .a " instead of the " . " when using import spss in Stata 16.

    I was wondering if this is what's supposed to happen and if there's any way to avoid this. Not everyone in the research team I'm part of has Stata 16, and we need this import to be done directly through Stata, not a third-party software like Stat/Transfer. Hence it's necessary that we use this command.

    Regards

  • #2
    Do you have a replicable example, i.e. an SPSS file others can import that shows how the MD gets handled differently?

    I can't tell how import spss handles missing data. It may be that usespss does a better job than import spss does, e.g. .a is a better code than . is. The way usespss handles MD is described at

    http://www.radyakin.org/transfer/use...q/info/faq.txt

    If possible, look at the coding in the original spss file, and determine if usespss is handling MD better.

    In any event, It would be easy enough to recode .a to ., so it doesn't seem like much of a problem.

    If cost is the problem, the free rioweb may meet your needs:

    https://gallery.shinyapps.io/rioweb/
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      From my understanding after looking at documentation for SPSS, they do not contain the special missing values that Stata includes (.a through .z) in addition to the system missing (.). Instead, SPSS has the system missing and user missing.

      Richard's helpful link indicates that user missing are imported with special missing to preserve the "priority" of missing values. That document also describes a simple way to recode all those missing values to the system missing if you don't care about differentiation.

      All in all, it seems like your dataset contains user missing values, -usespss- recognized and preserved this information in a way that is understood by Stata, and you can deal with that straightforwardly.

      Comment


      • #4
        This is a replicable example. It shows that import spss changes all missing values to . usespss, on the other hand, converts user missing values to .a, .b, .c, etc. So, from the standpoint of preserving missing values, usespss seems to be better than import spss. There may be other ways import spss is superior, e.g. it may be able to handle more complicated .sav files (but I don't know that).

        Maybe it is in the documentation and I am missing it, but if not I would change the import spss documentation to make clear that all SPSS missing are converted to . in Stata. Or better yet, improve import spss so it preserves user-missing values like usepss does.

        Code:
        copy https://www3.nd.edu/~rwilliam/spssfiles/missing.sav tempmissing.sav, replace
        import spss tempmissing.sav, clear
        tab1 RINCOME, missing
        usespss tmpmissing.sav, clear
        tab1 RINCOME, missing
        Selected output:

        Code:
        . import spss tempmissing.sav, clear
        (6 vars, 1,517 obs)
        
        . tab1 RINCOME, missing
        
        -> tabulation of RINCOME  
        
           RESPONDENTS |
                INCOME |      Freq.     Percent        Cum.
        ---------------+-----------------------------------
              LT $1000 |         36        2.37        2.37
         $1000 TO 2999 |         34        2.24        4.61
         $3000 TO 3999 |         35        2.31        6.92
         $4000 TO 4999 |         29        1.91        8.83
         $5000 TO 5999 |         35        2.31       11.14
         $6000 TO 6999 |         16        1.05       12.20
         $7000 TO 7999 |         14        0.92       13.12
         $8000 TO 9999 |         41        2.70       15.82
        $10000 - 14999 |        119        7.84       23.67
        $15000 - 19999 |        127        8.37       32.04
        $20000 - 24999 |        105        6.92       38.96
        $25000 OR MORE |        321       21.16       60.12
               REFUSED |         40        2.64       62.76
                    . |        565       37.24      100.00
        ---------------+-----------------------------------
                 Total |      1,517      100.00
        
        . usespss tmpmissing.sav, clear
        
        . tab1 RINCOME, missing
        
        -> tabulation of RINCOME  
        
           RESPONDENTS |
                INCOME |      Freq.     Percent        Cum.
        ---------------+-----------------------------------
              LT $1000 |         36        2.37        2.37
         $1000 TO 2999 |         34        2.24        4.61
         $3000 TO 3999 |         35        2.31        6.92
         $4000 TO 4999 |         29        1.91        8.83
         $5000 TO 5999 |         35        2.31       11.14
         $6000 TO 6999 |         16        1.05       12.20
         $7000 TO 7999 |         14        0.92       13.12
         $8000 TO 9999 |         41        2.70       15.82
        $10000 - 14999 |        119        7.84       23.67
        $15000 - 19999 |        127        8.37       32.04
        $20000 - 24999 |        105        6.92       38.96
        $25000 OR MORE |        321       21.16       60.12
               REFUSED |         40        2.64       62.76
               [0]NAP |        463       30.52       93.28
                [98]DK |          7        0.46       93.74
                [99]NA |         95        6.26      100.00
        ---------------+-----------------------------------
                 Total |      1,517      100.00
        Last edited by Richard Williams; 24 Sep 2021, 07:22.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment

        Working...
        X