Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • SPSS to Stata

    Hi,
    I have a syntaxfile in SPSS that I would like to replicate in Stata in order to be able to do some imputation. I think the syntax looks fairly simple, but I´m still not able to get the same results in Stata.
    The problem is I get different N in SPSS and in Stata when I do the analyses.

    The whole syntax in SPSS looks like this:


    DATASET ACTIVATE DataSet3.
    WEIGHT BY weight.

    REGRESSION
    /DESCRIPTIVES MEAN STDDEV CORR SIG N
    /MISSING LISTWISE
    /STATISTICS COEFF OUTS R ANOVA CHANGE
    /CRITERIA=PIN(.05) POUT(.10)
    /NOORIGIN
    /DEPENDENT Direct_Violence
    /METHOD=ENTER gender threat political edu1 edu2 edu3 inc1 inc2 year03 year04 rel1 rel2 rel3 age.

    I´m suspecting its either the missing listwise or the pin pout that gives me different results. Does anyone know what these commands are in stata?

    Thankful for all the help I can get.
    -Sophie

  • #2
    Sophie: Please read and act on

    https://www.statalist.org/forums/help#realnames

    https://www.statalist.org/forums/help#adviceextras #3

    and register with a family name too.

    I've not used SPSS for some decades and have forgotten all I ever knew. But the claim of different results is hard to discuss even by those fluent in Stata and SPSS without a data example and Stata syntax too.

    That said, stepwise regression has no visible fans here. To https://www.stata.com/support/faqs/s...sion-problems/ I would add the discussion in

    Harrell, F.E. 2015. Regression Modeling Strategies. Cham: Springer.

    Comment


    • #3
      I would suspect the weight.
      Show us your equivalent Stata syntax?

      It looks like you are using the SPSS GUI to generate syntax, and that this is a single model, not a stepwise model (method=enter), so the default pin pout have no effect. Missing listwise is also the assumption in Stata.

      Another possible culprit would be differing variable lists.

      Are you using factor variable notation in Stata? Generating indicator variables in Stata? Or have you simply transferred the data from SPSS to Stata? Another possiblity is that you have generated indicator variables differently in each software.
      Doug Hemken
      SSCC, Univ. of Wisc.-Madison

      Comment


      • #4
        Without code and data this is all just a guessing game. But another guess is that missing data isn't coded right in Stata. SPSS uses codes like 99 and 999 for missing data. In Stata you need to change to ., .a, .b, etc. Otherwise the 99s and 999s will be treated as legitimate non-missing values.

        These handouts show how to do OLS regression in SPSS and in Stata:

        https://www3.nd.edu/~rwilliam/stats1/OLS-SPSS.pdf

        https://www3.nd.edu/~rwilliam/stats2/OLS-Stata9.pdf

        This is a handy guide for translating SPSS commands into Stata:

        https://stats.idre.ucla.edu/stata/fa...-sas-and-spss/
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          If all the previous suggestions and advice are not enough to solve your doubt, maybe you could share a data set (a toy example will do fine) - as pointed out in #2, #3 and #4 - as well as the output in both SPSS and Stata.
          Best regards,

          Marcos

          Comment


          • #6
            Thank you so much for all your answers! I´m extremely grateful for them all!
            The SPSS do-file and datafile can be found in the link below. If the link does not work it can also be found at https://www.prio.org/JPR/Datasets/ (2017, vol54, number 6, Lihi Ben Shitrit++).
            In Stata I have been using mdesc as the command to see the number of missing data, and in SPSS I have been using the frequencies command. In doing this I get different results in the number of missing values on the different variables in the two programs.
            What I intend do to with the data is to do an imputation in Stata, and that is why I first need to replicate the study already done by someone else before I can start the imputation. What I am wondering now is why the number of missing values varies from SPSS to Stata.
            * Weights: I have tried with different weights in Stata. I´ve tried pweight, aweight and iweight, and I´ve also tried to multiply each variable with the weight-variable, but the N between Stata and SPSS is still not the same. I am not sure how the weights have been used in the original study apart from the fact that there is a varible called "weight" in the dataset, and that the command WEIGHT BY weight has been used.
            * Data: I´ve simply opened the datafile in SPSS and saved it in the dta-format. I´m afraid I´m not able to upload the Statafile itself here. If anyone knows how to upload a Stata datafile from my computer to the forum I would be more than happy to share it!

            Best regards,
            Sophie

            Comment


            • #7
              Also its the datafile entitled 2003-2005 I´ve been working with, but the same problem applies to the 2007 datafile. Thanks again!

              Sophie

              Comment


              • #8
                I think Richard has nailed it - in SPSS there are "user defined" missing values. For example in the variable "political" there are people with a value of 98 - in SPSS this is defined as missing. These user defined missing values affect some of your variables but not all of them.
                Doug Hemken
                SSCC, Univ. of Wisc.-Madison

                Comment


                • #9
                  Originally posted by Doug Hemken View Post
                  I think Richard has nailed it - in SPSS there are "user defined" missing values. For example in the variable "political" there are people with a value of 98 - in SPSS this is defined as missing. These user defined missing values affect some of your variables but not all of them.
                  Using Stat/Transfer, I did not get any user-defined missings. But then I realized Stat/Transfer was converting them to Stata missing. So, for political, I wound up getting (with and without iweights]

                  Code:
                  . tab political, m
                  
                  polilicalpo |
                       sition |      Freq.     Percent        Cum.
                  ------------+-----------------------------------
                    ext right |        111        4.73        4.73
                        right |        824       35.08       39.80
                       center |        718       30.57       70.37
                         left |        313       13.32       83.70
                     ext left |         20        0.85       84.55
                            . |        261       11.11       95.66
                           .a |        102        4.34      100.00
                  ------------+-----------------------------------
                        Total |      2,349      100.00
                  
                  . tab political [iw=weight], m
                  
                  polilicalpo |
                       sition |      Freq.     Percent        Cum.
                  ------------+-----------------------------------
                    ext right |    102.444        4.42        4.42
                        right |    724.148       31.23       35.64
                       center |    751.498       32.40       68.05
                         left |     365.95       15.78       83.83
                     ext left |     20.258        0.87       84.70
                            . |    251.644       10.85       95.55
                           .a |    103.184        4.45      100.00
                  ------------+-----------------------------------
                        Total |  2,319.126      100.00
                  We still don't know if the problem has been solved, because Sophie never told us what the SPSS answer is. If problems persist, get back to us with specifics.
                  -------------------------------------------
                  Richard Williams, Notre Dame Dept of Sociology
                  Stata Version: 17.0 MP (2 processor)

                  EMAIL: [email protected]
                  WWW: https://www3.nd.edu/~rwilliam

                  Comment


                  • #10
                    I have SPSS, so estimated the model with both.

                    Code:
                    . recode political (98=.a)
                    (political: 102 changes made)
                    
                    . summarize political
                    
                        Variable |        Obs        Mean    Std. Dev.       Min        Max
                    -------------+---------------------------------------------------------
                       political |      1,986    2.651057    .8455493          1          5
                    
                    .
                    . regress Direct_Violence gender threat political edu1 edu2 edu3 ///
                    >   inc1 inc2 year03 year04 rel1 rel2 rel3 age [iw=weight]
                    
                          Source |       SS           df       MS      Number of obs   =     1,733
                    -------------+----------------------------------   F(14, 1718)     =     14.90
                           Model |  183.811059        14  13.1293614   Prob > F        =    0.0000
                        Residual |  1514.74108     1,718  .881688638   R-squared       =    0.1082
                    -------------+----------------------------------   Adj R-squared   =    0.1012
                           Total |  1698.55214     1,732  .980688302   Root MSE        =    .93884
                    
                    *** Table of coefficients snipped ***
                    
                    . display e(F)
                    14.895627
                    
                    . display e(rmse)
                    .93884179
                    
                    .
                    . * From SPSS:  F(14, 1718) = 14.895627.  
                    . * From SPSS:  RMSE = 0.938842.
                    One additional point: In the SPSS output, the Model Summary table reports denominator df as 1718, but the ANOVA table shows it as 1719.

                    --
                    Bruce Weaver
                    Email: [email protected]
                    Web: http://sites.google.com/a/lakeheadu.ca/bweaver/
                    Version: Stata/MP 18.0 (Windows)

                    Comment


                    • #11
                      Much depends on how you transferred the data from SPSS to Stata. StatTransfer will do a better job than using SPSS itself.
                      Doug Hemken
                      SSCC, Univ. of Wisc.-Madison

                      Comment


                      • #12
                        Thanks to the help from all of you I managed to get the same results (pretty much) in SPSS and Stata. In a couple of the analysis the N was one lower in Stata than in SPSS (N=1734 vs N=1733), but I assume this might be because of different rounding in the two programs. Anyhow I managed to obtain the same coeffisients and p-values, and I´m ready to start the imputation now. I really appreciate all the help you gave me!

                        - Sophie

                        Comment


                        • #13
                          The user-written usespss seems to read the spss file ok and convert the user-specified missing values into Stata missing values.

                          I think rounding may get handled differently by different programs when using fractional weights. Stata says the residual df are 1718 in the above example. PSPP (a freebie semi-clone of SPSS) says 1718.516.
                          -------------------------------------------
                          Richard Williams, Notre Dame Dept of Sociology
                          Stata Version: 17.0 MP (2 processor)

                          EMAIL: [email protected]
                          WWW: https://www3.nd.edu/~rwilliam

                          Comment

                          Working...
                          X