split-half reliability - similar to SPSS

Rich Goldstein

Join Date: Mar 2014

Posts: 4458
#1

split-half reliability - similar to SPSS

02 Aug 2018, 14:02

A client previously calculated split-half reliability using SPSS and I would like, if possible, to "match" their results in Stata; the scale I am investigating has 17 elements, each scored from 1 to 6; the client provided the following description of what they did in SPSS (in case that helps anyone); they were investigating a different scale:

RELIABILITY
This is from choosing "scale" and "reliability analysis"

/VARIABLES=VAR00001 VAR00002 VAR00003 VAR00004 VAR00005 VAR00006 VAR00007 VAR00008 VAR00009
VAR00010 VAR00011 VAR00012 VAR00013 VAR00014 VAR00015 VAR00016 VAR00017 VAR00018 VAR00019 VAR00020
VAR00021 VAR00022 VAR00023 VAR00024 VAR00025

/SCALE('ALL VARIABLES')
ALL

/MODEL=SPLIT

/ICC=MODEL(MIXED) TYPE(CONSISTENCY) CIN=95 TESTVAL=0.
Here, in the statistics options, I checked the box for "intraclass correlation
coefficient", and I chose the "two-way mixed method" in the model choice, and
"consistency" for the type choice.

****end of quoted material

I have seen Chuck Huber's user's group paper from 2012; but he uses the Spearman-Brown Prophesy formula and I would like to get an ICC if possible

I have looked at the -icc- command but the language on "target" and "rater" is confusing to me (not my area) and my attempts give a very low icc (<.2) which I don't believe to be even close to accurate

So, suggestions welcome (and help with translating the SPSS code above particularly welcome)

added in edit: each patient is given this scale on admission to the facility; I have over 800 facilities and over 600k patients; I do not know who collected the data but all collectors are, allegedly, trained in the use of the scale (in case this helps anyone figure out what "target" and "rater" are)

Last edited by Rich Goldstein; 02 Aug 2018, 14:05.
Tags: None
Richard Williams

Join Date: Apr 2014

Posts: 4983
#2

02 Aug 2018, 15:06

Can you give us a reproducible example and tell us what the SPSS answers are? It would help to know what the "correct" answer is.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4458
#3

02 Aug 2018, 18:47

sorry, but I don't have the data they used and I don't have SPSS (and they don't have the data that I do)
Comment

Carole J. Wilson

Join Date: Jan 2015
Posts: 932

02 Aug 2018, 19:33

I created some data in Stata and then opened in SPSS and followed the point & click instructions provided. Hopefully the input/output is helpful for someone here. The SPSS output isn't pretty but I couldn't find an obvious way to include it.

Code:

*stata 14.2 
clear
set seed 76543
set obs 1000
forval i=1/25 {
    generate v`i' = floor((6)*runiform() + 1)
    }

Code:

RELIABILITY 
  /VARIABLES=v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 
    v25 
  /SCALE('ALL VARIABLES') ALL 
  /MODEL=SPLIT 
  /STATISTICS=SCALE 
  /ICC=MODEL(MIXED) TYPE(CONSISTENCY) CIN=95 TESTVAL=0.
Reliability
Notes        
Output Created        02-AUG-2018 20:18:52
Comments        
Input    Active Dataset    DataSet1
    Filter    <none>
    Weight    <none>
    Split File    <none>
    N of Rows in Working Data File    1000
    Matrix Input    
Missing Value Handling    Definition of Missing    User-defined missing values are treated as missing.
    Cases Used    Statistics are based on all cases with valid data for all variables in the procedure.
Syntax        RELIABILITY
  /VARIABLES=v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24
    v25
  /SCALE('ALL VARIABLES') ALL
  /MODEL=SPLIT
  /STATISTICS=SCALE
  /ICC=MODEL(MIXED) TYPE(CONSISTENCY) CIN=95 TESTVAL=0.
Resources    Processor Time    00:00:00.02
    Elapsed Time    00:00:00.01



Scale: ALL VARIABLES
Case Processing Summary            
        N    %
Cases    Valid    1000    100.0
    Excludeda    0    .0
    Total    1000    100.0
a Listwise deletion based on all variables in the procedure.            



Reliability Statistics            
Cronbach's Alpha    Part 1    Value    -.056a
        N of Items    13b
    Part 2    Value    -.005a
        N of Items    12c
    Total N of Items        25
Correlation Between Forms            .002
Spearman-Brown Coefficient    Equal Length        .005
    Unequal Length        .005
Guttman Split-Half Coefficient            .005
a The value is negative due to a negative average covariance among items. This violates reliability model assumptions. You may want to check item codings.            
b The items are: v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13.            
c The items are: v13, v14, v15, v16, v17, v18, v19, v20, v21, v22, v23, v24, v25.            



Scale Statistics                
    Mean    Variance    Std. Deviation    N of Items
Part 1    45.54    36.313    6.026    13a
Part 2    42.38    35.066    5.922    12b
Both Parts    87.91    71.554    8.459    25
a The items are: v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13.                
b The items are: v14, v15, v16, v17, v18, v19, v20, v21, v22, v23, v24, v25.                



Intraclass Correlation Coefficient                            
    Intraclass Correlationb    95% Confidence Interval        F Test with True Value 0            
        Lower Bound    Upper Bound    Value    df1    df2    Sig
Single Measures    -.001a    -.004    .003    .973    999    23976    .717
Average Measures    -.027c    -.122    .062    .973    999    23976    .717
Two-way mixed effects model where people effects are random and measures effects are fixed.                            
a The estimator is the same, whether the interaction effect is present or not.                            
b Type C intraclass correlation coefficients using a consistency definition. The between-measure variance is excluded from the denominator variance.                            
c This estimate is computed assuming the interaction effect is absent, because it is not estimable otherwise.

Stata/MP 14.1 (64-bit x86-64)
Revision 19 May 2016
Win 8.1

Comment

Bruce Weaver

Join Date: May 2014

Posts: 1130
#5

02 Aug 2018, 19:41

Rich, I have SPSS on my office computer, but am away from the office until Tuesday, so cannot work up an example right now. Meanwhile, you might be able to find the info you need in the SPSS Command Syntax Reference manual entry for the RELIABILITY command.
https://www.ibm.com/support/knowledg...liability.html

Cheers,
Bruce

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1130
#6

02 Aug 2018, 19:49

My post in #5 crossed with Carole's in #4.

Carole, to produce cleaner SPSS output for sharing here, you could export all visible output to PDF, or if people are comfortable with it, to Excel. Instructions and details are available here.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment
Carole J. Wilson

Join Date: Jan 2015

Posts: 932
#7

02 Aug 2018, 20:04

Thanks, Bruce. Here's the pdf of the output posted in #4.
Attached Files

OUTPUT.pdf (13.8 KB, 1 view)

Stata/MP 14.1 (64-bit x86-64)
Revision 19 May 2016
Win 8.1
Comment
Andrew Lover

Join Date: Apr 2014

Posts: 182
#8

02 Aug 2018, 20:13

Hi Rich, looks like the SPSS split-half reliability is a Spearman-Brown test https://www.stata.com/statalist/arch.../msg00357.html

maybe https://ideas.repec.org/c/boc/bocode/s456795.html (-sbri-) is useful?

__________________________________________________ __
Assistant Professor, Department of Biostatistics and Epidemiology
School of Public Health and Health Sciences
University of Massachusetts- Amherst
Comment

Carole J. Wilson

Join Date: Jan 2015
Posts: 932

02 Aug 2018, 21:08

A little detective work shows that Stata's -icc- does produce ICC's consistent with SPSS's.

-icc- requires the data in long form (25 judges, rating 1000 targets), so:

Code:

clear
set seed 76543
set obs 1000
forval i=1/25 {
    generate v`i' = floor((6)*runiform() + 1)
    }

gen target=_n
reshape long v, i(target) j(j)

icc v target
icc v target j

Yields the exact results on the last page of the SPSS output pdf:

Code:

. icc v target

Intraclass correlations
One-way random-effects model
Absolute agreement

Random effects: target           Number of targets =      1000
                                 Number of raters  =        25

--------------------------------------------------------------
                     v |        ICC       [95% Conf. Interval]
-----------------------+--------------------------------------
            Individual |  -.0010677      -.0043537    .0026532
               Average |  -.0273955       -.121541    .0623585
--------------------------------------------------------------
F test that
  ICC=0.00: F(999.0, 24000.0) = 0.97          Prob > F = 0.717

Note: ICCs estimate correlations between individual measurements
      and between average measurements made on the same target.
r; t=0.08 21:47:16

If you need to use the Spearman-Brown formula to calculate additional statistics, it is in the manual for [R] icc.

Stata/MP 14.1 (64-bit x86-64)
Revision 19 May 2016
Win 8.1

Comment

daniel klein

Join Date: Mar 2014

Posts: 3843
#10

03 Aug 2018, 00:17

For Guttman's split-half reliability, see guttmanl (SSC). To match SPSS results you would need to look up how SPSS splits the test, then

Code:

guttmanl v1-v9 || v10-v17

assuming 17 items in the scale. The command is likely to freeze when you do not specify a specific split with many items (and observations) but see the help file.

The kappaetc (SSC) command can estimate ICCs with data in wide form. This would probably be something like

Code:

kappaetc v1-v17 , icc(random)

Neither guttmanl nor kappaetc are written with large datasets in mind so they might perform pretty slow.

I wonder whether the client has taken the nested data structure (patients in facilities) into account and, if so, how.

Best
Daniel

Last edited by daniel klein; 03 Aug 2018, 00:35.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4458
#11

03 Aug 2018, 05:23

thank you all very much - I will be working on this over the weekend and hope to follow-up next week
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4983
#12

03 Aug 2018, 09:39

There is a poor person's free semi-clone of SPSS at

https://www.gnu.org/software/pspp/

I recommend it to people who have $0 to buy anything else and who don't want to learn R. Many SPSS features are not supported, but for features that are, perhaps it would sometimes be helpful for those who are trying to replicate some SPSS output.

I actually can get SPSS for free but do not have any great desire to do so! But maybe I'll break down and get it someday. (Although I knock SPSS now, I must admit that the 1975 maroon manual forever changed my life as it introduced me to quantitative research. I think I still have 3 or copies of it sitting around.)

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Announcement