Creating an Index

Marie Sofie

Join Date: Jan 2017
Posts: 79

18 May 2017, 14:38

Hello, I have four variables which are supposedly measuring 4 dimensions of the same concept. I want to create an index in Stata. I have no idea what is the best way of doing it. The response scale is the same for all 5 and because it is an experimental data, only people in the third experimental group get to answer these questions. Thank you.

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(ener_sec4 ener_sec3 ener_sec2 ener_sec1)
. . . .
. . . .
. . . .
. . . .
. . . .
5 4 2 2
5 4 3 3
5 5 4 4
. . . .
. . . .
5 5 5 5
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
5 5 4 4
5 5 5 5
. . . .
. . . .
. . . .
. . . .
. . . .
2 5 2 1
5 5 4 4


end
label values ener_sec4 P46_3_4
label def P46_3_4 1 "Sin importancia en absoluto", modify
label def P46_3_4 2 "Algo sin importancia", modify
label def P46_3_4 3 "Ni importante ni sin importancia", modify
label def P46_3_4 4 "Algo importante", modify
label def P46_3_4 5 "Extremadamente importante", modify
label values ener_sec3 P46_3_3
label def P46_3_3 1 "Sin importancia en absoluto", modify
label def P46_3_3 2 "Algo sin importancia", modify
label def P46_3_3 3 "Ni importante ni sin importancia", modify
label def P46_3_3 4 "Algo importante", modify
label def P46_3_3 5 "Extremadamente importante", modify
label values ener_sec2 P46_3_2
label def P46_3_2 1 "Sin importancia en absoluto", modify
label def P46_3_2 2 "Algo sin importancia", modify
label def P46_3_2 3 "Ni importante ni sin importancia", modify
label def P46_3_2 4 "Algo importante", modify
label def P46_3_2 5 "Extremadamente importante", modify
label values ener_sec1 P46_3_1
label def P46_3_1 1 "Sin importancia en absoluto", modify
label def P46_3_1 2 "Algo sin importancia", modify
label def P46_3_1 3 "Ni importante ni sin importancia", modify
label def P46_3_1 4 "Algo importante", modify
label def P46_3_1 5 "Extremadamente importante", modify

------------------ copy up to and including the previous line ------------------

Tags: None

Alan Neustadtl

Join Date: Mar 2014

Posts: 107
#2

18 May 2017, 20:00

Here is one solution:

Code:

generate index=(ener_sec4 + ener_sec3 + ener_sec2 + ener_sec1)-3 tab index
Comment
Marie Sofie

Join Date: Jan 2017

Posts: 79
#3

19 May 2017, 01:40

Thank you Alan. Fantastic. But How would I do a factor analysis to see if they load into the same concept? I tried to do it with sem but I could not in Stata 13. I have no experience with factor analysis whatsoever. Thank you.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#4

19 May 2017, 01:49

Marie:
see -help factor- and -help pca-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Marie Sofie

Join Date: Jan 2017

Posts: 79
#5

19 May 2017, 02:03

Dear Carlo. Thank you so much. I will try them!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#6

19 May 2017, 02:09

http://www.statalist.org/forums/foru...a-new-variable seems an obvious (even essential) cross-reference.

I'd add to the discussion Mosteller, F. and Tukey, J.W. 1977. Data analysis and regression: A second course in statistics. Reading, MA: Addison-Wesley for brief but highly pertinent remarks on judgement-based composites as serious candidates in this territory.
Comment
Marie Sofie

Join Date: Jan 2017

Posts: 79
#7

19 May 2017, 13:05

Thanks a million, Nick. I will look for this book when I am at work again on Monday. So, I am still finding it hard to understand how and what to interpret in a factor analysis. Doing the factor analysis in Stata, has given me this outcome:

factor ener_sec1 ener_sec2 ener_sec3 ener_sec4
(obs=339)

Factor analysis/correlation Number of obs = 339
Method: principal factors Retained factors = 2
Rotation: (unrotated) Number of params = 6

--------------------------------------------------------------------------
Factor | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 1.88919 1.75367 1.1416 1.1416
Factor2 | 0.13552 0.29712 0.0819 1.2235
Factor3 | -0.16160 0.04672 -0.0977 1.1259
Factor4 | -0.20832 . -0.1259 1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated: chi2(6) = 426.50 Prob>chi2 = 0.0000

Factor loadings (pattern matrix) and unique variances

-------------------------------------------------
Variable | Factor1 Factor2 | Uniqueness
-------------+--------------------+--------------
ener_sec1 | 0.5993 0.2188 | 0.5930
ener_sec2 | 0.6871 0.1710 | 0.4987
ener_sec3 | 0.7207 -0.1717 | 0.4511
ener_sec4 | 0.7339 -0.1701 | 0.4324
-------------------------------------------------

What does it mean? I mean do the variables load well? Can they load in one variable, or 2, as the factors indicate? Does it mean that I cannot make an index out of 4 variables, because they do not load well. What are the indications of loading? I really do not understand and I need to for tonight, as I have to create an index and use it as an IV. Thank you so much
Comment
Marie Sofie

Join Date: Jan 2017

Posts: 79
#8

19 May 2017, 13:07

Another problem is that when I do create an index and tab it, it gives me the scale of

indx | Freq. Percent Cum.
------------+-----------------------------------
1 | 1 0.29 0.29
7 | 1 0.29 0.59
8 | 2 0.59 1.18
9 | 14 4.13 5.31
10 | 7 2.06 7.37
11 | 7 2.06 9.44
12 | 10 2.95 12.39
13 | 35 10.32 22.71
14 | 38 11.21 33.92
15 | 87 25.66 59.59
16 | 38 11.21 70.80
17 | 99 29.20 100.00
------------+-----------------------------------
Total | 339 100.00

I have no idea how to treat it. Because the original scale was from 1-5.
Comment
Alan Neustadtl

Join Date: Mar 2014

Posts: 107
#9

19 May 2017, 19:51

Originally posted by Marie Sofie View Post

Another problem is that when I do create an index and tab it, it gives me the scale of

indx | Freq. Percent Cum.
------------+-----------------------------------
1 | 1 0.29 0.29
7 | 1 0.29 0.59
8 | 2 0.59 1.18
9 | 14 4.13 5.31
10 | 7 2.06 7.37
11 | 7 2.06 9.44
12 | 10 2.95 12.39
13 | 35 10.32 22.71
14 | 38 11.21 33.92
15 | 87 25.66 59.59
16 | 38 11.21 70.80
17 | 99 29.20 100.00
------------+-----------------------------------
Total | 339 100.00

I have no idea how to treat it. Because the original scale was from 1-5.

I showed one way of creating and additive index that accounts for the expanded range. Basically, any observation with values of 1 on all four variables is assigned a score of 1; observations with values of 5 on all four variables receive scores of 17. The "-3"in the code I showed you adjusts the additive scale with a range of 4 to 20 to 1 to 17.

If you prefer a scale of 1 to 5 you could sum and then average. Something like this:

Code:

generate index=(ener_sec4 + ener_sec3 + ener_sec2 + ener_sec1)/4
Comment
Marie Sofie

Join Date: Jan 2017

Posts: 79
#10

20 May 2017, 11:24

Dear Alan, thank you so much for your explanation! I get it now.I think I am all right with hte one you recommended, I only needed to understand the scale. Thanks a million!
Comment

Announcement

Creating an Index

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment