Creating an index

Katja Mennel

Join Date: Aug 2019

Posts: 2
#1

Creating an index

15 Sep 2019, 05:37

Hey
I want to create an index out of 10 variables, each of them has values on a scale from 1 to 5.
I know that I have to use:
gen index = (AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL), but I guess that`s not complete, or is it?
I found an old dofile with an index (other variables) I created two years ago with was: generate index = ((V50+V51+V52+V53)-4)/12
So i guess I also have to substract and divide by some number. I was trying to derive the numbers from the old index I found, but I really have no idea what numbers I have to choose.
Can someone possibly help me with this? That would be amazing
Best wishes
Katja
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

15 Sep 2019, 06:53

The ‘formula’ for the index is up to you to decide. Logically, it should back up on the literature.

That said , egen with the option rowmean may be what you’re looking for.

Best regards,

Marcos
Comment
Katja Mennel

Join Date: Aug 2019

Posts: 2
#3

15 Sep 2019, 07:01

Hey
Thnak you for your answer
How do you mean it should go up on the literature?
When I create it like I tried, my values are not going from 1 to 5 anymore. And I have no idea how to change that.
Best wishes,
Katja
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#4

15 Sep 2019, 07:05

By ‘literature’, I meant: the background (scientific) literature for the given index.

The command I suggested was according to the first example and gave the mean value. You may use the egen option to get the total as well.

Best regards,

Marcos
Comment
Red Owl

Join Date: Nov 2016

Posts: 127
#5

15 Sep 2019, 08:46

Katja,

Whatever approach you choose to creating the index, you will need to make a decision about how missing data are handled if any of your cases have missing values on the 10 variables.

Red Owl
Stata/IC 16.0 (Windows 10, 64-bit)
Comment
Alan Neustadtl

Join Date: Mar 2014

Posts: 107
#6

15 Sep 2019, 09:41

Leaving aside the missing value issue, the index that you created ranges from 10 to 50. It can be rescaled in different ways. For example:

Code:

gen index = (AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL)-9

This creates an index that ranges from 1 to 41 instead of 10 to 50.

Another method is to create an average. Again, assuming that you have no missing values you could use:

Code:

gen index = (AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL)/10

The lowest value would be equal to 10/10 or1 and the greatest value would be equal to 50/10=5.

Missing values can complicate the issue in a couple of ways. But, if you choose to average your index you may decide that your index can tolerate some missing data. Maybe you could accept averages based on 8 or 9 or all of the 10 variables. In that case, you would need to write some Stata code to adjust the denominator for each observation in your data.

You also might consider the command alpha and do something like this:

Code:

alpha AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL, item detail generate(index)

Best,
Alan
Comment
Red Owl

Join Date: Nov 2016

Posts: 127
#7

15 Sep 2019, 09:48

Katja,

Another issue you need to consider is whether any of your index variables need to be reverse-scored.

If you use Alan's suggestion of -alpha- that is handled automatically (unless you use the -asis- option). Otherwise, you'll need to do some data manipulation to reverse-score variables that are reverse-coded.

Red Owl
Stata/IC 16.0 (Windows 10, 64-bit)
Comment
Alan Neustadtl

Join Date: Mar 2014

Posts: 107
#8

16 Sep 2019, 18:04

Red Owl makes a good point that I didn't address. But a simple change to would do this easily:

Code:

gen index = 51-(AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL)

For example, if your index measured tolerance, do larger values mean more tolerance or less tolerance? Finally, the code above assumes that all of the variables are coded in the same direction. Sometimes surveys change the direction in a series of variables to minimize respondent response sets.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#9

17 Sep 2019, 01:50

Sometimes surveys change the direction in a series of variables to minimize respondent response sets.

I've never understood this. If a respondent is skimming the questionnaire and giving the same grade regardless, reversing the meaning just mangles the data.

Reminds me obliquely of a respondent on a television show transiently famous for repeatedly saying

Oi'll give it foive

better explained at https://en.wikipedia.org/wiki/Thank_...rs_(TV_series). I am a fair mimic but not from the same part of Britain and in any case audio effects don't seem possible on Statalist.
Comment
Alan Neustadtl

Join Date: Mar 2014

Posts: 107
#10

18 Sep 2019, 11:58

I think it is a way to detect a response set not to rectify it.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#11

18 Sep 2019, 12:36

That makes sense, but wasn't the understanding or the explanation of the social scientists (no names here) who first explained this to me.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment