Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating an index

    Hey
    I want to create an index out of 10 variables, each of them has values on a scale from 1 to 5.
    I know that I have to use:
    gen index = (AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL), but I guess that`s not complete, or is it?
    I found an old dofile with an index (other variables) I created two years ago with was: generate index = ((V50+V51+V52+V53)-4)/12
    So i guess I also have to substract and divide by some number. I was trying to derive the numbers from the old index I found, but I really have no idea what numbers I have to choose.
    Can someone possibly help me with this? That would be amazing
    Best wishes
    Katja

  • #2
    The ‘formula’ for the index is up to you to decide. Logically, it should back up on the literature.

    That said , egen with the option rowmean may be what you’re looking for.
    Best regards,

    Marcos

    Comment


    • #3
      Hey
      Thnak you for your answer
      How do you mean it should go up on the literature?
      When I create it like I tried, my values are not going from 1 to 5 anymore. And I have no idea how to change that.
      Best wishes,
      Katja

      Comment


      • #4
        By ‘literature’, I meant: the background (scientific) literature for the given index.

        The command I suggested was according to the first example and gave the mean value. You may use the egen option to get the total as well.
        Best regards,

        Marcos

        Comment


        • #5
          Katja,

          Whatever approach you choose to creating the index, you will need to make a decision about how missing data are handled if any of your cases have missing values on the 10 variables.

          Red Owl
          Stata/IC 16.0 (Windows 10, 64-bit)

          Comment


          • #6
            Leaving aside the missing value issue, the index that you created ranges from 10 to 50. It can be rescaled in different ways. For example:
            Code:
            gen index = (AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL)-9
            This creates an index that ranges from 1 to 41 instead of 10 to 50.

            Another method is to create an average. Again, assuming that you have no missing values you could use:
            Code:
            gen index = (AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL)/10
            The lowest value would be equal to 10/10 or1 and the greatest value would be equal to 50/10=5.

            Missing values can complicate the issue in a couple of ways. But, if you choose to average your index you may decide that your index can tolerate some missing data. Maybe you could accept averages based on 8 or 9 or all of the 10 variables. In that case, you would need to write some Stata code to adjust the denominator for each observation in your data.

            You also might consider the command alpha and do something like this:
            Code:
            alpha AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL, item detail generate(index)
            Best,
            Alan

            Comment


            • #7
              Katja,

              Another issue you need to consider is whether any of your index variables need to be reverse-scored.

              If you use Alan's suggestion of -alpha- that is handled automatically (unless you use the -asis- option). Otherwise, you'll need to do some data manipulation to reverse-score variables that are reverse-coded.

              Red Owl
              Stata/IC 16.0 (Windows 10, 64-bit)

              Comment


              • #8
                Red Owl makes a good point that I didn't address. But a simple change to would do this easily:
                Code:
                gen index = 51-(AC+AD+AE+AF+AG+AH+AI+AJ+AK+AL)
                For example, if your index measured tolerance, do larger values mean more tolerance or less tolerance? Finally, the code above assumes that all of the variables are coded in the same direction. Sometimes surveys change the direction in a series of variables to minimize respondent response sets.

                Comment


                • #9
                  Sometimes surveys change the direction in a series of variables to minimize respondent response sets.
                  I've never understood this. If a respondent is skimming the questionnaire and giving the same grade regardless, reversing the meaning just mangles the data.

                  Reminds me obliquely of a respondent on a television show transiently famous for repeatedly saying

                  Oi'll give it foive
                  better explained at https://en.wikipedia.org/wiki/Thank_...rs_(TV_series). I am a fair mimic but not from the same part of Britain and in any case audio effects don't seem possible on Statalist.

                  Comment


                  • #10
                    I think it is a way to detect a response set not to rectify it.

                    Comment


                    • #11
                      That makes sense, but wasn't the understanding or the explanation of the social scientists (no names here) who first explained this to me.

                      Comment

                      Working...
                      X