Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a composite index with dummy variables and regular variables

    As the title says, when I tried to make a composite index using a mix of dummy variables (shows when a country experiences sanctions in a certain sector) and regular a mix of indicators, I ended up with many missing variables probably because of the dummy variables.

    I also reconsidered dropping making a composite index based on a thread I came across here almost awhile ago, if I remember correctly one of the users said you'd be losing information that way?


    For some context: I am trying to capture the effect of geoeconomic fragmentation for a number of countries I'm studying (11 countries) , unfortunately since I have very limited data (baseline models covers 1996 - 2020) I'm forced to use a fixed effects model and can't make use of other more dynamic models, I also have a number of controls for the dependent variable (FDI)

    I would be grateful if someone could for any advice or pointers for the composite index issue.

  • #2
    Missing values won't arise from manipulating indicator variables (aka dummy variables) unless those indicators themselves have missing values -- or you divide by zero.

    It's quite common to see people generating indicators that are 1 or missing, and these can be problematic. If that's what's happening, use 1 or 0 indicators.

    See also for example https://journals.sagepub.com/doi/pdf...36867X19830921

    Comment


    • #3
      Originally posted by Nick Cox View Post
      Missing values won't arise from manipulating indicator variables (aka dummy variables) unless those indicators themselves have missing values -- or you divide by zero.

      It's quite common to see people generating indicators that are 1 or missing, and these can be problematic. If that's what's happening, use 1 or 0 indicators.

      See also for example https://journals.sagepub.com/doi/pdf...36867X19830921
      hank you for sharing the paper with me it's a good read but i'm not sure it applies to me. I already have the indicator variables/dummy variables but I want to combine them with other indices or measures.


      Not sure if I should have mentioned I was using the PCA method to make the composite indicator so maybe that's why it gave me missing values

      Here is what I have when I try applying PCA

      Sorry for not being very thorough at the start, I thought I had shared the necessary information

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str3 CountryCode double Year float GPR double WUI float kaopen byte(military trade financial travel other) float gef
      "AUS" 1996  .05041408  .10670312500000001  .9401681 0 0 0 0 0          .
      "AUS" 1997  .04831508          .125325075  .8803361 0 0 0 0 0 -3.1586065
      "AUS" 1998  .06631457           .03769885  .8205042 0 0 0 0 0  -1.917529
      "AUS" 1999  .08513333           .03474485  .7606722 0 0 0 0 0  -1.848865
      "AUS" 2000  .04517946  .16145779999999998  .7008403 0 0 0 0 0  -1.303708
      "AUS" 2001   .1130074          .087171975  .7008403 0 0 0 0 0  -1.264625
      "AUS" 2002  .13527791           .07341065  .7008403 1 0 0 0 0  -.9041352
      "AUS" 2003  .18588236          .308238375  .7008403 1 0 0 0 0  -1.466168
      "AUS" 2004  .10759059          .031713725  .7008403 1 0 0 0 0 -1.4289196
      "AUS" 2005  .09751298  .07910982500000001  .7008403 1 0 0 0 0 -1.4429926
      "AUS" 2006  .08831813          .068764825  .7008403 1 0 0 0 0 -1.4436494
      "AUS" 2007   .0824838           .01910585  .7008403 1 0 0 0 0 -1.4342687
      "AUS" 2008  .06123541            .2299376  .7008403 1 0 0 0 0 -1.4891534
      "AUS" 2009  .05658427           .07698375  .7008403 0 0 0 0 0 -1.8059756
      "AUS" 2010  .06562497           .17976685  .7008403 0 0 0 0 0 -1.8263845
      "AUS" 2011  .05599247  .24364802500000002  .7008403 0 0 0 0 0 -2.9974916
      "AUS" 2012  .06192498          .132407025  .7606722 0 0 0 0 0 -3.0320945
      "AUS" 2013  .05854882          .034845475  .8205042 0 0 0 0 0  -3.072853
      "AUS" 2014   .1242697            .1185409  .8803361 0 1 0 0 0  -2.859292
      "AUS" 2015  .08203483  .17426439999999999  .9401681 0 1 0 0 0  -2.947615
      "AUS" 2016  .07304541            .2490718         1 0 1 0 0 0  -3.029398
      "AUS" 2017  .11699405          .307399925         1 0 1 0 0 0  -2.402123
      "AUS" 2018  .09539042          .172188025         1 0 1 0 0 0  -2.378452
      "AUS" 2019  .12962687  .21579435000000002         1 0 1 0 0 0  -2.377163
      "AUS" 2020  .12218092            .3143915         1 0 1 0 0 0          .
      "CHN" 1996   .2561134             .108939 .16294755 1 0 1 0 0          .
      "CHN" 1997   .2077677          .051352225 .16294755 1 0 1 0 0   2.852751
      "CHN" 1998   .3124568          .081483525 .16294755 1 0 0 0 0          .
      "CHN" 1999   .3886796            .0310752 .16294755 1 0 0 0 0          .
      "CHN" 2000  .24345416 .047521549999999996 .16294755 1 0 0 0 0   3.594932
      "CHN" 2001   .3796609            .1213891 .16294755 1 0 0 0 0   3.622708
      "CHN" 2002   .3378655          .057847675 .16294755 1 0 0 0 0  3.6234775
      "CHN" 2003  .54733706          .088951575 .16294755 1 0 0 0 0   3.684949
      "CHN" 2004   .3740352          .042381625 .16294755 1 0 0 0 0   3.638829
      "CHN" 2005   .4370224                   0 .16294755 1 0 0 0 0   3.669075
      "CHN" 2006   .5760256                   0 .16294755 1 0 0 0 0  3.7145596
      "CHN" 2007   .4216573          .013048025 .16294755 1 0 0 0 0   3.661081
      "CHN" 2008   .3497178                   0 .16294755 1 0 0 0 0   3.640507
      "CHN" 2009   .3957576            .0610071 .16294755 1 0 0 0 0   3.641703
      "CHN" 2010   .4341458          .240712175 .16294755 1 0 0 0 0   3.613409
      "CHN" 2011   .3831185                   0 .16294755 1 0 0 0 0  3.6514366
      "CHN" 2012  .48966545  .21257702499999998 .16294755 1 0 0 0 0   3.637972
      "CHN" 2013   .3884187          .096474525 .16294755 1 0 0 0 0   2.586444
      "CHN" 2014   .3794429  .12708165000000002 .16294755 1 0 0 0 0  1.4450743
      "CHN" 2015  .41240865          .096949875 .16294755 1 0 0 0 0  1.4627117
      "CHN" 2016   .4496553            .0812062 .16294755 1 0 0 0 0   2.563445
      "CHN" 2017   .8143435           .09842055 .16294755 1 1 1 0 0  3.3656125
      "CHN" 2018   .9167054          .165574075 .16294755 1 1 1 0 0    3.38384
      "CHN" 2019   .8645223  .28920429999999997 .16294755 1 1 1 1 0   3.795875
      "CHN" 2020   .7542645          .354247475 .16294755 1 1 1 1 0          .
      "HKG" 1996  .03853131          .268645825         1 0 0 0 0 0          .
      "HKG" 1997  .03864578           .12235965         1 0 0 0 0 0 -4.1941657
      "HKG" 1998  .02722985  .11422120000000001         1 0 0 0 0 0  -4.196051
      "HKG" 1999 .034179375           .07152995         1 0 0 0 0 0 -4.1840715
      "HKG" 2000 .017073793          .087299825         1 0 0 0 0 0  -4.193254
      "HKG" 2001  .05670529  .08465837500000001         1 0 0 0 0 0  -4.179685
      "HKG" 2002  .04062532           .06298645         1 0 0 0 0 0   -4.18002
      "HKG" 2003  .08187662          .249121525         1 0 0 0 0 0 -4.2088385
      "HKG" 2004  .03542534 .034250525000000004         1 0 0 0 0 0 -4.1751885
      "HKG" 2005 .035413742           .13801695         1 0 0 0 0 0  -4.198783
      "HKG" 2006  .02977114          .019233725         1 0 0 0 0 0 -4.1736245
      "HKG" 2007 .027093435            .0203285         1 0 0 0 0 0 -4.1747494
      "HKG" 2008 .027385253           .11399655         1 0 0 0 0 0  -4.195949
      "HKG" 2009 .021914136  .21327975000000002         1 0 0 0 0 0 -3.6739585
      "HKG" 2010 .026896216          .232643525         1 0 0 0 0 0  -4.223083
      "HKG" 2011  .02157022          .100817075         1 0 0 0 0 0 -4.1948557
      "HKG" 2012  .02078211          .167178975         1 0 0 0 0 0  -4.210201
      "HKG" 2013 .024923297          .109808875         1 0 0 0 0 0 -4.1958027
      "HKG" 2014  .03556912           .04052145         1 0 0 0 0 0  -3.594372
      "HKG" 2015  .02443643            .0705611         1 0 0 0 0 0  -3.604844
      "HKG" 2016 .029536044            .0637433         1 0 0 0 0 0  -3.601625
      "HKG" 2017  .05237663                   0         1 0 0 0 0 0   -3.57966
      "HKG" 2018  .07984947          .021175675         1 0 0 0 0 0  -3.575484
      "HKG" 2019  .13819487          .062693925         1 0 0 0 0 0 -3.5658314
      "HKG" 2020   .2082471          .308808525         1 0 1 1 0 0          .
      "IDN" 1996   .0393579            .3364794  .9401681 0 0 1 0 0 -.10229006
      "IDN" 1997  .03581119           .26244455  .8803361 0 0 0 0 0  -.4386145
      "IDN" 1998  .06665757  .21717204999999998  .6575567 0 0 0 0 0 -.18799324
      "IDN" 1999   .0987041  .27104585000000003  .7606722 1 1 0 0 0   .3266703
      "IDN" 2000  .04922607          .449852325  .7008403 1 1 0 0 0   .3316634
      "IDN" 2001  .08832517   .5489590249999999  .7008403 1 0 0 0 0  .04900777
      "IDN" 2002  .12382384  .21622304999999997  .7008403 1 0 0 0 0   .1362699
      "IDN" 2003  .12254238  .31063430000000003  .7008403 1 0 0 0 0   .1290672
      "IDN" 2004  .09204476          .322989825  .7008403 1 0 0 0 0   .1162788
      "IDN" 2005  .08675057          .264383575  .7008403 1 0 0 0 0  .12787037
      "IDN" 2006  .06395322  .12682717500000001  .7008403 1 0 0 0 0  .15168357
      "IDN" 2007  .04392685            .1999754  .7008403 1 0 0 0 0  .12850058
      "IDN" 2008  .04147832          .249248525  .7008403 1 0 0 0 0   .1164973
      "IDN" 2009  .04596439          .068499975  .7008403 1 0 0 0 0  .25301528
      "IDN" 2010   .0331261           .07024945  .7008403 1 0 0 0 0   .2484166
      "IDN" 2011   .0342506          .102765225  .4172374 0 1 1 0 0   .8711606
      "IDN" 2012  .02131285           .32225955  .4172374 0 1 1 0 0   .8170259
      "IDN" 2013 .015832543  .09163945000000001  .4172374 0 1 1 0 0   .8676633
      "IDN" 2014 .031659987          .164782925  .4172374 0 1 1 0 0   .8562134
      "IDN" 2015  .01936742            .1107355  .4172374 0 1 1 0 0   1.396676
      "IDN" 2016  .03759759          .129507475  .4172374 0 1 1 0 0  1.3983735
      "IDN" 2017 .036192313          .036368925  .4172374 0 1 1 0 0  1.4190884
      "IDN" 2018 .031972926           .01740825  .4172374 0 1 1 0 0  1.4220184
      "IDN" 2019  .03615004          .270875675  .4172374 0 1 1 0 0  1.3657603
      "IDN" 2020  .02362779  .11192912499999999  .4172374 0 1 1 0 0          .
      end
      ------------------ copy up to and including the previous line ------------------

      Listed 100 out of 275 observations
      Use the count() option to list more
      [/CODE]
      Last edited by Nour Mohamed; 06 Sep 2024, 16:57.

      Comment


      • #4
        The problem doesn't lie in PCA either. PCA will ignore observations with missing values on any of the variables you use. How could it be otherwise?

        Comment


        • #5
          Originally posted by Nick Cox View Post
          The problem doesn't lie in PCA either. PCA will ignore observations with missing values on any of the variables you use. How could it be otherwise?
          So is PCA treating the dummy variables when absent as a missing variable then?

          Is there any alternative I can pursue to create a composite index that uses dummy variables?

          Comment


          • #6
            PCA is entirely unfazed by zero values, if that is what you mean by "absent" here.

            There are many ways to get composite indexes from a set of indicator variables, such as getting their sum or their mean. Usually that works well only if in some sense all the indicators run in the same direction.

            Comment

            Working...
            X