Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Principal component analysis for alternative proxies

    I have a panel data. There are three alternative proxies (i.e. IndVar1, IndVar2, IndVar3) that basically measure the same concept. I was wondering how I can use principal component analysis to somehow combine these three proxies into one proxy so that the subsequent analysis/table can be cleaner.

    Main model in my research: reghdfe DepVar IndVar Controls, absorb() cluster()

    See below for the data on DepVar and IndVar. I assume no control variables will be used for the principal component analysis, correct? Thank you in advance for the help!

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long gvkey double(fyear DepVar) float IndVar1 double IndVar2 float IndVar3
    12994 1992  -.56    .00540844                   .  .6532383
    12994 1993  -.28   .004935834                   .  .6034814
    12994 1994  -.33   .004775097                   .  .6251687
    12994 1995   -.2   .004983093                   .  .6231105
    12994 1996  -.15   .004461602                   .  .6325336
    12994 1997  -.24   .003900709                   .  .6226345
    12994 1998  -.16  .0029155216                   .   .578259
    12994 1999  -.23   .003016023                   . .58972865
    12994 2000   -.2  .0030279916                   . .59530157
    12994 2001     .   .002729808                   .  .4891296
    12994 2002  -.64  .0021383914                   . .58651394
    12994 2003  -.22   .002304107  .07442265412013015  .6755146
    12994 2004  -.79  .0018747656 .062054743157105366   .664766
    12994 2005  -.84   .001843272 .055298161709759375   .489827
    12994 2006 -1.18  .0017245542 .058979753733659865  .4455284
    12994 2007  -.63  .0013577248 .043039874231813637  .4127912
    12994 2008  -.11  .0022961204   .0738202692003167  .3696525
    12994 2009  -.63  .0022685607  .07338793706063122  .4201931
    12994 2010  -.55   .002429222  .07591318393406761  .4039361
    12994 2011  -.57   .001742565  .06081552044609666  .4516295
    12994 2012  -.53  .0015599926  .04406979048788129   .476481
    12994 2013  -.95  .0016679754 .044451545599926624 .44745025
    12994 2014  -.26  .0011239594 .025345286241558036  .5163922
    12994 2015  -.22  .0009625988 .027145287003981236  .4973588
    12994 2016  -.19  .0009307549  .02531653253342371 .51517814
    19049 1992     .            .                   .         .
    19049 1993     .            .                   .         .
    19049 1994     .            .                   .         .
    19049 1995     .            .                   .         .
    19049 1996     .            .                   .         .
    19049 1997     .            .                   .         .
    19049 1998     .            .                   .         .
    19049 1999     .  .0005279381                   .         .
    19049 2000     .  .0004776801                   . .52466106
    19049 2001     . .00044878325                   . .54433024
    19049 2002     .  .0005047504                   .         .
    19049 2003     .  .0006503357                   .  .6144845
    19049 2004     .  .0003528647                   .         .
    19049 2005     . .00033243175                   .  .4410313
    19049 2006     . .00032095055                   .         .
    19049 2007     . .00050686934                   .  .6006066
    19049 2008     .  .0002772205                   .         .
    19049 2009     . .00027057092                   .         .
    19049 2010     . .00028201297                   . .54251695
    19049 2011     . .00025021454                   .  .7300749
    16950 1994     .            .                   .         .
    16739 1992     .            .                   .         .
    16739 1993     .            .                   .         .
    16739 1994     .            .                   .         .
    16739 1995     .            .                   .         .
    16739 1996     .            .                   .         .
    16739 1997     .            .                   .         .
    16739 1998     .            .                   .         .
    16739 1999     .  .0003397757                   .         .
     4641 1999  -.75  .0039060125  .01679585385129795  .6458617
     4641 2000  -.48     .0045971 .013331589564619169  .6057627
    11908 1992   -.2   .015568634   .1712549838617809  .8920395
    11908 1993  -.13   .015002538  .11551954946221668  .9201134
    11908 1994  -.09   .014272025  .11274899586111282  .8750779
    11908 1995  -.06    .01522486  .12788881948782013   .861227
    11908 1996   -.1    .01369816  .08492859833567344  .8307174
    11908 1997  -.09    .01317675  .10014329715217499  .8125359
    11908 1998  -.08    .02582409  .17818623727783686  .7449759
    11908 1999  -.18   .006290726 .035857137614871276  .9388587
    11908 2000     .   .007939343  .03255130800682783  .8582537
    13073 1992     .  .0005427379                   . .54051685
    12115 1992 -1.58  .0027985075 .024067164179104476  .4264534
    12115 1993 -1.89   .002607562 .021382007822685786  .6258829
    12115 1994     .   .003222836  .02707182320441989  .4209149
    12825 1992  -.09   .009577443   .2767881086465157  .7387033
    12825 1993  -.06   .009337733   .2960061184380197  .6722038
    12825 1994  -.07    .01399246  .37359869623191266  .7162285
    12825 1995  -.11   .011777083   .3627341676742616  .7804794
    12825 1996  -.13   .012164562  .34304065396686373   .759369
    12825 1997  -.14    .01380677    .443197323110909  .7223861
    12825 1998  -.12   .013886148  .42075029051283735  .6724367
    12825 1999  -.12   .009612634    .290301555371118  .6642201
    12825 2000   -.3    .01029577  .26974915761886936  .6841277
    12825 2001  -.32   .009998227   .1909661469103205  .6845236
    12825 2002  -.31   .010245856  .20081877753573799  .6544336
    12825 2003  -.18   .010452835   .1604510171062917  .6696278
    12825 2004   -.1    .00971764  .14965164375979684  .6465802
    12825 2005   -.2   .008470789  .12452058942268873  .6323833
    12825 2006  -.17   .007518207  .11765994168486289   .611187
    12825 2007  -.06   .007628929  .09993896856881294  .6062316
    12825 2008  -.15    .00736284  .08651337933355771  .6536069
    12825 2009  -.08   .006611036  .08991008991008992  .5995871
    12825 2010  -.09   .006138777  .08103186025414538 .55019194
    12825 2011  -.09   .006405038   .0925528002413253 .55718964
    12825 2012  -.12   .005809563   .0720385754952652  .5646573
    12825 2013  -.08   .005638063 .058072047810773346  .5540121
    12825 2014  -.13   .005265921  .05055284429445174 .53680456
    12825 2015  -.11   .004824248  .06247401645636254  .5315951
    12825 2016  -.09   .004845646  .04942558904884008  .5211627
    12825 2017  -.05    .00531318                   .  .5373825
    12096 1992     .    .00945335 .014180024660912454 1.0209436
    12096 1993 -1.16    .01277275 .015327301756253327  .6785985
    12096 1994  -.61   .009256304  .02036386849664858  .7435537
    12096 1995  -.74    .00730953 .012426201855496203  .8320454
    12096 1996  -.33   .004839531 .010163015792154865  .9697902
    end

  • #2
    You could use factor analysis or principal components analysis or even the alpha procedure to generate your scales. Note that almost all of these techniques will result in missing data for all of the observations where you don't have values on all three variables. As I look at your data this may substantially reduce your sample size.

    It is also not unheard of simply to standardize the three indicators and add them up or take the mean.

    One issue that you may need to worry about is how highly correlated are these. If they're not highly correlated, then it is quite likely they are actually measuring different constructs.

    Comment

    Working...
    X