Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Double respondent firm surveys and analysis - averaging vs clustering

    I have come across an issue that I can't find any clear directions about. I have a Dataset where we have two respondents per firm filling out the same survey on their perception of firm performance, competitive advantage and other perceptual variables. It seems to me that there are two possibilities the analysis (OLS).

    1. Use an average of the two responses to bring the data back to a firm level.
    2. Cluster the standard errors on a firm level and thus have twice the amount of observations.

    It seems prior research that I can find usually has gone with the first option (but they are citing people from the 70s and 80s to justify this, and these original papers have not justified it from a stats point of view). So can anybody with greater knowledge than me help shed light on this issue from a stats point of view?

  • #2
    You should be more precise about your research goal: what effect are you interested in? what's your dependent variable? what time span are your data covering? and in your particular case: did both employees fill out the forms simultaneously?

    Comment


    • #3
      Collapsing to averages is a very bad idea if you want to estimate any descriptive statistic besides the arithmetic mean:
      Code:
      set obs 100
      set seed 4389128
      gen firm = _n
      expand 2
      bys firm: gen id =_n
      
      gen x = exp(rnormal(0,2))
      
      
      tempfile orig means
      
      /* Original Data */
      svyset firm
      save `orig', replace
      
      /* Collapsed to firm means */
      collapse (mean) y (mean) x, by(firm)
      svyset firm
      save `means', replace
      
      /* Estimate arithmetic mean */
      use `orig', clear
      svy: mean x
      use `means', clear
      svy: mean x
      
      /* Estimate Median */
      use `orig', clear
      epctile x, p(50) svy
      use `means' , clear
      epctile x, p(50) svy
      Results:

      Code:
      . use `orig', clear
      . svy: mean x
      --------------------------------------------------------------
                   |             Linearized
                   |       Mean   Std. Err.     [95% Conf. Interval]
      -------------+------------------------------------------------
                 x |    5.59676   1.157261      3.300503    7.893017
      --------------------------------------------------------------
      
      . use `means', clear
      . svy: mean x
      --------------------------------------------------------------
                   |             Linearized
                   |       Mean   Std. Err.     [95% Conf. Interval]
      -------------+------------------------------------------------
                 x |    5.59676   1.157261      3.300503    7.893017
      --------------------------------------------------------------
      .
      . /* Median */
      . use `orig', clear
      . epctile x, p(50) svy
      Percentile estimation
      ------------------------------------------------------------------------------
                   |             Linearized
                 x |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               p50 |   .8407342   .1801065     4.67   0.000     .4877319    1.193736
      ------------------------------------------------------------------------------
      
      . use `means' , clear
      . epctile x, p(50) svy
      Percentile estimation
      ------------------------------------------------------------------------------
                   |             Linearized
                 x |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               p50 |   1.731446   .3305758     5.24   0.000     1.083529    2.379363
      ------------------------------------------------------------------------------
      Steve Samuels
      Statistical Consulting
      [email protected]

      Stata 14.2

      Comment


      • #4
        How were the two employees selected? At random from among a pool of those defined as "eligible" for the survey?
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment


        • #5
          A little experimentation shows that with much larger sample sizes, the differences between the cluster & average versions diminishes. However I asked the question about sampling of respondents in order to find out if there if there might be useful information in differences between respondents from the same firm. How firms and respondents were selected would help to answer that question.
          Steve Samuels
          Statistical Consulting
          [email protected]

          Stata 14.2

          Comment

          Working...
          X