Double respondent firm surveys and analysis - averaging vs clustering

Tim Hasso

Join Date: Jul 2015

Posts: 1
#1

Double respondent firm surveys and analysis - averaging vs clustering

03 Jul 2015, 06:46

I have come across an issue that I can't find any clear directions about. I have a Dataset where we have two respondents per firm filling out the same survey on their perception of firm performance, competitive advantage and other perceptual variables. It seems to me that there are two possibilities the analysis (OLS).

1. Use an average of the two responses to bring the data back to a firm level.
2. Cluster the standard errors on a firm level and thus have twice the amount of observations.

It seems prior research that I can find usually has gone with the first option (but they are citing people from the 70s and 80s to justify this, and these original papers have not justified it from a stats point of view). So can anybody with greater knowledge than me help shed light on this issue from a stats point of view?
Tags: None
Christian Agethen

Join Date: Nov 2014

Posts: 27
#2

08 Jul 2015, 15:25

You should be more precise about your research goal: what effect are you interested in? what's your dependent variable? what time span are your data covering? and in your particular case: did both employees fill out the forms simultaneously?
Comment

Steve Samuels

Join Date: Mar 2014
Posts: 1786

08 Jul 2015, 20:26

Collapsing to averages is a very bad idea if you want to estimate any descriptive statistic besides the arithmetic mean:

Code:

set obs 100
set seed 4389128
gen firm = _n
expand 2
bys firm: gen id =_n

gen x = exp(rnormal(0,2))


tempfile orig means

/* Original Data */
svyset firm
save `orig', replace

/* Collapsed to firm means */
collapse (mean) y (mean) x, by(firm)
svyset firm
save `means', replace

/* Estimate arithmetic mean */
use `orig', clear
svy: mean x
use `means', clear
svy: mean x

/* Estimate Median */
use `orig', clear
epctile x, p(50) svy
use `means' , clear
epctile x, p(50) svy

Results:

Code:

. use `orig', clear
. svy: mean x
--------------------------------------------------------------
             |             Linearized
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
           x |    5.59676   1.157261      3.300503    7.893017
--------------------------------------------------------------

. use `means', clear
. svy: mean x
--------------------------------------------------------------
             |             Linearized
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
           x |    5.59676   1.157261      3.300503    7.893017
--------------------------------------------------------------
.
. /* Median */
. use `orig', clear
. epctile x, p(50) svy
Percentile estimation
------------------------------------------------------------------------------
             |             Linearized
           x |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         p50 |   .8407342   .1801065     4.67   0.000     .4877319    1.193736
------------------------------------------------------------------------------

. use `means' , clear
. epctile x, p(50) svy
Percentile estimation
------------------------------------------------------------------------------
             |             Linearized
           x |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         p50 |   1.731446   .3305758     5.24   0.000     1.083529    2.379363
------------------------------------------------------------------------------

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2

Comment

Steve Samuels

Join Date: Mar 2014

Posts: 1786
#4

09 Jul 2015, 16:14

How were the two employees selected? At random from among a pool of those defined as "eligible" for the survey?

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#5

14 Jul 2015, 22:21

A little experimentation shows that with much larger sample sizes, the differences between the cluster & average versions diminishes. However I asked the question about sampling of respondents in order to find out if there if there might be useful information in differences between respondents from the same firm. How firms and respondents were selected would help to answer that question.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment

Announcement

Double respondent firm surveys and analysis - averaging vs clustering

Comment

Comment

Comment

Comment