Running dominance analysis on datasets with bootstrap weights

Stephanie Cheng

Join Date: Jun 2021

Posts: 3
#1

Running dominance analysis on datasets with bootstrap weights

18 May 2023, 10:42

Dear all,

A dataset I work on has both survey weight and bootstrap weights.

Here is the excerpt of my dataset. Variable 'svywgt' is survey weight, and 'bs1' to 'bs3' are bootstrap weights. My goal is to look at the relative importance of 'space', 'room', 'afford', 'condition' (independent variables) to 'rating' (dependent variable).

Code:

clear input double(rating svywgt bs1 bs2 bs3 space room afford condition) 6 162.7 .5 151.9 .5 3 5 4 3 9 2401.8 2.1 2.1 2.2 5 5 2 3 10 49.2 2 2.4 1.3 3 2 4 2 10 26.9 50 1.8 49.4 5 5 5 5 5 9.2 23.5 .5 .8 1 1 4 2 8 48.6 85.6 39.9 130.5 4 4 3 4 8 1102.6 1067.9 1418.2 4567.2 2 1 2 5 10 2259.2 .5 4589.1 .6 4 4 3 1 4 83.2 81.3 151.6 93.4 3 2 5 4 7 112.5 2.1 245.5 371.3 5 5 2 5 end

Normally, before running regression, I need to specify the survey weight and bootstrap weights in the survey design:

Code:

svyset [pweight=svywgt], brrweight(bs1-bs3) vce(brr) mse

However, when I applied the -svy:- or -svy brr:- prefix to the -domin- command, an error message poped up:

Code:

. svy brr: domin rating space room afford condition, reg(regress) fitstat(e(r2)) domin is not supported by svy with vce(brr); see help svy estimation for a list of Stata estimation commands that are supported by svy r(322);

Here are some other things I have found when I explored possible solutions:
I am able to use only the survey weight on -domin-, i.e.

Code:

domin rating space room afford condition [pweight=svywgt], reg(regress) fitstat(e(r2))

The -bootstrap:- prefix is compatible with -domin- command, but it is not using the bootstrap weights in the dataset.

Code:

bootstrap: domin rating space room afford condition, reg(regress) fitstat(e(r2))

My questions are:
1. Is there any quick way to run -domin- command with bootstrap weights that I am not aware of?

2. If I run -domin- but replace [pweight=svyweight] with [pweight=bs1] and repeat it for 1000 times (since the real datasets has 1000 bootstrap weight), does it mean the results generated are the same as running -domin- with bootstrap weight? How should I automate this process and summarize the results?

3. Is there any other way that allows me to run dominance analysis with bootstrap weights, for example, writing my own programs in between commands?

4. After running -domin-, how can I obtain standard error or confidence intervals or each independent variable?

Many thanks!

Last edited by Stephanie Cheng; 18 May 2023, 10:49.
Tags: None
Joseph Luchman

Join Date: Mar 2014

Posts: 114
#2

22 May 2023, 07:35

Hi Stephanie,

-domin-'s helpfile discussed a few of these points. For example, that -domin- does not produce standard errors for any analysis without the use of bootstrapping (or similar methods) which can be of more or less value depending on what the researcher believes the main sources of instability are in the model.

This idea extends to support any -svy- prefix functionality as most of what it does is in support of obtaining accurate sampling variance; the sampling weights are most impactful for what matters for -domin- as is discussed in the helpfile as well.

To your question #1, I'm not familiar with research that has attempted to devise a method that incorporates bootstrap weights from a complex survey design into dominance statistics/Shapley values and there's no canned/pre-developed method to incorporate bootstrap weights in the program. It seems possible to do, but would run well out ahead of established practice.

To #2, this seems like the most feasible way to accomplish this but, will admit, my understanding of bootstrap weights with complex surveys is cursory and the nuances of how to get a CI with these, if they are different from standard bootstrapping. Perhaps generate a 'container' matrix and fill it in with results as you go while the command runs. Something like the below - which provides the (unadjusted) 95%CI for the 'age' variable in the example. I use Mata here but could also be done with Stata matrices.

Code:

webuse nmihs_bs domin bwgrp age vagbleed multiple highbp childsex [pw=finwgt], reg(regress) fitstat(e(r2)) mata: res_mat = J(1000, 5, .) forvalues rep = 1/1000 { quietly domin bwgrp age vagbleed multiple highbp childsex [pw=bsrw`rep'], reg(regress) fitstat(e(r2)) mata: res_mat[`rep', ] = st_matrix("e(b)") } mata: sort(res_mat[,1])[(25,975),1]

#3 - Possibly. The helpfile discusses the structure of a command required to wrap into -domin-. So long as the command is 'wrap-able' it should be able to be used in -domin-. That said, -domin- will still not be able to accept the -svy- or other prefixes it's not designed to accept. May be a challenging programming problem to solve.

#4 - back to #2, not entirely certain outside of the standard bootstrap practice of sorting the results and using the upper and lower bound of the 95% of estimates in the middle of the distribution for a 95% CI. If there are complex survey details in addition to this, I am not aware of what they may be.

In the end, this is an interesting question but would urge you to consider whether just running the -domin- with the survey weight could provide what you need. What role is the standard error playing in the result and how does it meaningfully improve understanding beyond that of the the standard errors estimated for the base -regress- model? I have always found that, practically speaking, the results from the complete and conditional dominance statistics provide as much if not more information than the standard errors of the general dominance statistics and do not require thousands of re-runs of the dominance analysis to obtain.

- joe

Joseph Nicholas Luchman, Ph.D., PStat® (American Statistical Association)
----
Research Fellow
Fors Marsh
----
Version 18.0 MP
1 like
Comment

Announcement

Running dominance analysis on datasets with bootstrap weights

Comment