Constructing confidence intervals by inversion using permute?

Pawel Charasz

Join Date: Jan 2020
Posts: 12

Constructing confidence intervals by inversion using permute?

01 May 2024, 10:12

I would like to construct confidence intervals by test inversion. My first hunch was to use the Stata command permute and run it for a wide range of null hypotheses, and based on this determine the appropriate confidence interval. I do not know, however, how to properly test for an alternative hypothesis using permute. The basic code is this:

Code:

webuse lifeexp
permute safewater _b, reps(1000): reg lexp safewater, r

Which prints:

Code:

Monte Carlo permutation results                  Number of observations =    40
Permutation variable: safewater                  Number of permutations = 1,000

      Command: regress lexp safewater, r
        _pm_1: _b[safewater]

-------------------------------------------------------------------------------
             |                                               Monte Carlo error
             |                                              -------------------
           T |    T(obs)       Test       c       n      p  SE(p)   [95% CI(p)]
-------------+-----------------------------------------------------------------
       _pm_1 |   .238561      lower    1000    1000 1.0000  .0000  .9963 1.0000
             |                upper       0    1000  .0000  .0000  .0000  .0037
             |            two-sided                  .0000  .0000      .      .
-------------------------------------------------------------------------------
Notes: For lower one-sided test, c = #{T <= T(obs)} and p = p_lower = c/n.
       For upper one-sided test, c = #{T >= T(obs)} and p = p_upper = c/n.
       For two-sided test, p = 2*min(p_lower, p_upper); SE and CI approximate.

Now, I wish to run something similar for other hypotheses, e.g. not only that _b[safewater] = .238561, to be able to construct a confidence interval for the point estimate.

My first thought was to do something like this:

Code:

permute safewater (_b[safewater]-0.2), reps(1000): reg lexp safewater, r

However, even for extreme values, it just reports exactly the same values for the permutation test, making me doubt whether this is a correct approach.

Code:

Monte Carlo permutation results                  Number of observations =    40
Permutation variable: safewater                  Number of permutations = 1,000

      Command: regress lexp safewater, r
        _pm_1: _b[safewater]-0.2

-------------------------------------------------------------------------------
             |                                               Monte Carlo error
             |                                              -------------------
           T |    T(obs)       Test       c       n      p  SE(p)   [95% CI(p)]
-------------+-----------------------------------------------------------------
       _pm_1 |   .038561      lower    1000    1000 1.0000  .0000  .9963 1.0000
             |                upper       0    1000  .0000  .0000  .0000  .0037
             |            two-sided                  .0000  .0000      .      .
-------------------------------------------------------------------------------
Notes: For lower one-sided test, c = #{T <= T(obs)} and p = p_lower = c/n.
       For upper one-sided test, c = #{T >= T(obs)} and p = p_upper = c/n.
       For two-sided test, p = 2*min(p_lower, p_upper); SE and CI approximate.

Any suggestions on how to do this using permute or any other Stata command would be appreciated!

Tags: None

Mike Lacy

Join Date: Apr 2014

Posts: 2416
#2

01 May 2024, 16:08

Setting aside any possible oddities you get in the -permute- output: Do you have a reason for not using -bootstrap-? It's intended for getting CIs via resampling, as it simulates a sampling distribution in which the alternative (non-null) hypothesis is true. -permute- simulates a sampling distribution in which the null hypothesis is true, which is not what you'd want for confidence intervals.
2 likes
Comment
Pawel Charasz

Join Date: Jan 2020

Posts: 12
#3

02 May 2024, 17:30

Thank you for your reply. I think my confusion may have come exactly from my misunderstanding of what permute actually does. Ultimately, what I want to do is to implement randomization inference, which as I understand it involves testing the sharp null hypothesis of no effect. The idea then would be to construct a confidence interval by test inversion utilizing the duality between confidence intervals and hypothesis tests , i.e. to find the set of all null hypotheses that would not be rejected.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#4

03 May 2024, 07:29

I still think that the most helpful answer I can offer would be "use -bootstrap-, not -permute-," so I'd ask my question somewhat more pointedly: Is there some reason you would think that -bootstrap- is not valid in your situation? To the best of my understanding, such situations may occur, but are not common. Knowing *why* you don't want to use a bootstrap procedure here might be helpful. Perhaps you are following some source that advises this combination of inversion/permutation? I have not seen that before

Now, it *may* be possible that inverting a permutation test will give a valid confidence interval here, perhaps the same one as would a bootstrap procedure. That's an interesting question, i.e., the circumstances under which those procedures would produce the same result.

But let's assume provisionally that both approaches, bootstrap and permute/test inversion, are valid and would give the same results. I'd think that your approach would be more compute intensive, as (to the best of my understanding) test inversion would require that you iterate on the value of the confidence limits, performing a permutation test repeatedly. If it took (say) 10 iterations to arrive at the upper CI limit, and the same for the lower limit, and supposing that (say) 1,000 permutation repetitions are used for each iteration, that would take quite a while. I'd rather devote computation time to one bootstrap procedure with (say) 20,000 repetitions.

As to your wondering about "what permute actually does:" Per -help permute-, randomly shuffles the values of the permutation variable across observations, runs the estimation command each time, and collects the results. Another description would be that -permute- repeatedly runs the estimation command using data sets formed by sampling values of the permutation variable *without replacement* from the observed data.

I might be missing something here, so I'd encourage anyone else to enter in as appropriate.
1 like
Comment
Pawel Charasz

Join Date: Jan 2020

Posts: 12
#5

03 May 2024, 08:45

Dear Mike Lacy , I see how bootstrap is very similar to what I wish to do, but there seem to be some conceptual differences. I found this discussion highlighting the similarities and differences between bootstrapping and randomization inference very helpful: https://jasonkerwin.com/nonparibus/2017/09/25/randomization-inference-vs-bootstrapping-p-values/ . I must admit, it's not fully clear to me how much that would actually lead to different results. Ideally, I would then want to compare both approaches. Your point about computational intensity is definitely valid, as indeed, test inversion would require iterating over a specified grid.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#6

03 May 2024, 10:34

OK, I'd see that the material you cite argues for a different kind of justification for re-sampling methods, but I'm not sure the resulting p-values would estimate something different than what one would get from a conventional normal theory p-value, assuming optimal assumptions are met.

I did do a little messing around with trying to use -permute- to create a p-value for a test again something other than a 0 null value, and I don't think Stata's -permute- will do it as you would like, simply by using a different expression. I think you need to save the results from

Code:

permute safewater (_b[safewater]), reps(1000): reg lexp safewater, r

, and then count the results that exceed (_b[safewater - 0.2) yourself. (0.2 here is just a stand-in for whatever parameter value you want to use.)
1 like
Comment
Pawel Charasz

Join Date: Jan 2020

Posts: 12
#7

07 May 2024, 14:19

Thank you, Mike. I did a little bit more digging and found the user-written command ritest (available from SSC) which extends permute by allowing to specify the null hypothesis.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#8

07 May 2024, 15:35

This puzzles me: In my experience, permutation tests are *always* using a null hypothesis of "no effect." (For example, if you add a constant to the variable lexp in your example, the null value, i.e., the expected value of the parameter in the permutation sampling distribution, will nevertheless be 0). I will have to look at -ritest- and see what the story is. If you have a nice example to post here, please do it.
Comment

Announcement

Constructing confidence intervals by inversion using permute?

Comment

Comment

Comment

Comment

Comment

Comment

Comment