Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Omnibus Test using Randomization Inference

    Dear Community

    I'm currently trying to do an balance check for experimental data, in which I use randomization inference to calculate a p-value for an omnibus test. I am currently using the code below. However, I'm not able to reproduce these p-values, even though I have set a seed beforehand. The changes are quite large when re-running the code (range between 0.998 and 0.020). What am I missing here? Do I need to set anything else for this to output me the same value each time? Thank you so much for the help and inputs!

    Code:
    snapshot restore 1 
    
    estpost sum $testvar1 if treat==0 & empty!=.
    matrix cgmean=e(mean)
    matrix cgsd=e(sd)
    local cgN = e(N)
    
    estpost sum $testvar1 if treat==1 & empty!=.
    matrix tgmean=e(mean)
    matrix tgsd=e(sd)
    local tgN = e(N)
    
    
    * RI omnibus test
    regress treat $testvar1 if empty != ., vce(cluster cluster)
    test $testvar1
    scalar F_obs = r(F)
    
    local reps = 1000
    tempname Fvals
    matrix `Fvals' = J(`reps', 1, .)
    set seed 210725
    
    forvalues i = 1/`reps' {
        preserve
        gen treat_perm = treat
        gen rand = runiform()
        sort rand
        replace treat_perm = treat[_n]
        quietly regress treat_perm $testvar1 if empty!= ., vce(cluster cluster)
        quietly test $testvar1
        matrix `Fvals'[`i',1] = r(F)
        restore
    }
    
    local count = 0
    forvalues i = 1/`reps' {
        if `Fvals'[`i',1] >= F_obs {
            local count = `count' + 1
        }
    }
    local pRI = `count' / `reps'
    scalar pvalue = int(`pRI' * 1000) / 1000
    di as result "  RI p-value     : " %6.3f pvalue
    
    matrix pvalues = J(1, `: word count $testvar1', .)
    local counter = 1
    
    gen     cg = treat==0 
    replace cg = .     if treat==.
    foreach var of global testvar1 {
        
        if strpos("$nonbin", "`var'") {
            ranksum `var' if empty != ., by(cg)
            local pvalue = r(p)
        }
        else {
            if strpos("$bin", "`var'") {
                tabulate `var' cg if empty != ., chi2
                local pvalue = r(p)
            }
        }
    
        matrix pvalues[1, `counter'] = `pvalue'
        local counter = `counter' + 1
    }
    matrix colnames pvalues = $testvar1
    matrix list pvalues

  • #2
    try moving the seed before snapshot

    Comment


    • #3
      Hi Georg! Thanks for the input. Unfortunately, this does not solve my problem. Any other suggestions?

      Comment


      • #4
        You can try
        Code:
        sort rand, stable
        You can also add some random numbers to display as a debugging help, like:
        Code:
        di runiform()
        As soon as this number changes in each run, you know the problem must occur before.
        Best wishes

        Stata 18.0 MP | ORCID | Google Scholar

        Comment


        • #5
          Hi Felix, thanks for the help. Interestingly, when I add the display code, I can relatively clearly see that the numbers are stable, so the seed seems to work for the runiform. However, I'm not adding any other RNG in the code for generating the p-value for the f-stat or am I completely missing something?

          Code:
          set seed 031192
          snapshot restore 1 
          
          estpost sum $testvar1 if lao_index!=.
          matrix mean=e(mean)
          matrix sd=e(sd)
          local N = e(N)
          
          estpost sum $testvar1 if treat==0 & lao_index!=.
          matrix cgmean=e(mean)
          matrix cgsd=e(sd)
          local cgN = e(N)
          
          estpost sum $testvar1 if treat==1 & lao_index!=.
          matrix tgmean=e(mean)
          matrix tgsd=e(sd)
          local tgN = e(N)
          
          * RI omnibus test
          regress treat $testvar1 if lao_index != ., vce(cluster cluster)
          test $testvar1
          scalar F_obs = r(F)
          
          local reps = 1000
          tempname Fvals
          matrix `Fvals' = J(`reps', 1, .)
          
          forvalues i = 1/`reps' {
              preserve
              gen treat_perm = treat
              gen rand = runiform()
              di runiform()
              sort rand, stable
              replace treat_perm = treat[_n]
              quietly regress treat_perm $testvar1 if lao_index != ., vce(cluster cluster)
              quietly test $testvar1
              matrix `Fvals'[`i',1] = r(F)
              restore
          }
          
          local count = 0
          forvalues i = 1/`reps' {
              if `Fvals'[`i',1] >= F_obs {
                  local count = `count' + 1
              }
          }
          
          local p1 : display %4.3f `count' / `reps'
          scalar pvalue = int(`p1' * 1000) / 1000
          
          matrix pvalues = J(1, `: word count $testvar1', .)
          local counter = 1
          
          gen     cg = treat==0 
          replace cg = .     if treat==.
          foreach var of global testvar1 {
              if strpos("$nonbin", "`var'") {
                  ranksum `var' if lao_index!=., by(cg) 
                  local pvalue = r(p)
              }
              else if strpos("$bin", "`var'") {
                  tabulate `var' cg if lao_index!=., chi2
                  local pvalue = r(p)
              }
              matrix pvalues[1, `counter'] = `pvalue'
              local counter = `counter' + 1
          }
          matrix colnames pvalues = $testvar1
          matrix list pvalues

          Comment


          • #6
            So the matrix is always the same? After the first loop you can add:
            Code:
            mat list  `Fvals'
            And check
            Best wishes

            Stata 18.0 MP | ORCID | Google Scholar

            Comment


            • #7
              Yes, the matrix shows me the same value 1000 times (1.0486942) and this number is consistent over runs. But the p-values are still not the same (e.g., 1st run: 0.767, 2nd run: 0.770)
              Last edited by Arto Arman; 30 Jul 2025, 03:40.

              Comment


              • #8
                Now I am quite confused. Why would the values of the matrix be constant? I mean, you want the matrix to have the same results when running the dofile multiple times, but of course, the matrix itself should not only have a single value present. I fear without sharing a data example this could take longer to solve.
                Best wishes

                Stata 18.0 MP | ORCID | Google Scholar

                Comment


                • #9
                  Sorry, I forgot to provide it before. Here is my data example for some of the variables stored in the local "testvar1" and the treatment indicator "treat". Do you think you need anything else?

                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input float treat byte(lao_1 wb_1 fair_1 trust_1 criminal_1 risk_1) float age
                  1 5  7 5  9 0  5 39
                  1 6  8 6  8 0  7 59
                  1 5  5 3  4 2  8 42
                  1 5  8 5  8 2  7 19
                  1 .  . .  . .  . 25
                  1 3  6 5  7 0  3 22
                  1 3  3 4  8 2  3 52
                  1 5  5 3  7 1  4 33
                  1 5  7 4  6 2  2 43
                  0 5  1 4  8 0  2 28
                  1 4  4 5  8 1  6 25
                  1 3  2 5  9 1  3 23
                  1 4  2 3  3 2  8 25
                  1 6  3 5  9 0  5 34
                  1 5  6 5  8 1  8 59
                  1 3  7 6  8 3  2 27
                  0 4  3 5  8 3  4 55
                  0 0  0 2  1 5  5 27
                  1 .  . .  . .  . 33
                  1 6 10 6  7 1  4 54
                  1 1  0 6 10 4  2 52
                  1 6  1 5  9 0  3 24
                  1 5  1 3  7 2  6 46
                  0 6  5 5 10 0  7 65
                  1 2  7 2  6 5  5 34
                  1 5  5 4  7 1  8 42
                  1 5  7 5  9 1  6 39
                  0 2  2 3  3 3  5 33
                  1 5  8 5  9 0  2 48
                  1 5  9 5  8 0  7 30
                  1 4  3 5  9 3  8 34
                  1 3  1 5  8 3  4 35
                  1 .  . .  . .  . 38
                  1 4  7 5  9 1  5 52
                  1 3  2 5  7 1  7 38
                  1 6  2 6  9 0  1 36
                  1 6  3 5  9 0  2 34
                  1 6  1 5  9 0  4 24
                  1 6  8 5  7 1  7 55
                  1 6  8 5  8 0  3 30
                  0 5  5 5  7 0  3 64
                  0 6  5 4  8 0  7 31
                  1 6  6 5  9 1  5 47
                  1 5  4 5  9 1  3 34
                  0 5  2 4  8 2  2 48
                  1 .  . .  . .  . 38
                  1 5  5 5  9 0  7 53
                  1 5  5 5  9 0  5 48
                  1 2  2 4  6 1  4 31
                  1 5  3 4  7 2  7 50
                  1 5  3 5  8 3  6 35
                  1 4  6 5  8 2 10 21
                  1 6  6 6  8 0  3 28
                  1 5  2 6  9 3  2 34
                  0 5  5 5  8 0  2 37
                  0 3  5 2  6 2  4 29
                  0 6  4 2  6 0  6 43
                  1 3  5 4  9 1  2 55
                  1 6  1 4  7 4  1 33
                  1 3  8 6 10 2  4 70
                  1 6  7 6 10 0  7 60
                  1 3  0 4  6 0  2 28
                  1 4  0 5  7 1  7 27
                  0 6  5 5  7 2  6 63
                  1 5  3 5 10 0  6 58
                  1 3  6 6 10 1  6 31
                  1 5  5 4  7 4  5 18
                  1 5  4 5  8 0  8 31
                  1 2  3 5  8 0  3 30
                  1 5  4 3  5 2  6 19
                  1 6  6 5  9 2  4 24
                  1 5  2 5  9 3  5 24
                  1 6  2 4  7 0  7 52
                  1 2  8 5  8 3  6 29
                  1 5  3 4  8 2  3 28
                  1 3  4 4  6 2  7 40
                  1 3  7 6  9 3  7 52
                  0 4  6 4  6 2  5 44
                  1 4  5 4  7 4  5 39
                  1 4  5 4  4 2  3 33
                  1 2  7 4  8 1  7 34
                  1 5  6 5  9 1  3 43
                  1 4  2 4  8 1  7 34
                  1 6  8 4  7 1  2 29
                  1 3  3 3  7 3  4 26
                  1 2  2 6  9 1  2 30
                  1 6  7 6 10 0  2 44
                  1 4  5 5  9 1  6 40
                  1 3  2 5  8 1  2 39
                  1 3  1 5  8 1  7 43
                  1 4  2 3  7 3  4 36
                  1 .  . .  . .  . 33
                  1 6  7 5 10 2  3 21
                  1 3  2 5  7 0  5 38
                  1 5  0 4  9 2  4 42
                  1 2  7 5  8 1  3 28
                  1 4  3 6 10 0  3 25
                  1 4  4 5  9 3  7 50
                  0 1  3 5  8 2  9 25
                  0 3  3 3  3 2  2 39
                  end

                  Comment


                  • #10
                    Please make sure that your example code runs with your example data. For example, the code contains a global, which is never defined in your code.
                    Best wishes

                    Stata 18.0 MP | ORCID | Google Scholar

                    Comment

                    Working...
                    X