Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple tests adjusting p-values

    Suppose I have an X vector (for simplicity containing 3 age categories and a male dummy) and a group dummy ("foreign"). I would like to test the following:
    • whether the mean of each x variable is different for the foreign=0 and foreign=1 people.
    • a joint test on whether the mean at least one of the x variables is different for the foreign=0 and foreign=1 people.
    In doing so, I would like to adjust the p-values for multiple testing (e.g. with Šidák correction).

    Here's a minimal working example that prepares a fake dataset and preforms the first type of tests mentioned above (but without correcting for multiple hypotheses):
    Code:
    set more off
    sysuse auto, clear
    
    
    // Create fake discrete variables
    
    * Age group
    gen age_group = rep78
    recode age_group 4=1 5=2 .=1
    assert inlist(age_group, 1, 2, 3)
    
    * Male
    gen male = gear_ratio
    recode male 0/2.5=0 2.5/.=1
    assert inlist(male, 0, 1)
    
    
    // T-tests "X = foreign" (not corrected for multiple testing)
    
    ttest male, by(foreign) unequal
    
    tab age_group, gen(age_group_)
    foreach c in age_group_1 age_group_2 age_group_3 {
        ttest `c', by(foreign) unequal
    }
    Thanks for any comments on this.
    Last edited by Stefano Lombardi; 13 Jun 2022, 05:47.

  • #2
    Stefano:
    the first comment is that you're seemingly planning to apply -ttest- to discrete variable. Is there any rationale behind that?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Carlo Lazzaro View Post
      Stefano:
      the first comment is that you're seemingly planning to apply -ttest- to discrete variable. Is there any rationale behind that?
      Thanks for the reply. The goal is to compare averages in two groups defined by the foreign indicator variable.
      It can be with -ttest- or other commands/approaches.

      For instance, the following commands return the same p-value (using my toy dataset):
      Code:
      qui reg male foreign
      lincom foreign
      ttest male, by(foreign)
      I am not entirely sure I understand your question. If it has to do with the fact that male is discrete, -ttest- can still be applied, see:
      Code:
      help ttest
      , "View complete PDF manual entry" (example 2).

      S.
      Last edited by Stefano Lombardi; 13 Jun 2022, 08:55.

      Comment


      • #4
        Stefano:
        in -ttest- Example #2 -mpg- is continuous, not discrete.
        Elaborating on your first code, I would go:
        Code:
        logit male i.foreign
        as in the following toy-example:
        Code:
        . use "C:\Program Files\Stata17\ado\base\a\auto.dta"
        (1978 automobile data)
        
        . logit foreign i.rep78
        
        note: 1.rep78 != 0 predicts failure perfectly;
              1.rep78 omitted and 2 obs not used.
        
        note: 2.rep78 != 0 predicts failure perfectly;
              2.rep78 omitted and 8 obs not used.
        
        note: 5.rep78 omitted because of collinearity.
        Iteration 0:   log likelihood = -38.411464 
        Iteration 1:   log likelihood = -27.676628 
        Iteration 2:   log likelihood = -27.446054 
        Iteration 3:   log likelihood = -27.444671 
        Iteration 4:   log likelihood = -27.444671 
        
        Logistic regression                                     Number of obs =     59
                                                                LR chi2(2)    =  21.93
                                                                Prob > chi2   = 0.0000
        Log likelihood = -27.444671                             Pseudo R2     = 0.2855
        
        ------------------------------------------------------------------------------
             foreign | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
        -------------+----------------------------------------------------------------
               rep78 |
                  1  |          0  (empty)
                  2  |          0  (empty)
                  3  |  -3.701302   .9906975    -3.74   0.000    -5.643033   -1.759571
                  4  |  -1.504077   .9128709    -1.65   0.099    -3.293271    .2851168
                  5  |          0  (omitted)
                     |
               _cons |   1.504077    .781736     1.92   0.054    -.0280969    3.036252
        ------------------------------------------------------------------------------
        
        . mat list e(b)
        
        e(b)[1,6]
               foreign:    foreign:    foreign:    foreign:    foreign:    foreign:
                    1b.         2o.          3.          4.         5o.           
                 rep78       rep78       rep78       rep78       rep78       _cons
        y1           0           0  -3.7013019  -1.5040774           0   1.5040774
        
        . test 3.rep78=4.rep78=_cons
        
         ( 1)  [foreign]3.rep78 - [foreign]4.rep78 = 0
         ( 2)  [foreign]3.rep78 - [foreign]_cons = 0
        
                   chi2(  2) =   13.83
                 Prob > chi2 =    0.0010
        
        .
        Thongs may be different if you're planning to run a linear probability model.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          Stefano:
          in -ttest- Example #2 -mpg- is continuous, not discrete.
          Elaborating on your first code, I would go:

          [...]
          You're right about the example, apologies for misreading.
          Thank you for the feedback.

          Another option, partly based on past Statalist examples, would be to estimate multiple reg (or I presume probit) models, and then use suest followed by test (that can be used to test parameter restrictions across models).
          That is:

          Code:
          reg male foreign    
          est store m1
          reg age_group_1 foreign
          est store m2
          reg age_group_2 foreign
          est store m3
          reg age_group_3 foreign
          est store m4
          
          suest m1 m2 m3 m4, noomitted
          
          test [m1_mean]foreign [m2_mean]foreign [m3_mean]foreign, mtest(sidak)
          Although I am not sure on what assumptions all this relies upon.
          Last edited by Stefano Lombardi; 13 Jun 2022, 09:56.

          Comment


          • #6
            Stefano:
            I would go:
            Code:
             
             . mlogit age_group i.foreign i.male
            and then follow the syntax of -mlogit postestimation-,Example #5.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X