Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Test for differences in mean binary variable

    Hi all,

    i have a rather general and potentially quite banal question. I want to first test whether a binary variable differs significantly between two samples and then whether the binary differs over time two points in time for one sample. Two samples are rather large.

    Is in such a case a ttest or prtest appropriate or is something different required?

    thanks in advance,
    felix

  • #2
    Felix:
    you might be interested in something along the following lines (with or without interactions; -baselevels- estimation option added to ease output interpretation):
    Code:
    . use http://www.stata-press.com/data/r14/union.dta
    (NLS Women 14-24 in 1968)
    
    . xtset idcode year
           panel variable:  idcode (unbalanced)
            time variable:  year, 70 to 88, but with gaps
                    delta:  1 unit
    
    
    . xtlogit union i.not_smsa##i.year if year<=71, basel
    
    Fitting comparison model:
    
    (omitted)
    
    Random-effects logistic regression              Number of obs     =      3,451
    Group variable: idcode                          Number of groups  =      2,288
    
    Random effects u_i ~ Gaussian                   Obs per group:
                                                                  min =          1
                                                                  avg =        1.5
                                                                  max =          2
    
    Integration method: mvaghermite                 Integration pts.  =         12
    
                                                    Wald chi2(3)      =      16.18
    Log likelihood  = -1563.1617                    Prob > chi2       =     0.0010
    
    -------------------------------------------------------------------------------
            union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
         not_smsa |
               0  |          0  (base)
               1  |   -1.41537    .357029    -3.96   0.000    -2.115134   -.7156057
                  |
             year |
              70  |          0  (base)
              71  |  -.0783275   .1779788    -0.44   0.660    -.4271596    .2705047
                  |
    not_smsa#year |
            1 71  |   .7671836   .3812472     2.01   0.044     .0199528    1.514414
                  |
            _cons |  -4.298953   .1978051   -21.73   0.000    -4.686644   -3.911262
    --------------+----------------------------------------------------------------
         /lnsig2u |   3.277113   .0931235                      3.094594    3.459632
    --------------+----------------------------------------------------------------
          sigma_u |   5.147734   .2396874                      4.698753    5.639615
              rho |   .8895611   .0091486                       .870315    .9062586
    -------------------------------------------------------------------------------
    LR test of rho=0: chibar2(01) = 420.66                 Prob >= chibar2 = 0.000
    
    . xtlogit union i.not_smsa i.year if year<=71, basel
    
    Fitting comparison model:
    
    (omitted)
    
    Random-effects logistic regression              Number of obs     =      3,451
    Group variable: idcode                          Number of groups  =      2,288
    
    Random effects u_i ~ Gaussian                   Obs per group:
                                                                  min =          1
                                                                  avg =        1.5
                                                                  max =          2
    
    Integration method: mvaghermite                 Integration pts.  =         12
    
                                                    Wald chi2(2)      =      12.92
    Log likelihood  = -1565.1904                    Prob > chi2       =     0.0016
    
    ------------------------------------------------------------------------------
           union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
        not_smsa |
              0  |          0  (base)
              1  |  -.9988445    .281969    -3.54   0.000    -1.551494   -.4461954
                 |
            year |
             70  |          0  (base)
             71  |   .0934561   .1558475     0.60   0.549    -.2119993    .3989116
                 |
           _cons |  -4.360673   .1947524   -22.39   0.000    -4.742381   -3.978965
    -------------+----------------------------------------------------------------
        /lnsig2u |   3.262381   .0934132                      3.079295    3.445468
    -------------+----------------------------------------------------------------
         sigma_u |   5.109956   .2386687                      4.662946    5.599817
             rho |   .8881055   .0092829                      .8685784    .9050483
    ------------------------------------------------------------------------------
    LR test of rho=0: chibar2(01) = 419.12                 Prob >= chibar2 = 0.000
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo gave an overarching solution, capable to tackle a complex scenario.

      I just wish to underline that, shall you only want "to test whether a binary variable differs significantly between two samples and then whether the binary differs over time two points in time for one sample", you could perform a chi-square test for the first scenario, and the McNemar's test for the second one. Shall you have more than 2-time points, you will need to perform the Chocran's Q test.
      Best regards,

      Marcos

      Comment


      • #4
        Marcos Almeida Thanks a lot again.
        Last edited by Felix Stein; 16 Nov 2016, 13:26.

        Comment

        Working...
        X