Test for differences in mean binary variable

Felix Stein

Join Date: May 2015

Posts: 25
#1

Test for differences in mean binary variable

15 Nov 2016, 19:14

Hi all,

i have a rather general and potentially quite banal question. I want to first test whether a binary variable differs significantly between two samples and then whether the binary differs over time two points in time for one sample. Two samples are rather large.

Is in such a case a ttest or prtest appropriate or is something different required?

thanks in advance,
felix
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

16 Nov 2016, 00:11

Felix:
you might be interested in something along the following lines (with or without interactions; -baselevels- estimation option added to ease output interpretation):

Code:

. use http://www.stata-press.com/data/r14/union.dta
(NLS Women 14-24 in 1968)

. xtset idcode year
       panel variable:  idcode (unbalanced)
        time variable:  year, 70 to 88, but with gaps
                delta:  1 unit


. xtlogit union i.not_smsa##i.year if year<=71, basel

Fitting comparison model:

(omitted)

Random-effects logistic regression              Number of obs     =      3,451
Group variable: idcode                          Number of groups  =      2,288

Random effects u_i ~ Gaussian                   Obs per group:
                                                              min =          1
                                                              avg =        1.5
                                                              max =          2

Integration method: mvaghermite                 Integration pts.  =         12

                                                Wald chi2(3)      =      16.18
Log likelihood  = -1563.1617                    Prob > chi2       =     0.0010

-------------------------------------------------------------------------------
        union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
     not_smsa |
           0  |          0  (base)
           1  |   -1.41537    .357029    -3.96   0.000    -2.115134   -.7156057
              |
         year |
          70  |          0  (base)
          71  |  -.0783275   .1779788    -0.44   0.660    -.4271596    .2705047
              |
not_smsa#year |
        1 71  |   .7671836   .3812472     2.01   0.044     .0199528    1.514414
              |
        _cons |  -4.298953   .1978051   -21.73   0.000    -4.686644   -3.911262
--------------+----------------------------------------------------------------
     /lnsig2u |   3.277113   .0931235                      3.094594    3.459632
--------------+----------------------------------------------------------------
      sigma_u |   5.147734   .2396874                      4.698753    5.639615
          rho |   .8895611   .0091486                       .870315    .9062586
-------------------------------------------------------------------------------
LR test of rho=0: chibar2(01) = 420.66                 Prob >= chibar2 = 0.000

. xtlogit union i.not_smsa i.year if year<=71, basel

Fitting comparison model:

(omitted)

Random-effects logistic regression              Number of obs     =      3,451
Group variable: idcode                          Number of groups  =      2,288

Random effects u_i ~ Gaussian                   Obs per group:
                                                              min =          1
                                                              avg =        1.5
                                                              max =          2

Integration method: mvaghermite                 Integration pts.  =         12

                                                Wald chi2(2)      =      12.92
Log likelihood  = -1565.1904                    Prob > chi2       =     0.0016

------------------------------------------------------------------------------
       union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    not_smsa |
          0  |          0  (base)
          1  |  -.9988445    .281969    -3.54   0.000    -1.551494   -.4461954
             |
        year |
         70  |          0  (base)
         71  |   .0934561   .1558475     0.60   0.549    -.2119993    .3989116
             |
       _cons |  -4.360673   .1947524   -22.39   0.000    -4.742381   -3.978965
-------------+----------------------------------------------------------------
    /lnsig2u |   3.262381   .0934132                      3.079295    3.445468
-------------+----------------------------------------------------------------
     sigma_u |   5.109956   .2386687                      4.662946    5.599817
         rho |   .8881055   .0092829                      .8685784    .9050483
------------------------------------------------------------------------------
LR test of rho=0: chibar2(01) = 419.12                 Prob >= chibar2 = 0.000

Kind regards,
Carlo
(Stata 19.0)

Comment

Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#3

16 Nov 2016, 07:10

Carlo gave an overarching solution, capable to tackle a complex scenario.

I just wish to underline that, shall you only want "to test whether a binary variable differs significantly between two samples and then whether the binary differs over time two points in time for one sample", you could perform a chi-square test for the first scenario, and the McNemar's test for the second one. Shall you have more than 2-time points, you will need to perform the Chocran's Q test.

Best regards,

Marcos
Comment
Felix Stein

Join Date: May 2015

Posts: 25
#4

16 Nov 2016, 13:18

Marcos Almeida Thanks a lot again.

Last edited by Felix Stein; 16 Nov 2016, 13:26.
Comment

Announcement

Test for differences in mean binary variable

Comment

Comment

Comment