Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing proportions by two different variables

    Hello there,

    I want to make compare two proportions of a categorical variable by two different explanatory variables. My categorical variable is edu (1=literate; 0=no edu) and my two explanatory variables are sex and region. How can I make test statistics to see if there is significant difference in the share of literate people for males Vs for those who live in two regions as shows in the example below?
    Code:
    clear
    input sex region edu weight
    1 2 0 230
    1 3 1 200
    0 3 0 240
    1 1 1 250
    1 2 1 300
    0 3 0 280
    0 1 0 180
    1 1 1 340
    1 2 0 270
    0 1 1 210
    end
    lab def  sex   0 "female"  1 "male"
    lab val sex sex
     lab def  region   1 "sari"  2 "mexi" 3 "bata"
    lab val region region
    lab def edu 0 "no edu"  1 "literate"
    lab val edu edu
    sum edu if (region==1 | region==3)
    sum edu if sex ==1
    
    .    sum edu if    (region==1  region==3)
    
        Variable    Obs    Mean    Std. Dev.    Min    Max
                            
        edu    7    .5714286    .5345225    0    1
    
    .    sum edu if    sex ==1
    
        Variable    Obs    Mean    Std. Dev.    Min    Max
                            
        edu    6    .6666667    .5163978    0    1
    So, what kind of test should I use to test the difference between 0.5714286 and 0.6666667? As you can see also from the above example, I have sample weight variable. How can I also apply this weight in the test?

  • #2
    It's a strange hypothesis to test, but you could use logistic regression and set up a linear contrast using lincom.

    But first, check your data entry and make the necessary corrections.
    Code:
    sort region sex edu
    list edu sex region weight, noobs sepby(sex region)

    Comment


    • #3
      Logit does not converge here in this example data. Probit converges:

      Code:
      . probit edu i.sex i.region [fw=weight], nolog
      
      Probit regression                               Number of obs     =      2,500
                                                      LR chi2(3)        =    1864.89
                                                      Prob > chi2       =     0.0000
      Log likelihood = -798.42301                     Pseudo R2         =     0.5387
      
      ------------------------------------------------------------------------------
               edu |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               sex |
             male  |   11.68689   263.5739     0.04   0.965    -504.9085    528.2822
                   |
            region |
             mexi  |  -12.10214   263.5739    -0.05   0.963    -528.6975    504.4932
             bata  |  -6.012014   179.8201    -0.03   0.973    -358.4529    346.4289
                   |
             _cons |   .0965473   .0635716     1.52   0.129    -.0280508    .2211454
      ------------------------------------------------------------------------------
      Note: 0 failures and 590 successes completely determined.

      Comment


      • #4
        Thank you Joro Kolev and Joseph Coveney for your useful suggestions!

        Comment

        Working...
        X