Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to count observation number (and percentage) based on var1 by var2?

    Hi,all,

    Description of my data set:
    Var 1 is education, with values indicate levels of education.
    Var 2 is gender, with values indicate gender.

    What I want to do:
    to compare the education level difference among male and female. So I want to count the numbers of people with different levels of education divided by male and female (it would be better to show the percentage of education level in each group).

    I have the following codes for your reference. Would really appreciate your help!
    Thanks.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str4 name str15 education str6 gender
    "A" "primary school"  "female"
    "B" "primary school"  "male"  
    "C" "master degree"   "male"  
    "D" "bachelor degree" "female"
    "E" "bachelor degree" "female"
    "F" "master degree"   "male"  
    end

  • #2
    Icey:
    welcome to this forum.
    You can address the issue via two different ways:
    Code:
    . bysort gender_num: tab education_num
    
    -------------------------------------------------------------------------------------------------------------------
    -> gender_num = female
    
      education_num |      Freq.     Percent        Cum.
    ----------------+-----------------------------------
     primary school |          1       33.33       33.33
    bachelor degree |          2       66.67      100.00
    ----------------+-----------------------------------
              Total |          3      100.00
    
    -------------------------------------------------------------------------------------------------------------------
    -> gender_num = male
    
      education_num |      Freq.     Percent        Cum.
    ----------------+-----------------------------------
     primary school |          1       33.33       33.33
      master degree |          2       66.67      100.00
    ----------------+-----------------------------------
              Total |          3      100.00
    
    
    .
    If you want to spice things up with a bit of inferential flavour, you can consider -ologit-:
    Code:
    . g id=_n
    
    . g education_num=0 if education=="primary school"
    (4 missing values generated)
    
    . replace education_num=1 if education=="bachelor degree"
    (2 real changes made)
    
    . replace education_num=2 if education=="master degree"
    (2 real changes made)
    
    . label define education_num 0 "primary school" 1 "bachelor degree" 2 "master degree"
    
    . label val education_num education_num
    
    . gen gender_num=0 if gender=="female"
    (3 missing values generated)
    
    . replace gender_num=1 if gender=="male"
    (3 real changes made)
    
    . label define gender_num 0 "female" 1 "male"
    
    . label val gender_num gender_num
    
    . ologit education_num i.gender_num
    
    Iteration 0:   log likelihood = -6.5916737  
    Iteration 1:   log likelihood =  -6.037985  
    Iteration 2:   log likelihood = -6.0328898  
    Iteration 3:   log likelihood = -6.0328733  
    Iteration 4:   log likelihood = -6.0328733  
    
    Ordered logistic regression                     Number of obs     =          6
                                                    LR chi2(1)        =       1.12
                                                    Prob > chi2       =     0.2904
    Log likelihood = -6.0328733                     Pseudo R2         =     0.0848
    
    -------------------------------------------------------------------------------
    education_num |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
       gender_num |
            male  |   1.723757   1.701913     1.01   0.311    -1.611931    5.059446
    --------------+----------------------------------------------------------------
            /cut1 |   -.164153   1.021501                     -2.166259    1.837953
            /cut2 |    1.45926   1.223355                      -.938472    3.856993
    -------------------------------------------------------------------------------
    
    
    .
    That said, the regression model above suffers from the omission of a relevant predictors: the individual ability to achieve higher educational goals regardless students' gender.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi, Carlo, really appreciate your help!!! That works.

      Comment

      Working...
      X