Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correlation by gender in a Panel Data

    Hi, I have an artificial panel data in which the individual is the Household ID and time is defined as the individual's gender, meaning that for each household I have one couple with one individual male and other female. I have several explanatory variables such as years of education, age, nationality, religiosity and so on. I want to create a matrix with the correlation between males and females for each of my explanaotry variables. I tryed to use:

    correlate varlist, by (gender)

    But this option is not allowed. Any suggestion?

    Thank you,
    Daniel
    Last edited by Daniel Lerner; 18 Jun 2023, 08:57. Reason: Panel Data, Correlation Matrix, Family Economics

  • #2
    Daniel:
    welcome to this forum.
    You may want to test something like:
    Code:
    bysort gender: correlate <varlist>
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi, when I use bysort it just repeat the command for males and females. What I wanna test is the correlation between both groups for each explanatory variable.
      Let's say:

      correlation between male's years of education and female's years of education

      Comment


      • #4
        Daniel:
        correct, as this is what you asked for.
        That said, if you want to -correlate- your variables as you reported in your #3:
        Code:
        . use "C:\Program Files\Stata17\ado\base\a\auto.dta"
        (1978 automobile data)
        
        . g price_1=price if foreign==0
        (22 missing values generated)
        
        . g price_2=price if foreign==1
        (52 missing values generated)
        
        . corr price_1 price_2
        no observations
        r(2000);
        Conversely, if you're interested in the correlation of regression coefficients, things are easier:
        Code:
        use "C:\Program Files\Stata17\ado\base\a\auto.dta"
        . regress price i.foreign##c.trunk
        
              Source |       SS           df       MS      Number of obs   =        74
        -------------+----------------------------------   F(3, 70)        =      3.45
               Model |  81822937.2         3  27274312.4   Prob > F        =    0.0211
            Residual |   553242459        70   7903463.7   R-squared       =    0.1288
        -------------+----------------------------------   Adj R-squared   =    0.0915
               Total |   635065396        73  8699525.97   Root MSE        =    2811.3
        
        ---------------------------------------------------------------------------------
                  price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
        ----------------+----------------------------------------------------------------
                foreign |
               Foreign  |   1114.855    2657.69     0.42   0.676    -4185.741    6415.451
                  trunk |   261.6024   91.41565     2.86   0.006     79.27966    443.9252
                        |
        foreign#c.trunk |
               Foreign  |   6.257639   211.4828     0.03   0.976    -415.5316    428.0468
                        |
                  _cons |   2213.787    1403.61     1.58   0.119    -585.6244    5013.199
        ---------------------------------------------------------------------------------
        
        . estat vce, corr
        
        Correlation matrix of coefficients of regress model
        
                     |        1.           1.fore~n#          
                e(V) |  foreign     trunk   c.trunk     _cons 
        -------------+----------------------------------------
           1.foreign |   1.0000                               
               trunk |   0.5074    1.0000                     
           1.foreign#|                                        
             c.trunk |  -0.9575   -0.4323    1.0000           
               _cons |  -0.5281   -0.9607    0.4153    1.0000 
        
        .
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Let's suppose gender is coded 0 for male and 1 for female. (By the way, a really good convention that should be a rule for researchers is to name an indicator for the condition coded 1, so that 0 for male and 1 for female and a variable name female then would be more transparent.)

          Then what you want is then a generalisation of


          Code:
          bysort id (gender) : gen whatever_female = whatever[2] if _n == 1 & _N == 2 
          
          corr whatever whatever_female

          Comment


          • #6
            Second thoughts are that this might, unusually, be a situation where a reshape wide is a really good idea. Given a data example that is real or realistic I would be happy to show token code.

            Comment

            Working...
            X