Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using fixed effect model for panel data or pooled OLS

    Hi everyone,

    I have been trying to run regression analysis on the effect of ethnicity and gender on future employment, without much success. None of the p-value is significant (please see below). I have considered treating it as panel data as I have observations from 3 years and conduct a fixed effect model, to single out the unobserved heterogeneities however, I am confused as to which one I should set as panel variable in the "xtset" command, I have previously set salary as the panel var, however, it is the independent variable that cannot be the panel var.

    Futhermore, many of my controlled variables are string, do I need to generate new variables and replace them with numeric variables one by one before conducting "xtreg" ?


    Please also see below an extract of my data, open to any suggestion on how I can improve the current regression analysis. Thanks!

    Code:
    reg salary year genderx mature
    
          Source |       SS           df       MS      Number of obs   =     3,426
    -------------+----------------------------------   F(3, 3422)      =      0.51
           Model |  1.4826e+13         3  4.9419e+12   Prob > F        =    0.6750
        Residual |  3.3125e+16     3,422  9.6801e+12   R-squared       =    0.0004
    -------------+----------------------------------   Adj R-squared   =   -0.0004
           Total |  3.3140e+16     3,425  9.6759e+12   Root MSE        =    3.1e+06
    
    ------------------------------------------------------------------------------
          salary | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            year |  -6990.206   65826.15    -0.11   0.915    -136052.7    122072.3
         genderx |  -128063.3   106348.3    -1.20   0.229    -336575.9    80449.25
          mature |  -70051.69   238444.9    -0.29   0.769    -537560.4      397457
           _cons |   314171.5   210461.1     1.49   0.136    -98470.57    726813.5
    ------------------------------------------------------------------------------

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long salary str36 ethnicity str6 domicile float year
       0 "Chinese"                            "UK"     1
       0 "Chinese"                            "UK"     1
       0 "White"                              "UK"     1
       0 "White"                              "UK"     1
       0 "White"                              "UK"     1
       0 "Mixed - White and Asian"            "UK"     1
       0 "White"                              "UK"     1
       0 "Black or Black British - African"   "UK"     1
       0 "White"                              "UK"     1
       0 "White"                              "UK"     1
       0 "White"                              "UK"     1
       0 "White"                              "EU"     1
       0 "White"                              "UK"     1
       0 "White"                              "UK"     1
       0 "White"                              "UK"     2
       0 "White"                              "UK"     2
       0 "White"                              "UK"     2
       0 "White"                              "UK"     2
       0 "White"                              "UK"     2
       0 "Chinese"                            "non-EU" 2
       0 "White"                              "UK"     2
       0 "White"                              "UK"     2
       0 "Black or Black British - Caribbean" "UK"     2
       0 "White"                              "UK"     3
       0 "Black or Black British - African"   "UK"     3
       0 "White"                              "UK"     3
       0 "White"                              "UK"     3
       0 "Other Asian background"             "UK"     3
       0 "Other Asian background"             "UK"     3
       0 "White"                              "UK"     3
       0 "Information refused"                "UK"     3
       1 "White"                              "UK"     3
       1 "Mixed - White and Asian"            "UK"     3
       1 "Mixed - White and Black Caribbean"  "non-EU" 3
      10 "White"                              "UK"     1
      14 "White"                              "UK"     1
      17 "White"                              "UK"     1
      23 "White"                              "UK"     1
      26 "White"                              "UK"     2
      28 "White"                              "UK"     1
      30 "Chinese"                            "non-EU" 1
      40 "White"                              "EU"     1
      40 "White"                              "UK"     1
      40 "White"                              "UK"     1
      40 "White"                              "UK"     2
      45 "White"                              "UK"     1
      48 "Asian or Asian British - Indian"    "UK"     1
      50 "White"                              "UK"     1
      50 "White"                              "UK"     1
      50 "White"                              "UK"     1
      91 "White"                              "UK"     1
     100 "White"                              "UK"     3
     120 "Asian or Asian British - Indian"    "non-EU" 1
     200 "White"                              "UK"     3
     300 "Chinese"                            "non-EU" 1
     520 "White"                              "UK"     2
     600 "White"                              "UK"     3
     684 "White"                              "UK"     3
     800 "Asian or Asian British - Pakistani" "UK"     2
     850 "White"                              "UK"     3
     900 "White"                              "UK"     2
    1000 "White"                              "UK"     2
    1000 "White"                              "UK"     3
    1000 "Mixed - White and Asian"            "UK"     3
    1200 "Mixed - White and Black African"    "UK"     2
    1200 "White"                              "UK"     2
    1200 "White"                              "UK"     2
    1200 "White"                              "UK"     3
    1350 "White"                              "non-EU" 2
    1500 "White"                              "UK"     1
    1500 "White"                              "UK"     1
    1500 "Arab"                               "UK"     1
    1500 "White"                              "UK"     2
    1500 "White"                              "UK"     2
    1680 "White"                              "EU"     1
    1800 "White"                              "UK"     3
    2000 "Mixed - White and Black African"    "UK"     1
    2000 "White"                              "UK"     1
    2000 "White"                              "UK"     1
    2000 "White"                              "UK"     2
    2000 "White"                              "UK"     2
    2000 "White"                              "UK"     2
    2500 "White"                              "UK"     2
    2600 "White"                              "UK"     3
    3000 "Mixed - White and Asian"            "UK"     1
    3000 "White"                              "UK"     1
    3000 "White"                              "UK"     1
    3000 "White"                              "EU"     1
    3000 "Mixed - White and Black Caribbean"  "UK"     1
    3000 "White"                              "UK"     1
    3000 "White"                              "UK"     2
    3000 "Asian or Asian British - Indian"    "non-EU" 2
    3000 "White"                              "UK"     2
    3000 "White"                              "UK"     2
    3000 "White"                              "non-EU" 3
    3252 "White"                              "UK"     3
    3300 "White"                              "UK"     3
    3423 "White"                              "UK"     2
    3500 "White"                              "EU"     2
    3500 "White"                              "UK"     2
    end

  • #2
    Lynn:
    if you want to go panel, your dataset need a bit of surgery first:
    Code:
    . encode domicile, g(panelid)
    
    . encode ethnicity, g(new_ethnicity)
    
    . xtset panelid year
    repeated time values within panel
    r(451);
    
    . xtset panelid
    
    Panel variable: panelid (unbalanced)
    
    . xtreg salary i.new_ethnicity i.year, fe
    
    Fixed-effects (within) regression               Number of obs     =        100
    Group variable: panelid                         Number of groups  =          3
    
    R-squared:                                      Obs per group:
         Within  = 0.1417                                         min =          5
         Between = 0.0286                                         avg =       33.3
         Overall = 0.1316                                         max =         87
    
                                                    F(13,84)          =       1.07
    corr(u_i, Xb) = -0.0748                         Prob > F          =     0.3984
    
    -----------------------------------------------------------------------------------------------------
                                 salary | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    ------------------------------------+----------------------------------------------------------------
                          new_ethnicity |
       Asian or Asian British - Indian  |  -845.0559   1413.716    -0.60   0.552    -3656.384    1966.273
    Asian or Asian British - Pakistani  |  -1379.231   1692.302    -0.82   0.417     -4744.56    1986.098
      Black or Black British - African  |  -1620.522   1453.028    -1.12   0.268    -4510.027    1268.982
    Black or Black British - Caribbean  |  -2179.231   1692.302    -1.29   0.201     -5544.56    1186.098
                               Chinese  |  -1727.027   1335.931    -1.29   0.200    -4383.673    929.6181
                   Information refused  |  -1741.045    1699.76    -1.02   0.309    -5121.204    1639.115
               Mixed - White and Asian  |  -620.2723   1328.172    -0.47   0.642    -3261.488    2020.943
       Mixed - White and Black African  |  -239.6154    1450.85    -0.17   0.869    -3124.789    2645.558
     Mixed - White and Black Caribbean  |  -251.0066   1475.821    -0.17   0.865    -3185.839    2683.826
                Other Asian background  |  -1741.045   1481.448    -1.18   0.243    -4687.066    1204.977
                                 White  |  -846.1647   1196.777    -0.71   0.482    -3226.088    1533.759
                                        |
                                   year |
                                     2  |   679.2307   293.0891     2.32   0.023     96.39093     1262.07
                                     3  |   241.0446   333.4646     0.72   0.472    -422.0863    904.1754
                                        |
                                  _cons |   1563.673   1179.785     1.33   0.189    -782.4592    3909.806
    ------------------------------------+----------------------------------------------------------------
                                sigma_u |  437.67687
                                sigma_e |  1178.5555
                                    rho |  .12119866   (fraction of variance due to u_i)
    -----------------------------------------------------------------------------------------------------
    F test that all u_i=0: F(2, 84) = 1.29                       Prob > F = 0.2801
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thanks Carlo, that helped a lot!

      Since the p-values are still largely insignificant, I think I could try 2 things to minimize the bias in the data:

      1. I would need to adjust salary based on working location (eg. people working in London naturally have higher wages), currency, as well as accounting for inflation over the 3 years that the data covers, including salary in all currencies.
      2. Only include salaries in UK pounds and exclusing salaries that are below the national minimum wage, which is what the "TRUE" in "salaryinclusionflag".

      To keep the useful data, I believe I cannot simply "drop", but must duplicate salary first? I would really use your help, thank you!


      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str26 employercity str28 salarycurrency long salary str5 salaryinclusionflag
      "St Albans"        "United Kingdom Pounds"    1 "FALSE"
      "London"           "United Kingdom Pounds"    1 "FALSE"
      "London"           "United Kingdom Pounds"    1 "FALSE"
      ""                 "United Kingdom Pounds"   10 "FALSE"
      ""                 "United Kingdom Pounds"   14 "FALSE"
      "Woking"           "United Kingdom Pounds"   17 "FALSE"
      "London"           "United Kingdom Pounds"   23 "FALSE"
      "Bristol"          "United Kingdom Pounds"   26 "FALSE"
      "Pershore"         "Other Not known"         28 "FALSE"
      "Cambridge"        "Other Not known"         30 "FALSE"
      ""                 "Europe Euros"            40 "FALSE"
      "London"           "United Kingdom Pounds"   40 "FALSE"
      ""                 "United Kingdom Pounds"   40 "FALSE"
      ""                 "United Kingdom Pounds"   40 "FALSE"
      "Manchester"       "United Kingdom Pounds"   45 "FALSE"
      ""                 "Other Not known"         48 "FALSE"
      "Winchester"       "United Kingdom Pounds"   50 "FALSE"
      "Stockport"        "United Kingdom Pounds"   50 "FALSE"
      "Oxford"           "United Kingdom Pounds"   50 "FALSE"
      "London"           "United Kingdom Pounds"   91 "FALSE"
      ""                 "United Kingdom Pounds"  100 "FALSE"
      "London"           "United Kingdom Pounds"  120 "FALSE"
      ""                 "United Kingdom Pounds"  200 "FALSE"
      ""                 "China Yuan Renminbi"    300 "FALSE"
      "Edinburgh"        "United Kingdom Pounds"  520 "FALSE"
      "Oxford"           "United Kingdom Pounds"  600 "FALSE"
      "Manchester"       "United Kingdom Pounds"  684 "FALSE"
      "Tottenham"        "United Kingdom Pounds"  800 "FALSE"
      "Henley on Thames" "United Kingdom Pounds"  850 "FALSE"
      ""                 "United Kingdom Pounds"  900 "FALSE"
      ""                 "United Kingdom Pounds" 1000 "FALSE"
      ""                 "United Kingdom Pounds" 1000 "FALSE"
      ""                 "United Kingdom Pounds" 1000 "FALSE"
      ""                 "United Kingdom Pounds" 1200 "FALSE"
      "Cambridge"        "United Kingdom Pounds" 1200 "FALSE"
      ""                 "Europe Euros"          1200 "FALSE"
      "Cambridge"        "United Kingdom Pounds" 1200 "FALSE"
      ""                 "Europe Euros"          1350 "FALSE"
      ""                 "USA Dollars"           1500 "FALSE"
      "Tring"            "United Kingdom Pounds" 1500 "FALSE"
      "London"           "United Kingdom Pounds" 1500 "FALSE"
      "London"           "United Kingdom Pounds" 1500 "FALSE"
      "London"           "United Kingdom Pounds" 1500 "FALSE"
      ""                 "Hungary Forint"        1680 "FALSE"
      ""                 "United Kingdom Pounds" 1800 "FALSE"
      "Broadstairs"      "United Kingdom Pounds" 2000 "FALSE"
      ""                 "United Kingdom Pounds" 2000 "FALSE"
      "London"           "United Kingdom Pounds" 2000 "FALSE"
      "Cambridge"        "United Kingdom Pounds" 2000 "FALSE"
      "Bradford"         "United Kingdom Pounds" 2000 "FALSE"
      ""                 "United Kingdom Pounds" 2000 "FALSE"
      "London"           "United Kingdom Pounds" 2500 "FALSE"
      "London"           "United Kingdom Pounds" 2600 "FALSE"
      "London"           "United Kingdom Pounds" 3000 "FALSE"
      "Bath"             "United Kingdom Pounds" 3000 "FALSE"
      "Bournemouth"      "United Kingdom Pounds" 3000 "FALSE"
      ""                 "Europe Euros"          3000 "FALSE"
      "Lancaster"        "United Kingdom Pounds" 3000 "FALSE"
      ""                 "United Kingdom Pounds" 3000 "FALSE"
      "London"           "United Kingdom Pounds" 3000 "FALSE"
      ""                 "Singapore Dollars"     3000 "FALSE"
      ""                 "United Kingdom Pounds" 3000 "FALSE"
      "London"           "United Kingdom Pounds" 3000 "FALSE"
      "Cambridge"        "United Kingdom Pounds" 3000 "FALSE"
      "Bristol"          "United Kingdom Pounds" 3252 "FALSE"
      "Birmingham"       "United Kingdom Pounds" 3300 "FALSE"
      "Colchester"       "United Kingdom Pounds" 3423 "FALSE"
      ""                 "Europe Euros"          3500 "FALSE"
      ""                 "United Kingdom Pounds" 3500 "FALSE"
      ""                 "Switzerland Francs"    3600 "FALSE"
      "Belfast"          "United Kingdom Pounds" 3624 "FALSE"
      ""                 "Switzerland Francs"    3800 "FALSE"
      ""                 "Europe Euros"          3840 "FALSE"
      "London"           "United Kingdom Pounds" 4000 "FALSE"
      ""                 "United Kingdom Pounds" 4000 "FALSE"
      ""                 "Korea (South) Won"     4000 "FALSE"
      "London"           "United Kingdom Pounds" 4000 "FALSE"
      ""                 "Singapore Dollars"     4000 "FALSE"
      ""                 "United Kingdom Pounds" 4000 "FALSE"
      "Guildford"        "United Kingdom Pounds" 4000 "FALSE"
      ""                 "United Kingdom Pounds" 4000 "FALSE"
      "London"           "United Kingdom Pounds" 4000 "FALSE"
      "Cambridge"        "United Kingdom Pounds" 4320 "FALSE"
      "Leeds"            "United Kingdom Pounds" 4500 "FALSE"
      ""                 "United Kingdom Pounds" 4500 "FALSE"
      "Remote Working"   "United Kingdom Pounds" 4800 "FALSE"
      "Cambridge"        "United Kingdom Pounds" 4814 "FALSE"
      ""                 "United Kingdom Pounds" 5000 "FALSE"
      ""                 "USA Dollars"           5000 "FALSE"
      ""                 "United Kingdom Pounds" 5000 "FALSE"
      "York"             "United Kingdom Pounds" 5000 "FALSE"
      ""                 "USA Dollars"           5000 "FALSE"
      "London"           "United Kingdom Pounds" 5000 "FALSE"
      ""                 "United Kingdom Pounds" 5000 "FALSE"
      ""                 "United Kingdom Pounds" 5000 "FALSE"
      "Manchester"       "United Kingdom Pounds" 5000 "FALSE"
      ""                 "United Kingdom Pounds" 5000 "FALSE"
      ""                 "United Kingdom Pounds" 5000 "FALSE"
      ""                 "Malaysia Ringgits"     5000 "FALSE"
      ""                 "United Kingdom Pounds" 5000 "FALSE"
      end

      Comment


      • #4
        Lynn:
        set aside inflation and the like, that you can adjust for in your dataset, if you want to keep in your panel data regression the "TRUE" flagged observations only, you may want to go:
        Code:
         
         xtreg salary i.new_ethnicity i.year if salaryinclusionflag=="TRUE", fe
        Personally, I would convert your string variable(s) in numeric format via -encode-.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X