Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data - constant dependent variable across individuals not time

    Hello,

    I wanted to seek for someone's advice as I have run into a number of problems with my analysis and got really confused.
    I am analysing the impact of 5 technology related variables on employment in 6 different job categories.
    I have panel data for time 2006-2016 across 10 regions: xtset regions year

    When I have a following regression:
    xtreg employment_category_1 5 different tech variables degree high education no qualifications, fe
    Unfortunetely, ll of my variables come out as significant.

    Hence I have created a new dependent variable: total employment in year t 5 different tech variables degree high education no qualifications, fe
    This technology and educational variables do vary across regions, however total employment now only varies across time...
    Now more coefficients are significant

    Hence my question is: can I run regression like this?

    I would be very thankful for any advice

  • #2
    Julia:
    welcome to this forum.
    I think that you question is ill-posed: the aim of any regression model is not to obtain the highest number of statistical significant coefficients, but to give a fair and true view of the data generating process.
    That said, your chances of receiving more positive replies is conditional on posting what you typed and what Stata gave you back (via CODE delimiters, please) and/or sharing an example/excerpt of your data via -dataex- (as per FAQ). Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Julia:
      as per FAQ, please do not post screenshots, as they are impossible to elaborate on; use -dataex- to post and example/excerpt of your data. Thanks.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Hi Julia,

        I don't see the problem here. So some of your explanatory (independent) variables only vary within the panels (within effect), not across panels (between effect). Why is that creating you problems? I guess I really don't understand what model you're trying to estimate, how you're going about it, and what problems are you running into in your analysis. Please expound.
        Alfonso Sanchez-Penalver

        Comment


        • #5
          Alfonso Sánchez-Peñalver Hello Alfonso,
          My problem is that I set my panel data for: region time as my dependent variables do vary across both regions and time however explanatory variables which I want to test only vary across time (not regions). I am not sure if I can still use fixed effects for this, as Hausmann Test suggested

          Comment


          • #6
            Carlo Lazzaro Of course, I am sorry about this. This should be okey:
            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            year regions total_hm_eachyear total_low_eachyear total_inetrmediate high_manegerial lower_managerial intermediate access_internet rd_expenditure_by_business double gva degree
            2006  1 4180195 8152769 1951024 469902  879281 187878 75.5 3650 109887  288847
            2006  2 4180195 8152769 1951024 337430  599284 129886 58.1  985  75166  194967
            2006  3 4180195 8152769 1951024 781737 1309888 302902 70.1  882 287020 1204794
            2006  4 4180195 8152769 1951024 143100  343195  84898   63  292  42274  145178
            2006  5 4180195 8152769 1951024 419238 1011823 274734   62 1623 128335  359653
            2006  6 4180195 8152769 1951024 822526 1361677 296340 67.2 3347 191672  578124
            2006  7 4180195 8152769 1951024 403224  799651 190716 64.9 1232  95187  305392
            2006  8 4180195 8152769 1951024 146241  409064 110338 56.2  216  45837  242546
            2006  9 4180195 8152769 1951024 355053  734918 190960   67  915  93755  322383
            2006 10 4180195 8152769 1951024 301744  703988 182372 60.5  384  92369  391383
            2007  1 4186417 8031418 1887084 470912  858142 171708 72.7 3992 114161  545314
            2007  2 4186417 8031418 1887084 311560  596660 128768 73.8 1062  78249  396978
            2007  3 4186417 8031418 1887084 789711 1286675 292654 69.7 1067 309828 1290992
            2007  4 4186417 8031418 1887084 142036  334088  86640 45.6  331  43515  201714
            2007  5 4186417 8031418 1887084 405416  986973 263950 59.6 2021 133878  535430
            2007  6 4186417 8031418 1887084 848654 1363345 297036 74.7 3515 201410  990251
            2007  7 4186417 8031418 1887084 393937  770679 186106 74.2 1229  99399  500275
            2007  8 4186417 8031418 1887084 156482  398800 110510 64.5  308  47862  250701
            2007  9 4186417 8031418 1887084 367230  739131 184842 71.7  995  96529  474357
            2007 10 4186417 8031418 1887084 300479  696925 164870 53.4  436  98014  428109
            2008  1 4232327 8203301 1915338 480635  911238 192288 75.6 4182 118349  566936
            2008  2 4232327 8203301 1915338 292542  623342 145744 68.3  976  80011  413092
            2008  3 4232327 8203301 1915338 841138 1311291 312368 77.1 1109 317266 1383294
            2008  4 4232327 8203301 1915338 134960  342957  88068 64.3  318  44143  227896
            2008  5 4232327 8203301 1915338 451921  986320 246370 64.3 2130 136146  545302
            2008  6 4232327 8203301 1915338 809674 1389293 293970 78.1 3466 212264 1019875
            2008  7 4232327 8203301 1915338 371183  766467 177000 72.1 1345 103606  549388
            2008  8 4232327 8203301 1915338 174829  396966  97954   73  243  47416  275802
            2008  9 4232327 8203301 1915338 360058  783242 172318 66.9  886  98699  489892
            2008 10 4232327 8203301 1915338 315387  692185 189258 68.6  433  97211  456818
            2009  1 4383729 8335256 1929204 486185  916857 177870 80.2 3812 115391  588436
            2009  2 4383729 8335256 1929204 304538  653617 141898 71.9  992  78992  408604
            2009  3 4383729 8335256 1929204 860409 1353701 307066 83.8  907 310719 1467978
            2009  4 4383729 8335256 1929204 145922  326534  92418 71.4  315  44309  219077
            2009  5 4383729 8335256 1929204 476038  948635 249672 73.8 1926 136055  537647
            2009  6 4383729 8335256 1929204 813567 1409177 318112 81.2 3758 208692 1063420
            2009  7 4383729 8335256 1929204 409599  813396 182346 76.9 1349 102325  535448
            2009  8 4383729 8335256 1929204 174542  412807  99570 72.6  243  47046  293567
            2009  9 4383729 8335256 1929204 363584  776756 176726 74.4  848  95329  496662
            2009 10 4383729 8335256 1929204 349345  723776 183526 73.2  454  95122  509913
            2010  1 4416277 8445118 1940628 498486  907704 193400 80.8 3846 116343  613767
            2010  2 4416277 8445118 1940628 308811  667877 142370 79.9 1137  81492  429493
            2010  3 4416277 8445118 1940628 845303 1389991 293936 86.7  877 317481 1537961
            2010  4 4416277 8445118 1940628 145232  358559 101490 66.7  308  44457  228388
            2010  5 4416277 8445118 1940628 480392 1005476 265442 77.7 2074 140108  585354
            2010  6 4416277 8445118 1940628 819670 1420863 292034 83.4 3798 212908 1114057
            2010  7 4416277 8445118 1940628 424160  827610 180584 75.2 1454 104827  558084
            2010  8 4416277 8445118 1940628 187543  412378 104630 83.2  234  48973  309035
            2010  9 4416277 8445118 1940628 355675  740063 205008 78.3  886  99576  506846
            2010 10 4416277 8445118 1940628 351005  714597 161734 74.9  488  95555  532949
            2011  1 4711213 8116903 2461184 517747  882572 258920 82.1 3639 117549  646389
            2011  2 4711213 8116903 2461184 350838  629078 164270 82.1 1146  83631  456840
            2011  3 4711213 8116903 2461184 899300 1427022 390600   86 1118 329724 1646891
            2011  4 4711213 8116903 2461184 181802  318054 119652 72.3  259  45313  243366
            2011  5 4711213 8116903 2461184 487441  935147 350764 82.9 2220 142082  638970
            2011  6 4711213 8116903 2461184 873235 1374493 400510 85.2 4579 218763 1150259
            2011  7 4711213 8116903 2461184 452272  780934 194084 84.2 1359 106738  630952
            2011  8 4711213 8116903 2461184 193650  414238 137832 73.6  252  51131  327229
            2011  9 4711213 8116903 2461184 394840  698203 225988 77.5 1281 103594  520865
            2011 10 4711213 8116903 2461184 360088  657162 218564 77.2  550  97024  538772
            2012  1 4465149 8303603 2585208 505492  917864 258524 85.6 3606 121477  685737
            2012  2 4465149 8303603 2585208 306605  630218 178168 86.8 1218  86189  477137
            2012  3 4465149 8303603 2585208 893257 1371151 418292 90.5 1570 345407 1811656
            2012  4 4465149 8303603 2585208 149618  332835 106660 69.5  282  46705  247691
            2012  5 4465149 8303603 2585208 459626  977940 368962 84.9 1781 146154  640533
            2012  6 4465149 8303603 2585208 866477 1421894 414906 87.6 4133 225381 1275835
            2012  7 4465149 8303603 2585208 419558  779452 218326 88.6 1367 110561  664600
            2012  8 4465149 8303603 2585208 174899  413843 144526 74.6  268  53093  336222
            2012  9 4465149 8303603 2585208 346730  780161 262542 80.2 1460 106276  546361
            2012 10 4465149 8303603 2585208 342887  678245 214302 84.2  600  99230  558781
            2013  1 4657847 8632346 2414086 551185  958358 240620 87.7 4137 127279  782194
            2013  2 4657847 8632346 2414086 330120  655570 151026   88 1340  89764  501852
            2013  3 4657847 8632346 2414086 937934 1491902 401920 94.3 1291 362754 1924117
            2013  4 4657847 8632346 2414086 125747  357811 121514 76.9  323  46956  256778
            2013  5 4657847 8632346 2414086 511113  967187 345024 84.4 1835 150385  697618
            2013  6 4657847 8632346 2414086 848155 1463328 357326 90.8 4288 234106 1305326
            2013  7 4657847 8632346 2414086 420991  824553 180864 89.2 1450 113675  693036
            2013  8 4657847 8632346 2414086 169278  446256 141414   82  367  54847  348343
            2013  9 4657847 8632346 2414086 391175  759431 229786   83 1684 111832  592840
            2013 10 4657847 8632346 2414086 372149  707950 244592 85.3  660 101163  607541
            2014  1 4914490 8559271 2512036 553520  951813 251864   88 4094 134141  806614
            2014  2 4914490 8559271 2512036 365923  622804 213508   87 1473  93032  535331
            2014  3 4914490 8559271 2512036 960895 1485180 423372 93.8 1724 385331 2007381
            2014  4 4914490 8559271 2512036 128580  333526 136862 84.7  282  48583  271982
            2014  5 4914490 8559271 2512036 531307  973682 330398 88.5 1913 156404  713764
            2014  6 4914490 8559271 2512036 940857 1494319 325490 90.1 4609 246263 1388151
            2014  7 4914490 8559271 2512036 490160  789347 177062 89.2 1561 119388  704330
            2014  8 4914490 8559271 2512036 170157  440826 130702 86.6  386  56347  371666
            2014  9 4914490 8559271 2512036 406247  746906 265784 88.8 1924 117867  616841
            2014 10 4914490 8559271 2512036 366844  720868 256994 86.4  700 104512  640100
            2015  1 4880272 8753235 2542906 522184  966837 227230 92.3 4200 139103  817219
            2015  2 4880272 8753235 2542906 365655  667837 224442 88.7 1531  96582  573658
            2015  3 4880272 8753235 2542906 959113 1467676 417408 93.2 1892 397898 2100711
            2015  4 4880272 8753235 2542906 138793  355883 124138 87.5  306  50157  279424
            2015  5 4880272 8753235 2542906 527905 1015707 331162 88.5 2116 163645  731215
            2015  6 4880272 8753235 2542906 920785 1543806 389780 93.5 4765 254297 1452274
            2015  7 4880272 8753235 2542906 479595  837440 220270 90.8 1476 121157  776539
            2015  8 4880272 8753235 2542906 181705  419289 131244 88.6  368  57942  384489
            2015  9 4880272 8753235 2542906 411614  769154 260964 88.2 2159 121167  669121
            2015 10 4880272 8753235 2542906 372923  709606 216268   89  769 108710  646951
            end
            label values regions regions
            label def regions 1 "East", modify
            label def regions 2 "East Midlands", modify
            label def regions 3 "London", modify
            label def regions 4 "North East", modify
            label def regions 5 "North West", modify
            label def regions 6 "South East", modify
            label def regions 7 "South West", modify
            label def regions 8 "Wales", modify
            label def regions 9 "West Midlands", modify
            label def regions 10 "Yorkshire and The Humber", modify
            My problem is that I would like to use xtregfor: either total_hm_eachyear, total_low_eachyear or total_inetrmediate as my dependent variables but they do not vary across regions, only time.
            My independent variables are last4 columns which do vary across both regions and time.

            Another possibility is using either high_manegerial lower_managerial intermediate which dovary across time, but I feel like results I get do not show anything because there are no fixed effects for which I can control (I could not think about any time-invariant variables for regions).
            I am very not sure which model I should use to analyze this relationship
            Last edited by Julia Raciniewska; 17 Feb 2019, 12:02.

            Comment


            • #7
              But I don't see a problem with an independent variable only varying across time. These are variables that are common to all the regions. I have done the following estimations with your data

              Code:
              . xtset regions year
                     panel variable:  regions (strongly balanced)
                      time variable:  year, 2006 to 2015
                              delta:  1 unit
              
              . 
              . xtreg gva total_* high_manegerial lower_managerial intermediate, fe
              
              Fixed-effects (within) regression               Number of obs     =        100
              Group variable: regions                         Number of groups  =         10
              
              R-sq:                                           Obs per group:
                   within  = 0.7996                                         min =         10
                   between = 0.8702                                         avg =       10.0
                   overall = 0.8674                                         max =         10
              
                                                              F(6,84)           =      55.85
              corr(u_i, Xb)  = -0.5349                        Prob > F          =     0.0000
              
              ------------------------------------------------------------------------------------
                             gva |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------------+----------------------------------------------------------------
               total_hm_eachyear |  -.0035832   .0070132    -0.51   0.611    -.0175297    .0103633
              total_low_eachyear |   .0053217   .0052471     1.01   0.313    -.0051128    .0157562
              total_inetrmediate |   .0017711   .0057685     0.31   0.760    -.0097002    .0132424
                 high_manegerial |   .1742765    .034422     5.06   0.000     .1058246    .2427284
                lower_managerial |   .1406151   .0270051     5.21   0.000     .0869125    .1943177
                    intermediate |   .0951422   .0375012     2.54   0.013     .0205668    .1697175
                           _cons |  -117872.8   26985.13    -4.37   0.000    -171535.7   -64209.86
              -------------------+----------------------------------------------------------------
                         sigma_u |  37343.545
                         sigma_e |  6767.3828
                             rho |  .96820366   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------------
              F test that all u_i=0: F(9, 84) = 146.76                     Prob > F = 0.0000
              
              . est store fe
              
              . xtreg gva total_* high_manegerial lower_managerial intermediate, re
              
              Random-effects GLS regression                   Number of obs     =        100
              Group variable: regions                         Number of groups  =         10
              
              R-sq:                                           Obs per group:
                   within  = 0.7922                                         min =         10
                   between = 0.8749                                         avg =       10.0
                   overall = 0.8725                                         max =         10
              
                                                              Wald chi2(6)      =     454.32
              corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
              
              ------------------------------------------------------------------------------------
                             gva |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------------+----------------------------------------------------------------
               total_hm_eachyear |  -.0033033   .0082197    -0.40   0.688    -.0194136    .0128071
              total_low_eachyear |   .0099656   .0060258     1.65   0.098    -.0018449     .021776
              total_inetrmediate |   .0011757   .0068874     0.17   0.864    -.0123234    .0146749
                 high_manegerial |   .1714768   .0365534     4.69   0.000     .0998335    .2431201
                lower_managerial |    .094177   .0263209     3.58   0.000      .042589     .145765
                    intermediate |    .101096   .0442108     2.29   0.022     .0144445    .1877475
                           _cons |  -117872.8   32909.08    -3.58   0.000    -182373.4   -53372.14
              -------------------+----------------------------------------------------------------
                         sigma_u |  13343.821
                         sigma_e |  6767.3828
                             rho |   .7954146   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------------
              
              . 
              . hausman . fe
              
                               ---- Coefficients ----
                           |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                           |       .            fe         Difference          S.E.
              -------------+----------------------------------------------------------------
              total_hm_e~r |   -.0033033    -.0035832          .00028        .0042871
              total_low_~r |    .0099656     .0053217        .0046438        .0029628
              total_inet~e |    .0011757     .0017711       -.0005954        .0037632
              high_maneg~l |    .1714768     .1742765       -.0027997        .0122994
              lower_mana~l |     .094177     .1406151       -.0464381               .
              intermediate |     .101096     .0951422        .0059538        .0234147
              ------------------------------------------------------------------------------
                                         b = consistent under Ho and Ha; obtained from xtreg
                          B = inconsistent under Ha, efficient under Ho; obtained from xtreg
              
                  Test:  Ho:  difference in coefficients not systematic
              
                                chi2(6) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                                        =    -2.01    chi2<0 ==> model fitted on these
                                                      data fails to meet the asymptotic
                                                      assumptions of the Hausman test;
                                                      see suest for a generalized test
              You can see there is no problem in the estimations. You say the Hausman test suggests using fixed effects, but the Hausman test we did here suggests that the model fails to meet the assumptions of the Hausman test. This is not the same as fixed effects being right. Is this also the case with the whole data or just a case of this reduced sample?

              Now, the fact that you have variables that only vary within panels, doesn't make fixed effects wrong, either. Notice that in fixed effects you are demeaning all variables within the panels. These variables that only vary across time, will all have the same values in each panel, but different values across time, and now will be centered around zero. That is not something wrong. I believe that the problems that you may be running into may be due to other reasons.

              Alfonso.
              Alfonso Sanchez-Penalver

              Comment


              • #8
                Julia:
                Alfonso gave helpful insights.
                I would aonlky add that you should probably skimming through the literature of your research field and see what others did in the past (in terms of model, regressand and predictors) when presented with the same research issue.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Alfonso Sánchez-Peñalver Thank you for your time Sir. I think I didnt express my problem clear - GVA, and last 4 columns are my independent variables - it is the total_hm which I wanted to use as an explanatory one (and create other regressions replace total_hm for total_low, etc). That is why I am very confused as this is the variable which is constant for regions but varies across time, while gva and other independent variables vary for both regions and time.
                  When I had run the same tests you did above but used total_hm as my dependent variable over whole sample - I did get the same results as you though: model failed to meet assumptions of the Hausman test, and chi(2) was -35.56 .
                  What do you think?

                  Thank you very much for your support - it is very difficult to find answer to my problem anywhere else

                  Comment


                  • #10
                    Hi Julia,

                    well there I think you have a problem, because you only have a very limited number of observations you can really use, those that the dependent variable varies, i.e. the number of years. How many years in your data?
                    Alfonso Sanchez-Penalver

                    Comment


                    • #11
                      Alfonso Sánchez-Peñalver Carlo Lazzaro
                      ​​​​​​​There are 11, that is data for 2006-2016. I am writing my dissertation on the impact of technology on employment in different job categories - as paper which inspired me did this for years before 2003, I am trying to check whether it had the same impact after that year. When it comes to my regression it is:
                      xtreg total_employment_higher_managerial = access to internet + total r&d by businesses+ total r&d by higher education + GVA + degree + higher education + other qualifications + no qualifications, fe

                      All independent variables are measured by region & time. I tried changing total_employment_higher_managerial to employment in this category not summed up for whole country, but varying across regions, however Braush-Pagan test showed that variance is not constant (heteroscedasticity) and RESET test showed that I have ommitted variables, hence I though about appling fixed effects. However, I get very weird results - not only some of them inconsistent with logic - but also most of my coefficients are insigniciant.
                      Hence, when I changed the dependent variable to "total_employment_higher_managerial", that is for whole country (only varies across time), not only results became understandable but also significant.

                      That is why I am seeking for advice - I do not want to take this results and present them as I am unsure if I can run a regression like this.

                      Thank you very much
                      Last edited by Julia Raciniewska; 17 Feb 2019, 13:04.

                      Comment


                      • #12
                        The problem is that you're trying to explain an aggregate explanatory variable with several disaggregated variables. To keep things simple, this is like trying to explain the UK's
                        GDP with the regional level population. The truth is that you only have 11 observations of the dependent variable. A different matter would be if you had the regional data for your dependent variable, e.g. the GDP of each specific region. You see, to explain national level GDP, you need national level variables, not regional level variables. At least I think this is where the problem lies. You mention previous work done on this for years before 2003. Did they also have aggregate data as the dependent variable? Or was the dependent variable at the regional level?

                        If you get data at the regional level for your dependent variables you would have no issue. Otherwise I believe the specification of the model is wrong. My advise is that you sit down and discuss this with your dissertation advisor, since (s)he should be able to advise you better.
                        Alfonso Sanchez-Penalver

                        Comment

                        Working...
                        X