Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Importance of misspecification test vs. R-sq. and consequences of xtsktest

    Dear Statalist Members,

    I am analyzing a balanced panel of around 2400 firms over 12 years (Stata 13). The output I am able to present here is based on test data, as I am not allowed (or able to) extract the original files. The only difference is the number of firms, which is higher in the original dataset, and that most of my explanatory variables turn out to be significant, unlike in this sample data. F-statistic in the original is F(11,13432) Prob>F 0.0000, R-sq. overall is 0.9639.

    My goal is to analyze the effect of investments in computer (investict, dummy 0-1), product and process innovations (dummies 0-1) on the demand for highskilled workers. Controls include the size of the firm in terms of employees (total), the industry, a dummy for West Germany (west), a dummy for a collective bargaining agreement (collective), the state of the art of production equipment (tech) and if the firm deals with RnD (dummy), and some more.

    I have used xtserial and xttest3 which have lead me to include clustered robust standard errors. Using xtoverid,made me decide to use fixed effects. -testparm- has made me include year fixed effects. So my regression is now:

    Code:
      xtreg highskill investict product_inno process_inno total west industry collective exportshare investment turnover rnd t
    > ech i.year, fe vce(cluster idnum)
    note: west omitted because of collinearity
    
    Fixed-effects (within) regression               Number of obs      =      4344
    Group variable: idnum                           Number of groups   =       498
    
    R-sq:  within  = 0.1005                         Obs per group: min =         1
           between = 0.5034                                        avg =       8.7
           overall = 0.4393                                        max =        11
    
                                                    F(21,497)          =      2.60
    corr(u_i, Xb)  = 0.3892                         Prob > F           =    0.0001
    
                                    (Std. Err. adjusted for 498 clusters in idnum)
    ------------------------------------------------------------------------------
                 |               Robust
       highskill |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
       investict |   .7032893   .2711382     2.59   0.010      .170571    1.236008
    product_inno |   .2723859   .6988765     0.39   0.697    -1.100731    1.645503
    process_inno |  -.3938082   .4501978    -0.87   0.382    -1.278334    .4907173
           total |    .101938   .0245108     4.16   0.000     .0537805    .1500954
            west |          0  (omitted)
        industry |   .1624997   .1911486     0.85   0.396    -.2130592    .5380586
      collective |  -.2838042   .5861356    -0.48   0.628    -1.435413    .8678049
     exportshare |   .8483747   2.351452     0.36   0.718    -3.771638    5.468387
      investment |   1.44e-06   5.98e-07     2.41   0.016     2.68e-07    2.62e-06
        turnover |  -1.99e-07   1.39e-07    -1.43   0.153    -4.73e-07    7.46e-08
             rnd |  -1.103514   .9824249    -1.12   0.262    -3.033732    .8267042
            tech |  -.6756037   .2828397    -2.39   0.017    -1.231313   -.1198947
                 |
            year |
           2008  |   .0310991   .3815399     0.08   0.935    -.7185309    .7807291
           2009  |   .4981931   .3197414     1.56   0.120    -.1300184    1.126405
           2010  |   .7890588   .4913133     1.61   0.109    -.1762483    1.754366
           2011  |   1.109093   .5630923     1.97   0.049     .0027585    2.215428
           2012  |   1.189345   .5407669     2.20   0.028      .126874    2.251816
           2013  |   .0965383   .7094676     0.14   0.892    -1.297387    1.490464
           2014  |   .4120097   .6609871     0.62   0.533    -.8866637    1.710683
           2015  |  -.1867301   .7267681    -0.26   0.797    -1.614647    1.241187
           2016  |   .1137137   .5447759     0.21   0.835     -.956634    1.184061
           2017  |  -.4267298   .7349041    -0.58   0.562    -1.870632    1.017172
                 |
           _cons |   4.706464   2.350515     2.00   0.046     .0882924    9.324636
    -------------+----------------------------------------------------------------
         sigma_u |  22.632204
         sigma_e |  7.5596268
             rho |  .89962854   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------

    I originally intended to use the share of highskilled employees as my dependent variable, but after reading the paper of Kronman (1993) and several posts in this forum concerning the problems with ratios, I have switched to using the absolute number of highskilled employees (highskill) and include the total number of employees as a control. This has increased my R-squared by a lot (it was only 0.016 before).

    On the other hand, I tested my model specification using:

    Code:
     predict fitted, xb
    g sq_fitted=fitted^2
    xtreg highskill fitted sq_fitted
    test sq_fitted
    The p-value was 0.8 before when using the share, now it is significant (0.0000) and telling me my model is misspecified. Now my question is, if the test I used to test for misspecification is the right thing to do here and if yes, what else can I do now concerning my specification? Or is a high R-Sq. enough to argue that my model fits?

    Also I don't understand why the dummy for west would be omitted, none of the regressors are highly correlated.

    I have read many posts in this forum and run several tests that made me end up with this fixed effects regression model, so I am confused about the result of the specification test. I have also tried -areg-, absorb(idnum) vce(cluster idnum), which has slightly different coefficients and a higher R-Sq. (as is normal) than the -xtreg, fe- but it has the same result in the misspecification test.

    Testing for normality using
    Code:
     xtreg highskill investict product_inno process_inno total west industry collective exportshare investment turnover rnd tech, re vce(cluster idnum)
    (re because it is not possible with fe) and then -xtsktest- has given me the following:

    Code:
       xtsktest
    (running _xtsktest_calculations on estimation sample)
    
    Bootstrap replications (50)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    ..................................................    50
    
    Tests for skewness and kurtosis                 Number of obs      =      4344
                                                    Replications       =        50
    
                                     (Replications based on 498 clusters in idnum)
    ------------------------------------------------------------------------------
                 |   Observed   Bootstrap                         Normal-based
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
      Skewness_e |  -1805.438   1230.613    -1.47   0.142    -4217.396    606.5195
      Kurtosis_e |   456552.4   194447.7     2.35   0.019     75441.97    837662.8
      Skewness_u |    12182.3   2960.393     4.12   0.000     6380.038    17984.56
      Kurtosis_u |    1510700   274557.2     5.50   0.000     972577.4     2048822
    ------------------------------------------------------------------------------
    Joint test for Normality on e:        chi2(2) =   7.67    Prob > chi2 = 0.0217
    Joint test for Normality on u:        chi2(2) =  47.21    Prob > chi2 = 0.0000
    ------------------------------------------------------------------------------
    Could this mean I should transform my data using logs as there are issues with normality? or what are the consequences?

    I appreciate any input on my issues, thanks in advance,

    Helen

    Last edited by Helen Hickmann; 30 Jun 2020, 04:30.

  • #2
    Helen:
    the first thing I would look at (via -estat vce, corr) is possible quasi-extreme multicolinearity issue in your original code (where most the coefficients do not reach statistical significance despite a respectable R-sq within).
    The misspecification test you ran actually tests the correctness of the regressand functional form (see -linktest- entry in Stata .pdf model for more details, as -linktest- uses the same machinery), although oftentimes is (aslo) interpreted as a test of misspecification of the predictors: that said, it may be that some of your regressors have a non-linear relationship with the regressand.
    Eventually, I do not think that striving for normality is relevant, as normality is a (weak) requirement for the (idiosyncratic) residual distribution only and with a large sample even minimal departures form normality make the alarm start (I would visually inspect the whole matter, instead).
    Last edited by Carlo Lazzaro; 30 Jun 2020, 06:46.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo,

      Thank you for your reply. I was maybe a little unclear above: so in my original data, most of my regressors actually are significant, and the R-sq as stated above is way higher than here in the sample data. I have run -estat vce, corr- in my original data and the only high correlation is between turnover and total (almost 1), which is logical considering that bigger firms (in terms of employees) have a higher turnover (or sales volume). I was afraid to exclude one of the two since both are highly significant and Stata did not throw them out... or is it better to do so (throw out one)? All other correlations between the independent variables are lower than 0.1.

      When it comes to normality I did inspect my variables graphically and they do not seem to be normally distributed as they do not have negative values. For variables like the total amount of employees, the amount of highskilled or the turnover and investment amounts as well as the exportshares, the distributions are all more concentrated on the lower (left) end, as i have more smaller/medium sized firms in the panel (and in Germany so this is more or less representative).

      I have therefore tried
      Code:
       xtreg lnhighskill investict product_inno process_inno lntotal west collective lnexportshare lninvestment lnturnover rnd tech i.year, fe vce(cluster idnum)
      and now my model seems to be correctly specified according to the same test as above (0.5594), but my R-sq. is only half the size (from 0.43 above to 0.21 with the logs). Is this log transformation therefore a better fit for my data? -linktest- seems not applicable after -xtreg-

      Considering linearity it is indeed the case that from looking at scatter plots the relationship between highskill and the regressors is not very linear. Most of them are dummies anyways but even the continous ones do not seem linear.

      As I am new to data analysis what I would think of doing next would be to create maybe squared variables of each non-dummy or non-categorical regressor and include it in the regression to see if there is any significance for the squared term? or is there a more efficient way to do that? Or might the above linear transformation do the job, as the misspecification test went better this time, regardless of a lower R-sq.?

      Best,

      Helen

      Comment


      • #4
        Helen:
        1) thanks for clarifying the difference between your original dataset and the excerpt that you shared with the list;
        2) as far as the R-sq is concerned, you should look at the within one as you're using the -fe- estimator (overall R-sq is less meaningful in this rescpect);
        3) a correlation about 1 between two variables or coefficient seems really high. I would try with including -turnover- only;
        4) squaring continuos predictors (eg, investment) looking for possible turning-points is worh exploring.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Carlo,

          Thanks for the information on the appropriate R-sq and dismissing one of the strongly correlated regressors. I have included investment squared and it is significant. As turnover was not significant anymore but 'total' was, i have decided to only include total employees.

          Code:
           xtreg highskill investict product_inno process_inno total west industry collective exportshar
          > e investment investment_sq rnd tech i.year, fe vce(cluster idnum)
          note: west omitted because of collinearity
          
          Fixed-effects (within) regression               Number of obs      =      6402
          Group variable: idnum                           Number of groups   =       657
          
          R-sq:  within  = 0.1029                         Obs per group: min =         1
                 between = 0.5260                                        avg =       9.7
                 overall = 0.4988                                        max =        11
          
                                                          F(20,656)          =         .
          corr(u_i, Xb)  = 0.4542                         Prob > F           =         .
          
                                           (Std. Err. adjusted for 657 clusters in idnum)
          -------------------------------------------------------------------------------
                        |               Robust
              highskill |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          --------------+----------------------------------------------------------------
              investict |   .6781545   .2696692     2.51   0.012     .1486355    1.207673
           product_inno |   .2201725   .6099916     0.36   0.718     -.977599    1.417944
           process_inno |  -.2346403   .3503019    -0.67   0.503    -.9224886    .4532079
                  total |   .0968363   .0205078     4.72   0.000     .0565675    .1371052
                   west |          0  (omitted)
               industry |   .0944955   .1731603     0.55   0.585    -.2455199    .4345109
             collective |  -.4380177   .5091838    -0.86   0.390    -1.437844    .5618089
            exportshare |   .5127848   2.028985     0.25   0.801    -3.471304    4.496873
             investment |   1.31e-07   4.31e-07     0.30   0.762    -7.15e-07    9.77e-07
          investment_sq |   4.96e-14   1.31e-14     3.78   0.000     2.39e-14    7.54e-14
                    rnd |  -1.466807   .8043435    -1.82   0.069    -3.046206     .112591
                   tech |  -.5374448   .2554219    -2.10   0.036    -1.038988   -.0359017
                        |
                   year |
                  2008  |   .0244317   .2850558     0.09   0.932    -.5353002    .5841636
                  2009  |   .5002077   .2657605     1.88   0.060    -.0216361    1.022051
                  2010  |   .5927328   .4292332     1.38   0.168     -.250104     1.43557
                  2011  |   .9605033   .4526084     2.12   0.034     .0717674    1.849239
                  2012  |   1.033362   .4275999     2.42   0.016     .1937323    1.872992
                  2013  |   .2971908   .5621296     0.53   0.597    -.8065995    1.400981
                  2014  |  -.0327852   .5648572    -0.06   0.954    -1.141931    1.076361
                  2015  |  -.0607577     .55997    -0.11   0.914    -1.160307    1.038792
                  2016  |   .1996527   .4519628     0.44   0.659    -.6878155    1.087121
                  2017  |  -.1020679   .5694883    -0.18   0.858    -1.220308    1.016172
                        |
                  _cons |   4.985016   2.240526     2.22   0.026     .5855491    9.384482
          --------------+----------------------------------------------------------------
                sigma_u |   24.22636
                sigma_e |  7.4350321
                    rho |  .91392085   (fraction of variance due to u_i)
          -------------------------------------------------------------------------------
          But the test
          Code:
            predict fitted, xb
          g sq_fitted=fitted^2
          xtreg lnhighskill fitted sq_fitted
          again shows misspecification
          Code:
          . test sq_fitted  
          
           ( 1)  sq_fitted = 0
          
                     chi2(  1) =   92.14
                   Prob > chi2 =    0.0000
          If i take a logarithmic transformation of my dependent variable as well as my continous variables, I get the following:

          Code:
           xtreg lnhighskill investict product_inno process_inno lntotal west collective lnexportshare l
          > ninvestment rnd tech industry i.year, fe vce(cluster idnum)
          note: west omitted because of collinearity
          
          Fixed-effects (within) regression               Number of obs      =      1043
          Group variable: idnum                           Number of groups   =       198
          
          R-sq:  within  = 0.1016                         Obs per group: min =         1
                 between = 0.4145                                        avg =       5.3
                 overall = 0.4122                                        max =        11
          
                                                          F(20,197)          =      2.85
          corr(u_i, Xb)  = 0.3630                         Prob > F           =    0.0001
          
                                           (Std. Err. adjusted for 198 clusters in idnum)
          -------------------------------------------------------------------------------
                        |               Robust
            lnhighskill |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          --------------+----------------------------------------------------------------
              investict |    .112618   .0413907     2.72   0.007     .0309923    .1942437
           product_inno |   .0483549   .0393228     1.23   0.220    -.0291928    .1259026
           process_inno |  -.0030902    .045346    -0.07   0.946    -.0925162    .0863357
                lntotal |   .5126123   .1167062     4.39   0.000     .2824584    .7427663
                   west |          0  (omitted)
             collective |  -.0049241   .0681019    -0.07   0.942    -.1392264    .1293782
          lnexportshare |   .0548722   .0338597     1.62   0.107    -.0119019    .1216462
           lninvestment |    .005109   .0188699     0.27   0.787    -.0321039    .0423219
                    rnd |   -.057603   .0670661    -0.86   0.391    -.1898627    .0746566
                   tech |  -.0533592   .0265694    -2.01   0.046    -.1057562   -.0009623
               industry |   .0341748   .0256811     1.33   0.185    -.0164704      .08482
                        |
                   year |
                  2008  |   .0244506    .053138     0.46   0.646    -.0803417    .1292429
                  2009  |  -.0230317   .0615113    -0.37   0.708    -.1443369    .0982734
                  2010  |  -.0086223   .0552691    -0.16   0.876    -.1176173    .1003727
                  2011  |   .0435392   .0711816     0.61   0.541    -.0968365    .1839149
                  2012  |   .0858316   .0760463     1.13   0.260    -.0641377    .2358009
                  2013  |   .0200317   .0731886     0.27   0.785     -.124302    .1643654
                  2014  |   .0327455   .0719557     0.46   0.650    -.1091569    .1746478
                  2015  |   .1280943   .0697003     1.84   0.068    -.0093602    .2655487
                  2016  |   .0727282   .0732099     0.99   0.322    -.0716475     .217104
                  2017  |    .054778   .0704262     0.78   0.438     -.084108    .1936641
                        |
                  _cons |  -.3458275   .6061856    -0.57   0.569    -1.541273    .8496184
          --------------+----------------------------------------------------------------
                sigma_u |  1.2328732
                sigma_e |  .38180985
                    rho |  .91248488   (fraction of variance due to u_i)
          -------------------------------------------------------------------------------
          and further
          Code:
            test sq_fitted
          
           ( 1)  sq_fitted = 0
          
                     chi2(  1) =    0.20
                   Prob > chi2 =    0.6529

          So to me it seems like i should go for the log transformed model, even though it costs me many observations due to zero values for companies not having highskilled employees or having an investment in the previous year. But this now sounds not convincing after just having seen that the squared term of investment was significant. Or is it possible to mix the two, take logs of some and squared terms of investment?

          Best,

          Helen



          Last edited by Helen Hickmann; 30 Jun 2020, 08:58.

          Comment


          • #6
            Helen:
            as per your data, I would go logged, event though it means working with about 20% of the original sample.
            It might be that the (seemingly relevant) number of 0 values company causes you problems in your original code.
            As an aside, why creating interactions (and/or categorical variables) by hand when -fvvarlist- notation can do it for you?
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Dear Carlo,

              Thank you very much, I will go for the log transformed model. I will research if there is a way to do a log transformation which will not make me loose all firms who did not invest in the previous year and have a zero value for ‚investment’ as that would bias my results for sure.

              As for -fvvarlist- I am not sure what you refer to? I have so far no interaction terms in my model and my categorical variable is ‚tech‘ which is like that due to the survey design. Or do you mean I should build categories using -fvvarlist- of my continuous variables like investment or total employees? But then I cannot no them right? I am a little confused sorry

              best
              Last edited by Helen Hickmann; 30 Jun 2020, 16:41.

              Comment


              • #8
                Helen:
                1) unfortunately, I do not think that you can avoid getting rid of those firms with 0 value for investments, if you log. As a second thought, it may well be that data collectiion (ie, mixing up together firms that invested or not in the previous year) influences the feasibility of data analysis, as it seems that, in order to work on a model that is not misspecified, you should limit the -e(sample)- to those firms that did invest (and I do not think you can do anything about that, if logging is actually the way to go). Conversely, you may want to investigate what happens if you omit the variables with 0 values and log (provide that this second approach gives a fair and true view of the data generating process).
                2) in one of your previois posts, you stated
                I have included investment squared...
                Doesn't a squared term represent an interaction with the linear term and itself?:
                Code:
                c.investment##c.investment
                Last edited by Carlo Lazzaro; 01 Jul 2020, 01:07.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Dear Carlo,

                  yes I had investment squared to test it’s significance in the original model (before logging) and it was significant. But then I went for the log model instead and when I go for the log model I cannot not-log investment and add it as its normal value or/and as a square term or can I? If that was possible I wouldn’t have to cut my sample if I can avoid logging investment...

                  Helen

                  Comment


                  • #10
                    Helen:
                    yes, squared investment term was in your first code. I took it as ana example to switch to -fvvarlist- notation.
                    You may want to investigate what happens when you do not log investment in your logged model. Actually it is not mandatory to log all the continuous predictors (in addition to the logged regressand). Obviously, the more you mixed up logged and non-logged predictors, interpreting your results can be difficult.
                    As an aside, you can also square a logged term (although, taking a look at the 95% CI of -lninvest-, I doubt that a square term is helpful).
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Carlo,

                      That’s good to know. Thank you so much for all the advice! I will try the different versions with the original data set and see what fits best.

                      have a great day,

                      Helen

                      Comment


                      • #12
                        Thanks, you too.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #13
                          I'll try to respond more fully later, but you should not drop the zeros. That is selecting your sample on the basis of y. You need to have a very good reason to do that.

                          You can always use a linear model to start with, even if your outcome variable is fraction. On top of that, you might try the fractional correlated random effects approach in Papke and Wooldridge (2008, Journal of Econometrics). It is better to use a fraction in a linear model, or a CRE model, than to drop any observation with y = 0. And R-squared has nothing to do with this. You can't compare R-squareds across different dependent variables.

                          You might even try xtpoisson with the fe and vce(robust) options to allow the zeros, with lntotal as an explanatory variable. This is fully robust, as has been discussed on this site many times. The exponential mean is not ideal because it doesn't impose that skilled workers <= total workers, but if the difference is usually large, it shouldn't matter much. And your linear model doesn't impose that, either.

                          BTW, the reason "west" drops out is because it doesn't vary over time. You don't need it in the model when you do FE, whether you use a linear model or xtpoisson.

                          As Carlo said, you shouldn't be testing for normality or heteroskedasticity or even serial correlation. Cluster your standard errors. Your N is plenty large for the asymptotics.

                          JW

                          Comment


                          • #14
                            Dear Jeff,

                            thank you very much for the extensive and very helpful reply! I will do my research on the literature and approaches you suggest to apply them to my data.

                            best,

                            Helen

                            Comment


                            • #15
                              Dear Jeff,

                              I have now done some further research. Many people using a log transformation with zero values in the variable seem to simply add a low constant (mostly 1) to and up with log(1)=0, but I have also read that this can produce bias in the coefficients, which is why I am not eager to do it. Also people are indecisive about just adding one in the cases where the value is zero or to add one to all values. As the variables that I am using which have zero values are my DV high skill (number of high skilled employees) as well as two of my IVs, investment (in EUR) and exportshare (between zero and 1), I do not see big issues with adding 1 EUR etc, to my observations but I am not an expert and therefore would prefer a more acceptable solution.

                              I have followed your advice trying -xtpoisson, fe vce(robust)- and have received the following result (i have replaced i. year with i.industry due to the result of -testparm- for both):

                              Code:
                               xtpoisson highskill investict product_inno process_inno total i.industry collective exportshare investment rnd tech, f
                              > e vce(robust)
                              note: 12 groups (12 obs) dropped because of only one obs per group
                              note: 111 groups (1143 obs) dropped because of all zero outcomes
                              
                              Iteration 0:   log pseudolikelihood = -11853.823  
                              Iteration 1:   log pseudolikelihood = -11548.262  
                              Iteration 2:   log pseudolikelihood = -11547.692  
                              Iteration 3:   log pseudolikelihood = -11547.691  
                              
                              Conditional fixed-effects Poisson regression    Number of obs      =      5247
                              Group variable: idnum                           Number of groups   =       534
                              
                                                                              Obs per group: min =         2
                                                                                             avg =       9.8
                                                                                             max =        11
                              
                                                                              Wald chi2(18)      =     65.87
                              Log pseudolikelihood  = -11547.691              Prob > chi2        =    0.0000
                              
                                                                                                         (Std. Err. adjusted for clustering on idnum)
                              -----------------------------------------------------------------------------------------------------------------------
                                                                                    |               Robust
                                                                          highskill |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              ------------------------------------------------------+----------------------------------------------------------------
                                                                          investict |    .060396   .0261166     2.31   0.021     .0092085    .1115835
                                                                       product_inno |   .0268025   .0478654     0.56   0.576    -.0670118    .1206169
                                                                       process_inno |  -.0248983   .0242651    -1.03   0.305    -.0724571    .0226604
                                                                              total |   .0013654   .0003804     3.59   0.000     .0006198     .002111
                                                                                    |
                                                                           industry |
                              Mining & quarrying, electricity, gas and water sup..  |  -.7953135   .3598805    -2.21   0.027    -1.500666   -.0899607
                                                                     Manufacturing  |  -.7077856   .4969981    -1.42   0.154    -1.681884    .2663128
                                                                      Construction  |  -.1095389   .6873729    -0.16   0.873    -1.456765    1.237687
                                                                             Trade  |  -.8627884   .2321539    -3.72   0.000    -1.317802   -.4077751
                                                                         Transport  |  -.2272573    .286954    -0.79   0.428    -.7896769    .3351622
                                                     Information and Communication  |  -.5411883   .2737929    -1.98   0.048    -1.077812    -.004564
                                                                Financial services  |          0  (omitted)
                                                                    Other services  |  -.2763747   .2497819    -1.11   0.269    -.7659382    .2131888
                                                 Education, Health and Social Work  |  -.2910454   .3001761    -0.97   0.332    -.8793797    .2972889
                                                                     Public sector  |  -.3389178    .290392    -1.17   0.243    -.9080756    .2302399
                                                                                    |
                                                                         collective |  -.0291069   .0453624    -0.64   0.521    -.1180156    .0598018
                                                                        exportshare |  -.0539176   .0734882    -0.73   0.463    -.1979519    .0901167
                                                                         investment |   9.57e-09   4.03e-09     2.37   0.018     1.67e-09    1.75e-08
                                                                                rnd |   -.072425   .0405402    -1.79   0.074    -.1518824    .0070324
                                                                               tech |  -.0569154   .0234294    -2.43   0.015    -.1028362   -.0109946
                              -----------------------------------------------------------------------------------------------------------------------
                              The results look okay, but I do not understand why so many of my observations get dropped here as well? It is half of my total observations. I would really appreciate advice on how to handle it or where my mistake in the application lies. I am also not sure about the interpretation of the results (or are they the same as when using a linear regression, comparing units?) but I guess i will be able to find that out using Google Also the before discussed misspecification test using fitted values turns out 0.000 again indicating misspecification, but I am not sure this one is applicable for the xtpoisson regression...

                              Helen
                              Last edited by Helen Hickmann; 03 Jul 2020, 04:12.

                              Comment

                              Working...
                              X