Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpretation stata results

    Hello,

    I'm doing a master thesis on the effect of CSR contracting on CSR performance. My model is based on a multiples regression by OLS.

    I have regressed CSR level on CSR contracting and used several control variables. As a results I get CSR contracting significant for p<0,1

    Afterwards, I added the control variables industries and created dummies for each industry and used the command i.industries in stata.

    After controlling for industries, CSR contracting variable (independant variable)is more significant than in the first regression, it's is now significant

    for p<0,01. I'm struggling to understand the reasoning behind this big change of results. Morevoer the coefficient of CSR is now 2 times bigger than in the first regression.

    Could someone help me interpret why integrating industry as a control variable has this big effect on the coefficient and the significance.

    Thank you very much

    Jamo

  • #2
    Aby:
    in https://www.statalist.org/forums/for...trol-variables you were kindly asked to follow the FAQ recommendations about posting what you typed and what Stata gave you back.
    Following the current road, your queries are at high risk to be left unreplied, since:hence, no guess-work can support my reply);
    - you performed two different regression models (and you do not provide any detail about them): hence, nobody with a decent smattering of statistics can be surprised that your results differ.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello Carlo,

      Thank you for your quick answer and sorry for my unaccurate post.
      I will precise my case here.
      I'm doing a master thesis on the effect of CSR contracting on CSR performance. My model is based on a multiples regression by OLS.
      I have regressed CSR level (CSR Score2017 in the regression) on CSR contracting (CSR Contrat in the regression) and used several control variables commonly used in the litterature.
      My command was : "reg CSRScore2017 CSRContrat CSRScore2016 Endettement LogVentes RDVentes ROA"
      and got this results :
      Linear regression
      CSRScore2017 Coef. St.Err. t-value p-value [95% Conf Interval] Sig
      CSRContrat 2.365 1.220 1.94 0.057 -0.068 4.798 *
      CSRScore2016 0.770 0.053 14.52 0.000 0.664 0.876 ***
      Endettement 7.522 4.549 1.65 0.103 -1.549 16.593
      LogVentes 0.324 0.439 0.74 0.463 -0.552 1.201
      RDVentes 9.607 15.689 0.61 0.542 -21.676 40.890
      ROA 0.103 0.129 0.80 0.426 -0.154 0.361
      Constant 6.452 9.265 0.70 0.488 -12.022 24.927
      Mean dependent var 69.765 SD dependent var 11.356
      R-squared 0.848 Number of obs 78.000
      F-test 65.976 Prob > F 0.000
      Akaike crit. (AIC) 466.492 Bayesian crit. (BIC) 482.989
      *** p<0.01, ** p<0.05, * p<0.1
      Afterwards I made the same regression but this time moreover controlling for industries.
      The command that I used was : "reg CSRScore2017 CSRContrat CSRScore2016 Endettement LogVentes RDVentes
      ROA i.Industrie"
      And got the following results :
      Linear regression
      CSRScore2017 Coef. St.Err. t-value p-value [95% Conf Interval] Sig
      CSRContrat 3.851 1.209 3.19 0.002 1.435 6.267 ***
      CSRScore2016 0.772 0.051 15.04 0.000 0.670 0.875 ***
      Endettement 6.685 4.573 1.46 0.149 -2.454 15.823
      LogVentes 0.249 0.449 0.55 0.582 -0.649 1.146
      RDVentes -11.747 17.042 -0.69 0.493 -45.803 22.310
      ROA 0.026 0.146 0.18 0.860 -0.266 0.318
      1b.Industrie 0.000 . . . . .
      2.Industrie -0.115 2.856 -0.04 0.968 -5.821 5.591
      3.Industrie 1.964 2.375 0.83 0.411 -2.782 6.711
      4.Industrie 2.699 2.534 1.06 0.291 -2.366 7.764
      5.Industrie 9.385 3.297 2.85 0.006 2.795 15.974 ***
      6.Industrie 5.273 2.396 2.20 0.031 0.485 10.061 **
      7.Industrie -2.135 3.356 -0.64 0.527 -8.842 4.572
      8.Industrie 3.831 3.053 1.25 0.214 -2.270 9.932
      9.Industrie 5.045 2.758 1.83 0.072 -0.467 10.557 *
      Constant 5.010 9.062 0.55 0.582 -13.099 23.119
      Mean dependent var 69.765 SD dependent var 11.356
      R-squared 0.888 Number of obs 78.000
      F-test 35.751 Prob > F 0.000
      Akaike crit. (AIC) 458.490 Bayesian crit. (BIC) 493.841
      *** p<0.01, ** p<0.05, * p<0.1
      Thus, my concern was that I'm not sure how to interpret those results. Especially why the coefficient of my independant
      variable "CSRContrat" (CSR contracting in english) has raised and the reasoning behind the increase in its significance level.

      Thank you very much,

      Abi

      Comment


      • #4
        Abi:
        in order to make your posts more readable, please use CODE delimiters when reporting your Stata output (see the FAQ for more details). Thanks.
        I do not consider the increase in CSRContrat significance particularly relevant as it was near to be significant (0.057) in your first regression model.
        That said:
        - have you tested the correctness of your functional form (see -estat ovtest-)?
        - have you tested for heteroskedasticity in residula distribution (see -estat hettest)?
        - have you tested the joint significance of -i.industry- (see -testparm)?
        -what does the adjusted R-squared of your models tell you?
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Hello Carlo,
          Thanks for your answer!
          So in order to test for heteroskedasticity, I have done it using F-tst and entered the following code into Stata after my regression results :
          -"predict, e residuals"
          -"ge e2 = e^2"
          -" reg e2 CSRContrat CSRScore2016 Endettement LogVentes RDVentes ROA i.Industrie"
          I obtained a regression and those results:
          Code:
           Source |       SS           df       MS      Number of obs   =        78
          -------------+----------------------------------   F(14, 63)       =      1.06
                 Model |  7806.40347        14  557.600248   Prob > F        =    0.4090
              Residual |  33115.8267        63  525.648043   R-squared       =    0.1908
          -------------+----------------------------------   Adj R-squared   =    0.0109
                 Total |  40922.2302        77  531.457535   Root MSE        =    22.927
          So that I can infer that my model is not heteroskedastic. Is it right to do it this way?

          Then, yes I tested the joint significance of i.industry using testparm. Thank you

          Comment


          • #6
            Abi:
            I think you would be more comfortable with -estat hettest- to investigate possible heteroskedasticity in your residual distribution.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Carlo:
              I have done the heteroskedasticity test using estat hettest
              I found no heteroskedasticity
              Is it enough to infer that my model is robust or is there any other verification I can make?
              Thank you,
              Abi

              Comment


              • #8
                Abi:
                then you can exclude heteroskedasticity issues.
                That said, what about -estat ovtest-?
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Carlo:
                  About -estat ovtest-, I can not reject the null hypothese and thus according to the test model has no ommited variable bias.
                  After doing those 2 tests (no heteroskedasticity and no ommited variable bias), it is posible to infer my model is robust?
                  Thank you

                  Comment


                  • #10
                    Abi:
                    whenever we talk about robustness, we should first answer to the following question: robust to what?
                    As per your test outcomes, you can only state that there's no evidence that your regression suffers from heteroskedastcity and misspecificatioin of the functional form: obviously. both results are really good (as in many cases the results of the postestimation tests you performed are source of concerns).
                    Besides, you were brave enough to perform postestimation test: it is often the case that, even in articles published in top techinical journals, you have to take the results of regressions for granted.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Carlo:
                      Thank you very much for your help, it helped me a lot.
                      Concerning the robustness, I was refering to a concept that I have seen which is robust regression, an alternative model of a "normal" regression that
                      allows to deal with sample involving outliers. I am not sure to have understood the concept. Are you familiar with it
                      Best,
                      Abi

                      Comment


                      • #12
                        Abi:
                        Considering a 10-year timespan, the reduction of posts on -rreg- on this forum makes me think that it has been progressively side-tracked. All in all, unless they are the result of a mistaken data entry, outliers are a matter of fact and sometimes they reflect our imprecise knowledge of the data generating process.
                        That said, I would stick with OLS.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #13
                          Carlo:
                          Yes rreg seems to be side-tracked according to the discussion relating to this concept.
                          I had a last question because I'm considering now the potential limitations of my model.
                          Some of those limitations are for example my limited sample (78 datas) that reduce both internal and external validation.
                          I was considering discussing also about the fact that my study only take into account datas for 1 year (effect of CSR Contract in 2016 on CSR level in 2017) and thus can be affected by an unobservable time effect. Do you think it is a relevant point to discuss in the limitations of my model?
                          Best,
                          Abi

                          Comment


                          • #14
                            Abi:
                            internal validity of a given study usually implies randomization, something that is hardly possible outside the medical field.
                            In your case, I would test whether your model is good enough in predicting the observed values and, potentially, how ccn it be goof for out of sample prediction (something that is often hard to forecast with a reasonable precision).
                            As far as the limitataions of your research are concerned, it is obviously wise to highlight that, due to your limited sample size, more reasearch is needed (that may well include panel data regression) to confirm your findings (however, this drawback is partially countebalanced by the absence of evidence that your regression suffers from heteroskedastcity and misspecificatioin of the functional form).
                            Kind regards,
                            Carlo
                            (Stata 19.0)

                            Comment


                            • #15
                              Allright thank you very much
                              Furthermore as I'm "lagging" my dependant variable CSRScore2017 into a control variable one year before CSRScore 2016 can I infer that doing so allow to control for company fixed effects invariant in times?

                              Comment

                              Working...
                              X