Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pooled Regression with time variable

    Hi All,

    I have annual data, and I want to run a pooled OLS regression and include a time variable as one of the independent variables, as the dependent variable may increase or decrease with time. Will the following command suffice?-

    reg dependent independent1 independent2 c.f_year##c.f_year [here f_year refers to the financial year]

  • #2
    Nihar:
    I assume that you have a cross-sectional dataset.
    Therefore, you probably meant an OLS (as pooled OLS deals with panel datasets).
    That said, your code is aiming at exploring non-linearities between -f_year- and the regressand.
    It makes sense, as the relationship between the two terms might be quadratic (that is, showing a turning point at some value for -f_year).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Yes, I meant OLS only, as I have a cross-sectional dataset. The coefficient of f_year is coming significant along will all other independent variables. So, how can we interpret the results?

      Comment


      • #4
        Nihar:
        how interested listers coud ever reply positively to your query without taking a look at what you typed and what Stata gave you back (as perFAQ)? Thanks.
        Last edited by Carlo Lazzaro; 06 Apr 2023, 11:21.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Carlo-

          Here is the command and the result-
          reg yearly_firm_specific_sd beta_med lev_med c.f_year##c.f_year, robust cluster(co_code)

          Linear regression Number of obs = 19,962
          F(3, 1338) = .
          Prob > F = .
          R-squared = 0.1214
          Root MSE = .41838

          (Std. Err. adjusted for 1,339 clusters in co_code)
          -----------------------------------------------------------------------------------
          | Robust
          yearly_firm_spe~d | Coef. Std. Err. t P>|t| [95% Conf. Interval]
          ------------------+----------------------------------------------------------------
          beta_med | .1569812 .0118052 13.30 0.000 .1338225 .1801399
          lev_med | .1690556 .0147589 11.45 0.000 .1401024 .1980087
          f_year | -11.42906 .733983 -15.57 0.000 -12.86895 -9.98918
          |
          c.f_year#c.f_year | .0028363 .0001823 15.56 0.000 .0024787 .0031939
          |
          _cons | 11515.87 738.8172 15.59 0.000 10066.5 12965.24

          Comment


          • #6
            Originally posted by Nihar Singh View Post
            Yes, I meant OLS only, as I have a cross-sectional dataset. The coefficient of f_year is coming significant along will all other independent variables. So, how can we interpret the results?
            Sorry, I got confused. I do not have cross-sectional data. I have pooled data.

            Comment


            • #7
              Nihar:
              whenever I hear about pooled data without further details, I consider the risk of ecological regression bias (https://academic.oup.com/ije/article/30/6/1343/651788).
              Obviously, I cannot say whether or not this is the case of your regression.
              That said, it seems that there's evidence of a quadratic relation between -f_year- and tyhe regressand.
              The turning point can be calculated as:
              Code:
              . di 11.42906/(2*.0028363)
              2014.7833
              As an aside, please use CODE delimiters to post what you typed and what Stata gave you back. Thanks.
              Last edited by Carlo Lazzaro; 07 Apr 2023, 00:04.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Carlo: Here is the data example-

                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input long co_code str8 date float(f_year yearly_firm_specific_sd lev_med promoters_pct_med beta_med)
                 363 "20060125" 2006  3.062336 1 1 1
                 363 "20070122" 2007 2.7243526 1 1 1
                 363 "20080125" 2008  2.888042 1 1 1
                 363 "20090114" 2009 3.2448754 1 1 1
                 363 "20100108" 2010  3.001153 1 1 0
                 363 "20110117" 2011  2.570871 1 1 1
                 363 "20120117" 2012  2.563627 1 1 1
                 363 "20130122" 2013  2.636189 1 1 1
                 363 "20140101" 2014 2.7847264 1 1 0
                 363 "20150130" 2015   3.44346 1 1 1
                 363 "20160125" 2016 3.1907215 1 1 1
                 363 "20170119" 2017  3.074483 1 1 1
                 363 "20180122" 2018 3.1581826 1 1 1
                 363 "20190111" 2019 3.3134716 0 1 1
                 363 "20200131" 2020  2.863771 0 1 0
                 415 "20100125" 2010  3.195892 0 1 1
                 415 "20110107" 2011  3.100225 0 1 1
                 415 "20120120" 2012  2.545047 0 1 1
                 415 "20130130" 2013 2.7539606 0 1 1
                 415 "20140123" 2014  3.167217 0 1 0
                 415 "20150102" 2015  2.899632 0 1 1
                 415 "20160114" 2016  2.895793 0 1 1
                 415 "20170125" 2017  2.639007 0 1 1
                 415 "20190116" 2019 2.9509964 0 1 1
                 415 "20200122" 2020 3.3354754 0 1 1
                 415 "20210112" 2021  3.163728 0 1 1
                 783 "20040112" 2004  3.142742 0 0 1
                 783 "20050106" 2005  2.834996 0 0 1
                 783 "20060112" 2006  3.020261 1 0 1
                 783 "20070103" 2007  3.110961 0 0 0
                 783 "20080109" 2008 3.1880925 0 0 0
                 783 "20090129" 2009  3.328319 0 0 0
                 783 "20100115" 2010 3.2471824 0 0 1
                 783 "20110131" 2011  2.784959 0 0 1
                 783 "20120106" 2012  2.742421 0 0 1
                 783 "20130107" 2013 2.6407325 0 0 0
                 783 "20140120" 2014  2.555225 0 0 0
                 783 "20150129" 2015  2.726333 0 0 1
                 783 "20160128" 2016   2.60269 0 0 1
                 783 "20170110" 2017  2.661177 0 1 1
                 783 "20180131" 2018   2.75258 0 0 1
                 783 "20190107" 2019  2.474093 0 0 0
                 783 "20200113" 2020 2.3380747 0 0 0
                 783 "20210107" 2021  2.574968 0 0 0
                1120 "20120113" 2012 2.1070302 0 1 0
                1120 "20130116" 2013  2.009795 0 1 0
                1120 "20140131" 2014  2.357927 0 1 0
                1120 "20150105" 2015 2.3628206 0 1 0
                1120 "20160106" 2016 2.2069402 0 1 0
                1120 "20170119" 2017 2.1189635 0 1 0
                1120 "20180130" 2018 1.9640605 0 1 0
                1120 "20190129" 2019 2.0260704 0 1 0
                1120 "20200115" 2020 2.3112993 0 1 0
                1120 "20210115" 2021  2.386069 0 1 0
                2717 "20130103" 2013 2.1386871 1 1 0
                2717 "20140109" 2014  2.161491 1 1 0
                2717 "20150102" 2015 2.1866627 1 1 0
                2717 "20160108" 2016 2.3691595 0 1 0
                2717 "20170117" 2017  2.197305 0 1 0
                2717 "20180122" 2018 2.3840895 0 0 0
                2717 "20190124" 2019  3.058293 0 0 0
                2717 "20200131" 2020  3.560856 0 0 0
                2717 "20210107" 2021  2.941291 0 0 0
                2842 "20050112" 2005 4.6062675 1 1 1
                3335 "20130125" 2013 2.5157194 0 1 0
                3335 "20140123" 2014 2.5657785 0 1 0
                3335 "20150113" 2015 2.9098105 0 1 0
                3335 "20160119" 2016 2.5949035 0 1 1
                3335 "20180130" 2018  2.468572 0 1 1
                3335 "20190108" 2019 2.1381102 1 1 1
                3335 "20200101" 2020  2.780363 0 1 1
                3889 "20170119" 2017  2.762818 0 1 1
                3889 "20180111" 2018 2.9973826 0 0 1
                3889 "20190109" 2019 2.9543955 0 0 1
                3889 "20200131" 2020  3.137166 0 0 1
                3889 "20210105" 2021 3.1984675 0 0 1
                3990 "20050106" 2005 2.6449234 1 1 0
                3990 "20060105" 2006  2.446327 1 1 0
                3990 "20070122" 2007 2.1911693 1 1 0
                3990 "20080107" 2008  2.948964 1 0 0
                3990 "20090112" 2009 2.3711774 1 1 0
                3990 "20100127" 2010 2.8231556 1 1 0
                3990 "20110107" 2011 2.2977307 1 1 0
                3990 "20120117" 2012  2.429293 1 1 0
                3990 "20130103" 2013  2.591575 1 1 0
                3990 "20140107" 2014    2.4969 1 1 1
                3990 "20150128" 2015   2.62027 1 1 1
                3990 "20160118" 2016  2.512878 1 1 1
                3990 "20170119" 2017 2.0524552 1 1 0
                3990 "20180129" 2018 2.0375447 1 1 0
                3990 "20190103" 2019  2.215372 1 1 0
                3990 "20200123" 2020 2.3274434 1 1 0
                3990 "20210127" 2021  3.001243 1 1 0
                3998 "20040116" 2004  2.765778 1 1 1
                3998 "20050106" 2005  2.452104 1 0 1
                3998 "20060116" 2006 2.3279567 1 0 1
                3998 "20070131" 2007   2.48019 1 0 1
                3998 "20080125" 2008  3.192807 1 0 1
                3998 "20090116" 2009 2.3718643 1 0 1
                3998 "20100129" 2010  2.599503 1 0 1
                end
                lev_med, promoters_pct_med, and beta_med are dummy variables.

                I want to run the following regression-
                FIRMSPECIFICit = βo + β1Timet + β2Xit + εit

                I have used the following command-
                Code:
                 reg yearly_firm_specific_sd beta_med promoters_pct_med lev_med c.f_year##c.f_year, robust cluster(co_code)
                The output-

                Linear regression                               Number of obs     =     15,914
                F(4, 1338) = .
                Prob > F = .
                R-squared = 0.1323
                Root MSE = .42053

                (Std. Err. adjusted for 1,339 clusters in co_code)
                -----------------------------------------------------------------------------------
                | Robust
                yearly_firm_spe~d | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                ------------------+----------------------------------------------------------------
                lev_med | .169065 .014026 12.05 0.000 .1415497 .1965802
                promoters_pct_med | -.0135231 .014984 -0.90 0.367 -.0429178 .0158717
                beta_med | .1440992 .0101884 14.14 0.000 .1241121 .1640863
                f_year | -11.45913 .6711533 -17.07 0.000 -12.77576 -10.14251
                |
                c.f_year#c.f_year | .0028434 .0001667 17.06 0.000 .0025164 .0031704
                |
                _cons | 11547.75 675.6166 17.09 0.000 10222.37 12873.14
                -----------------------------------------------------------------------------------


                I have following questions-

                1. Is my code trying to measure what I intend?
                2. What value should I report for β1 (coefficient of Timet)? (f_year or c.f_year#c.f_year)

                Comment


                • #9
                  Nihar:
                  1) the correct way to investigate what you're after rests on -xtreg,fe-:
                  Code:
                  . xtreg yearly_firm_specific_sd beta_med promoters_pct_med lev_med c.f_year##c.f_year, fe cluster(co_code)
                  
                  Fixed-effects (within) regression               Number of obs     =        100
                  Group variable: co_code                         Number of groups  =         10
                  
                  R-squared:                                      Obs per group:
                       Within  = 0.2378                                         min =          1
                       Between = 0.0137                                         avg =       10.0
                       Overall = 0.1068                                         max =         18
                  
                                                                  F(5,9)            =      10.95
                  corr(u_i, Xb) = -0.3475                         Prob > F          =     0.0013
                  
                                                      (Std. err. adjusted for 10 clusters in co_code)
                  -----------------------------------------------------------------------------------
                                    |               Robust
                  yearly_firm_spe~d | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                  ------------------+----------------------------------------------------------------
                           beta_med |   .0766676   .0527856     1.45   0.180    -.0427416    .1960769
                  promoters_pct_med |  -.3471832     .11361    -3.06   0.014    -.6041868   -.0901796
                            lev_med |    -.25986   .0764639    -3.40   0.008    -.4328333   -.0868867
                             f_year |  -11.65301   3.832994    -3.04   0.014    -20.32384    -2.98217
                                    |
                  c.f_year#c.f_year |   .0028913   .0009525     3.04   0.014     .0007367     .005046
                                    |
                              _cons |   11744.33   3856.272     3.05   0.014     3020.839    20467.83
                  ------------------+----------------------------------------------------------------
                            sigma_u |  .66928402
                            sigma_e |  .26280095
                                rho |  .86641467   (fraction of variance due to u_i)
                  -----------------------------------------------------------------------------------
                  
                  . di 11.65301/(2*.0028913)
                  2015.1852
                  The interaction of -timevar- with itself shows a quadratic relationship between -f_year- and the regressand.
                  There's a absolute miminum at 2015; before the minimun the function points downwards; after the minimum its direction is upwards.
                  As -sigma_u-> -sigma_e- there's evidence of a panel-wise effect;
                  2) you should report both the linear and quadratic coefficients for -timevar-;
                  3) I'd test the correct specification of the functional form of the regressand via:
                  Code:
                  . predict fitted, xb
                  
                  . gen sq_fitted=fitted^2
                  
                  . xtreg yearly_firm_specific_sd fitted sq_fitted , fe cluster(co_code)
                  
                  Fixed-effects (within) regression               Number of obs     =        100
                  Group variable: co_code                         Number of groups  =         10
                  
                  R-squared:                                      Obs per group:
                       Within  = 0.2378                                         min =          1
                       Between = 0.0138                                         avg =       10.0
                       Overall = 0.1068                                         max =         18
                  
                                                                  F(2,9)            =      15.59
                  corr(u_i, Xb) = -0.3473                         Prob > F          =     0.0012
                  
                                                 (Std. err. adjusted for 10 clusters in co_code)
                  ------------------------------------------------------------------------------
                               |               Robust
                  yearly_fir~d | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                  -------------+----------------------------------------------------------------
                        fitted |   1.049569   1.595373     0.66   0.527    -2.559416    4.658554
                     sq_fitted |  -.0090364   .2913892    -0.03   0.976    -.6682046    .6501318
                         _cons |  -.0673745   2.191324    -0.03   0.976    -5.024493    4.889744
                  -------------+----------------------------------------------------------------
                       sigma_u |  .66913623
                       sigma_e |  .25828202
                           rho |  .87032904   (fraction of variance due to u_i)
                  ------------------------------------------------------------------------------
                  
                  .
                  As the -sq_fitted- fails ti reach statistical signifcance, the modell is correctly specified;
                  4) Ebentually, as an aside.
                  Your panel regression equation should be written as:
                  Code:
                  FIRMSPECIFICit = βo + β1Timet + β2Xit + εit + ui
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Carlo-

                    When I run the regression on the full sample, I get the following results-

                    -sigma_u-< -sigma_e-

                    -sq_fitted- is coming to be statistically significant at 1%. (Does this mean my model is not correctly specified?)


                    What is the rationale behind including ui?

                    While I am trying to use the outreg command, the following issue is coming up-
                    Code:
                    Adjusted R-squared (e(r2_a)) not defined; cannot use adjr2 option

                    Comment


                    • #11
                      Nihar:
                      1) you might not have a panel-wise effect with the full sample. Re-run -xtreg- with default standard errors (that do not affect the sigmas) and see whether the F-test at the borrom of the outcome table reaches statistical significance (panel-wise effect) ir not;
                      2) aq:fitted (highly) statistically significant=model misspecification (your right-hand side of the regression equation may need more predictors and/or interactions):
                      3) panel error is composed of epsilon (that varies across panel units and time) and ui (that caries across panels only; that is, each observation belonging to the same panel shares the very same ui value).
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        Okay, I will try with default standard errors.

                        Thank you, Carlo, for all the help.

                        Comment


                        • #13
                          Carlo:

                          One more question regarding the regression.

                          Code:
                          FIRMSPECIFICit = βo + β1Timet + β2Xit + εit 
                          Shall I include f_year as a dependent variable for time or the interaction of f_year with itself as a dependent variable?

                          Comment


                          • #14
                            Nihar:
                            as interaction is another predictor, you should include three Betas (the two conditional main effect and the interaction between them) in the right-hand side of your regression equation.
                            Kind regards,
                            Carlo
                            (Stata 19.0)

                            Comment


                            • #15
                              Okay. Thank you once again, Carlo!

                              Comment

                              Working...
                              X