Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Box Cox regression (Lambda model) result shows 0.000 standard errors for estimated coefficients and Sigma constant

    Hello all,

    I am Florence Ijagbone. I am working on the social welfare function as part of my PhD programme on income inequality in Nigeria.

    I want to know if its normal for Box- Cox regression, Lambda model to come up with 0.000 standard errors for the coefficients and sigma constant.

    And please, how do I interprete the Sigma constant in the result?.

    The regression result is shown below in the original output and tabular form:

    Number of obs = 66
    LR chi2(2) = 16.99
    Log likelihood = -270.79042 Prob > chi2 = 0.000

    ------------------------------------------------------------------------------
    socialwelf~e | Coefficient Std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    /lambda | .3419019 .3273188 1.04 0.296 -.2996312 .9834351
    ------------------------------------------------------------------------------

    Estimates of scale-variant parameters
    ----------------------------
    | Coefficient
    -------------+--------------
    Notrans |
    _cons | 6.590964
    -------------+--------------
    Trans |
    meanincome | .0389765
    ginicomple~t | .1702596
    -------------+--------------
    /sigma | .9716801
    ----------------------------

    ---------------------------------------------------------
    Test Restricted LR statistic
    H0: log likelihood chi2 Prob > chi2
    ---------------------------------------------------------
    lambda = -1 -283.81911 26.06 0.000
    lambda = 0 -271.4806 1.38 0.240
    lambda = 1 -271.8932 2.21 0.138
    ---------------------------------------------------------


    Social Welfare Function
    ------------------------------------
    socialwelfare
    ------------------------------------
    Intercept 6.591
    (0.000)
    Mean Income 0.039
    (0.000)
    Gini Complement 0.170
    (0.000)
    Intercept 0.342
    (0.327)
    Intercept 0.972
    (0.000)
    Number of observations 66
    ------------------------------------

    Thank you for your kind contribution.

    Florence Ijagbone

  • #2
    Please post your data and the complete code you're using. You are using at least one and possibly several commands here that aren't boxcox.

    Comment


    • #3
      Thank you very much.

      The data is declared as panel data: xtset Regnam year

      The code is: boxcox socialwelfare meanincome ginicomplement, model(lambda)

      Here is the data:
      year socialwelfare meanincome ginicomplement Regnam
      2010 53.1 7027.711 10 1
      2011 53.1 7027.711 10 1
      2012 43.7 4282.343 10 1
      2013 43.7 4282.343 10 1
      2014 43.7 4282.343 10 1
      2015 57.3 2521.568 10 1
      2016 57.3 2521.568 10 1
      2017 57.3 2521.568 10 1
      2018 60.2 3809.644 10 1
      2019 60.2 3809.644 10 1
      2020 60.2 3809.644 10 1
      2010 37.1 1862.475 10.31117 2
      2011 37.1 1862.475 10.31117 2
      2012 41.9 2040.182 10 2
      2013 41.9 2040.182 10 2
      2014 41.9 2040.182 10 2
      2015 49.1 1820.654 10 2
      2016 49.1 1820.654 10 2
      2017 49.1 1820.654 10 2
      2018 49.9 1162.753 10 2
      2019 49.9 1162.753 10 2
      2020 49.9 1162.753 10 2
      2010 27.9 2522.379 11.04641 3
      2011 27.9 2522.379 11.04641 3
      2012 46.3 2142.13 10.01859 3
      2013 46.3 2142.13 10.01859 3
      2014 46.3 2142.13 10.01859 3
      2015 55.1 1412.37 10 3
      2016 55.1 1412.37 10 3
      2017 55.1 1412.37 10 3
      2018 60.4 1660.765 10 3
      2019 60.4 1660.765 10 3
      2020 60.4 1660.765 10 3
      2010 79.5 4238.538 19.35187 4
      2011 79.5 4238.538 19.35187 4
      2012 92 5505.879 16.91941 4
      2013 92 5505.879 16.91941 4
      2014 92 5505.879 16.91941 4
      2015 80.8 2599.007 10 4
      2016 80.8 2599.007 10 4
      2017 80.8 2599.007 10 4
      2018 76.9 3915.357 10 4
      2019 76.9 3915.357 10 4
      2020 76.9 3915.357 10 4
      2010 76.3 20877.88 12.94279 5
      2011 76.3 20877.88 12.94279 5
      2012 63.7 10970.05 17.11206 5
      2013 63.7 10970.05 17.11206 5
      2014 63.7 10970.05 17.11206 5
      2015 81.2 5628.83 10 5
      2016 81.2 5628.83 10 5
      2017 81.2 5628.83 10 5
      2018 76.5 3793.816 10 5
      2019 76.5 3793.816 10 5
      2020 76.5 3793.816 10 5
      2010 74.8 8561.696 16.6276 6
      2011 74.8 8561.696 16.6276 6
      2012 75.7 16078.71 19.8012 6
      2013 75.7 16078.71 19.8012 6
      2014 75.7 16078.71 19.8012 6
      2015 83.9 5609.677 10 6
      2016 83.9 5609.677 10 6
      2017 83.9 5609.677 10 6
      2018 82.7 7187.508 10.91311 6
      2019 82.7 7187.508 10.91311 6
      2020 82.7 7187.508 10.91311 6

      Comment


      • #4
        Thanks for the detail. I was wrong in #2 to guess there is more here than boxcox results. Sorry about that.

        Now the question seems to be whether the model you're testing makes economic as well as statistical sense, and I yield to economists on that one.

        Comment


        • #5
          Thanks Cox.

          There is nothing wrong with the model. Agreed the variables are not normally distributed hence the use of the BOXCOX (LAMBDA) regression.

          I only want to know:
          (1) why the regression returns 0.000 for standard errors for the variables, whether it has anything to do with normalization by the BOXCOX regression technique and
          (2) the interpretation of Sigma coefficient 0.972 in the result.

          I repost below the regression as it appeared on Stata result window:

          boxcox socialwelfare meanincome ginicomplement, model(lambda)
          Fitting comparison model

          Iteration 0: log likelihood = -279.397
          Iteration 1: log likelihood = -279.28442
          Iteration 2: log likelihood = -279.28434
          Iteration 3: log likelihood = -279.28434

          Fitting full model

          Iteration 0: log likelihood = -271.8932
          Iteration 1: log likelihood = -270.80988
          Iteration 2: log likelihood = -270.79061
          Iteration 3: log likelihood = -270.79042
          Iteration 4: log likelihood = -270.79042

          Number of obs = 66
          LR chi2(2) = 16.99
          Log likelihood = -270.79042 Prob > chi2 = 0.000


          socialwelf~e Coefficient Std. err. z P>z [95% conf. interval]

          /lambda .3419019 .3273188 1.04 0.296 -.2996312 .9834351


          Estimates of scale-variant parameters

          Coefficient

          Notrans
          _cons 6.590964

          Trans
          meanincome .0389765
          ginicomple~t .1702596

          /sigma .9716801



          Test Restricted LR statistic
          H0: log likelihood chi2 Prob > chi2

          lambda = -1 -283.81911 26.06 0.000
          lambda = 0 -271.4806 1.38 0.240
          lambda = 1 -271.8932 2.21 0.138


          Will appreciate any input in line with my questions.

          Thank you.

          Comment


          • #6
            You have 66 total observations, and those aren't independent because you only have 6 cross-sectional units and 11 time periods. You shouldn't be using a statistical procedure that is only ever justified with asymptotics. My recommendation is to use the log of the welfare measure and use a linear model. Unfortunately, with N = 6, T = 11, adjusting your standard errors for serial correlation isn't really possible. I'd probably use user-written xtscc with one lag as an attempt. But when you have a small data set you need to restrain yourself from doing anything fancy.

            Comment


            • #7
              Note that the 0.000 outputs are P-values, not standard errors.

              Comment


              • #8
                x 1-- In STATA I have 500 data and the data name "delivery kit" For this data I have three forms one is the "enrollment form" the second form is the "follow-up form" and the third one is the "devilry form" and every enrolment against three followups and first follow-up done 7th day after enrollment and 2nd followup 14th days after enrollment and 3rd followup 28days after enrolment. Hence, I find that every enrollment against how many follow-ups and which day after enrolment, so what command will use in STATA?

                2- in STATA I have 500 data in this one variable is "date of last menstrual period" and the second variable is "date of outcome or date of delivery" to find the difference between the date of delivery and date of last menstrual period and generate new variable so what command will use in STATA

                Comment


                • #9
                  Amna Ghaffar Please start a new thread

                  Comment


                  • #10
                    Thanks Jeff and Nick.

                    Agreed that the data size leaves little room for manipulation which is a major challenge. Log linear regressions already attempted produced residuals that are not normally distributed hence the BOXCOX technique. The SWILK-W test of residuals is good with the result of the BOXCOX technique.

                    However, when the result of the BoxCox regression is tabulated for export to Ms-Word or Excel at 95% confidence interval, the estimated coefficients come with (0.000) below and in fact earlier tabulations came with (.). I want to know what they - (0.000) or (.) - signify in the tabulated result. I also want to know what the SIGMA coefficient in the BOXCOX regression result measures. I repost the tabulated result below:


                    Social Welfare Function
                    ------------------------------------
                    socialwelfare
                    ------------------------------------
                    Intercept 6.591
                    (0.000)
                    Mean Income 0.039
                    (0.000)
                    Gini Complement 0.170
                    (0.000)
                    Intercept 0.342
                    (0.327)
                    Intercept 0.972 This is the sigma coefficient
                    (0.000)
                    Number of observations 66





                    Comment


                    • #11
                      The logic of preferring Box-Cox because the residuals in the log(Y) regression don't look normally distributed makes little sense. All of the Box-Cox statistics are based on asymptotic analysis. It's a nonlinear MLE. So, if you're willing to rely on asymptotic approximations for the Box-Cox estimation, then why not for a standard regression with log(Y) as the dependent variable? We know normality of the errors is only needed for exact inference. At least you have a chance of unbiasedness with log(Y) as the dependent variable, and the usual standard errors might be approximately correctly. Who knows with Box-Cox applied to N = 6, T = 11?

                      Having said that, your estimate of lambda is .342 with a 95% CI (believing all of the model assumptions and the asymptotics) from -.2996312 to .9834351. The p-value for testing the null that lambda is zero is 0.296. In other words, H0: lambda = 0 cannot be rejected at anything close to the 5% level. So there's no good argument for rejecting log(Y) based on the formal tests.

                      In addition, you're using the version of Box-Cox that transform the x variables using the same lambda as the dependent variable. Is that what you want? The estimated coefficients will be very difficult to interpret.

                      Comment


                      • #12
                        I had initially used the xtscc (Driscoll-Kraay estimation technique) with clear display of SEs but I eventually opted for BOXCOX due to the non-normal distribution of the variables shown by SWILK W test for normal data. I hope it is safe to infer that BOXCOX regression technique on Stata does not show SEs for the estimated coefficients of the variables.

                        Thank you so much for the robust contributions and suggestions.

                        Comment


                        • #13
                          #12 It's not safe to assume or infer any such thing. If you want to continue your current question, I suggest you start a new thread about what the method you used to export results does with boxcox results. .

                          In essence, however, if you are opting for boxcox you're going against well-founded advice, so that's what it is.

                          Comment


                          • #14
                            Thanks Nick.

                            I should know the interpretation of the output 0.000 attached to the estimated coefficients from boxcox regression using Stata. The appropriateness of boxcox or not does not remove the need to be able to interpret the regression output which is my primary concern here. Along the line in #7 I am made to understand it's p-value not SE which is not the normal way of interpreting regression output.

                            Your suggestion is well-taken.

                            Comment

                            Working...
                            X