Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimating NLS

    Dear All,
    What could be the reason(s) for this undesired result? (please see codes and results below). In the model, y is the dependent variable, controlling for only its lagged, l.y. The "g" parameter is the coefficient of the endogenous regressor. I'm raising the parameter to a variable called 'gap'. This variable is the number years separating observed years in an irregularly spaced panel, e.g 1992 and 1996, gap=4, 1996 and 2004, gap=8 and so on. "m" stands for the actual number of observed periods. For instance if between 1970 and 2010 datasets were collected for only 19 years within the period, m=1, ..., 19 while actual T=40. In this case, data is missing for all observations for 40-19 = 21 years. If the gaps are ignored the panel will have T=19. Doing this will bias the results, and neither is of interest to me. It is for this reason am accounting for gaps by trying to use NLS fixed effect estimator.

    Help will be much appreciated.

    Thanks,
    Dapel

    Code:
    . nl (y=l.y*{g}^gap) if m>1
    (obs = 432)
    
    Iteration 0:  residual SS =  1.64e+08
    
          Source |       SS       df       MS
    -------------+------------------------------         Number of obs =       432
           Model |  -141461844    -1   141461844         R-squared     =   -6.2283
        Residual |   164174641   432   380033.89         Adj R-squared =   -6.2116
    -------------+------------------------------         Root MSE      =  616.4689
           Total |  22712796.1   431  52697.9028         Res. dev.     =  6776.306
    
    ------------------------------------------------------------------------------
               y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
              /g |          0  (constrained)
    ------------------------------------------------------------------------------
      Parameter g taken as constant term in model & ANOVA table
    Last edited by Zuhumnan Dapel; 15 Nov 2014, 14:40.

  • #2
    The initial value for any parameter in -nl- that is not user-specified is zero. I think that g=0 is a really bad starting point for this model. You'd probably do much better initializing g to 1. Just a guess.

    That said, what about a different approach? It looks like you are trying to fit, what is in effect a compound interest model to the data. So why not log transform everything? Then it becomes a linear model: log(y) = gap*log(g) + log(L.y)

    Code:
    gen log_y = log(y)
    gen log_lag_y = log(L.y)
    regress log_y gap log_lag_y
    test log_lag_y = 1, coef
    The -test- command will report the regression coefficients with log_lag_y's constrained to 1. Then you just have to exponentiate the coefficient of gap to get your estimate of g.

    Comment


    • #3
      Dear Clyde, What a great relief this is to me, thanks a million! However, this is the equation I have been wanting to estimate: yim=ggap*L.yim-1 + eim i= 1, ..., N, m= 1, ..., M. Where M =19 is the total number of observed years between 1970 and 2010. The parameter of interest is the coefficient of the lagged, L.yim-1,which is subject to non-linear restrictions, gap size. The reason is to account for the intervals between the surveys. In the log transformed version you suggested, the lagged variable seem to be unassociated with a parameter.
      Last edited by Zuhumnan Dapel; 16 Nov 2014, 17:40.

      Comment


      • #4
        Yes, the lagged variable is unassociated with a parameter in the log-transformed version. Ignore the error terms and take logarithms of both sides. Then log(g) becomes the coefficient of gap, and L.y becomes an added term with (implicit) coefficient 1. That is precisely the model that will be estimated by the code I gave you. The difference from your original model is that the error terms are additive on the log scale, and multiplicative in the original metric. But given the nature of your model, I would think that would be a feature, not a problem.

        Comment


        • #5
          Thanks! I'm trying to grasp what this means: "~~the error terms are additive on the log scale, and multiplicative in the original metric". And can we use the estimated log(g) as a speed of convergence parameter (or autoregressive parameter), in measuring the impact of previous
          log(y) on current log(y)?
          Last edited by Zuhumnan Dapel; 17 Nov 2014, 03:48.

          Comment


          • #6
            "the error terms are additive on the log scale, and multiplicative in the original metric"

            This only means when you take log, you are assuming y=x*eps => lny=lnx+ln(eps)
            instead of the original metrics y=x+eps=>lny=ln(x+eps)

            BTW, since you are doing autoregressive stuff, you should also check the stationary condition by using PP test , Dicky-Fuller , ADF, or DF GLS.






            Comment


            • #7
              Thanks Lui

              Comment


              • #8
                Dear Clyde,
                This is what I got with some of the codes you gave.
                Code:
                . gen log_y = log(y)
                
                . gen log_lag_y = log(L.y)
                (108 missing values generated)
                
                . regress log_y gap log_lag_y
                
                      Source |       SS       df       MS              Number of obs =     432
                -------------+------------------------------           F(  2,   429) =  123.24
                       Model |  22.9499511     2  11.4749756           Prob > F      =  0.0000
                    Residual |  39.9449164   429  .093111693           R-squared     =  0.3649
                -------------+------------------------------           Adj R-squared =  0.3619
                       Total |  62.8948675   431  .145927767           Root MSE      =  .30514
                
                ------------------------------------------------------------------------------
                       log_y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                         gap |   .1381267   .0108975    12.68   0.000     .1167075     .159546
                   log_lag_y |   .4509374   .0310385    14.53   0.000      .389931    .5119439
                       _cons |   2.507871    .243472    10.30   0.000     2.029324    2.986417
                ------------------------------------------------------------------------------
                Can we refer to the coefficient gap=0.138 as the speed of convergence or the autoregressive parameter? Otherwise I may have to go with the initial estimation of using '1' as initial value. What do you think? What I observe is that the transformation dampens the value of parameter.

                That said, one may be working with natural log of consumption. In this case one would have to take log of log? How plausible is this? It is the case too, that when the autoregressive parameter is estimated using non-logged values, the parameter is <1, but when logged, it becomes >=1

                Comment


                • #9
                  Those are not the results you want. You forgot to run the last, crucial, line of code:

                  Code:
                  test log_lag_y = 1, coef
                  That line will re-evaluate the regression constraining the coefficient of log_lag_y to 1. Without that constraint, the coefficient of gap is not what you are looking for. The output of the test command will include a new regression table, and the coefficient of gap in that table will be what you want.

                  Comment


                  • #10
                    Thanks. I run. Just that it was not pasted here because there is no much difference between the values.
                    However, here is it:
                    Code:
                    . test log_lag_y = 1, coef
                    
                     ( 1)  log_lag_y = 1
                    
                           F(  1,   429) =  312.93
                                Prob > F =    0.0000
                    
                    
                    Constrained coefficients
                    
                    ------------------------------------------------------------------------------
                                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                             gap |   .2390388   .0092852    25.74   0.000     .2208402    .2572374
                       log_lag_y |          1          .        .       .            .           .
                           _cons |  -1.676763    .057613   -29.10   0.000    -1.789682   -1.563843
                    ------------------------------------------------------------------------------

                    Comment


                    • #11
                      So your estimate of the parameter you referred to as g in your original post is exp(0.2390388) = 1.27 (to 2 decimal places.) If you want a confidence interval around it, you can just exponentiate the confidence interval for the gap coefficient itself.

                      Comment


                      • #12
                        Thanks. You are making it clearer for me. In dynamic panel data analysis, the coefficient of the endogenous covariate is <1. How is this result that we've just got 1.27>1?

                        Comment


                        • #13
                          Hmm. I see I made an error in translating between the algebraic equation log(y) = gap*log(g) + log(L.y) and the regression code. I should have specified that there be no constant term in the model. So the code should have been

                          Code:
                          regress log_y gap log_lag_y, nocons
                          test log_lag_y = 1, coef
                          Sorry for the confusion.

                          There is another issue here, also not dealt with by this code. At the start of this thread, there was no mention of panel data. And I had in mind that this was just a single time series with gaps. If this is panel data, then we can't use -regress- here because it will fail to capture the within-panel-unit nature of what we are trying to do. Then we need -xtreg, fe-. It then also gets a little more complicated because -xtreg, fe- does not allow the -nocons- option. So if we are in a panel data situation it has to be:

                          Code:
                          xtreg log_y gap log_lag_y, fe
                          test (_cons = 0) (log_lag_y = 1), coef

                          Comment


                          • #14
                            Thanks for such a humble response. Here is what I got
                            Code:
                            . xtreg log_y gap log_lag_y, fe
                            
                            Fixed-effects (within) regression               Number of obs      =       432
                            Group variable: id                              Number of groups   =       108
                            
                            R-sq:  within  = 0.3431                         Obs per group: min =         4
                                   between = 0.4441                                        avg =       4.0
                                   overall = 0.3646                                        max =         4
                            
                                                                            F(2,322)           =     84.09
                            corr(u_i, Xb)  = 0.0946                         Prob > F           =    0.0000
                            
                            ------------------------------------------------------------------------------
                                   log_y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                     gap |   .1321363   .0114847    11.51   0.000     .1095418    .1547308
                               log_lag_y |   .4183435    .036233    11.55   0.000     .3470602    .4896267
                                   _cons |   2.756283   .2821837     9.77   0.000     2.201126    3.311439
                            -------------+----------------------------------------------------------------
                                 sigma_u |  .14976866
                                 sigma_e |  .30750083
                                     rho |   .1917356   (fraction of variance due to u_i)
                            ------------------------------------------------------------------------------
                            F test that all u_i=0:     F(107, 322) =     0.94            Prob > F = 0.6449
                            
                            . test (_cons = 0) (log_lag_y = 1), coef
                            
                             ( 1)  _cons = 0
                             ( 2)  log_lag_y = 1
                            
                                   F(  2,   322) =  545.90
                                        Prob > F =    0.0000
                            Warning:  variance matrix is nonsymmetric or highly singular
                            
                            
                            Constrained coefficients
                            
                            ------------------------------------------------------------------------------
                                         |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                     gap |  -.0222749          .        .       .            .           .
                               log_lag_y |          1          .        .       .            .           .
                                   _cons |   1.95e-14          .        .       .            .           .
                            ------------------------------------------------------------------------------
                            
                            .
                            end of do-file
                            Now we have a coefficient <1 ,e.1 exp(-.0222749) = exp
                            0.97 (to 2 decimal places). But no standard error and CI.
                            Is there no way to estimate NLS panel fixed effects based on yim=ggap*L.yim-1 + eim i= 1, ..., N, m= 1, ..., M?
                            Last edited by Zuhumnan Dapel; 17 Nov 2014, 18:20.

                            Comment


                            • #15
                              Well, that didn't work very well! And as I think about it more deeply, simply setting the constant term in the fixed effects regression to zero does not also set the fixed effects themselves to zero, which is what is required here. So that model was also mis-specified. Second helping of humble pie.

                              I don't know of any way to do non-linear estimation with fixed effects in Stata.

                              I do have one more approach to doing it with the log-transformed data. If we de-mean all the variables involved and then use -regress-, then we are capturing the within-panel-unit variation. So give this a try:

                              Code:
                              // DE-MEAN THE VARIABLES
                              drop if missing(log_y, gap, log_lag_y) // RESTRICT TO ESTIMATION SAMPLE
                              foreach v of varlist log_y gap log_lag_y {
                                  egen mean_`v' = mean(`v'), by(panel_var)
                                  gen dm_`v' = `v' - mean_`v'
                              }
                              
                              // NOW -regress, nocons- GIVES A WITHIN panel_var REGRESSION
                              regress dm_log_y dm_gap dm_log_lag_y, nocons
                              test log_lag_y = 1, coef
                              And, again, when you're done, the exponential of the coefficient of dm_gap shown in the -test- output is your estimate of {g} in the original problem.

                              Comment

                              Working...
                              X