Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Questions about type mismatch error

    Hello Statalists,

    The Stata version that i am using right now is Stata 16.1 on Windows 10. I am confusing about a problem when I code a while loop in stata i get a type mismatch error.
    What i want to do here is to achieve the three year rollowing window regression
    The Random raw data of mine has shown below:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(LPERMNO fyear) float(eps bps acps) double PRC float(lbps lPRC feps1)
    10031 1986  .29556468   2.779724          .               4.75         .       . -.19567648
    10031 1987 -.19567648   2.848677  .31047335               3.25  2.779724    4.75  -2.921357
    10031 1988  -2.921357 -.07230712 -2.1386507              1.125  2.848677    3.25          .
    54594 1986  1.4579537  12.747722          .              36.25         .       .  1.3379332
    54594 1987  1.3379332   9.483112  1.3898554                 26 12.747722   36.25   1.550003
    54594 1988   1.550003   10.71772  1.5441314             36.125  9.483112      26  1.5952618
    54594 1989  1.5952618  11.786345   .3907474              12.25  10.71772  36.125   .9314077
    54594 1990   .9314077  12.194198  -1.390095              13.75 11.786345   12.25   .6302283
    54594 1991   .6302283  12.374174  .07585382              11.75 12.194198   13.75 .017797623
    54594 1992 .017797623   11.89963 -1.0391171             12.875 12.374174   11.75    .596253
    54594 1993    .596253   11.91299   .1752169                 13  11.89963  12.875   .6555353
    54594 1994   .6555353   12.35004  -.3002318              18.25  11.91299      13  1.0008751
    54594 1995  1.0008751  12.791286  -.5468184             23.125  12.35004   18.25   1.264832
    54594 1996   1.264832    14.7912   .7412657             33.375 12.791286  23.125  1.2870705
    54594 1997  1.2870705  10.859443   .4632544             19.625   14.7912  33.375  1.5218947
    54594 1998  1.5218947  11.907345  .48011395                 18 10.859443  19.625  1.3088777
    54594 1999  1.3088777  12.637818   .9347478            11.4375 11.907345      18   .6879385
    54594 2000   .6879385  12.629914 -1.0232023  8.010000228881836 12.637818 11.4375 -1.8493568
    54594 2001 -1.8493568   9.734389  -.8040164               4.75 12.629914    8.01  -.3896267
    54594 2002  -.3896267   9.261498 -1.4801105  8.020000457763672  9.734389    4.75    .108668
    54594 2003    .108668   9.355993   -.343247 12.449999809265137  9.261498    8.02    .569938
    54594 2004    .569938   9.658872 -1.0889646  17.18000030517578  9.355993   12.45   .9593223
    54594 2005   .9593223   11.53263  2.0637584  23.84000015258789  9.658872   17.18  1.5756315
    54594 2006  1.5756315  13.099817   2.117708  30.34000015258789  11.53263   23.84    1.95355
    54594 2007    1.95355  15.094396  1.5015087  16.59000015258789 13.099817   30.34   2.072832
    54594 2008   2.072832   16.89371  .40127045 21.940000534057617 15.094396   16.59  1.1302806
    54594 2009  1.1302806  18.916676  -2.748658  18.65999984741211  16.89371   21.94   1.838541
    54594 2010   1.838541   21.01116  -.9746361 16.670000076293945 18.916676   18.66   1.681598
    54594 2011   1.681598  21.469694  -.6578601 16.420000076293945  21.01116   16.67   1.396577
    54594 2012   1.396577  23.325377 -2.7398305 27.329999923706055 21.469694   16.42  1.8427705
    end
    Before I run the while loop, I have initialized all the variables that i might replace in the loop as numeric(typically float)
    Code:
    gen beta0_const = .
    gen beta1_eps = .
    gen beta2_PRC = .
    gen beta3_lPRC = .
    gen beta4_bps = .
    gen beta5_lbps = .
    gen beta6_acps = .
    gen beta0_tv = .
    gen beta1_tv= .
    gen beta2_tv =.
    gen beta3_tv =.
    gen beta4_tv = .
    gen beta5_tv= .
    gen beta6_tv = .
    gen r2_adj =.
    gen nobs_yr = .
    And my full code of while loop has shown below:
    Code:
    scalar t0 = 1986
    scalar T = 2020
    local j =t0
    while `j' <= T{
    quietly reg feps1 eps PRC lPRC bps lbps acps if fyear >=`j'-3 & fyear <= `j', vce(robust)
    matrix coe = e(b)
    quietly replace beta0_const = el(coe,1,7) if fyear == `j' 
    quietly replace beta1_eps = el(coe,1,1) if fyear == `j' 
    quietly replace beta2_PRC = el(coe,1,2) if fyear == `j' 
    quietly replace beta3_lPRC = el(coe,1,3) if fyear == `j'
    quietly replace beta4_bps = el(coe,1,4) if fyear == `j'
    quietly replace beta5_lbps = el(coe,1,5) if fyear == `j'
    quietly replace beta6_acps = el(coe,1,6) if fyear == `j'
    quietly replace beta0_tv = _b[_cons]/_se[_cons] if fyear == `j'
    quietly replace beta1_tv= _b[eps]/_se[eps] if fyear == `j'
    quietly replace beta2_tv = _b[PRC]/_se[PRC] if fyear == `j'
    quietly replace beta3_tv = _b[lPRC]/_se[lPRC] if fyear == `j'
    quietly replace beta4_tv = _b[bps]/_se[bps] if fyear == `j'
    quietly replace beta5_tv= _b[lbps]/_se[lbps] if fyear == `j'
    quietly replace beta6_tv = _b[acps/_se[acps] if fyear == `j'
    quietly replace r2_adj = e(r2_a) if fyear == `j'
    quietly replace nobs_yr = e(N) if fyear == `j'
    local j = `j'+ 1
    }
    and after i run the code i receive
    type mismatch
    r(109);

    I have read several solutions mentioned before my post, and i have checked that all my variables in the loop are numeric, thus, i believe that i do not have a destring problem in my situation. Could you help me with this problem please? I would be very grateful!

  • #2
    The first step in debugging this is to remove the quietly prefix from your regress and replace commands to see where the error occurs, and if perhaps the preceding regression failed, or dropped a collinear variable, or did something else that led to the type mismatch.

    Comment


    • #3
      Thank you for your responding!

      First of all, after I remove the quietly command, the error still happened at the very end of the loop. I thought the error must in some place inside the loop.

      Second, I ran the regression splited from the while loop, it works fine.
      Code:
      . regress feps1 eps PRC bps acps lPRC lbps
      
            Source |       SS           df       MS      Number of obs   =   115,080
      -------------+----------------------------------   F(6, 115073)    =  20453.77
             Model |   144211.26         6    24035.21   Prob > F        =    0.0000
          Residual |  135222.177   115,073  1.17509908   R-squared       =    0.5161
      -------------+----------------------------------   Adj R-squared   =    0.5161
             Total |  279433.437   115,079  2.42818791   Root MSE        =     1.084
      
      ------------------------------------------------------------------------------
             feps1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               eps |   .5712763   .0030202   189.15   0.000     .5653567    .5771958
               PRC |   .0316592   .0003524    89.85   0.000     .0309686    .0323499
               bps |  -.0450768   .0014994   -30.06   0.000    -.0480156    -.042138
              acps |  -.0245667   .0010808   -22.73   0.000    -.0266851   -.0224483
              lPRC |  -.0155908   .0003705   -42.08   0.000    -.0163169   -.0148646
              lbps |   .0574809   .0014691    39.13   0.000     .0546014    .0603603
             _cons |  -.1441011   .0055684   -25.88   0.000    -.1550151    -.133187
      ------------------------------------------------------------------------------
      Thirdly, i believe it might caused by the colinear variable as you said since when I randomly choose one row and fix the firm ID:
      Code:
      regress feps1 eps PRC bps acps lPRC lbps if gvkey == gvkey[20] & inrange(fyear, fyear[20]-3,fyear[20])
      It appears three colinearity variables inside the regression.
      Code:
      note: eps omitted because of collinearity
      note: bps omitted because of collinearity
      note: lbps omitted because of collinearity
      
            Source |       SS           df       MS      Number of obs   =         4
      -------------+----------------------------------   F(3, 0)         =         .
             Model |  7.22601836         3  2.40867279   Prob > F        =         .
          Residual |           0         0           .   R-squared       =    1.0000
      -------------+----------------------------------   Adj R-squared   =         .
             Total |  7.22601836         3  2.40867279   Root MSE        =         0
      
      ------------------------------------------------------------------------------
             feps1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               eps |          0  (omitted)
               PRC |   .1656679          .        .       .            .           .
               bps |          0  (omitted)
              acps |   1.015408          .        .       .            .           .
              lPRC |  -.0028417          .        .       .            .           .
              lbps |          0  (omitted)
             _cons |  -2.104887          .        .       .            .           .
      ------------------------------------------------------------------------------
      Could you explain to me that why the colinear variable could cause this issue? I still do not understand it.

      P.S. I could finish rest of my work with rangestat command, which is really useful. Thank you for your previous post as well!

      Comment


      • #4
        You can't fit a 7-predictor model to 4 data points except vacuously. 4 parameters are enough to define a perfect fit.

        I can't see any grounds for type mismatch here. Type mismatch is when the code needs numeric but you supply string, or conversely. Collinearity is nothing to do with type mismatch.

        Comment


        • #5
          The code you show us in post #1 differs in at least one critical way from the code you actually ran.

          When I run the following code based on your post #1
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input double(LPERMNO fyear) float(eps bps acps) double PRC float(lbps lPRC feps1)
          10031 1986  .29556468   2.779724          .               4.75         .       . -.19567648
          10031 1987 -.19567648   2.848677  .31047335               3.25  2.779724    4.75  -2.921357
          10031 1988  -2.921357 -.07230712 -2.1386507              1.125  2.848677    3.25          .
          54594 1986  1.4579537  12.747722          .              36.25         .       .  1.3379332
          54594 1987  1.3379332   9.483112  1.3898554                 26 12.747722   36.25   1.550003
          54594 1988   1.550003   10.71772  1.5441314             36.125  9.483112      26  1.5952618
          54594 1989  1.5952618  11.786345   .3907474              12.25  10.71772  36.125   .9314077
          54594 1990   .9314077  12.194198  -1.390095              13.75 11.786345   12.25   .6302283
          54594 1991   .6302283  12.374174  .07585382              11.75 12.194198   13.75 .017797623
          54594 1992 .017797623   11.89963 -1.0391171             12.875 12.374174   11.75    .596253
          54594 1993    .596253   11.91299   .1752169                 13  11.89963  12.875   .6555353
          54594 1994   .6555353   12.35004  -.3002318              18.25  11.91299      13  1.0008751
          54594 1995  1.0008751  12.791286  -.5468184             23.125  12.35004   18.25   1.264832
          54594 1996   1.264832    14.7912   .7412657             33.375 12.791286  23.125  1.2870705
          54594 1997  1.2870705  10.859443   .4632544             19.625   14.7912  33.375  1.5218947
          54594 1998  1.5218947  11.907345  .48011395                 18 10.859443  19.625  1.3088777
          54594 1999  1.3088777  12.637818   .9347478            11.4375 11.907345      18   .6879385
          54594 2000   .6879385  12.629914 -1.0232023  8.010000228881836 12.637818 11.4375 -1.8493568
          54594 2001 -1.8493568   9.734389  -.8040164               4.75 12.629914    8.01  -.3896267
          54594 2002  -.3896267   9.261498 -1.4801105  8.020000457763672  9.734389    4.75    .108668
          54594 2003    .108668   9.355993   -.343247 12.449999809265137  9.261498    8.02    .569938
          54594 2004    .569938   9.658872 -1.0889646  17.18000030517578  9.355993   12.45   .9593223
          54594 2005   .9593223   11.53263  2.0637584  23.84000015258789  9.658872   17.18  1.5756315
          54594 2006  1.5756315  13.099817   2.117708  30.34000015258789  11.53263   23.84    1.95355
          54594 2007    1.95355  15.094396  1.5015087  16.59000015258789 13.099817   30.34   2.072832
          54594 2008   2.072832   16.89371  .40127045 21.940000534057617 15.094396   16.59  1.1302806
          54594 2009  1.1302806  18.916676  -2.748658  18.65999984741211  16.89371   21.94   1.838541
          54594 2010   1.838541   21.01116  -.9746361 16.670000076293945 18.916676   18.66   1.681598
          54594 2011   1.681598  21.469694  -.6578601 16.420000076293945  21.01116   16.67   1.396577
          54594 2012   1.396577  23.325377 -2.7398305 27.329999923706055 21.469694   16.42  1.8427705
          end
          
          gen beta0_const = .
          gen beta1_eps = .
          gen beta2_PRC = .
          gen beta3_lPRC = .
          gen beta4_bps = .
          gen beta5_lbps = .
          gen beta6_acps = .
          gen beta0_tv = .
          gen beta1_tv= .
          gen beta2_tv =.
          gen beta3_tv =.
          gen beta4_tv = .
          gen beta5_tv= .
          gen beta6_tv = .
          gen r2_adj =.
          gen nobs_yr = .
          
          expand 10 // make enough observation to do a regression
          
          scalar t0 = 1990 // was 1986
          scalar T = 2020
          local j =t0
          while `j' <= T{
          reg feps1 eps PRC lPRC bps lbps acps if fyear >=`j'-3 & fyear <= `j', vce(robust)
          matrix coe = e(b)
          replace beta0_const = el(coe,1,7) if fyear == `j' 
          replace beta1_eps = el(coe,1,1) if fyear == `j' 
          replace beta2_PRC = el(coe,1,2) if fyear == `j' 
          replace beta3_lPRC = el(coe,1,3) if fyear == `j'
          replace beta4_bps = el(coe,1,4) if fyear == `j'
          replace beta5_lbps = el(coe,1,5) if fyear == `j'
          replace beta6_acps = el(coe,1,6) if fyear == `j'
          replace beta0_tv = _b[_cons]/_se[_cons] if fyear == `j'
          replace beta1_tv= _b[eps]/_se[eps] if fyear == `j'
          replace beta2_tv = _b[PRC]/_se[PRC] if fyear == `j'
          replace beta3_tv = _b[lPRC]/_se[lPRC] if fyear == `j'
          replace beta4_tv = _b[bps]/_se[bps] if fyear == `j'
          replace beta5_tv= _b[lbps]/_se[lbps] if fyear == `j'
          replace beta6_tv = _b[acps/_se[acps] if fyear == `j'
          replace r2_adj = e(r2_a) if fyear == `j'
          replace nobs_yr = e(N) if fyear == `j'
          local j = `j'+ 1
          }
          I get the following results
          Code:
          note: eps omitted because of collinearity
          note: acps omitted because of collinearity
          
          Linear regression                               Number of obs     =         50
                                                          F(0, 45)          =          .
                                                          Prob > F          =          .
                                                          R-squared         =     1.0000
                                                          Root MSE          =          0
          
          ------------------------------------------------------------------------------
                       |               Robust
                 feps1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   eps |          0  (omitted)
                   PRC |   .0529958   1.15e-16  4.6e+14   0.000     .0529958    .0529958
                  lPRC |   .0249317   4.81e-17  5.2e+14   0.000     .0249317    .0249317
                   bps |   .1692075   5.55e-16  3.0e+14   0.000     .1692075    .1692075
                  lbps |    .136212   6.40e-16  2.1e+14   0.000      .136212     .136212
                  acps |          0  (omitted)
                 _cons |  -4.072668   4.44e-16 -9.2e+15   0.000    -4.072668   -4.072668
          ------------------------------------------------------------------------------
          (10 real changes made)
          (10 real changes made)
          (10 real changes made)
          (10 real changes made)
          (10 real changes made)
          (10 real changes made)
          (10 real changes made)
          (10 real changes made)
          (0 real changes made)
          (10 real changes made)
          (10 real changes made)
          (10 real changes made)
          (10 real changes made)
          too few ')' or ']'
          r(132);
          which is not the error you reported.

          Careful inspection shows that
          Code:
          replace beta6_tv = _b[acps/_se[acps] if fyear == `j'
          is missing a closing ] before the /.

          Correcting that and the code continues successfully.

          The problem lies, as Nick suggested, in code you don't show us, or with data that differs from the data you showed us.

          Comment

          Working...
          X