Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different results for differents seeds qregpd

    Dear all,

    I´m working with a data set of firms to study barriers to innovation and I am using the qregpd to run a quantile regression with panel data to asses the effect of barriers in different points of the distribution of the varaible of interest (ln of labor productivity in this case). I´ve decided to go with the mcmc optimization method given that, even specifying a seed, I can´t seem to get the same result twice in a row using the default Nelder-Mead method. Here´s the output using Nelder-Mead algorithm.
    Code:
    set seed 2003080
      qregpd lnprod $barreras  exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.10)
    
    Quantile Regression for Panel Data (QRPD)
         Number of obs:              1941
         Number of groups:            504
         Min obs per group:             3
         Max obs per group:             4
    ------------------------------------------------------------------------------
          lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           cost_ |   -.021396   .1942694    -0.11   0.912     -.402157     .359365
      knowledge_ |  -.0164477   .1071803    -0.15   0.878    -.2265173    .1936219
         market_ |   .0021737   .2487624     0.01   0.993    -.4853917    .4897391
            reg_ |    .011481     .23233     0.05   0.961    -.4438775    .4668395
      exporter2_ |   .0027602   .0048785     0.57   0.572    -.0068016     .012322
          lneduc |   .1743128   .2203763     0.79   0.429    -.2576168    .6062424
          lnedad |   9.628899   17.29708     0.56   0.578    -24.27275    43.53055
          lnsize |   .2744016   .9562956     0.29   0.774    -1.599903    2.148707
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    
    
    set seed 2003080
      qregpd lnprod $barreras  exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.10)
    
    Quantile Regression for Panel Data (QRPD)
         Number of obs:              1941
         Number of groups:            504
         Min obs per group:             3
         Max obs per group:             4
    ------------------------------------------------------------------------------
          lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           cost_ |  -.0210913   .0918877    -0.23   0.818    -.2011879    .1590053
      knowledge_ |  -.0158104   .0920948    -0.17   0.864    -.1963129    .1646922
         market_ |   .0027017   .1191696     0.02   0.982    -.2308663    .2362698
            reg_ |   .0100679   .1204593     0.08   0.933    -.2260281    .2461638
      exporter2_ |   .0028732   .0032588     0.88   0.378    -.0035141    .0092604
          lneduc |   .1740414   .1397286     1.25   0.213    -.0998215    .4479044
          lnedad |   9.628627   1.908467     5.05   0.000     5.888101    13.36915
          lnsize |   .2744354   .2059562     1.33   0.183    -.1292313    .6781022
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    Given these different results with this method I switched to MCMC in the belief that in this case the seed would work. However, using this method I have the problem that different seeds return different regression results!
    Code:
    . set seed 2003080
    
    .   qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize  if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc) 
    > draws(1000) burn(100)
    Adaptive MCMC optimization
    
    
    Quantile Regression for Panel Data (QRPD)
         Number of obs:              1941
         Number of groups:            504
         Min obs per group:             3
         Max obs per group:             4
    ------------------------------------------------------------------------------
          lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           cost_ |   -.022124   .0353219    -0.63   0.531    -.0913537    .0471057
      knowledge_ |  -.0123532   .0376526    -0.33   0.743    -.0861508    .0614445
         market_ |   .0285633    .037407     0.76   0.445    -.0447531    .1018797
            reg_ |  -.0183071   .0387575    -0.47   0.637    -.0942704    .0576562
            sNI_ |  -.0879371    .042883    -2.05   0.040    -.1719862   -.0038879
      exporter2_ |   .0007951   .0031179     0.26   0.799    -.0053159    .0069062
          lneduc |   .0563408   .0421624     1.34   0.181     -.026296    .1389776
          lnedad |   .9033293   .0846333    10.67   0.000     .7374509    1.069208
          lnsize |  -.3157141   .0479072    -6.59   0.000    -.4096104   -.2218178
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    
    
    MCMC diagonstics:
         Mean acceptance rate:      0.222
         Total draws:                1000
         Burn-in draws:               100
         Draws retained:              900
         Value of objective function:   
                 Mean:              -7.7385
                 Min:              -16.7594
                 Max:               -3.2065
    MCMC notes:
         *Point estimates correspond to mean of draws.
         *Standard errors are derived from variance of draws.
    
    . set seed 2003081
    
    .   qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize  if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc) 
    > draws(1000) burn(100)
    Adaptive MCMC optimization
    
    
    Quantile Regression for Panel Data (QRPD)
         Number of obs:              1941
         Number of groups:            504
         Min obs per group:             3
         Max obs per group:             4
    ------------------------------------------------------------------------------
          lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           cost_ |  -.0294718   .0269462    -1.09   0.274    -.0822853    .0233417
      knowledge_ |   .0092274   .0537253     0.17   0.864    -.0960723    .1145271
         market_ |   .0100021   .0320289     0.31   0.755    -.0527734    .0727776
            reg_ |    .103807   .0354037     2.93   0.003     .0344171    .1731968
            sNI_ |    .065963   .0420626     1.57   0.117    -.0164782    .1484042
      exporter2_ |   .0095051   .0021575     4.41   0.000     .0052765    .0137337
          lneduc |   .1172342   .0172016     6.82   0.000     .0835196    .1509487
          lnedad |   .8451934   .0497182    17.00   0.000     .7477474    .9426393
          lnsize |  -.3524431   .0246028   -14.33   0.000    -.4006638   -.3042224
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    
    
    MCMC diagonstics:
         Mean acceptance rate:      0.251
         Total draws:                1000
         Burn-in draws:               100
         Draws retained:              900
         Value of objective function:   
                 Mean:             -15.9054
                 Min:              -23.2636
                 Max:               -7.5322
    MCMC notes:
         *Point estimates correspond to mean of draws.
         *Standard errors are derived from variance of draws.
    
    . set seed 2003082
    
    .   qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize  if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc) 
    > draws(1000) burn(100)
    Adaptive MCMC optimization
    
    
    Quantile Regression for Panel Data (QRPD)
         Number of obs:              1941
         Number of groups:            504
         Min obs per group:             3
         Max obs per group:             4
    ------------------------------------------------------------------------------
          lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           cost_ |  -.0710054   .0614428    -1.16   0.248     -.191431    .0494203
      knowledge_ |  -.0234029   .0287051    -0.82   0.415    -.0796638     .032858
         market_ |  -.1040086   .0441818    -2.35   0.019    -.1906034   -.0174139
            reg_ |   .0412002   .0355504     1.16   0.246    -.0284772    .1108777
            sNI_ |   .0229435    .044611     0.51   0.607    -.0644924    .1103793
      exporter2_ |   .0112458   .0010849    10.37   0.000     .0091195    .0133722
          lneduc |    .130359   .0106865    12.20   0.000      .109414    .1513041
          lnedad |   .6781942   .0353308    19.20   0.000      .608947    .7474414
          lnsize |  -.3386571   .0225564   -15.01   0.000    -.3828668   -.2944474
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    
    
    MCMC diagonstics:
         Mean acceptance rate:      0.222
         Total draws:                1000
         Burn-in draws:               100
         Draws retained:              900
         Value of objective function:   
                 Mean:             -20.2498
                 Min:              -35.5141
                 Max:              -14.9407
    MCMC notes:
         *Point estimates correspond to mean of draws.
         *Standard errors are derived from variance of draws.
    I´m getting different point estimates and different statistical significance depending on the seed used. Anybody knows how to work this out?

    Thank you.

    Daniel

  • #2
    Originally posted by Daniel Bukstein View Post
    Dear all,

    I´m working with a data set of firms to study barriers to innovation and I am using the qregpd to run a quantile regression with panel data to asses the effect of barriers in different points of the distribution of the varaible of interest (ln of labor productivity in this case). I´ve decided to go with the mcmc optimization method given that, even specifying a seed, I can´t seem to get the same result twice in a row using the default Nelder-Mead method. Here´s the output using Nelder-Mead algorithm.
    Code:
    set seed 2003080
    qregpd lnprod $barreras exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.10)
    
    Quantile Regression for Panel Data (QRPD)
    Number of obs: 1941
    Number of groups: 504
    Min obs per group: 3
    Max obs per group: 4
    ------------------------------------------------------------------------------
    lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    cost_ | -.021396 .1942694 -0.11 0.912 -.402157 .359365
    knowledge_ | -.0164477 .1071803 -0.15 0.878 -.2265173 .1936219
    market_ | .0021737 .2487624 0.01 0.993 -.4853917 .4897391
    reg_ | .011481 .23233 0.05 0.961 -.4438775 .4668395
    exporter2_ | .0027602 .0048785 0.57 0.572 -.0068016 .012322
    lneduc | .1743128 .2203763 0.79 0.429 -.2576168 .6062424
    lnedad | 9.628899 17.29708 0.56 0.578 -24.27275 43.53055
    lnsize | .2744016 .9562956 0.29 0.774 -1.599903 2.148707
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    
    
    set seed 2003080
    qregpd lnprod $barreras exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.10)
    
    Quantile Regression for Panel Data (QRPD)
    Number of obs: 1941
    Number of groups: 504
    Min obs per group: 3
    Max obs per group: 4
    ------------------------------------------------------------------------------
    lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    cost_ | -.0210913 .0918877 -0.23 0.818 -.2011879 .1590053
    knowledge_ | -.0158104 .0920948 -0.17 0.864 -.1963129 .1646922
    market_ | .0027017 .1191696 0.02 0.982 -.2308663 .2362698
    reg_ | .0100679 .1204593 0.08 0.933 -.2260281 .2461638
    exporter2_ | .0028732 .0032588 0.88 0.378 -.0035141 .0092604
    lneduc | .1740414 .1397286 1.25 0.213 -.0998215 .4479044
    lnedad | 9.628627 1.908467 5.05 0.000 5.888101 13.36915
    lnsize | .2744354 .2059562 1.33 0.183 -.1292313 .6781022
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    Given these different results with this method I switched to MCMC in the belief that in this case the seed would work. However, using this method I have the problem that different seeds return different regression results!
    Code:
    . set seed 2003080
    
    . qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc)
    > draws(1000) burn(100)
    Adaptive MCMC optimization
    
    
    Quantile Regression for Panel Data (QRPD)
    Number of obs: 1941
    Number of groups: 504
    Min obs per group: 3
    Max obs per group: 4
    ------------------------------------------------------------------------------
    lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    cost_ | -.022124 .0353219 -0.63 0.531 -.0913537 .0471057
    knowledge_ | -.0123532 .0376526 -0.33 0.743 -.0861508 .0614445
    market_ | .0285633 .037407 0.76 0.445 -.0447531 .1018797
    reg_ | -.0183071 .0387575 -0.47 0.637 -.0942704 .0576562
    sNI_ | -.0879371 .042883 -2.05 0.040 -.1719862 -.0038879
    exporter2_ | .0007951 .0031179 0.26 0.799 -.0053159 .0069062
    lneduc | .0563408 .0421624 1.34 0.181 -.026296 .1389776
    lnedad | .9033293 .0846333 10.67 0.000 .7374509 1.069208
    lnsize | -.3157141 .0479072 -6.59 0.000 -.4096104 -.2218178
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    
    
    MCMC diagonstics:
    Mean acceptance rate: 0.222
    Total draws: 1000
    Burn-in draws: 100
    Draws retained: 900
    Value of objective function:
    Mean: -7.7385
    Min: -16.7594
    Max: -3.2065
    MCMC notes:
    *Point estimates correspond to mean of draws.
    *Standard errors are derived from variance of draws.
    
    . set seed 2003081
    
    . qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc)
    > draws(1000) burn(100)
    Adaptive MCMC optimization
    
    
    Quantile Regression for Panel Data (QRPD)
    Number of obs: 1941
    Number of groups: 504
    Min obs per group: 3
    Max obs per group: 4
    ------------------------------------------------------------------------------
    lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    cost_ | -.0294718 .0269462 -1.09 0.274 -.0822853 .0233417
    knowledge_ | .0092274 .0537253 0.17 0.864 -.0960723 .1145271
    market_ | .0100021 .0320289 0.31 0.755 -.0527734 .0727776
    reg_ | .103807 .0354037 2.93 0.003 .0344171 .1731968
    sNI_ | .065963 .0420626 1.57 0.117 -.0164782 .1484042
    exporter2_ | .0095051 .0021575 4.41 0.000 .0052765 .0137337
    lneduc | .1172342 .0172016 6.82 0.000 .0835196 .1509487
    lnedad | .8451934 .0497182 17.00 0.000 .7477474 .9426393
    lnsize | -.3524431 .0246028 -14.33 0.000 -.4006638 -.3042224
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    
    
    MCMC diagonstics:
    Mean acceptance rate: 0.251
    Total draws: 1000
    Burn-in draws: 100
    Draws retained: 900
    Value of objective function:
    Mean: -15.9054
    Min: -23.2636
    Max: -7.5322
    MCMC notes:
    *Point estimates correspond to mean of draws.
    *Standard errors are derived from variance of draws.
    
    . set seed 2003082
    
    . qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc)
    > draws(1000) burn(100)
    Adaptive MCMC optimization
    
    
    Quantile Regression for Panel Data (QRPD)
    Number of obs: 1941
    Number of groups: 504
    Min obs per group: 3
    Max obs per group: 4
    ------------------------------------------------------------------------------
    lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    cost_ | -.0710054 .0614428 -1.16 0.248 -.191431 .0494203
    knowledge_ | -.0234029 .0287051 -0.82 0.415 -.0796638 .032858
    market_ | -.1040086 .0441818 -2.35 0.019 -.1906034 -.0174139
    reg_ | .0412002 .0355504 1.16 0.246 -.0284772 .1108777
    sNI_ | .0229435 .044611 0.51 0.607 -.0644924 .1103793
    exporter2_ | .0112458 .0010849 10.37 0.000 .0091195 .0133722
    lneduc | .130359 .0106865 12.20 0.000 .109414 .1513041
    lnedad | .6781942 .0353308 19.20 0.000 .608947 .7474414
    lnsize | -.3386571 .0225564 -15.01 0.000 -.3828668 -.2944474
    ------------------------------------------------------------------------------
    No excluded instruments - standard QRPD estimation.
    
    
    MCMC diagonstics:
    Mean acceptance rate: 0.222
    Total draws: 1000
    Burn-in draws: 100
    Draws retained: 900
    Value of objective function:
    Mean: -20.2498
    Min: -35.5141
    Max: -14.9407
    MCMC notes:
    *Point estimates correspond to mean of draws.
    *Standard errors are derived from variance of draws.
    I´m getting different point estimates and different statistical significance depending on the seed used. Anybody knows how to work this out?

    Thank you.

    Daniel
    Hi Daniel,
    May be you can increase the number of bootstrap using the option 'draws'.

    Bests,
    wanhaiyou


    Comment


    • #3
      Hello Daniel,

      I also work with panel data and use quantile regression (QR). I am curious though, what is your rationale for choosing -qregpd-? Since there is lack of consensus among researchers on the performance on these new panel QR estimators (e.g., -qregpd-, -xtqreg-, etc.), I am being cautious about using them and instead employ a "classic" pooled QR (with additional robustness tests). I'd appreciate your comment on this.

      Sincerely,
      Anton

      Comment


      • #4
        Thank you Wanhaiyou for the response, I tried that and with 5000 draws I'm still getting these kind of results. I'll try with a larger number.

        Comment


        • #5
          Hello Anton.

          I choosed to work with qregpd estimator because is a kind of unconditional QR estimator implemented for panel data. Given the known problem of the impossibility to move from E[Y|X] to E[Y] in QR and the development of new unconditional QR methods such as Firpo et al (2009) I though qregpd was a good option because of its straightforward interpretation and ease of use. Can I ask you wich commands are you using and which robustness checks are you implementing using pooled QR for panel data? I think I will go with that option if I cannot get around this issue.

          References: Sergio Firpo, Nicole Fortin and Thomas Lemieux. (2009). Unconditional Quantile Regressions. Econometrica. 77(3) 953–973.

          Comment


          • #6
            Daniel,

            I sincerely appreciate your comment. One of the reasons why I asked you for that is because recently I got a paper back for review from a top tier MIS journal and the last comment explicitly stated that the Stata package we used (i.e., -qregpd-) is quite new, and therefore there are limited materials for reviewers to refer to in order to comment more carefully on model assumptions and estimation procedure.

            To answer your question on the commands I am using, please refer to my recent post on this topic: http://www.statalist.org/forums/foru...sqreg-vs-lqreg

            In addition to the above, I use a new -krls- module by Ferwada et al. (2015) that also allows estimation across quantiles.

            Reference: Ferwerda, J., Hainmueller, J., & Hazlett, C. (2015). KRLS: A Stata package for kernel-based regularized least squares. Chicago
            Last edited by Anton Ivanov; 25 Jul 2017, 10:16.

            Comment


            • #7
              Anton, thank you for the references. I also discovered the xtrifreg command, which is an addition to the rifreg package that is based on the Firpo, Fortin and Lemieux paper mentioned above. Given the variability found in my results I would avoid using the qregpd command until there is further evidence about its performance.

              Comment


              • #8
                You are getting different results for the same run of qregpd probably because of the randomness in sort. qregpd calls sort, and sort randomly breaks any ties in the key values. To reproduce exactly the same results, besides set seed #, try set sortseed #. The latter specifies the seed of the random-number generator that breaks ties in sort.

                I hope this solves the problem.

                -- Kreshna

                Comment


                • #9
                  Dear Kreshna, that worked well. Thank you very much for the help.

                  Comment


                  • #10
                    Dear All,

                    I am running a quantile panel estimation with -qregpd- and facing the exact same issue as Daniel.
                    When point estimates differ so much both in terms of signs and significance only because of the seed, they can't be considered as reliable...
                    While replicability can be achieved, is there a way to get more stable results no matter the seed?

                    Sincerely,
                    Anastasia

                    Comment


                    • #11
                      Dear All,
                      I am also facing the same issue as Daniel and Anastasia. The coefficients heavily depend on the seed value. Does it indicates the method is not robust?

                      Xiaoshi

                      Comment

                      Working...
                      X