Different results for differents seeds qregpd

Daniel Bukstein

Join Date: Apr 2017
Posts: 7

Different results for differents seeds qregpd

24 Jul 2017, 13:54

Dear all,

I´m working with a data set of firms to study barriers to innovation and I am using the qregpd to run a quantile regression with panel data to asses the effect of barriers in different points of the distribution of the varaible of interest (ln of labor productivity in this case). I´ve decided to go with the mcmc optimization method given that, even specifying a seed, I can´t seem to get the same result twice in a row using the default Nelder-Mead method. Here´s the output using Nelder-Mead algorithm.

Code:

set seed 2003080
  qregpd lnprod $barreras  exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.10)

Quantile Regression for Panel Data (QRPD)
     Number of obs:              1941
     Number of groups:            504
     Min obs per group:             3
     Max obs per group:             4
------------------------------------------------------------------------------
      lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       cost_ |   -.021396   .1942694    -0.11   0.912     -.402157     .359365
  knowledge_ |  -.0164477   .1071803    -0.15   0.878    -.2265173    .1936219
     market_ |   .0021737   .2487624     0.01   0.993    -.4853917    .4897391
        reg_ |    .011481     .23233     0.05   0.961    -.4438775    .4668395
  exporter2_ |   .0027602   .0048785     0.57   0.572    -.0068016     .012322
      lneduc |   .1743128   .2203763     0.79   0.429    -.2576168    .6062424
      lnedad |   9.628899   17.29708     0.56   0.578    -24.27275    43.53055
      lnsize |   .2744016   .9562956     0.29   0.774    -1.599903    2.148707
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.


set seed 2003080
  qregpd lnprod $barreras  exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.10)

Quantile Regression for Panel Data (QRPD)
     Number of obs:              1941
     Number of groups:            504
     Min obs per group:             3
     Max obs per group:             4
------------------------------------------------------------------------------
      lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       cost_ |  -.0210913   .0918877    -0.23   0.818    -.2011879    .1590053
  knowledge_ |  -.0158104   .0920948    -0.17   0.864    -.1963129    .1646922
     market_ |   .0027017   .1191696     0.02   0.982    -.2308663    .2362698
        reg_ |   .0100679   .1204593     0.08   0.933    -.2260281    .2461638
  exporter2_ |   .0028732   .0032588     0.88   0.378    -.0035141    .0092604
      lneduc |   .1740414   .1397286     1.25   0.213    -.0998215    .4479044
      lnedad |   9.628627   1.908467     5.05   0.000     5.888101    13.36915
      lnsize |   .2744354   .2059562     1.33   0.183    -.1292313    .6781022
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.

Given these different results with this method I switched to MCMC in the belief that in this case the seed would work. However, using this method I have the problem that different seeds return different regression results!

Code:

. set seed 2003080

.   qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize  if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc) 
> draws(1000) burn(100)
Adaptive MCMC optimization


Quantile Regression for Panel Data (QRPD)
     Number of obs:              1941
     Number of groups:            504
     Min obs per group:             3
     Max obs per group:             4
------------------------------------------------------------------------------
      lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       cost_ |   -.022124   .0353219    -0.63   0.531    -.0913537    .0471057
  knowledge_ |  -.0123532   .0376526    -0.33   0.743    -.0861508    .0614445
     market_ |   .0285633    .037407     0.76   0.445    -.0447531    .1018797
        reg_ |  -.0183071   .0387575    -0.47   0.637    -.0942704    .0576562
        sNI_ |  -.0879371    .042883    -2.05   0.040    -.1719862   -.0038879
  exporter2_ |   .0007951   .0031179     0.26   0.799    -.0053159    .0069062
      lneduc |   .0563408   .0421624     1.34   0.181     -.026296    .1389776
      lnedad |   .9033293   .0846333    10.67   0.000     .7374509    1.069208
      lnsize |  -.3157141   .0479072    -6.59   0.000    -.4096104   -.2218178
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.


MCMC diagonstics:
     Mean acceptance rate:      0.222
     Total draws:                1000
     Burn-in draws:               100
     Draws retained:              900
     Value of objective function:   
             Mean:              -7.7385
             Min:              -16.7594
             Max:               -3.2065
MCMC notes:
     *Point estimates correspond to mean of draws.
     *Standard errors are derived from variance of draws.

. set seed 2003081

.   qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize  if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc) 
> draws(1000) burn(100)
Adaptive MCMC optimization


Quantile Regression for Panel Data (QRPD)
     Number of obs:              1941
     Number of groups:            504
     Min obs per group:             3
     Max obs per group:             4
------------------------------------------------------------------------------
      lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       cost_ |  -.0294718   .0269462    -1.09   0.274    -.0822853    .0233417
  knowledge_ |   .0092274   .0537253     0.17   0.864    -.0960723    .1145271
     market_ |   .0100021   .0320289     0.31   0.755    -.0527734    .0727776
        reg_ |    .103807   .0354037     2.93   0.003     .0344171    .1731968
        sNI_ |    .065963   .0420626     1.57   0.117    -.0164782    .1484042
  exporter2_ |   .0095051   .0021575     4.41   0.000     .0052765    .0137337
      lneduc |   .1172342   .0172016     6.82   0.000     .0835196    .1509487
      lnedad |   .8451934   .0497182    17.00   0.000     .7477474    .9426393
      lnsize |  -.3524431   .0246028   -14.33   0.000    -.4006638   -.3042224
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.


MCMC diagonstics:
     Mean acceptance rate:      0.251
     Total draws:                1000
     Burn-in draws:               100
     Draws retained:              900
     Value of objective function:   
             Mean:             -15.9054
             Min:              -23.2636
             Max:               -7.5322
MCMC notes:
     *Point estimates correspond to mean of draws.
     *Standard errors are derived from variance of draws.

. set seed 2003082

.   qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize  if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc) 
> draws(1000) burn(100)
Adaptive MCMC optimization


Quantile Regression for Panel Data (QRPD)
     Number of obs:              1941
     Number of groups:            504
     Min obs per group:             3
     Max obs per group:             4
------------------------------------------------------------------------------
      lnprod |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       cost_ |  -.0710054   .0614428    -1.16   0.248     -.191431    .0494203
  knowledge_ |  -.0234029   .0287051    -0.82   0.415    -.0796638     .032858
     market_ |  -.1040086   .0441818    -2.35   0.019    -.1906034   -.0174139
        reg_ |   .0412002   .0355504     1.16   0.246    -.0284772    .1108777
        sNI_ |   .0229435    .044611     0.51   0.607    -.0644924    .1103793
  exporter2_ |   .0112458   .0010849    10.37   0.000     .0091195    .0133722
      lneduc |    .130359   .0106865    12.20   0.000      .109414    .1513041
      lnedad |   .6781942   .0353308    19.20   0.000      .608947    .7474414
      lnsize |  -.3386571   .0225564   -15.01   0.000    -.3828668   -.2944474
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.


MCMC diagonstics:
     Mean acceptance rate:      0.222
     Total draws:                1000
     Burn-in draws:               100
     Draws retained:              900
     Value of objective function:   
             Mean:             -20.2498
             Min:              -35.5141
             Max:              -14.9407
MCMC notes:
     *Point estimates correspond to mean of draws.
     *Standard errors are derived from variance of draws.

I´m getting different point estimates and different statistical significance depending on the seed used. Anybody knows how to work this out?

Thank you.

Daniel

Tags: None

wanhaiyou

Join Date: May 2014
Posts: 130

24 Jul 2017, 18:36

Originally posted by Daniel Bukstein View Post

Code:

set seed 2003080
qregpd lnprod $barreras exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.10)

Quantile Regression for Panel Data (QRPD)
Number of obs: 1941
Number of groups: 504
Min obs per group: 3
Max obs per group: 4
------------------------------------------------------------------------------
lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cost_ | -.021396 .1942694 -0.11 0.912 -.402157 .359365
knowledge_ | -.0164477 .1071803 -0.15 0.878 -.2265173 .1936219
market_ | .0021737 .2487624 0.01 0.993 -.4853917 .4897391
reg_ | .011481 .23233 0.05 0.961 -.4438775 .4668395
exporter2_ | .0027602 .0048785 0.57 0.572 -.0068016 .012322
lneduc | .1743128 .2203763 0.79 0.429 -.2576168 .6062424
lnedad | 9.628899 17.29708 0.56 0.578 -24.27275 43.53055
lnsize | .2744016 .9562956 0.29 0.774 -1.599903 2.148707
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.


set seed 2003080
qregpd lnprod $barreras exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.10)

Quantile Regression for Panel Data (QRPD)
Number of obs: 1941
Number of groups: 504
Min obs per group: 3
Max obs per group: 4
------------------------------------------------------------------------------
lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cost_ | -.0210913 .0918877 -0.23 0.818 -.2011879 .1590053
knowledge_ | -.0158104 .0920948 -0.17 0.864 -.1963129 .1646922
market_ | .0027017 .1191696 0.02 0.982 -.2308663 .2362698
reg_ | .0100679 .1204593 0.08 0.933 -.2260281 .2461638
exporter2_ | .0028732 .0032588 0.88 0.378 -.0035141 .0092604
lneduc | .1740414 .1397286 1.25 0.213 -.0998215 .4479044
lnedad | 9.628627 1.908467 5.05 0.000 5.888101 13.36915
lnsize | .2744354 .2059562 1.33 0.183 -.1292313 .6781022
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.

Code:

. set seed 2003080

. qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc)
> draws(1000) burn(100)
Adaptive MCMC optimization


Quantile Regression for Panel Data (QRPD)
Number of obs: 1941
Number of groups: 504
Min obs per group: 3
Max obs per group: 4
------------------------------------------------------------------------------
lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cost_ | -.022124 .0353219 -0.63 0.531 -.0913537 .0471057
knowledge_ | -.0123532 .0376526 -0.33 0.743 -.0861508 .0614445
market_ | .0285633 .037407 0.76 0.445 -.0447531 .1018797
reg_ | -.0183071 .0387575 -0.47 0.637 -.0942704 .0576562
sNI_ | -.0879371 .042883 -2.05 0.040 -.1719862 -.0038879
exporter2_ | .0007951 .0031179 0.26 0.799 -.0053159 .0069062
lneduc | .0563408 .0421624 1.34 0.181 -.026296 .1389776
lnedad | .9033293 .0846333 10.67 0.000 .7374509 1.069208
lnsize | -.3157141 .0479072 -6.59 0.000 -.4096104 -.2218178
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.


MCMC diagonstics:
Mean acceptance rate: 0.222
Total draws: 1000
Burn-in draws: 100
Draws retained: 900
Value of objective function:
Mean: -7.7385
Min: -16.7594
Max: -3.2065
MCMC notes:
*Point estimates correspond to mean of draws.
*Standard errors are derived from variance of draws.

. set seed 2003081

. qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc)
> draws(1000) burn(100)
Adaptive MCMC optimization


Quantile Regression for Panel Data (QRPD)
Number of obs: 1941
Number of groups: 504
Min obs per group: 3
Max obs per group: 4
------------------------------------------------------------------------------
lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cost_ | -.0294718 .0269462 -1.09 0.274 -.0822853 .0233417
knowledge_ | .0092274 .0537253 0.17 0.864 -.0960723 .1145271
market_ | .0100021 .0320289 0.31 0.755 -.0527734 .0727776
reg_ | .103807 .0354037 2.93 0.003 .0344171 .1731968
sNI_ | .065963 .0420626 1.57 0.117 -.0164782 .1484042
exporter2_ | .0095051 .0021575 4.41 0.000 .0052765 .0137337
lneduc | .1172342 .0172016 6.82 0.000 .0835196 .1509487
lnedad | .8451934 .0497182 17.00 0.000 .7477474 .9426393
lnsize | -.3524431 .0246028 -14.33 0.000 -.4006638 -.3042224
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.


MCMC diagonstics:
Mean acceptance rate: 0.251
Total draws: 1000
Burn-in draws: 100
Draws retained: 900
Value of objective function:
Mean: -15.9054
Min: -23.2636
Max: -7.5322
MCMC notes:
*Point estimates correspond to mean of draws.
*Standard errors are derived from variance of draws.

. set seed 2003082

. qregpd lnprod $barreras sNI_ exporter2_ lneduc lnedad lnsize if relevant4==1, id(correlativo) fix(periodo) q(0.1) optimize(mcmc)
> draws(1000) burn(100)
Adaptive MCMC optimization


Quantile Regression for Panel Data (QRPD)
Number of obs: 1941
Number of groups: 504
Min obs per group: 3
Max obs per group: 4
------------------------------------------------------------------------------
lnprod | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cost_ | -.0710054 .0614428 -1.16 0.248 -.191431 .0494203
knowledge_ | -.0234029 .0287051 -0.82 0.415 -.0796638 .032858
market_ | -.1040086 .0441818 -2.35 0.019 -.1906034 -.0174139
reg_ | .0412002 .0355504 1.16 0.246 -.0284772 .1108777
sNI_ | .0229435 .044611 0.51 0.607 -.0644924 .1103793
exporter2_ | .0112458 .0010849 10.37 0.000 .0091195 .0133722
lneduc | .130359 .0106865 12.20 0.000 .109414 .1513041
lnedad | .6781942 .0353308 19.20 0.000 .608947 .7474414
lnsize | -.3386571 .0225564 -15.01 0.000 -.3828668 -.2944474
------------------------------------------------------------------------------
No excluded instruments - standard QRPD estimation.


MCMC diagonstics:
Mean acceptance rate: 0.222
Total draws: 1000
Burn-in draws: 100
Draws retained: 900
Value of objective function:
Mean: -20.2498
Min: -35.5141
Max: -14.9407
MCMC notes:
*Point estimates correspond to mean of draws.
*Standard errors are derived from variance of draws.

I´m getting different point estimates and different statistical significance depending on the seed used. Anybody knows how to work this out?

Thank you.

Daniel

Hi Daniel,
May be you can increase the number of bootstrap using the option 'draws'.

Bests,
wanhaiyou

Comment

Anton Ivanov

Join Date: Sep 2014

Posts: 267
#3

24 Jul 2017, 22:55

Hello Daniel,

I also work with panel data and use quantile regression (QR). I am curious though, what is your rationale for choosing -qregpd-? Since there is lack of consensus among researchers on the performance on these new panel QR estimators (e.g., -qregpd-, -xtqreg-, etc.), I am being cautious about using them and instead employ a "classic" pooled QR (with additional robustness tests). I'd appreciate your comment on this.

Sincerely,
Anton
1 like
Comment
Daniel Bukstein

Join Date: Apr 2017

Posts: 7
#4

25 Jul 2017, 05:37

Thank you Wanhaiyou for the response, I tried that and with 5000 draws I'm still getting these kind of results. I'll try with a larger number.
Comment
Daniel Bukstein

Join Date: Apr 2017

Posts: 7
#5

25 Jul 2017, 05:50

Hello Anton.

I choosed to work with qregpd estimator because is a kind of unconditional QR estimator implemented for panel data. Given the known problem of the impossibility to move from E[Y|X] to E[Y] in QR and the development of new unconditional QR methods such as Firpo et al (2009) I though qregpd was a good option because of its straightforward interpretation and ease of use. Can I ask you wich commands are you using and which robustness checks are you implementing using pooled QR for panel data? I think I will go with that option if I cannot get around this issue.

References: Sergio Firpo, Nicole Fortin and Thomas Lemieux. (2009). Unconditional Quantile Regressions. Econometrica. 77(3) 953–973.
1 like
Comment
Anton Ivanov

Join Date: Sep 2014

Posts: 267
#6

25 Jul 2017, 10:14

Daniel,

I sincerely appreciate your comment. One of the reasons why I asked you for that is because recently I got a paper back for review from a top tier MIS journal and the last comment explicitly stated that the Stata package we used (i.e., -qregpd-) is quite new, and therefore there are limited materials for reviewers to refer to in order to comment more carefully on model assumptions and estimation procedure.

To answer your question on the commands I am using, please refer to my recent post on this topic: http://www.statalist.org/forums/foru...sqreg-vs-lqreg

In addition to the above, I use a new -krls- module by Ferwada et al. (2015) that also allows estimation across quantiles.

Reference: Ferwerda, J., Hainmueller, J., & Hazlett, C. (2015). KRLS: A Stata package for kernel-based regularized least squares. Chicago

Last edited by Anton Ivanov; 25 Jul 2017, 10:16.
Comment
Daniel Bukstein

Join Date: Apr 2017

Posts: 7
#7

25 Jul 2017, 11:22

Anton, thank you for the references. I also discovered the xtrifreg command, which is an addition to the rifreg package that is based on the Firpo, Fortin and Lemieux paper mentioned above. Given the variability found in my results I would avoid using the qregpd command until there is further evidence about its performance.
Comment
Kreshna Gopal (StataCorp)

StataCorp Employee

Join Date: Apr 2014

Posts: 43
#8

26 Jul 2017, 15:11

You are getting different results for the same run of qregpd probably because of the randomness in sort. qregpd calls sort, and sort randomly breaks any ties in the key values. To reproduce exactly the same results, besides set seed #, try set sortseed #. The latter specifies the seed of the random-number generator that breaks ties in sort.

I hope this solves the problem.

-- Kreshna
Comment
Daniel Bukstein

Join Date: Apr 2017

Posts: 7
#9

27 Jul 2017, 08:38

Dear Kreshna, that worked well. Thank you very much for the help.
Comment
Anastasia Arabadzhyan

Join Date: Sep 2017

Posts: 2
#10

26 Jan 2018, 08:08

Dear All,

I am running a quantile panel estimation with -qregpd- and facing the exact same issue as Daniel.
When point estimates differ so much both in terms of signs and significance only because of the seed, they can't be considered as reliable...
While replicability can be achieved, is there a way to get more stable results no matter the seed?

Sincerely,
Anastasia
Comment
xiaoshi zhou

Join Date: Mar 2016

Posts: 5
#11

22 Jun 2018, 03:20

Dear All,
I am also facing the same issue as Daniel and Anastasia. The coefficients heavily depend on the seed value. Does it indicates the method is not robust?

Xiaoshi
Comment
Devashish Singh

Join Date: Aug 2022

Posts: 27
#12

17 Jan 2024, 04:58

Hello all. is there any update to this problem?
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2497
#13

17 Jan 2024, 05:56

Is not a problem but a feature of the method
one option may be to run a larger set of repetitions.
you may also want to explore rqreg. It produces similar result to qregpd in the sense of obtaining unconditional regressions
Comment

Announcement

Different results for differents seeds qregpd

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment