Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quantile Regressions

    Hi everyone,

    I am working with a large data set (3 million observations). I am trying to run quantile regressions and have tried both qreg and qreg2. It has been around 30 hours, and it is still running. Is there anyway to speed the process up or is it the cost of working with such a large data set? (I am working on Stata 15 MP, 2 core).

    Any help will be much appreciated!


  • #2
    Dear Siddharth,

    I am afraid I do not have an answer to your question but I note that the default options for -qreg- make the computation of the covariance matrix very slow; that does not affect -qreg2-. In any case, I do not think that these commands benefit much from parallelization so with such large sample the estimation is going to be slow. By the way, how many regressors does the model have?

    Best wishes,

    Joao

    Comment


    • #3
      Hi Joao,

      Thanks for your message. The regression contains 3 sets of geographical and time fixed effects, as well as some time trends. So even simple regressions take time. But I switched to using the reghdfe command to speed up the (non-quantile) regressions. I guess there is no such alternative available for quantile regressions?

      Comment


      • #4
        Take a look at the -laplacereg- command here: http://www.imm.ki.se/biostatistics/laplace/ (note: the -laplacereg- command was formerly named -laplace-).

        You can find additional info about Laplace regression here: http://www.stata-journal.com/article...article=st0294 and here https://www.ncbi.nlm.nih.gov/pubmed/20680972

        Code:
        clear
        set seed 1150
        set obs 500000
        gen x1 = rnormal()
        gen x2 = runiform() < 0.3
        gen y = 3 + 1.2*x1 - 0.5*x2 + rnormal()*2
        
        timer clear 1
        timer on 1
        qreg y x1 x2
        timer off 1
        
        timer clear 2
        timer on 2
        laplace y x1 x2
        timer off 2
        
        timer list 1
        timer list 2
        Code:
        Iteration  1:  WLS sum of weighted deviations =  399574.51
        
        Iteration  1: sum of abs. weighted deviations =  399574.51
        Iteration  2: sum of abs. weighted deviations =  399574.22
        Iteration  3: sum of abs. weighted deviations =     399574
        Iteration  4: sum of abs. weighted deviations =  399573.74
        Iteration  5: sum of abs. weighted deviations =  399573.63
        Iteration  6: sum of abs. weighted deviations =  399573.61
        Iteration  7: sum of abs. weighted deviations =  399573.59
        Iteration  8: sum of abs. weighted deviations =  399573.59
        Iteration  9: sum of abs. weighted deviations =  399573.58
        Iteration 10: sum of abs. weighted deviations =  399573.57
        Iteration 11: sum of abs. weighted deviations =  399573.57
        Iteration 12: sum of abs. weighted deviations =  399573.57
        Iteration 13: sum of abs. weighted deviations =  399573.57
        Iteration 14: sum of abs. weighted deviations =  399573.57
        
        Median regression                                   Number of obs =    500,000
          Raw sum of deviations 467320.1 (about 2.8569386)
          Min sum of deviations 399573.6                    Pseudo R2     =     0.1450
        
        ------------------------------------------------------------------------------
                   y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                  x1 |   1.193373   .0035891   332.50   0.000     1.186339    1.200408
                  x2 |  -.4968251   .0078248   -63.49   0.000    -.5121615   -.4814887
               _cons |   2.998858   .0042868   699.56   0.000     2.990456     3.00726
        ------------------------------------------------------------------------------
        
        . timer off 1
        
        . 
        . timer clear 2
        
        . timer on 2
        
        . laplace y x1 x2
        
        Laplace regression                               No. of subjects  =     500000
        ------------------------------------------------------------------------------
                     |               Robust
                   y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        q50          |
                  x1 |   1.193381   .0035556   335.63   0.000     1.186412    1.200349
                  x2 |  -.4968426   .0078298   -63.46   0.000    -.5121888   -.4814965
               _cons |   2.998821   .0042401   707.25   0.000     2.990511    3.007132
        ------------------------------------------------------------------------------
        
        . timer off 2
        
        . 
        . timer list 1
           1:      8.43 /        1 =       8.4300
        
        . timer list 2
           2:      2.64 /        1 =       2.6420

        Comment


        • #5
          Hello! Sorry I am new with STATAlist, so not sure if this is the right place to post, but I have a question related to the above question. How to get exponentiated coefficient for quantile regression? Like for regresss we do eform (b) to get exponentated betas, is it possible to do something similar with qreq commend? Thanks,

          Shukrullah
          Intermediate to advance user

          Comment

          Working...
          X