Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using svmat to make quantile graphs

    Hello, I am using the IHDS dataset to construct a quantile graph like the one shown in the image. Following is the code I am running:

    Code:
    gen inc_fem = incph if sex == 0
    gen loginc_fem = log(inc_fem)
    gen inc_male = incph if sex == 1
    gen loginc_male = log(inc_male)
    pctile qinc = logincph, nq(100) genp(percent)
    pctile qincf = loginc_fem, nquantiles(100) genp(percent1)
    pctile qincm = loginc_male, nquantiles(100) genp(percent2)
    gen log_gap = (qincm - qincf)/qincf
    
    forvalues q=10/90{
    quiet qreg logincph i.edu sex age public married religion i.caste i.state i.industry i.occupation, q(`q')
    mat qr1=nullmat(qr1)\_b[sex]
    mat quantile=nullmat(quantile)\\`q'
    }
    svmat quantile
    svmat qr1
    quiet reg logincph sex i.edu age public married religion i.caste i.state i.industry i.occupation if sex == 1
    local ols1 = _b[sex]
    quiet reg logincph sex i.edu age public married religion i.caste i.state i.industry i.occupation if sex == 0
    local ols0 = _b[sex]
    twoway (line qr11 quantile1, lcolor(red)), yline(`ols1', lcolor(red)) yline(`ols0', lcolor(black)) yline('log_gap', lcolor(green))
    Stata returns me an error saying 'something missing'

    Any help on what I am doing wrong here is greatly appreciated.

    Thanks,

    Kusha


    Click image for larger version

Name:	Screen Shot 2017-08-22 at 12.17.52.png
Views:	1
Size:	41.1 KB
ID:	1407358


  • #2
    1. After which line do you get this error message?
    2. This part
    Code:
     
     yline('log_gap', lcolor(green))
    looks suspicious. First, if you want to use log_ap as a local variable then you are using the wrong symbols. Second, log_gap is a variable in your dataset, not a local variable.

    Comment


    • #3
      Cross-posted at https://stackoverflow.com/questions/...uantile-graphs

      Please note our policy on cross-posting, which is explicit in the FAQ every poster is asked to read before posting.

      https://www.statalist.org/forums/help#crossposting
      8. May I cross-post to other forums?

      People posting on Statalist may also post the same question on other listservers or in web forums. There is absolutely no rule against doing that.

      But if you do post elsewhere, we ask that you provide cross-references in URL form to searchable archives. That way, people interested in your question can quickly check what has been said elsewhere and avoid posting similar comments. Being open about cross-posting saves everyone time.

      If your question was answered well elsewhere, please post a cross-reference to that answer on Statalist.

      Comment


      • #4
        Over on Stack Overflow, a key document urges people to post minimal, complete, verifiable examples

        -- see https://stackoverflow.com/help/mcve --

        which is in line with our advice to post data examples together with code sufficient to show your problem. We are some way here from such an MCVE. Perhaps 1% of readers here know what the IHDS data are and there is no source provided for us to see or to sample.

        Nevertheless, let's break this into blocks and see what can be said:

        Code:
        gen inc_fem = incph if sex == 0
        gen loginc_fem = log(inc_fem)
        gen inc_male = incph if sex == 1
        gen loginc_male = log(inc_male)
        pctile qinc = logincph, nq(100) genp(percent)
        pctile qincf = loginc_fem, nquantiles(100) genp(percent1)
        pctile qincm = loginc_male, nquantiles(100) genp(percent2)
        gen log_gap = (qincm - qincf)/qincf
        A small comment is that this can be slimmed down:

        Code:
        gen loginc_fem = log(inc_ph) if sex == 0
        gen loginc_male = log(inc_ph) if sex == 1
        pctile qinc = logincph, nq(100) genp(percent)
        pctile qincf = loginc_fem, nquantiles(100) genp(percent1)
        pctile qincm = loginc_male, nquantiles(100) genp(percent2)
        gen log_gap = (qincm - qincf)/qincf
        Now your quantile regressions:

        Code:
        forvalues q=10/90{
        quiet qreg logincph i.edu sex age public married religion i.caste i.state i.industry i.occupation, q(`q')
        mat qr1=nullmat(qr1)\_b[sex]
        mat quantile=nullmat(quantile)\\`q'
        }
        svmat quantile
        svmat qr1
        This too can be slimmed down. As a purely stylistic point, I put quietly on a shorter line of code where there is a choice:

        Code:
        quietly forvalues q=10/90{
            qreg logincph i.edu sex age public married religion i.caste i.state i.industry i.occupation, q(`q')
            mat results = nullmat(results) \ _b[sex], `q'
        }
        svmat results
        Now we get to a block that looks problematic to me.

        Code:
        quiet reg logincph sex i.edu age public married religion i.caste i.state i.industry i.occupation if sex == 1
        local ols1 = _b[sex]
        quiet reg logincph sex i.edu age public married religion i.caste i.state i.industry i.occupation if sex == 0
        local ols0 = _b[sex]
        You're running regressions separately for males and females. But if sex is a constant in the data fed to the model, it can't be a predictor. Constant predictors drop out of regressions. Stata will save a coefficient for you, but it's necessarily zero:

        Code:
        . sysuse auto, clear
        
        . regress mpg foreign if foreign == 1
        note: foreign omitted because of collinearity
        
              Source |       SS           df       MS      Number of obs   =        22
        -------------+----------------------------------   F(0, 21)        =      0.00
               Model |           0         0           .   Prob > F        =         .
            Residual |  917.863636        21  43.7077922   R-squared       =    0.0000
        -------------+----------------------------------   Adj R-squared   =    0.0000
               Total |  917.863636        21  43.7077922   Root MSE        =    6.6112
        
        ------------------------------------------------------------------------------
                 mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
             foreign |          0  (omitted)
               _cons |   24.77273    1.40951    17.58   0.000     21.84149    27.70396
        ------------------------------------------------------------------------------
        
        . di _b[foreign]
        0
        Also, separate regressions for males and females don't seem comparable with the quantile regressions for both, but you should know what you want to do.

        Finally, Blaise has already commented on this, which is the immediate problem, it seems.

        Code:
        twoway (line qr11 quantile1, lcolor(red)), yline(`ols1', lcolor(red)) yline(`ols0', lcolor(black)) yline('log_gap', lcolor(green))
        You created log_gap as a variable earlier, so the punctuation around it is unneeded as well as illegal.

        Also, there's an alignment problem. Your first block of code will populate the first 99 observations {100 - 1), but svmat will populate 81 and they are not aligned. Your first percentile is at 1% and your first quantile regression coefficient is at 10%,.
        Last edited by Nick Cox; 22 Aug 2017, 02:37.

        Comment


        • #5
          Dear Nick, I want to plot lower and upper confidence interval with the coefficients. So, for coefficients code is:
          mat qr1=nullmat(qr1)\_b[sex] Can you please provide code for obtaining lower and upper confidence interval. For instance \_b is for coefficients, similarly what will be for lower and upper confidence interval?

          Comment

          Working...
          X