Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using semipar together with margins and marginsplot

    Hi, I am trying to make a margin plot after running a semipar command.
    Specifically, for npreg command, margins can be used afterward to make a margin plot. Example can be found here: https://www.stata.com/new-in-stata/n...ic-regression/
    However, I could not do the same with semipar.
    May I ask if there is a way to run margin plot with semipar?

  • #2
    I dont think there is a straight forward way to do that with semipar.
    This command is relatively outdated, and on its core, it is based on a simple regression on the conditional residuals of all the variables in your model.
    May I ask what type of "margins" plot do you have in mind?
    Fernando

    Comment


    • #3
      Hi, I'm planning to use margins function and marginsplot function to sketch the result of semipar, as the plot given by semipar is too clustered.
      At the moment, I'm trying to replicate the result of semipar by npregress, as margins function can work with npregress
      Might I ask if it's possible to replicate semipar function's result by npregress. Specifically, are the calculation behind these functions the same?
      Using Gaussian kernal for both, and linear estimator for noregress, I can get very close results to semipar. But I'd like to clarify if the calculation behind them are in fact the same

      Detail commands :
      semipar Y X1 X2 X3 , nonpar(X_nonpar) kernel(gaussian) test(2) robust partial(nonlinear_component) generate(semipar_result)
      npregress kernel nonlinear_component X_nonpar, estimator(linear) kernel(gaussian) meanbwid(676.19, copy) predict(npreg_result) noderivatives

      where:
      • Variable X_nonpar is the predictor with nonlinear effect
      • Variable nonnonlinear_component is to get the result from semipar, and use it for npregress
      • npregress use the bandwidth 676.19, which is the one chosen by sempar function.
      • The results semipar_result and npreg_result are quite close together, but Idk if it's coincident or they in fact come from the same
      Lastly, I would appreciate if anyone could provide how to get the actual semipar's bandwidth. The current bandwidth (676.19) is reported by semipar function after rounding up to 2 digits, and the actual bandwidth isn't reported
      Last edited by Li Han; 16 May 2019, 00:31.

      Comment


      • #4
        Hi Li Han,
        So, if understand you correctly, you want to replicate only the nonparametric part of Semipar graph with npregress, is that correct?
        If that is the case, it is possible to do so, but is not as simple as what you are trying to do.
        What you need from semipar is its estimation for the model residuals, which is used to obtain the semiparametric component
        Bellow, I provide you with the replication from semipar, and the semiparametric figure it produces using semipar, lpoly and npregress.
        The Hprice3 is the data used in the semipar helpfile

        Code:
        use ".\HPRICE3.DTA", clear
        semipar lprice ldist larea lland rooms bath age, nonpar(linst) nograph
        predict lprice_hat,
        sum lprice_hat,
        replace lprice_hat=lprice_hat-r(mean)
        gen lprice_res=lprice-lprice_hat
        semipar lprice ldist larea lland rooms bath age, nonpar(linst)
        graph save m1
        lpoly lprice_res  linst, degree(1) kernel(gaussian)
        
        graph save m2
        *to check the bandwidth
        return list
        
        npregress kernel lprice_res  linst, kernel(gaussian) bwidth( .2368461925982401, copy) noderiv
        
        two scatter lprice_res  linst || line _Mean_lprice_res linst, sort
        graph save m3
        graph combine m1.gph m2.gph m3.gph
        Hope this helps
        Fernando

        Comment


        • #5
          Hi, really thank you for your help. It's exactly what I want.
          Besides, may I ask how I can do Hardle and Mammen test (a test of the difference between parametric result and nonparametric result) without relying on semipar function? As I want to do a Hardle and Mammen test with different bandwidth, but semipar function doesn't allow me to choose bandwidth. Hence, I may need another way to do Hardle and Mammen test, without using semipar.

          Comment


          • #6
            Well, i havent applied that test on my own by hand before. However, if you look into semipar.ado, and at their Stata paper, you can follow exactly the procedure that they do.
            This will help you see where my replication comes from as well.
            In any case, since you seem to be working on replicating Robinson's estimator by hand, I would recommend you to do the whole procedure using npregress, instead of lpoly.
            Lpoly uses a plug in procedure for bandwidth, which may not perform as well in all cases.
            Npregress instead using a Cross-validation procedure to select bandwidth.
            Furthermore, unless you are facing a nonlinear component that is highly nonlinear, you may benefit from using other type of models for the nonparametric component, such as splines, polynomials or fractional polynomials.
            HTH
            Fernando

            Comment

            Working...
            X