Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unable to run an out-of-sample prediction ['predict'] with the npregress kernel function

    I have been running nonparametric regressions with in-command predictions for months, with ease. However, I just started attempting to run the following:
    FOR EXAMPLE: npregress kernel y x1 x2 i.x3 if x4 < 100, imaic vce(bootstrap, reps(150) seed(123) dots(1))
    TO THEN RUN: predict yhat
    From which it will only run on the sample and ignore the out-of-sample.

    Note: this is not time-series data

    How am I failing to utilise the 'predict' command correctly?
    Thank you!
    Last edited by Kent Bhupathi; 03 Oct 2019, 14:42.

  • #2
    I have the exact same problem. It seems that predict after npregress kernel is by default not expanding to out-of-sample observations. Please correct me if I am wrong - otherwise I would be happy if this could be fixed!

    Comment


    • #3
      the help file says,
      These statistics are available both in and out of sample; type predict ... if e(sample) ... if wanted only for the estimation sample.
      so you should contact tech support

      Comment


      • #4
        Hi Stefan
        I think the reason why npregress does not make predictions out of sample (even if out of sample are within the support of the original independent variables) is because, different from parametric models, there are no "betas" from which to get the predictions. The only parameter that would play a similar role is the bandwidth.
        Because the model is "local" in nature, one would need to reestimate the model using the "out of sample" observations, given the observed outcome of the "in sample" data.
        You could do it manually, but I do not think npregress has that support yet.
        See below for an example:

        Code:
        webuse dui, clear
        set seed 1
        npregress kernel citations fines , kernel(gaussian)
        matrix bw=e(bwidth)
        global bw bw[1,1]
        predict mean_insample
        
        ** out of sample prediction
        
        
        gen dfines=.
        local Nob=500
        forvalues fin=5(.1)15 {
            local Nob=`Nob'+1
            qui:replace dfines=fines-`fin'
            qui:regress citations dfines if _n<=500 [w=normalden(dfines,0,$bw) ]
            qui:set obs `Nob'
            qui:replace mean_insample=_b[_cons] in `Nob'
            qui:replace fines        =`fin' in `Nob'
        }
        gen mean_insample2=mean_insample
        qui:replace dfines=fines-7.4
        qui:regress citations dfines if _n<=500 [w=normalden(dfines,0,$bw) ]
        predict xb
        replace mean_insample2=xb if fines<=7.4
        qui:replace dfines=fines-12
        drop xb
        qui:regress citations dfines if _n<=500 [w=normalden(dfines,0,$bw) ]
        predict xb
        replace mean_insample2=xb if fines>=12
        two scatter mean_insample fines , msize(tiny) || scatter mean_insample2 fines , msize(tiny)  || scatter mean_insample fines if _n<=500 || histogram fines if _n<=500, yaxis(2) fcolor(%40)
        As you can see, I was able to produce "out of sample" predictions, which I compare them to the in-sample predictions from npregress.
        For the observations within the support, the predictions fall within what you would expect.
        The problem falls with the out of sample predictions. One option is to still use the local estimator, using the "point of reference" that is outside of the original support to make the predictions. (using the first method above). This are the blue dots in the figure. You can see its there is some curvatures at the right of the figure, but there is nothing that may explain it.
        The alternative way to do the Out of sample predictions could be to use the last "original" point of reference. and use the linear prediction from it. (this only work for the local linear model). So the out of sample predictions for points outside of the original support are straight lines (red dots).
        I do not know if there is any preference of one with respect to the other.
        HTH

        Click image for larger version

Name:	Graph.png
Views:	1
Size:	47.1 KB
ID:	1537145


        Comment

        Working...
        X