Model equation from Nonparamteric kernel regression analysis

John Tetteh

Join Date: Aug 2024

Posts: 3
#1

Model equation from Nonparamteric kernel regression analysis

15 Aug 2024, 06:00

Dear all,
I performed a nonparametric kernel regression analysis to develop a diagnostic model for predicting HRQoL, a highly skweed continuous variable. This approach demonstrated superior predictive performance compared to other distributional analysis techniques.

My current challenge is presenting the model equation for external validation. Although I understand the theory, I'm having difficulty programming it in Stata for this purpose. Any assistance would be greatly appreciated.

. npregress kernel qol i.sex age i.work c.comorb i.diabetes c.newimd c.newtrigs c.newurea c.neweGFRMDRD c.newwaist newbmi

Computing mean function

Minimizing cross-validation function:

Iteration 0: Cross-validation criterion = 30.474236
Iteration 1: Cross-validation criterion = 30.351633
Iteration 2: Cross-validation criterion = 30.351633
Iteration 3: Cross-validation criterion = 30.351633
Iteration 4: Cross-validation criterion = 30.351633

warning: 213 observations were not used to compute the mean function because they violated the model
identification assumptions. These observations are marked as 1 in the system variable
_unident_sample. You may use the unidentsample() option to use a different variable name.

Computing optimal derivative bandwidth

Iteration 0: Cross-validation criterion = 1.0019674
Iteration 1: Cross-validation criterion = 1.0019674
Iteration 2: Cross-validation criterion = 1.0019386
Iteration 3: Cross-validation criterion = 1.0018571
Iteration 4: Cross-validation criterion = 1.0018571

Bandwidth
-----------------------------------
| Mean Effect
-------------+---------------------
sex | .5 .5
age | 5.397418 5.919251
work | .5 .5
comorb | .5638 .6183092
diabetes | .5 .5
newimd | 6.627235 7.267969
newtrigs | .4576943 .5019451
newurea | .7612126 .834808
neweGFRMDRD | 6.707786 7.356307
newwaist | 7.322567 8.030526
newbmi | 2.701951 2.96318
-----------------------------------

Local-linear regression Number of obs = 3,940
Continuous kernel : epanechnikov E(Kernel obs) = 3,940
Discrete kernel : liracine R-squared = 0.8250
Bandwidth : cross-validation
-------------------------------------------------------------------------------------------------
qol | Estimate
--------------------------------+----------------------------------------------------------------
Mean |
qol | .8561769
--------------------------------+----------------------------------------------------------------
Effect |
age | -.0003027
comorb | -.0171372
newimd | -.0020548
newtrigs | -.0168828
newurea | -.0030568
neweGFRMDRD | -.0004259
newwaist | -.0021735
newbmi | .0003762
|
sex |
(Female vs Male) | -.0551113
|
work |
(Retired vs Currently working) | -.0555846
(Other vs Currently working) | -.1171366
|
diabetes |
(Yes vs No) | -.0015465
-------------------------------------------------------------------------------------------------
Note: Effect estimates are averages of derivatives for continuous covariates and averages of contrasts for
factor covariates.
Note: You may compute standard errors using vce(bootstrap) or reps().

Attached Files

Kernel regression.txt (3.4 KB, 1 view)
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35636
#2

15 Aug 2024, 06:12

There isn't an equation you can use. Although produced by an objective algorithm, the fitted values aren't defined by an equation (or at least not one that would be of use or interest).

In high school you may have drawn by hand or eye a smooth curve through a set of points on a scatter plot. The method you've used in Stata is similar in spirit although not in substance; it's a kind of smoothing operation.
Comment
John Tetteh

Join Date: Aug 2024

Posts: 3
#3

15 Aug 2024, 06:34

Thank you so much @Nick.
The predicted values are highly correlated with the observed values. Now is there an approach to interpret the mean derivates of the predictors?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35636
#4

15 Aug 2024, 07:37

Other than guessing that you mean derivatives, I regret that I don't know what your question means in general or with respect to this command. I hope someone else can help. Many of your predictors are categorical in any case.
Comment

Announcement

Model equation from Nonparamteric kernel regression analysis

Comment

Comment

Comment