Interval constraint on dependent variable, constrained least squares

Matteo Bagnara

Join Date: Jun 2018
Posts: 27

Interval constraint on dependent variable, constrained least squares

13 Aug 2018, 09:04

Hi everybody,
I have the following dataset:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(Q MeR RV qrt_aday qrt_nday qrt_var lnRV)
16  .0008290323  4.736642e-06  -.005230409    .05646858  4.608747e-06 -12.260182
17    .00044375 .000012509194  .0091804005    .01881856 .000012538238 -11.289046
18     .0004625  .00001069046  .0012458297   .028011275 .000010679361 -11.446158
19 .00010158731 .000011364647 -.0008584664   .006905314 .000011389733 -11.385003
20  .0004209678 .000010257266   .008079512     .0177161  9.809267e-06 -11.487524
21  -.000515873  .00003718756   .017037444   -.05071094  .00003758523 -10.199536
22     .0010875 .000012483018    .01383731    .05533044 .000012532302 -11.291142
23 .00056666665  9.596032e-06 -.0020172177    .03741645  9.378113e-06  -11.55416
24 -.0004698412 .000023483357 -.0017103762   -.02861211  .00002310704  -10.65922
25 -.0008047619  .00004745881    .00752774   -.05969866  .00004681504  -9.955648
26 -.0016609374   .0000869344   -.02177702   -.08736214  .00008737257  -9.350357
27   .000885484  .00005871836   .029210404    .02385477  .00005932969  -9.742758
28  .0020370968 .000021779417   .011997658    .11352037  .00002142717 -10.734545
29 .00028906247  .00004278279  -.018141773   .035276785  .00004326642 -10.059375
30  .0010571429  .00001257287   .007371168    .05881207 .000012314264  -11.28397
31 .00005161291   .0000289419   .003370562 -.0010431414 .000028606475  -10.45022
32 -.0012209677  .00004922577   .004821284    -.0820741  .00004940278  -9.919093
33  .0019661018  .00003825759   .014042603    .10072759  .00003842825 -10.171168
34 .00051730766 .000028131086 -.0038036145    .02997416  .00002835944 -10.478636
35  .0003320755 .000016813141  -.003634345    .02080315 .000016484906  -10.99335
end
format %tq Q

I need to fit the estimate the following model: RV=B0+B1*qrt_aday+B2*qrt_nday+B3*qrt_var
with the constraint: RV>=0.

I know that cnsreg cannot be used with interval constraint and thus I appealed to nl command. My solution was to estimate the model with the exp(lnRV) as dependent variable, where lnRV=ln(RV). With the code:

nl (exp(lnRV) = {a}*qrt_aday + {b}*qrt_nday + {c}*qrt_var + {c}), nolog

I get "exp: operator not valid". If I try to the model with defining the exponential of lnRV before using nl command, I obviously get the same results of a unconstrained regression. Somebody has some idea?
Thank you in avdance

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#2

13 Aug 2018, 09:37

I think what you want to do here is estimate a model of ln RV, and then use the exponential of those estimates as your estimates of RV.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35642

13 Aug 2018, 09:43

Alternatively, a log link and your original scale. Here is an example -- indicative and not definitive --

Code:

. glm RV qrt_aday qrt_nday qrt_var, link(log) 

Iteration 0:   log likelihood =  209.96982  
Iteration 1:   log likelihood =  213.38108  
Iteration 2:   log likelihood =  214.17374  
Iteration 3:   log likelihood =   214.1738  
Iteration 4:   log likelihood =   214.1738  

Generalized linear models                         No. of obs      =         20
Optimization     : ML                             Residual df     =         16
                                                  Scale parameter =   3.66e-11
Deviance         =  5.84932e-10                   (1/df) Deviance =   3.66e-11
Pearson          =  5.84932e-10                   (1/df) Pearson  =   3.66e-11

Variance function: V(u) = 1                       [Gaussian]
Link function    : g(u) = ln(u)                   [Log]

                                                  AIC             =  -21.01738
Log likelihood   =   214.173803                   BIC             =  -47.93172

------------------------------------------------------------------------------
             |                 OIM
          RV |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    qrt_aday |   6.622073   2.793225     2.37   0.018     1.147453    12.09669
    qrt_nday |  -.4441017   1.039341    -0.43   0.669    -2.481173     1.59297
     qrt_var |   24062.36   2081.401    11.56   0.000     19982.89    28141.83
       _cons |  -11.29677   .1112656  -101.53   0.000    -11.51485    -11.0787
------------------------------------------------------------------------------

Comment

Matteo Bagnara

Join Date: Jun 2018

Posts: 27
#4

13 Aug 2018, 10:09

Clyde and Nick,
thank you for your help! I am a bit confused, though. If I estimate a model with ln of the dependent variable with natural levels of predictors, what about the coefficient estimates? My final goal is to get the coefficient of the model presented above so that I can show how I can build a measure of the dependent variable as a linear combination of the right-hand variables.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35642
#5

13 Aug 2018, 10:16

A linear combination of the right-hand variables will go negative at some point in your data space.
Comment
Matteo Bagnara

Join Date: Jun 2018

Posts: 27
#6

13 Aug 2018, 10:31

Yes I understand that. But isn't is just the purpose of using constrained least squares? I just want to exclude that the fitted values are negative and get the coefficients of such a regression
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35642
#7

13 Aug 2018, 10:38

You can't directly constrain predictions except by working in a space where impossible predictions are indeed impossible. By the way, you say that RV can be zero, but is that really true? Note that a log link with a generalised linear model implies merely that predicted means should be positive; it says nothing about the data as such.

Tell us more (than zero) about what your variables mean, as it might inform suggestions about plausible models.
Comment
Matteo Bagnara

Join Date: Jun 2018

Posts: 27
#8

13 Aug 2018, 10:51

RV stays for realized variance (in my model: in t+1) and it is nothing but a quarterly realized variance of the ln of MeR. So it can't be negative. That is the sense of using the constraint. The right-hand variables are predictors, i.e. quarterly returns (qrt_aday, qrt_nday) and the quarterly realized variance qrt_var. So qrt_var is the lagged variable RV and also the other predictors are lagged by on eperiod with repsect to RV
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35642
#9

13 Aug 2018, 11:01

So, in principle you expect the mean response to be positive.

In my experience, economists [you?] don't know generalized linear models as a family despite being familiar with particular members of the family.

https://onlinelibrary.wiley.com/doi/...9.2002.00440.x is a rather good focused introduction. The examples from soil science aren't hard to understand.

BTW, the sample data suggest that a very simple model will suffice:

Last edited by Nick Cox; 13 Aug 2018, 11:18.
Comment
Matteo Bagnara

Join Date: Jun 2018

Posts: 27
#10

13 Aug 2018, 12:04

I do not deny my ignorance in statistical methods.
But I understand that a simple linear model produces only positive fitted values because I already tried on my own. In the current situation I am trying to replicate certain results and with the same data, the unconstrained model produces strongly different results. The only information I have is that the desired results are obtained forcing the fitted values to be no negative. That's why I thought there was a different way to do this. But probably a simple linear model is already fine given the dataset.
Comment

Announcement

Interval constraint on dependent variable, constrained least squares

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment