Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using coefficeints from regress command to generate a new variable

    good evening everyone,

    I am using Stata 16.1. I have to run a pooled OLS on a panel data and the use the estimated coefficients to generate a new variable:

    1. The model I'm trying to implement is: ∆ki,t = ( λ0 + ΛZi,t-1) (Gapi,t-1) + ηi,t
    2. I use the following regress command after setting the dataset as panel using xtset:


    regress actual_tier1_gap l.tier1_gap l.((c.tier1_gap)#(i.state1nonstate0 c.size c.return_on_equity_w ///
    i.below_tier1 c.provforNPA_to_net_advances_w i.Listeddummy1iflisted c.GdpGrowthRate c.inflation)), noconst

    (I apologise for the variable names being unwieldy.)

    3. I have run into a few issues:

    a. my first question is whether the command is appropriate for the model I am trying to implement

    b. the coefficient ofλ0*Gapi,t-1 tends to remain low if I add few controls, but increases considerably when I add all of the controls above.

    c. Finally in the second stage I have to estimate a variable ΛZi,t-1 from the estimated coefficients of Eq1. I'm unable to grasp how to do it.

    Any help would be much appreciated.

    regards,
    Gagan

  • George Ford
    replied
    I don't think you need to include a ratio, but I'm not sure exactly what you're after. You have a linear model with a lot of interactions. Not sure where the ratio comes into play if you want a prediction from that regression. If the theory calls for it, then sure.

    Leave a comment:


  • gagandeep sharma
    replied
    Originally posted by George Ford View Post
    Use all the variables you want to include in generate X to get a single X.
    Thanks George,

    Could you please comment on this:

    Since in the regression individual Betas are the product of two variables (variable*gap), is the true Λhat in vector Zi,t-1 given by: [_b(l.variable*l.gap)/_b(l.gap)] for each variable in the vector Z?

    so that i can use:

    generate X = ((_b[cL.tier1_gap1#cL.size] / (_b[L.tier1_gap1])) * L.size) --------- (for one of the variables in vector z, and so on adding all the variables)

    Leave a comment:


  • George Ford
    replied
    Use all the variables you want to include in generate X to get a single X.

    Leave a comment:


  • gagandeep sharma
    replied
    Originally posted by George Ford View Post
    Code:
    sysuse auto, clear
    reg price mpg weight length foreign
    * use this if you want the prediction of the regression
    predict pfit , xb
    * use this if you want parts of the regression (2 ways to get to the same result)
    gen newvar = _b[_cons] + _b[mpg]*mpg + _b[weight]*weight +_b[length]*length
    gen newvaralt = pfit-_b[foreign]*foreign
    * newvar = newvaralt
    You've got ugly variable names. If you have trouble matching up the variable names with the coefficients (_b[x]), then

    Code:
    matrix list e(b)
    to see what Stata sees.




    Hi George,

    Terribly sorry to bother you again. if possible can you address my original query. I have tried multiple iterations but I keep getting results which aren't theoretically possible.

    Post the regression command:

    regress actual_tier1_gap l.tier1_gap1 l.((c.tier1_gap1)#(i.state1nonstate0 c.size c.return_on_equity_w ///
    c.provforNPA_to_net_advances_w i.Listeddummy1iflisted c.GdpGrowthRate c.inflation)), noconst

    I need to construct a new variable VarX = ΛhatZi,t-1 where vector Z is (i.state1nonstate0 c.size c.return_on_equity_w i.below_tier1 c.provforNPA_to_net_advances_w i.Listeddummy1iflisted c.GdpGrowthRate c.inflation)
    since in the regression all these variables are interacted with Gapi,t-1 so to get the original coefficient in vector z is following approach correct?

    generate X = ((_b[cL.tier1_gap1#cL.size] / (_b[L.tier1_gap1])) * L.size) --------- (for one of the variables in vector z, and so on adding all the variables)

    in short, is the true Λhat in vector Zi,t-1 given by: [_b(l.variable*l.gap)/_b(l.gap)] for each variable in the vector Z?
    Last edited by gagandeep sharma; 10 Nov 2021, 16:35.

    Leave a comment:


  • Carlo Lazzaro
    replied
    Gagandeep:
    another source to grasp the building blocks ogf George's helpful advice, is Example 3, -bootstrap- entry, Stata .pdf manual.

    Leave a comment:


  • George Ford
    replied
    I ripped the general setup from
    HTML Code:
    https://www.schmidheiny.name/teaching/bootstrap2up.pdf
    This bootstraps the critical t-stats for each variable in the second stage equation (since the nominal ones are presumed incorrect). You'd compare the t-stats from the regression to these values rather than the normal table. First set is asymmetric values, the second is a symmetric. I think the former is probably better, but others may disagree.

    You can fancy it up where it makes a nice table if that's your desire.

    The main thing is that both equations are part of the bootstrap procedure (each round generates a new generated regressor gr).

    Leave a comment:


  • gagandeep sharma
    replied
    Originally posted by George Ford View Post
    A start (thoughts welcome).

    Code:
    sysuse auto, clear
    
    capture program drop myprog
    program define myprog, rclass
    reg mpg weight c.weight#(c.length c.foreign) , noconstant
    capture drop gr
    gen gr = _b[c.weight#c.length]*weight*length+_b[c.weight#c.foreign]*weight*foreign
    reg price trunk gr
    return scalar t_trunk = (_b[trunk]-b_trunk)/_se[trunk]
    return scalar t_gr = (_b[gr]-b_gr)/_se[gr]
    return scalar t_cons = (_b[_cons]-b_cons)/_se[_cons]
    end
    
    reg mpg weight c.weight#(c.length c.foreign) , noconstant
    capture drop gr
    gen gr = _b[c.weight#c.length]*weight*length+_b[c.weight#c.foreign]*weight*foreign
    reg price trunk gr
    scalar b_trunk = _b[trunk]
    scalar b_gr = _b[gr]
    scalar b_cons = _b[_cons]
    
    bootstrap t_trunk=r(t_trunk) t_gr=r(t_gr) t_cons=r(t_cons), reps(100) seed(12345) saving(bs_t, replace): myprog
    preserve
    use bs_t, replace
    centile t_trunk, centile(2.5, 97.5)
    centile t_gr, centile(2.5, 97.5)
    centile t_cons, centile(2.5, 97.5)
    gen t_abs_trunk = abs(t_trunk)
    gen t_abs_gr = abs(t_gr)
    gen t_abs_cons = abs(t_cons)
    centile t_abs_trun t_abs_gr t_abs_cons, centile(95)
    restore
    Thanks George,

    With my limited coding abilities I think it will take me a day or two to understand this. I'm grateful that you spent so much time to help me out.



    Leave a comment:


  • George Ford
    replied
    A start (thoughts welcome).

    Code:
    sysuse auto, clear
    
    capture program drop myprog
    program define myprog, rclass
        reg mpg weight c.weight#(c.length c.foreign) , noconstant
        capture drop gr
        gen gr = _b[c.weight#c.length]*weight*length+_b[c.weight#c.foreign]*weight*foreign  
        reg price trunk gr
        return scalar t_trunk = (_b[trunk]-b_trunk)/_se[trunk]
        return scalar t_gr = (_b[gr]-b_gr)/_se[gr]
        return scalar t_cons = (_b[_cons]-b_cons)/_se[_cons]
    end
    
    reg mpg weight c.weight#(c.length c.foreign) , noconstant
    capture drop gr
    gen gr = _b[c.weight#c.length]*weight*length+_b[c.weight#c.foreign]*weight*foreign
    reg price trunk gr
    scalar b_trunk = _b[trunk]
    scalar b_gr = _b[gr]
    scalar b_cons = _b[_cons]
    
    bootstrap t_trunk=r(t_trunk) t_gr=r(t_gr) t_cons=r(t_cons), reps(100) seed(12345) saving(bs_t, replace): myprog
    preserve
    use bs_t, replace
    centile t_trunk, centile(2.5, 97.5)
    centile t_gr, centile(2.5, 97.5)
    centile t_cons, centile(2.5, 97.5)
    gen t_abs_trunk = abs(t_trunk)
    gen t_abs_gr = abs(t_gr)
    gen t_abs_cons = abs(t_cons)
    centile t_abs_trun t_abs_gr t_abs_cons, centile(95)
    restore

    Leave a comment:


  • George Ford
    replied
    cancel that. error.
    Last edited by George Ford; 04 Nov 2021, 13:04. Reason: error/deleted

    Leave a comment:


  • George Ford
    replied
    Code:
    sysuse auto, clear
    reg price mpg c.mpg#(c.weight c.length c.foreign) , noconstant
    predict pfit , xb
    gen pfitalt = _b[mpg]*mpg + _b[c.mpg#c.weight]*mpg*weight+_b[c.mpg#c.length]*mpg*length 
    gen newvar = _b[c.mpg#c.weight]*mpg*weight+_b[c.mpg#c.length]*mpg*length 
    gen newvaralt = pfit-_b[mpg]*mpg

    Leave a comment:


  • George Ford
    replied
    Code:
    sysuse auto, clear
    reg price mpg weight length foreign
    * use this if you want the prediction of the regression
    predict pfit , xb
    * use this if you want parts of the regression (2 ways to get to the same result)
    gen newvar = _b[_cons] + _b[mpg]*mpg + _b[weight]*weight +_b[length]*length 
    gen newvaralt = pfit-_b[foreign]*foreign
    * newvar = newvaralt
    You've got ugly variable names. If you have trouble matching up the variable names with the coefficients (_b[x]), then

    Code:
    matrix list e(b)
    to see what Stata sees.




    Leave a comment:


  • gagandeep sharma
    replied
    Originally posted by Carlo Lazzaro View Post
    Gagandeep:
    1) you do not need to -xtset- your data first if you go pooled OLS (BTW: pooled OLS would not be my first choice for panel data regression). With a bit of guess-work, your supervisor may be interested in within- R-sg (if -fe-) or betwee R_sq (if -re-); both are produced by -xtreg- (with a bit of guess-work again, I assume that your regressand is continuous);
    2) Ok. I assume that the literature in your research field sponsors your approach;
    3) about your question 3b (and with no other pieces of information from your side), provided that it is not clear for what you're controlling for, if the coefficient you're concerned about reports wide variations, I would check your model specification, just to be sure that you're on the right track.
    Hi Carlo,

    first of all apologies for addressing you as Carlos.

    I have a panel dataset and the variable Gap in my equation above is actually a predicted variable from a system GMM equation. So it is already set as a panel. Nevertheless I take your point about pooled ols not requiring it.

    The model that I am using is pretty standard and comes from Jiang, C., Liu, H., & Molyneux, P. (2019). Do different forms of government ownership matter for bank capital behavior? Evidence from China. Journal of Financial Stability, 40, 38–49. https://doi.org/10.1016/j.jfs.2018.11.005

    P.S
    . sorry for spamming. I thought maybe you missed my post. I'm at my wits' end and would appreciate any help possible.

    Thanks and regards.


    I need to construct a new variable VarX = ΛhatZi,t-1 where vector z is (i.state1nonstate0 c.size c.return_on_equity_w i.below_tier1 c.provforNPA_to_net_advances_w i.Listeddummy1iflisted c.GdpGrowthRate c.inflation))
    But as I mentioned, all these variables are interacted with Gapi,t-1 so how can i use the estimated coefficients?

    Thanks and regards

    Leave a comment:


  • gagandeep sharma
    replied
    Originally posted by George Ford View Post
    Looks like voodoo, but what do I know?

    Here's what a few minutes of review does for you (by someone unfamiliar with this literature).

    This is a three stage model with 2 generated regressors: S1 to S2, and S2 to S3. Unaccounted for, so the hypothesis test are invalid (an error of unknown magnitude, though bootstrap usually increases SEs).

    In any case, in Step 2 they take a portion of the prediction from Step 1 to craft a new variable (eq 3). The adjustment λ is assumed constant. This can be done by multiplying Beta*Z's. The goal is to get a mean prediction that they then create a new variable of the difference between the mean and the actual value (the gap) (the true value-predicted value). Not sure taking a portion of the regression makes sense due to scaling (the model includes year dummies and a lagged DV). Insert that generated regressor into Step 2 (which kinda looks like your equation), then take the prediction of S2 and insert into Step 3--another generated regressor.

    Your model does not match the ones in that paper. There is no λ0 just ΛZi,t-1 , and it appears λ is a predetermined constant in this paper. (That being so, the generated regressor is just the prediction from S2). If otherwise, then you can just multiply coefficients*Z's to get the prediction. Another generated regressor.

    You'll need to bootstrap all 3 stages simultaneously for hypothesis testing. You'll have to code it using bsample.

    I wonder if they are using predications of the DV for the all the generated regressors? That seems to be what they are after, but it is unclear.

    I'd ask the authors' for their code, or find a better approach.
    George, thanks a ton for going through the model. My supervisor is also of the view that the process is not very sound econometrically, which I guess is your point too.

    "(That being so, the generated regressor is just the prediction from S2). If otherwise, then you can just multiply coefficients*Z's to get the prediction. Another generated regressor."

    Could you be a bit clearer and comment on the following:

    I need to construct a new variable VarX = ΛhatZi,t-1 where vector Z is (i.state1nonstate0 c.size c.return_on_equity_w i.below_tier1 c.provforNPA_to_net_advances_w i.Listeddummy1iflisted c.GdpGrowthRate c.inflation)
    But as I mentioned, all these variables are interacted with Gapi,t-1 so how can i use the estimated coefficients? For eg, my results show a coefficient for L.state1nonstate0#cL.tier1_gap1. Should I divide this coefficient by gapl1 and them multiply the dummy values of state1nonstate0 to get the estimates (and so on for all the variables in Z).

    I will email the authors, but honestly i have had very little success in getting researchers to part with their codes.

    Thanks and regards.


    Leave a comment:


  • George Ford
    replied
    Looks like voodoo, but what do I know?

    Here's what a few minutes of review does for you (by someone unfamiliar with this literature).

    This is a three stage model with 2 generated regressors: S1 to S2, and S2 to S3. Unaccounted for, so the hypothesis test are invalid (an error of unknown magnitude, though bootstrap usually increases SEs).

    In any case, in Step 2 they take a portion of the prediction from Step 1 to craft a new variable (eq 3). The adjustment λ is assumed constant. This can be done by multiplying Beta*Z's. The goal is to get a mean prediction that they then create a new variable of the difference between the mean and the actual value (the gap) (the true value-predicted value). Not sure taking a portion of the regression makes sense due to scaling (the model includes year dummies and a lagged DV). Insert that generated regressor into Step 2 (which kinda looks like your equation), then take the prediction of S2 and insert into Step 3--another generated regressor.

    Your model does not match the ones in that paper. There is no λ0 just ΛZi,t-1 , and it appears λ is a predetermined constant in this paper. (That being so, the generated regressor is just the prediction from S2). If otherwise, then you can just multiply coefficients*Z's to get the prediction. Another generated regressor.

    You'll need to bootstrap all 3 stages simultaneously for hypothesis testing. You'll have to code it using bsample.

    I wonder if they are using predications of the DV for the all the generated regressors? That seems to be what they are after, but it is unclear.

    I'd ask the authors' for their code, or find a better approach.

    Leave a comment:

Working...
X