Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • cherry singhal
    started a topic mixed model postestimation using predict

    mixed model postestimation using predict

    Hello,

    I am using Stata 14. I have a panel dataset - firms and years. I am running a mixed model which includes both fixed and random effects.
    My total sample size is 52,801. The data set has missing values so entire dataset is not used when I run mixed model. The mixed model output shows the #observations used is 40595.

    I ran the "predict ebX1,reffects" command to obtain BLUPS for random component of independent variable X1. Now since the mixed model analysis used 40,595 observations out of the total of 52,801 observations, I assumed that BLUPS would be created for exactly 40,595 observations.
    However, when I summarize ebX1 (command used: summarize ebX1), I get #obs =51032 in the summary table. This #obs is what I do not understand. So now if I run the command as "summ ebX1 if e(sample)" then I get summary statistics for the 40,595 observations. So I am able to get the correct summary statistics but what I do not understand is why were BLUPS calculated for 51,032 (>40,595) observations in the first place?
    Also, if I now want to say plot a histogram of the BLUPS using command "hist ebX1, freq", the graph is plotted for 51,032 observations which are incorrect for me. And I can not use "if e(sample)" with the histogram command. I can create an indicator variable to mark the 40,595 observations and go from there but I am curious why BLUPS were calculated for more observations than that.

    thanks.


  • Jing Tong
    replied
    Originally posted by Leonardo Guizzetti View Post

    Nothing more to add on the stats side of things. As a general and unsolicited piece of advice: do not use gpt4 (or similar AI bots) for code. It is no substitute for learned knowledge and skills.
    Hi Leonardo, thank you for your helpful suggestions! I see. I'll keep focusing and learning stats itself!

    Leave a comment:


  • Leonardo Guizzetti
    replied
    Originally posted by Jing Tong View Post

    Hi Clyde, I am also analyzing the same model using the same code. I'm curious about the order of "randslope randint" in this code. I asked gpt 4, and it told me "randslope" actually refers to the random intercept, while "randint" actually refers to the random slope. I feel so confused and couldn't find any same examples in the manual or other materials. Could you please give me some suggestions about it? Thanks!
    Nothing more to add on the stats side of things. As a general and unsolicited piece of advice: do not use gpt4 (or similar AI bots) for code. It is no substitute for learned knowledge and skills.

    Leave a comment:


  • Jing Tong
    replied
    Originally posted by Joseph Coveney View Post
    Just check what the variable labels say.

    .ÿ
    .ÿversionÿ18.0

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿlocalÿline_sizeÿ`c(linesize)'

    .ÿsetÿlinesizeÿ80

    .ÿ
    .ÿ//ÿseedem
    .ÿsetÿseedÿ2088152884

    .ÿ
    .ÿ//ÿRandomÿeffectsÿ(interceptsÿandÿslopes)
    .ÿquietlyÿdrawnormÿiÿs,ÿdoubleÿcorr(1ÿ-0.1ÿ\ÿ-0.1ÿ1)ÿsd(2ÿ1)ÿn(300)

    .ÿgenerateÿintÿpidÿ=ÿ_n

    .ÿ
    .ÿ//ÿRepeatedÿmeasurements
    .ÿquietlyÿexpandÿ10

    .ÿbysortÿpid:ÿgenerateÿbyteÿtimÿ=ÿ_n

    .ÿ
    .ÿ//ÿOutcomes
    .ÿgenerateÿdoubleÿoutÿ=ÿ///
    >ÿÿÿÿÿÿÿÿÿ100ÿ+ÿiÿ+ÿ///
    >ÿÿÿÿÿÿÿÿÿ(timÿ-ÿ5.5)ÿ*ÿ(10ÿ+ÿs)ÿ+ÿ///
    >ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿruniform(0,ÿ15)

    .ÿ
    .ÿmixedÿoutÿc.timÿ||ÿpid:ÿtim,ÿcovariance(unstructured)ÿnolrtestÿnolog

    Mixed-effectsÿMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿ=ÿÿÿÿ3,000
    Groupÿvariable:ÿpidÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿ=ÿÿÿÿÿÿ300
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿ10
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿ10.0
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿ10
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(1)ÿÿÿÿÿ=ÿ25027.67
    Logÿlikelihoodÿ=ÿ-9088.4726ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿ=ÿÿÿ0.0000

    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -------------+----------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿtimÿ|ÿÿÿ9.987772ÿÿÿ.0631333ÿÿÿ158.20ÿÿÿ0.000ÿÿÿÿÿ9.864033ÿÿÿÿ10.11151
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ52.62343ÿÿÿÿ.381409ÿÿÿ137.97ÿÿÿ0.000ÿÿÿÿÿ51.87589ÿÿÿÿ53.37098
    ------------------------------------------------------------------------------

    ------------------------------------------------------------------------------
    ÿÿRandom-effectsÿparametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿerr.ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -----------------------------+------------------------------------------------
    pid:ÿUnstructuredÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(tim)ÿ|ÿÿÿ.9630695ÿÿÿ.0978628ÿÿÿÿÿÿÿ.789155ÿÿÿÿ1.175311
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ34.68389ÿÿÿ3.572712ÿÿÿÿÿÿÿ28.3431ÿÿÿÿ42.44321
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿcov(tim,_cons)ÿ|ÿÿ-5.485814ÿÿÿ.5726141ÿÿÿÿÿ-6.608117ÿÿÿ-4.363511
    -----------------------------+------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(Residual)ÿ|ÿÿÿ19.19564ÿÿÿ.5541302ÿÿÿÿÿÿ18.13972ÿÿÿÿ20.31303
    ------------------------------------------------------------------------------

    .ÿ
    .ÿpredictÿdoubleÿrandslopeÿrandint,ÿreffects

    .ÿdescribeÿrand*

    VariableÿÿÿÿÿÿStorageÿÿÿDisplayÿÿÿÿValue
    ÿÿÿÿnameÿÿÿÿÿÿÿÿÿtypeÿÿÿÿformatÿÿÿÿlabelÿÿÿÿÿÿVariableÿlabel
    --------------------------------------------------------------------------------
    randslopeÿÿÿÿÿÿÿdoubleÿÿ%10.0gÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿBLUPÿr.e.ÿforÿpid:ÿtim
    randintÿÿÿÿÿÿÿÿÿfloatÿÿÿ%9.0gÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿBLUPÿr.e.ÿforÿpid:ÿ_cons

    .ÿ
    .ÿsetÿlinesizeÿ`line_size'

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .


    The score so far:
    Clyde 1
    Chatbox 0
    Haha, yes! Clyde wins. And also thank you, Joseph!

    Leave a comment:


  • Joseph Coveney
    replied
    Originally posted by Jing Tong View Post
    I'm curious about the order of "randslope randint" in this code. I asked gpt 4, and it told me "randslope" actually refers to the random intercept, while "randint" actually refers to the random slope. I feel so confused and couldn't find any same examples in the manual or other materials. Could you please give me some suggestions about it?
    Just check what the variable labels say.

    .ÿ
    .ÿversionÿ18.0

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿlocalÿline_sizeÿ`c(linesize)'

    .ÿsetÿlinesizeÿ80

    .ÿ
    .ÿ//ÿseedem
    .ÿsetÿseedÿ2088152884

    .ÿ
    .ÿ//ÿRandomÿeffectsÿ(interceptsÿandÿslopes)
    .ÿquietlyÿdrawnormÿiÿs,ÿdoubleÿcorr(1ÿ-0.1ÿ\ÿ-0.1ÿ1)ÿsd(2ÿ1)ÿn(300)

    .ÿgenerateÿintÿpidÿ=ÿ_n

    .ÿ
    .ÿ//ÿRepeatedÿmeasurements
    .ÿquietlyÿexpandÿ10

    .ÿbysortÿpid:ÿgenerateÿbyteÿtimÿ=ÿ_n

    .ÿ
    .ÿ//ÿOutcomes
    .ÿgenerateÿdoubleÿoutÿ=ÿ///
    >ÿÿÿÿÿÿÿÿÿ100ÿ+ÿiÿ+ÿ///
    >ÿÿÿÿÿÿÿÿÿ(timÿ-ÿ5.5)ÿ*ÿ(10ÿ+ÿs)ÿ+ÿ///
    >ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿruniform(0,ÿ15)

    .ÿ
    .ÿmixedÿoutÿc.timÿ||ÿpid:ÿtim,ÿcovariance(unstructured)ÿnolrtestÿnolog

    Mixed-effectsÿMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿ=ÿÿÿÿ3,000
    Groupÿvariable:ÿpidÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿ=ÿÿÿÿÿÿ300
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿ10
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿ10.0
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿ10
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(1)ÿÿÿÿÿ=ÿ25027.67
    Logÿlikelihoodÿ=ÿ-9088.4726ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿ=ÿÿÿ0.0000

    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -------------+----------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿtimÿ|ÿÿÿ9.987772ÿÿÿ.0631333ÿÿÿ158.20ÿÿÿ0.000ÿÿÿÿÿ9.864033ÿÿÿÿ10.11151
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ52.62343ÿÿÿÿ.381409ÿÿÿ137.97ÿÿÿ0.000ÿÿÿÿÿ51.87589ÿÿÿÿ53.37098
    ------------------------------------------------------------------------------

    ------------------------------------------------------------------------------
    ÿÿRandom-effectsÿparametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿerr.ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -----------------------------+------------------------------------------------
    pid:ÿUnstructuredÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(tim)ÿ|ÿÿÿ.9630695ÿÿÿ.0978628ÿÿÿÿÿÿÿ.789155ÿÿÿÿ1.175311
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ34.68389ÿÿÿ3.572712ÿÿÿÿÿÿÿ28.3431ÿÿÿÿ42.44321
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿcov(tim,_cons)ÿ|ÿÿ-5.485814ÿÿÿ.5726141ÿÿÿÿÿ-6.608117ÿÿÿ-4.363511
    -----------------------------+------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(Residual)ÿ|ÿÿÿ19.19564ÿÿÿ.5541302ÿÿÿÿÿÿ18.13972ÿÿÿÿ20.31303
    ------------------------------------------------------------------------------

    .ÿ
    .ÿpredictÿdoubleÿrandslopeÿrandint,ÿreffects

    .ÿdescribeÿrand*

    VariableÿÿÿÿÿÿStorageÿÿÿDisplayÿÿÿÿValue
    ÿÿÿÿnameÿÿÿÿÿÿÿÿÿtypeÿÿÿÿformatÿÿÿÿlabelÿÿÿÿÿÿVariableÿlabel
    --------------------------------------------------------------------------------
    randslopeÿÿÿÿÿÿÿdoubleÿÿ%10.0gÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿBLUPÿr.e.ÿforÿpid:ÿtim
    randintÿÿÿÿÿÿÿÿÿfloatÿÿÿ%9.0gÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿBLUPÿr.e.ÿforÿpid:ÿ_cons

    .ÿ
    .ÿsetÿlinesizeÿ`line_size'

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .


    The score so far:
    Clyde 1
    Chatbox 0

    Leave a comment:


  • Jing Tong
    replied
    Originally posted by Clyde Schechter View Post
    I am unable to find any documentation of this. My past experience using -mixed- and -predict, reffects- has taught me that the random slope is listed first. Here's an example:
    Code:
    . webuse pig, clear
    (Longitudinal analysis of pig weights)
    
    .
    . mixed weight week || id: week
    
    Performing EM optimization ...
    
    Performing gradient-based optimization:
    Iteration 0: Log likelihood = -869.03825
    Iteration 1: Log likelihood = -869.03825
    
    Computing standard errors ...
    
    Mixed-effects ML regression Number of obs = 432
    Group variable: id Number of groups = 48
    Obs per group:
    min = 9
    avg = 9.0
    max = 9
    Wald chi2(1) = 4689.51
    Log likelihood = -869.03825 Prob > chi2 = 0.0000
    
    ------------------------------------------------------------------------------
    weight | Coefficient Std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    week | 6.209896 .0906819 68.48 0.000 6.032163 6.387629
    _cons | 19.35561 .3979159 48.64 0.000 18.57571 20.13551
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
    Random-effects parameters | Estimate Std. err. [95% conf. interval]
    -----------------------------+------------------------------------------------
    id: Independent |
    var(week) | .3680668 .0801181 .2402389 .5639103
    var(_cons) | 6.756364 1.543503 4.317721 10.57235
    -----------------------------+------------------------------------------------
    var(Residual) | 1.598811 .1233988 1.374358 1.85992
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(2) = 764.42 Prob > chi2 = 0.0000
    
    Note: LR test is conservative and provided only for reference.
    
    . predict prediction*, reffects
    
    . des
    
    Contains data from https://www.stata-press.com/data/r18/pig.dta
    Observations: 432 Longitudinal analysis of pig weights
    Variables: 5 25 May 2022 11:06
    (_dta has notes)
    ----------------------------------------------------------------------------------------------------------------------------------------------
    Variable Storage Display Value
    name type format label Variable label
    ----------------------------------------------------------------------------------------------------------------------------------------------
    id float %9.0g
    week float %9.0g
    weight float %9.0g
    prediction1 float %9.0g BLUP r.e. for id: week
    prediction2 float %9.0g BLUP r.e. for id: _cons
    ----------------------------------------------------------------------------------------------------------------------------------------------
    Sorted by:
    Note: Dataset has changed since last saved.
    Note that when left to its own devices, -predict- created the random slope for the first prediction and the intercept for the second. I believe this is true for the -me- suite of estimations in general. It is, of course, safest, to just specify a single stub, as was done in this example. Stata will then tell you, in the variable label, which predicted variable corresponds to which model parameter.
    Hi Clyde, thank you so much for your reply and sharing the example! I also found a link which also suggests the random slope is listed first. I will try by following your suggestions!
    https://campusguides.lib.utah.edu/c....0853&p=1054177

    Leave a comment:


  • Clyde Schechter
    replied
    I am unable to find any documentation of this. My past experience using -mixed- and -predict, reffects- has taught me that the random slope is listed first. Here's an example:
    Code:
    . webuse pig, clear
    (Longitudinal analysis of pig weights)
    
    .
    . mixed weight week || id: week
    
    Performing EM optimization ...
    
    Performing gradient-based optimization:
    Iteration 0:  Log likelihood = -869.03825  
    Iteration 1:  Log likelihood = -869.03825  
    
    Computing standard errors ...
    
    Mixed-effects ML regression                         Number of obs    =     432
    Group variable: id                                  Number of groups =      48
                                                        Obs per group:
                                                                     min =       9
                                                                     avg =     9.0
                                                                     max =       9
                                                        Wald chi2(1)     = 4689.51
    Log likelihood = -869.03825                         Prob > chi2      =  0.0000
    
    ------------------------------------------------------------------------------
          weight | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            week |   6.209896   .0906819    68.48   0.000     6.032163    6.387629
           _cons |   19.35561   .3979159    48.64   0.000     18.57571    20.13551
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
    -----------------------------+------------------------------------------------
    id: Independent              |
                       var(week) |   .3680668   .0801181      .2402389    .5639103
                      var(_cons) |   6.756364   1.543503      4.317721    10.57235
    -----------------------------+------------------------------------------------
                   var(Residual) |   1.598811   .1233988      1.374358     1.85992
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(2) = 764.42                Prob > chi2 = 0.0000
    
    Note: LR test is conservative and provided only for reference.
    
    . predict prediction*, reffects
    
    . des
    
    Contains data from https://www.stata-press.com/data/r18/pig.dta
     Observations:           432                  Longitudinal analysis of pig weights
        Variables:             5                  25 May 2022 11:06
                                                  (_dta has notes)
    ----------------------------------------------------------------------------------------------------------------------------------------------
    Variable      Storage   Display    Value
        name         type    format    label      Variable label
    ----------------------------------------------------------------------------------------------------------------------------------------------
    id              float   %9.0g                 
    week            float   %9.0g                 
    weight          float   %9.0g                 
    prediction1     float   %9.0g                 BLUP r.e. for id: week
    prediction2     float   %9.0g                 BLUP r.e. for id: _cons
    ----------------------------------------------------------------------------------------------------------------------------------------------
    Sorted by:
         Note: Dataset has changed since last saved.
    Note that when left to its own devices, -predict- created the random slope for the first prediction and the intercept for the second. I believe this is true for the -me- suite of estimations in general. It is, of course, safest, to just specify a single stub, as was done in this example. Stata will then tell you, in the variable label, which predicted variable corresponds to which model parameter.

    Leave a comment:


  • Jing Tong
    replied
    Originally posted by Clyde Schechter View Post
    Code:
    predict randslope randint ,reffects
    is correct.
    Hi Clyde, I am also analyzing the same model using the same code. I'm curious about the order of "randslope randint" in this code. I asked gpt 4, and it told me "randslope" actually refers to the random intercept, while "randint" actually refers to the random slope. I feel so confused and couldn't find any same examples in the manual or other materials. Could you please give me some suggestions about it? Thanks!

    Leave a comment:


  • Clyde Schechter
    replied
    Code:
    predict randslope randint ,reffects
    is correct.

    Leave a comment:


  • Beakal Zinab
    replied
    Hello,
    I am running a mixed model which includes both fixed and random effects using "xtmixed length age_months||id:age_months" command and I want to have predicted data set on random effects. But I couldn't figure out which stata command to use, is "predict randslope randint ,reffects" correct or i have to look for other option?

    Leave a comment:


  • cherry singhal
    replied
    Thank you Clyde for the elaborate explanation!
    Thank you Joseph for the hands on example, it so helps!

    Leave a comment:


  • Joseph Coveney
    replied
    Take a look at the patterns of missing values in your dataset. (help missing and then scroll down to "Useful commands".) You can still get a BLUP under some patterns of missing data for a cluster. See the illustration of that below, which shows examples of circumstances under which you will and under which you won't get a prediction.

    .ÿversionÿ14.2

    .ÿ
    .ÿclearÿ*

    .ÿsetÿmoreÿoff

    .ÿsetÿseedÿ1372490

    .ÿ
    .ÿquietlyÿsetÿobsÿ200

    .ÿgenerateÿintÿpidÿ=ÿ_n

    .ÿgenerateÿdoubleÿuÿ=ÿrnormal()

    .ÿ
    .ÿquietlyÿexpandÿ2

    .ÿbysortÿpid:ÿgenerateÿbyteÿtimÿ=ÿ_n

    .ÿ
    .ÿgenerateÿdoubleÿrspÿ=ÿuÿ+ÿrnormal()

    .ÿ
    .ÿquietlyÿreplaceÿrspÿ=ÿ.ÿinÿ1

    .ÿquietlyÿreplaceÿtimÿ=ÿ.ÿinÿ3

    .ÿquietlyÿreplaceÿtimÿ=ÿ.ÿinÿ5/6

    .ÿ
    .ÿmixedÿrspÿi.timÿ||ÿpid:ÿ,ÿnolrtestÿnolog

    Mixed-effectsÿMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ396
    Groupÿvariable:ÿpidÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿÿ=ÿÿÿÿÿÿÿÿ199

    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿÿ1
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿÿ2.0
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿÿ2

    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(1)ÿÿÿÿÿÿ=ÿÿÿÿÿÿÿ5.56
    Logÿlikelihoodÿ=ÿ-675.86184ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0184

    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿrspÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
    -------------+----------------------------------------------------------------
    ÿÿÿÿÿÿÿ2.timÿ|ÿÿÿ.2412021ÿÿÿ.1023253ÿÿÿÿÿ2.36ÿÿÿ0.018ÿÿÿÿÿ.0406481ÿÿÿÿ.4417561
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.0155886ÿÿÿ.1017302ÿÿÿÿÿ0.15ÿÿÿ0.878ÿÿÿÿ-.1837989ÿÿÿÿ.2149761
    ------------------------------------------------------------------------------

    ------------------------------------------------------------------------------
    ÿÿRandom-effectsÿParametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿErr.ÿÿÿÿÿ[95%ÿConf.ÿInterval]
    -----------------------------+------------------------------------------------
    pid:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ1.009803ÿÿÿ.1621702ÿÿÿÿÿÿ.7371185ÿÿÿÿ1.383363
    -----------------------------+------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(Residual)ÿ|ÿÿÿÿ1.03397ÿÿÿ.1041779ÿÿÿÿÿÿ.8486823ÿÿÿÿ1.259711
    ------------------------------------------------------------------------------

    .ÿ
    .ÿpredictÿdoubleÿebe,ÿreffects
    (2ÿmissingÿvaluesÿgenerated)

    .ÿ
    .ÿlistÿinÿ1/6,ÿnoobsÿsepby(pid)

    ÿÿ+-------------------------------------------------+
    ÿÿ|ÿpidÿÿÿÿÿÿÿÿÿÿÿÿuÿÿÿtimÿÿÿÿÿÿÿÿÿÿrspÿÿÿÿÿÿÿÿÿebeÿ|
    ÿÿ|-------------------------------------------------|
    ÿÿ|ÿÿÿ1ÿÿÿÿ.41822853ÿÿÿÿÿ1ÿÿÿÿÿÿÿÿÿÿÿÿ.ÿÿÿ1.0644975ÿ|
    ÿÿ|ÿÿÿ1ÿÿÿÿ.41822853ÿÿÿÿÿ2ÿÿÿÿ2.4112619ÿÿÿ1.0644975ÿ|
    ÿÿ|-------------------------------------------------|
    ÿÿ|ÿÿÿ2ÿÿÿ-.32213358ÿÿÿÿÿ.ÿÿÿ-.80480298ÿÿÿÿ.0886456ÿ|
    ÿÿ|ÿÿÿ2ÿÿÿ-.32213358ÿÿÿÿÿ2ÿÿÿÿ.43620344ÿÿÿÿ.0886456ÿ|
    ÿÿ|-------------------------------------------------|
    ÿÿ|ÿÿÿ3ÿÿÿÿÿ.6285637ÿÿÿÿÿ.ÿÿÿÿÿ1.388014ÿÿÿÿÿÿÿÿÿÿÿ.ÿ|
    ÿÿ|ÿÿÿ3ÿÿÿÿÿ.6285637ÿÿÿÿÿ.ÿÿÿÿ.43723975ÿÿÿÿÿÿÿÿÿÿÿ.ÿ|
    ÿÿ+-------------------------------------------------+

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .

    Leave a comment:


  • Clyde Schechter
    replied
    Well, you don't say which variable defines your higher level in the model, but let's assume for the sake of discussion that it's firm.

    Remember that the random effect at the firm level is a constant attribute of the firm; it does not vary among observations within firm. Now, if a firm has any observations in the estimation sample at all, a firm-level random effect can be calculated for that firm. And Stata does that and records that same value in every observation for that firm, even if that observation is not, itself, part of the estimation sample. So the only circumstance under which Stata will not calculate the random effect in an observation is if that observation's firm has no observations in the estimation sample. I'm guessing that this doesn't happen very often in your data (although apparently it does happen occasionally since you don't get 52,801 observations for ebX1.)

    Actually, if you want to do a histogram of ebX1, even restricting to e(sample) is not appropriate because each firm is then counted as many times as it appears in the estimation sample. The actual distribution of ebX1 has to be viewed at the firm level. So something like this:

    Code:
    egen flag = tag(firm)
    histogram ebX1 if flag
    In fact pretty much anything you do with ebX1 should be conditioned on -if flag- so that each firm is counted (at most) once.


    Leave a comment:

Working...
X