"factor-variable and time-series operators not allowed r101;" when using community contributed command: sml

Sam Stargardt

Join Date: Sep 2022
Posts: 9

"factor-variable and time-series operators not allowed r101;" when using community contributed command: sml

18 Sep 2022, 10:35

I'm using the sml command from the user written software package SJ8-2 st0144, Article: Stata Journal, volume 8, number 2: st0144. I am using STATA version 17.0

In what follows Degree is a binary {0,1} variable. If I write the following code:

Code:

 sml Degree i.Female i.WeeklyFamInc35to49 i.WeeklyFamInc50to99 i.WeeklyFamInc100to149 i.WeeklyFamInc150to199 i.WeeklyFamInc200to249 i.WeeklyFamIncmore250 i.FatDeg i.MotDeg i.FathWorkingClassWork i.StateBens12mnths i.School pred_MathScore, offset(pred_MathScore) nolog

I get the following error

Code:

  factor-variable and time-series operators not allowed
r(101);

All of my categorical variables are dummy variables coded {0,1} and this doesn't happen when I just run:

Code:

sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths School, offset(pred_MathScore) nolog

But I want to estimate:

Code:

 margins School

And then to do this I need School recognise as a factor variable in the original estimation.

I've tried:

Code:

 xi: sml Degree i.Female i.WeeklyFamInc35to49 i.WeeklyFamInc50to99 i.WeeklyFamInc100to149 i.WeeklyFamInc150to199 i.WeeklyFamInc200to249 i.WeeklyFamIncmore250 i.FatDeg i.MotDeg i.FathWorkingClassWork i.StateBens12mnths i.School pred_MathScore, offset(pred_MathScore) nolog

But then I get:

Code:

i.Female          _IFemale_0-1        (naturally coded; _IFemale_0 omitted)
i.WeeklyFam~o49   _IWeeklyFam_0-1     (naturally coded; _IWeeklyFam_0 omitted)
i.WeeklyFam~o99   _IWeeklyFama0-1     (naturally coded; _IWeeklyFama0 omitted)
i.WeeklyFam~149   _IWeeklyFamb0-1     (naturally coded; _IWeeklyFamb0 omitted)
i.WeeklyFam~199   _IWeeklyFamc0-1     (naturally coded; _IWeeklyFamc0 omitted)
i.WeeklyFam~249   _IWeeklyFamd0-1     (naturally coded; _IWeeklyFamd0 omitted)
i.WeeklyFam~250   _IWeeklyFame0-1     (naturally coded; _IWeeklyFame0 omitted)
i.FatDeg          _IFatDeg_0-1        (naturally coded; _IFatDeg_0 omitted)
i.MotDeg          _IMotDeg_0-1        (naturally coded; _IMotDeg_0 omitted)
i.FathWorking~k   _IFathWorki_0-1     (naturally coded; _IFathWorki_0 omitted)
i.StateBens12~s   _IStateBens_0-1     (naturally coded; _IStateBens_0 omitted)
i.School          _ISchool_0-1        (naturally coded; _ISchool_0 omitted)

I couldn't really find anything about this naturally coded omitted message. Is this a problem? Or should I proceed with the estimation? I should reiterate that I only care about coding my variables as factors so I can then run margins. If there is another way to calculate the margins afterwards then it's fine to not code them as factors.

Tags: factor variables, r(101), user-written commands

Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#2

18 Sep 2022, 11:39

Well, you can't take full advantage of -margins- because, apparently, -sml- does not allow factor-variable notation and -margins- requires it.

The output you got with -xi- looks appropriate, or at least it is appropriate if each of these variables is a dichotomous 0/1 variable. Nothing to worry about here.

-margins- will not work in the usual way with these variables, however. So you may as well not bother with -xi- for this purpose.

If I am correct that all of these variables are dichotomous, then you can "trick" -margins- into giving you the predicted margins. Using School as an example:

Code:

sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 /// FatDeg MotDeg FathWorkingClassWork StateBens12mnths School pred_MathScore, offset(pred_MathScore) nolog margins, at(School == (0 1))

This will give you the same results as -margins School- would have given you if -sml- allowed factor variable notation.

Do not apply this trick incautiously, however. It only works because School is purely dichotomous and there are no interactions involving it in your model.
1 like
Comment

Sam Stargardt

Join Date: Sep 2022
Posts: 9

18 Sep 2022, 15:10

Thank you so much for your response Clyde!

I tried what you suggested:

Code:

sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to19
> 9 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths S
> chool, offset(pred_MathScore) nolog

followed by:

Code:

 margins, at(School == (0 1))

However STATA then output:

Code:

Predictive margins                                       Number of obs = 2,907
Model VCE: OIM

Expression: Linear prediction, predict()
1._at: School = 0
2._at: School = 1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .9210996   .4861446     1.89   0.058    -.0317263    1.873926
          2  |   1.605186     .58807     2.73   0.006     .4525898    2.757782

As these are meant to be fitted probabilites it seems strange that the Marginal probability for School=1 is above 1. The plot of the local polynomial smooth (my estimated cdf graphed as a function of my single index) also does not exceed 1, so I'm not sure how this can be.

When it says

Code:

 Expression: Linear prediction, predict()

Does this potentially mean it's using a linear probability model which might explain the greater than 1 estimate? Is there a way around this to get margins for the model I estimated? Thanks for your help!

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#4

18 Sep 2022, 16:41

Does this potentially mean it's using a linear probability model which might explain the greater than 1 estimate?

I am not familiar with -sml-, and I don't even know what it does, let alone how it does it. Suffice it to say, however, that "Expression: Linear prediction, predict()" means that -margins- is, indeed, working with the -xb- result of whatever model -sml- is estimating here. Whether that is a linear probability model or a linear estimate of something more indirectly related to your outcome variable, I wouldn't know.

I didn't anticipate that problem when I made my suggestion. If you know how -sml- transforms -xb- to get to its final results, then you can force -margins- to do the same thing by adding an appropriately written -expression()- option to your -margins- command.
Comment
Sam Stargardt

Join Date: Sep 2022

Posts: 9
#5

19 Sep 2022, 06:29

-sml- here is a single-index semi-parametric maximum likelihood estimator so it estimates the coefficients and non-parametrically estimates the cdf.

If -margins- is using the single-index i.e. linear part, to estimate marginal probabilities rather than using the cdf estimated by -sml- at xb do you know of a way that I can ask it to use the cdf?

For clarity I want to average G(xb) across all variables except school, fixing school =1 and school=0, where G(.) has been estimated by -sml-. I guess this is similar to what it would do if you used -margins- after estimating a logit model (except with my cdf in the place of the logistic cdf). Thanks for your help!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#6

19 Sep 2022, 10:06

If the -predict- command following -sml- has an option to predict the cdf value, then you could do it with:

Code:

margins, at(School = (0 1)) predict(the_cdf_option_for_-predict-)

But if not, I don't see any straightforward way to do this. I think you would have to write your own code to predict the CDF and then emulate what -margins- does using that.
Comment

Sam Stargardt

Join Date: Sep 2022
Posts: 9

19 Sep 2022, 10:41

-sml- has a command -F- which I think is the CDF.
For example I plot the CDF after -sml- using:

Code:

sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths School, offset(pred_MathScore)
matrix B=e(b)
matrix V=e(V)
predict Xb
lpoly Degree Xb, gen(F) at(Xb) gaussian

Where F is the CDF. Is there a way I can use this in the syntax you're suggesting?

Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30187

19 Sep 2022, 10:47

So, it would go something like this:

Code:

sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths School, offset(pred_MathScore)
clonevar School_orig = School
forvalues s = 0/1 {
    replace School = `s'
    predict Xb
    lpoly Degree Xb, gen(F) at(Xb) gaussian
    summ F
    display "Predicted Margin of F when S = 0 is `r(mean)'"
    drop F
}
replace School =  School_orig
drop School_orig

That will give you the predicted margins. The standard errors, however, I do not know how to calculate.

Comment

Sam Stargardt

Join Date: Sep 2022
Posts: 9

19 Sep 2022, 15:43

Thanks so much Clyde, that almost works except that I only get the estimate margin for School=0

I tried

Code:

snp Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths StdBehvScore StdMalScore School, offset(pred_MathScore)

clonevar School_orig = School
forvalues s = 0/1 {
    replace School = `s'
    predict Xb
    lpoly Degree Xb, gen(F) at(Xb) gaussian
    summ F
    display "Predicted Margin of F when S = 0 is `r(mean)'"
    drop F
    drop Xb
}
replace School =  School_orig
drop School_orig

and I get

Code:


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
           F |      2,653    .2036038    .2066127          0    .997744
Predicted Margin of F when S = 0 is .2036037860124303
(2,653 real changes made)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
           F |      2,653    .2036038    .2066127          0    .997744
Predicted Margin of F when S = 0 is .2036037858916236

I.e. the predicted margins for School=0 in both cases.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#10

19 Sep 2022, 18:40

Um, are you using -lpoly- correctly? Try listing some values of F during each iteration: I suspect they're not changing. This would suggest to me that your use of -lpoly- is not what it needs to be if I'm right. But as I'm not really familiar with -lpoly-, I can't guide you on how to fix that.

Added: Also check that Xb changes with each iteration. It may be that -predict- doesn't play well with -sml-. User-written commands don't always do what they need to do for Stata's post-estimation commands to produce appropriate results. (If that's the case, I think your problem is truly unsolvable. And if that's the case, you might want to contact the author(s) of -sml- to see if they have any advice for you.)

Last edited by Clyde Schechter; 19 Sep 2022, 18:46.
Comment
Sam Stargardt

Join Date: Sep 2022

Posts: 9
#11

20 Sep 2022, 07:02

By the way in these examples I'm using -snp- because it's also a semi-parametric single-index estimator but works much much faster than -sml- (it's just less efficient if the outcome is binary).

Yeah I've tried looking at the means of Xb after predict for School =1 and School =0, like:

Code:

snp Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths School, offset(pred_MathScore) replace School = 1 predict Xb summ Xb

And they don't change for School = 0 or School = 1. Are you saying then that the predict option doesn't work work for -sml-/-snp properly and I won't be able to estimate marginal effects in STATA? Thanks for your time with this problem. I appreciate it a lot.
Comment
Sam Stargardt

Join Date: Sep 2022

Posts: 9
#12

20 Sep 2022, 09:03

Hi Clyde,

I've found some code for calculating marginal effects of continuous variables in -sml-.

Do you think there is any way to extend it to binary variables like School in my example?

This is what you would do for continuous variables apparently:

Code:

sml y x* matrix B=e(b) matrix V=e(V) predict Xb . lpoly y Xb, gen(F) at(Xb) gaussian dydx F Xb, gen(f) local j 0 foreach var of varlist x1 x2 x3 { local j='j'+1 gen margin'var'=f*B[1,'j'] } matrix M=J(3,3,0)

Thank you so much for your time!
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30187

#13

20 Sep 2022, 09:48

I don't know. I don't know what -sml- does, so I cannot deduce how its margins and marginal effects differ between dichotomous and continuous variables. In a simple linear model, it makes no difference at all. But in non-linear (i.e. models with a non-linear link) transformed models, it does.

I will point out that there are somethings in the code you show that are syntactically incorrect and you will need to fix them to get it to run at all:

Code:

sml y x*
matrix B=e(b)
matrix V=e(V) // THIS NEVER GETS USED.  DOESN'T HURT, BUT WHY BOTHER?
predict Xb . lpoly y Xb, gen(F) at(Xb) gaussian // THIS LOOKS LIKE TWO COMMANDS RUN TOGETHER ON THE SAME LINE.  AND WHAT IS THE . ABOUT? SEEMS MISPLACED HERE
dydx F Xb, gen(f)
local j 0
foreach var of varlist x1 x2 x3 { // THERE IS NO } IN THE CODE TO MATCH THIS OPENING BRACE.  NOT SURE WHERE IT SHOULD GO, BUT NEEDS TO BE SOMEWHERE.
local j='j'+1 // REFERENCE TO LOCAL MACRO j SHOULD BE `j', NOT 'j'.
gen margin'var'=f*B[1,'j'] } // DITTO FOR LOCAL MACROS var AND j IN THIS COMMAND.  THE CLOSING BRACE MUST BE ON A SEPARATE LINE.
matrix M=J(3,3,0) // NOT CLEAR WHAT THIS IS FOR

Last edited by Clyde Schechter; 20 Sep 2022, 09:51.

Comment

Sam Stargardt

Join Date: Sep 2022

Posts: 9
#14

20 Sep 2022, 10:11

Alright thanks so much for all your help!
Comment

Announcement

"factor-variable and time-series operators not allowed r101;" when using community contributed command: sml

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment