Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "factor-variable and time-series operators not allowed r101;" when using community contributed command: sml

    I'm using the sml command from the user written software package SJ8-2 st0144, Article: Stata Journal, volume 8, number 2: st0144. I am using STATA version 17.0

    In what follows Degree is a binary {0,1} variable. If I write the following code:
    Code:
     sml Degree i.Female i.WeeklyFamInc35to49 i.WeeklyFamInc50to99 i.WeeklyFamInc100to149 i.WeeklyFamInc150to199 i.WeeklyFamInc200to249 i.WeeklyFamIncmore250 i.FatDeg i.MotDeg i.FathWorkingClassWork i.StateBens12mnths i.School pred_MathScore, offset(pred_MathScore) nolog
    I get the following error
    Code:
      factor-variable and time-series operators not allowed
    r(101);
    All of my categorical variables are dummy variables coded {0,1} and this doesn't happen when I just run:

    Code:
    sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths School, offset(pred_MathScore) nolog
    But I want to estimate:

    Code:
     margins School
    And then to do this I need School recognise as a factor variable in the original estimation.

    I've tried:

    Code:
     xi: sml Degree i.Female i.WeeklyFamInc35to49 i.WeeklyFamInc50to99 i.WeeklyFamInc100to149 i.WeeklyFamInc150to199 i.WeeklyFamInc200to249 i.WeeklyFamIncmore250 i.FatDeg i.MotDeg i.FathWorkingClassWork i.StateBens12mnths i.School pred_MathScore, offset(pred_MathScore) nolog
    But then I get:
    Code:
    i.Female          _IFemale_0-1        (naturally coded; _IFemale_0 omitted)
    i.WeeklyFam~o49   _IWeeklyFam_0-1     (naturally coded; _IWeeklyFam_0 omitted)
    i.WeeklyFam~o99   _IWeeklyFama0-1     (naturally coded; _IWeeklyFama0 omitted)
    i.WeeklyFam~149   _IWeeklyFamb0-1     (naturally coded; _IWeeklyFamb0 omitted)
    i.WeeklyFam~199   _IWeeklyFamc0-1     (naturally coded; _IWeeklyFamc0 omitted)
    i.WeeklyFam~249   _IWeeklyFamd0-1     (naturally coded; _IWeeklyFamd0 omitted)
    i.WeeklyFam~250   _IWeeklyFame0-1     (naturally coded; _IWeeklyFame0 omitted)
    i.FatDeg          _IFatDeg_0-1        (naturally coded; _IFatDeg_0 omitted)
    i.MotDeg          _IMotDeg_0-1        (naturally coded; _IMotDeg_0 omitted)
    i.FathWorking~k   _IFathWorki_0-1     (naturally coded; _IFathWorki_0 omitted)
    i.StateBens12~s   _IStateBens_0-1     (naturally coded; _IStateBens_0 omitted)
    i.School          _ISchool_0-1        (naturally coded; _ISchool_0 omitted)
    I couldn't really find anything about this naturally coded omitted message. Is this a problem? Or should I proceed with the estimation? I should reiterate that I only care about coding my variables as factors so I can then run margins. If there is another way to calculate the margins afterwards then it's fine to not code them as factors.

  • #2
    Well, you can't take full advantage of -margins- because, apparently, -sml- does not allow factor-variable notation and -margins- requires it.

    The output you got with -xi- looks appropriate, or at least it is appropriate if each of these variables is a dichotomous 0/1 variable. Nothing to worry about here.

    -margins- will not work in the usual way with these variables, however. So you may as well not bother with -xi- for this purpose.

    If I am correct that all of these variables are dichotomous, then you can "trick" -margins- into giving you the predicted margins. Using School as an example:
    Code:
    sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 ///
        FatDeg MotDeg FathWorkingClassWork StateBens12mnths School pred_MathScore, offset(pred_MathScore) nolog
    margins, at(School == (0 1))
    This will give you the same results as -margins School- would have given you if -sml- allowed factor variable notation.

    Do not apply this trick incautiously, however. It only works because School is purely dichotomous and there are no interactions involving it in your model.

    Comment


    • #3
      Thank you so much for your response Clyde!

      I tried what you suggested:

      Code:
      sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to19
      > 9 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths S
      > chool, offset(pred_MathScore) nolog
      followed by:
      Code:
       margins, at(School == (0 1))
      However STATA then output:
      Code:
      Predictive margins                                       Number of obs = 2,907
      Model VCE: OIM
      
      Expression: Linear prediction, predict()
      1._at: School = 0
      2._at: School = 1
      
      ------------------------------------------------------------------------------
                   |            Delta-method
                   |     Margin   std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
               _at |
                1  |   .9210996   .4861446     1.89   0.058    -.0317263    1.873926
                2  |   1.605186     .58807     2.73   0.006     .4525898    2.757782
      As these are meant to be fitted probabilites it seems strange that the Marginal probability for School=1 is above 1. The plot of the local polynomial smooth (my estimated cdf graphed as a function of my single index) also does not exceed 1, so I'm not sure how this can be.

      When it says
      Code:
       Expression: Linear prediction, predict()
      Does this potentially mean it's using a linear probability model which might explain the greater than 1 estimate? Is there a way around this to get margins for the model I estimated? Thanks for your help!

      Comment


      • #4
        Does this potentially mean it's using a linear probability model which might explain the greater than 1 estimate?
        I am not familiar with -sml-, and I don't even know what it does, let alone how it does it. Suffice it to say, however, that "Expression: Linear prediction, predict()" means that -margins- is, indeed, working with the -xb- result of whatever model -sml- is estimating here. Whether that is a linear probability model or a linear estimate of something more indirectly related to your outcome variable, I wouldn't know.

        I didn't anticipate that problem when I made my suggestion. If you know how -sml- transforms -xb- to get to its final results, then you can force -margins- to do the same thing by adding an appropriately written -expression()- option to your -margins- command.

        Comment


        • #5
          -sml- here is a single-index semi-parametric maximum likelihood estimator so it estimates the coefficients and non-parametrically estimates the cdf.

          If -margins- is using the single-index i.e. linear part, to estimate marginal probabilities rather than using the cdf estimated by -sml- at xb do you know of a way that I can ask it to use the cdf?

          For clarity I want to average G(xb) across all variables except school, fixing school =1 and school=0, where G(.) has been estimated by -sml-. I guess this is similar to what it would do if you used -margins- after estimating a logit model (except with my cdf in the place of the logistic cdf). Thanks for your help!

          Comment


          • #6
            If the -predict- command following -sml- has an option to predict the cdf value, then you could do it with:
            Code:
            margins, at(School = (0 1)) predict(the_cdf_option_for_-predict-)
            But if not, I don't see any straightforward way to do this. I think you would have to write your own code to predict the CDF and then emulate what -margins- does using that.

            Comment


            • #7
              -sml- has a command -F- which I think is the CDF.
              For example I plot the CDF after -sml- using:
              Code:
              sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths School, offset(pred_MathScore)
              matrix B=e(b)
              matrix V=e(V)
              predict Xb
              lpoly Degree Xb, gen(F) at(Xb) gaussian
              Where F is the CDF. Is there a way I can use this in the syntax you're suggesting?

              Comment


              • #8
                So, it would go something like this:
                Code:
                sml Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths School, offset(pred_MathScore)
                clonevar School_orig = School
                forvalues s = 0/1 {
                    replace School = `s'
                    predict Xb
                    lpoly Degree Xb, gen(F) at(Xb) gaussian
                    summ F
                    display "Predicted Margin of F when S = 0 is `r(mean)'"
                    drop F
                }
                replace School =  School_orig
                drop School_orig
                That will give you the predicted margins. The standard errors, however, I do not know how to calculate.

                Comment


                • #9
                  Thanks so much Clyde, that almost works except that I only get the estimate margin for School=0

                  I tried
                  Code:
                  snp Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths StdBehvScore StdMalScore School, offset(pred_MathScore)
                  
                  clonevar School_orig = School
                  forvalues s = 0/1 {
                      replace School = `s'
                      predict Xb
                      lpoly Degree Xb, gen(F) at(Xb) gaussian
                      summ F
                      display "Predicted Margin of F when S = 0 is `r(mean)'"
                      drop F
                      drop Xb
                  }
                  replace School =  School_orig
                  drop School_orig
                  and I get
                  Code:
                  
                      Variable |        Obs        Mean    Std. dev.       Min        Max
                  -------------+---------------------------------------------------------
                             F |      2,653    .2036038    .2066127          0    .997744
                  Predicted Margin of F when S = 0 is .2036037860124303
                  (2,653 real changes made)
                  
                      Variable |        Obs        Mean    Std. dev.       Min        Max
                  -------------+---------------------------------------------------------
                             F |      2,653    .2036038    .2066127          0    .997744
                  Predicted Margin of F when S = 0 is .2036037858916236
                  I.e. the predicted margins for School=0 in both cases.

                  Comment


                  • #10
                    Um, are you using -lpoly- correctly? Try listing some values of F during each iteration: I suspect they're not changing. This would suggest to me that your use of -lpoly- is not what it needs to be if I'm right. But as I'm not really familiar with -lpoly-, I can't guide you on how to fix that.

                    Added: Also check that Xb changes with each iteration. It may be that -predict- doesn't play well with -sml-. User-written commands don't always do what they need to do for Stata's post-estimation commands to produce appropriate results. (If that's the case, I think your problem is truly unsolvable. And if that's the case, you might want to contact the author(s) of -sml- to see if they have any advice for you.)
                    Last edited by Clyde Schechter; 19 Sep 2022, 18:46.

                    Comment


                    • #11

                      By the way in these examples I'm using -snp- because it's also a semi-parametric single-index estimator but works much much faster than -sml- (it's just less efficient if the outcome is binary).

                      Yeah I've tried looking at the means of Xb after predict for School =1 and School =0, like:

                      Code:
                      snp Degree Female WeeklyFamInc35to49 WeeklyFamInc50to99 WeeklyFamInc100to149 WeeklyFamInc150to199 WeeklyFamInc200to249 WeeklyFamIncmore250 FatDeg MotDeg FathWorkingClassWork StateBens12mnths School, offset(pred_MathScore)
                      replace School = 1
                      predict Xb
                      summ Xb
                      And they don't change for School = 0 or School = 1. Are you saying then that the predict option doesn't work work for -sml-/-snp properly and I won't be able to estimate marginal effects in STATA? Thanks for your time with this problem. I appreciate it a lot.

                      Comment


                      • #12
                        Hi Clyde,

                        I've found some code for calculating marginal effects of continuous variables in -sml-.

                        Do you think there is any way to extend it to binary variables like School in my example?

                        This is what you would do for continuous variables apparently:

                        Code:
                        sml y x*
                        matrix B=e(b) 
                        matrix V=e(V)
                        predict Xb . lpoly y Xb, gen(F) at(Xb) gaussian
                        dydx F Xb, gen(f)
                        local j 0
                        foreach var of varlist x1 x2 x3 {
                        local j='j'+1
                        gen margin'var'=f*B[1,'j'] }
                        matrix M=J(3,3,0)
                        Thank you so much for your time!

                        Comment


                        • #13
                          I don't know. I don't know what -sml- does, so I cannot deduce how its margins and marginal effects differ between dichotomous and continuous variables. In a simple linear model, it makes no difference at all. But in non-linear (i.e. models with a non-linear link) transformed models, it does.

                          I will point out that there are somethings in the code you show that are syntactically incorrect and you will need to fix them to get it to run at all:
                          Code:
                          sml y x*
                          matrix B=e(b)
                          matrix V=e(V) // THIS NEVER GETS USED.  DOESN'T HURT, BUT WHY BOTHER?
                          predict Xb . lpoly y Xb, gen(F) at(Xb) gaussian // THIS LOOKS LIKE TWO COMMANDS RUN TOGETHER ON THE SAME LINE.  AND WHAT IS THE . ABOUT? SEEMS MISPLACED HERE
                          dydx F Xb, gen(f)
                          local j 0
                          foreach var of varlist x1 x2 x3 { // THERE IS NO } IN THE CODE TO MATCH THIS OPENING BRACE.  NOT SURE WHERE IT SHOULD GO, BUT NEEDS TO BE SOMEWHERE.
                          local j='j'+1 // REFERENCE TO LOCAL MACRO j SHOULD BE `j', NOT 'j'.
                          gen margin'var'=f*B[1,'j'] } // DITTO FOR LOCAL MACROS var AND j IN THIS COMMAND.  THE CLOSING BRACE MUST BE ON A SEPARATE LINE.
                          matrix M=J(3,3,0) // NOT CLEAR WHAT THIS IS FOR
                          Last edited by Clyde Schechter; 20 Sep 2022, 09:51.

                          Comment


                          • #14
                            Alright thanks so much for all your help!

                            Comment

                            Working...
                            X