Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • mimrgns & interaction effects

    This question is mainly directed at Daniel Klein, author of mimrgns (available from SSC) but others are welcome to chime in too.

    A student asked me how to estimate marginal effects when using multiple imputation. That was easy enough. I told him to use Daniel's mimrgns command,

    But, then another complication became apparent. The student also wants to include things like interaction effects and squared terms in his model. According to Allison and others (see http://www.stata.com/statalist/archi.../msg00613.html) interaction terms should be treated like "just another variable." Specifically, Allison says

    In multiple imputation, interactions should be imputed as though they are additional variables, not constructed by multiplying imputed values. The same is true if you have x and x^2 in a model. The x^2 term should be imputed just like any other variable, not constructed by squaring the imputed values of x. While this principle may seem counterintuitive, it is easily demonstrated by simulation that the more "natural" way to do it produces biased estimates.

    So, Allison is saying that if you want age squared in a model, you first do

    gen agesq = age * age

    and then include agesq as a variable to be imputed.

    BUT, that goes counter to the advice given for interaction terms when you want to estimate marginal effects. If, say, you want both age and age squared in your model, then your estimation command should be something like

    reg y age c.age#c.age

    Otherwise margins will not know that if age = 20, age^2 has to equal 400.

    So, if you follow Allison's advice and compute the interaction or squared term yourself, then the eventual mimrgns command will give incorrect estimates of the marginal effects. But, if you don't follow his advice, you will get biased estimates.

    I don't know if there is any established research on this. If I really want the marginal effects, then my inclination is to use factor variables and accept the bias. But maybe you should just say that marginal effects shouldn't be done if you want both multiple imputation and interaction or squared effects.

    My main advice to the student was to see whether the multiple imputation is actually gaining him all that much! If not, it may be better just to use listwise deletion. But as far as I can tell, something has to give somewhere. Any thoughts on how best to handle this would be appreciated.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    Stata Version: 17.0 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

  • #2
    Richard,

    let me split your question into (i) a statistical and (ii) technical part.

    Starting with (ii), mimrgns is basically all about tricking Stata into doing what it would not do otherwise. From this perspective, we can ignore the imputed nature of the dataset for the moment and focus on margins and interaction terms. The advice to use factor variable notation for interaction terms is a purely technical one. Stata needs to know which of the predictors in the model belong together. If we can get this information to margins somehow, then there is nothing wrong with computing the interaction terms by hand. One way of telling Stata is manipulating the row and column names of the e() matrices. Here is a minimal example, illustrating the point

    Code:
    /*
    We use the auto dataset as an example.
    We run a simple linear model, regressing
    price on mpg, turn and the interaction of
    the two variables. We then estimate the
    marginal effect of mpg.
    */
    
    sysuse auto ,clear
    
    reg price c.mpg##c.turn
    margins ,dydx(mpg)
    
    /*
    We now try to replicate these results,
    calculating the interaction term by hand.
    */
    
    g mpgturn = mpg*turn
    reg price mpg turn mpgturn
    margins ,dydx(mpg)
    
    /*
    The coefficients in the regression model
    are naturally the same. The marginal
    effects, however, get messed up.
    
    Here is how we fix this. We need a little
    program, that changes the row and column
    names of the e() matrices, so they match
    the row and column names after using factor
    variable notation.  
    */
    
    pr bogus ,eclass
        tempname b V sample    
        mat `b' = e(b)
        mat `V' = e(V)
        mat rown `b' = y
        mat coln `b' = mpg turn c.mpg#c.turn _cons
        mat rown `V' = mpg turn c.mpg#c.turn _cons
        mat coln `V' = mpg turn c.mpg#c.turn _cons
        g `sample' = e(sample)
        eret post `b' `V' , e(`sample')
    end
    
    /*
    We can now reestimate our model (though this is
    not neccessary at this point, run our -bogus-
    program to trick Stata and obtain the exact same
    (and correct) marginal  effects that we obtained
    from the model with factor variables.
    */
    
    reg price mpg turn mpgturn
    bogus
    margins ,dydx(mpg)
    Now, we could - with some effort for generalization - implement this logic in mimrgns. We would maybe add an option that specifies the factor variable equivalent of the variables computed by hand.

    However, at this point we need to consider the statistical part of your question, which is not that clear to me at all. The JVA, i.e. just another variable approach (von Hippel, 2009), has been shown to produce unbiased estimates in regression type models with multiply imputed data. Whether this is also true for marginal effects - or better: for the way they are calculated with mimrgns - I cannot tell. This question is closely related to (if not the same as) the ones discussed here. Despite the fact that JVA leads to unbiased estimates of combined regression coefficients (across the imputed datasets), one could pose the question whether this necessarily translates to regression coefficients within each imputed dataset. This question is crucial, because the marginal effects combined by mimrgns are based on exactly those estimates, i.e. the marginal effects from the estimation command in each imputed dataset. The marginal effects are not based on the combined/final mi estimates/regression coefficients, for which unbiasedness has been demonstrated.

    I am not aware of any research in this specific area, but would be delighted to hear about it. I am sorry I cannot be of more assistance.

    Best
    Daniel


    von Hippel, Paul T. (2009). HOW TO IMPUTE INTERACTIONS, SQUARES, AND OTHER TRANSFORMED VARIABLES. Sociological Methodology, 39: 265-291.
    Last edited by daniel klein; 15 Oct 2014, 04:58.

    Comment


    • #3
      Thanks much Daniel. I like how you trick Stata into thinking factor variables have been used even when they haven't. I thought about doing something similar with gologit2 but then I figured out how to support factor variables directly. Your bogus approach could be handy if you are stuck using an old program that does not support factor variables. Yes, I think there is more work to be done in this area. At least for Stata users, the issue really couldn't even come up until mimrgns came along.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      Stata Version: 17.0 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Hello Dr. Williams (and other Statalist members)

        I am facing the same issue as the student you mentioned in this post. Specifically, I want to obtain the marginal effects for interaction terms that were imputed as "just another variable." I was wondering what you ended up doing to solve this problem? Also, do you know if any new methods have been developed to deal with this issue? I have been searching Statalist and google but have not come across anything.

        Thank you,

        Jon Phillips

        Comment


        • #5
          Hopefully Daniel Klein will see this and comment. I believe in another thread he recommended against the approach he gave above but I am not sure if he had any other preference.

          I might be tempted to try both JAV and the factor variables approach. If the results were very similar for my estimation model, then I would be tempted to use the factor variable approach so I could get the margins. Of course that won't work if the results came out very different.

          When uncertain how to proceed, I think it is good to try a sensitivity analysis where you do things different ways. If different approaches all yield very similar results, you note this. Then, even if you are doing it the wrong way, the costs of the mistake aren't very high. Of course, if different methods all produce very different outcomes, life becomes much more complicated.
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          Stata Version: 17.0 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://www3.nd.edu/~rwilliam

          Comment


          • #6
            Richard has obviously followed a recent discussion that contains a link to another question in which I indeed argue against using the "bogus" approach outlined above. I would like to stress here that I am not an expert or authority - I am really not sure how to best deal with this. The two main issues that I see right now and that need to be considered are the following.
            • We have good reason to believe that JAV gives better (final) MI estimates than any passive imputation approach - especially for linear regression models. We do not know, however, whether the JAV coefficients in the M imputed datasets are in any way valid and/or better than the alternative. Put another way, while we have good reason to trust the combined coefficients obtained from the JAV approach, we do not know whether any (intermediate) calculations, e.g. predictions, should be based on the dataset specific coefficients. Does this imply that we should use the combined coefficients with the dataset specific observed values to get predictions?
            • The better estimates obtained with JAV probably have something to do with its fundamental property that the interaction does no longer equal the product of the constituting variables. The margins command, on the other hand, works precisely because the interaction is mechanically related to the constituting variables. We need to decide whether this a mere technical problem (that could be addressed using the "bogus" approach) or rather some theoretical/statistical inconsistency.
            Even if we believe that the dataset specific JAV coefficients are valid and that there is no theoretical/statistical inconsistency when combining the JAV approach with the mechanics implied by factor variable notation we need to be aware of one more thing: We are manipulating Stata in a way and on a level we should not. StataCorp puts great effort into making its code robust to potential user errors. In other words, the way Stata works is often intended to protect us from ourselves. The "bogus" approach undermines this strategy and before you start writing your own code, make sure you really know what you are doing - and double (triple) check that the results of your manipulation are as intended.

            Having said all this, I will now outline the basic steps to combine the "bogus" approach with mimrgns.

            Code:
            /*
                Step 1
                
                run the model using factor variable notation
                save the results in a ster-file */
                
            mi estimate , saving(fv_results) esample(esample) : ...
            
            /*
                Step 2
                
                run the model using the JAV approach
                save the results in a ster-file
                
                Note:   if the interaction involves categorical variables
                        use a (constant) placeholder where the omitted
                        (base) categorie would appear had factor variables
                        been used */
            
            mi estimate , saving(jav_results) : ...
             
            /*
                Step 3 (could be step 1)
                
                define the bogus program */
                
            capture program drop bogus
            program bogus , eclass
                version 12.1
            
                args m bmat Vmat
                
                tempname b V
                
                estimates use jav_results , number(`m')
                
                matrix `b' = e(b)
                matrix `V' = e(V)
                
                estimates use fv_results , number(`m')
                
                ereturn repost b = `b' V = `V'
            end
            
            /*
                Step 4
                
                cycle through the previously saved datasets */
                
            estimates describe using fv_results
            local nfiles = r(nestresults) - 1
            
            forvalues m = 1/`nfiles' {
                bogus `m'
                
                if (`m' == 1) {
                    estimates save bogus_results , replace
                }
                else {
                    estimates save bogus_results , append
                }
            }
            
            estimates use fv_results , number(`= `nfiles' + 1')
            estimates save bogus_results , append
            
            /*
                Step 5
                
                run -mimrgns- using the bogus ster-files */
                
            mimrgns using bogus_results , esample(esample)
            Here is an example, illustrating the note in step 2 and showing how you could use the final JAV estimates for the dataset specific predictions.

            Code:
            /*
                use a toy dataset */
                
            webuse mheart5s0, clear
            mi convert flong
            
            /*
                create JAV */
                
            generate smokes_bmi = smokes * bmi
            mi register imputed smokes_bmi
            
            /*
                impute the missing values (JAV) */
                
            mi impute chained (pmm , knn(10)) age bmi smokes_bmi = ///
                attack smokes female , add(5)
            
            /*
                Step 1 */
                
            mi estimate , saving(fv_results) esample(esample) : ///
                logit attack i.smokes##c.bmi i.female
                
            /*
                Step 2 */
                
            generate zero = 0 // <- constant to be omitted
            
            mi estimate , saving(jav_results) : ///
                logit attack i.smokes bmi zero smokes_bmi i.female
            
            /*
                Step 2b
                
                get the final/combined MI estimates from JAV */
            
            tempname b_jav V_jav
            matrix `b_jav' = e(b_mi)
            matrix `V_jav' = e(V_mi)
            
            /*
                Step 3a
                
                enhance the bogus program to accept the final coefficients */
                
            capture program drop bogus
            program bogus , eclass
                version 12.1
            
                args m bmat Vmat
                
                tempname b V
                
                if mi("`bmat'`Vmat'") {
                    estimates use jav_results , number(`m')
                    matrix `b' = e(b)
                    matrix `V' = e(V)
                }
                else {
                    matrix `b' = `bmat'
                    matrix `V' = `Vmat'
                }
                
                estimates use fv_results , number(`m')
                
                ereturn repost b = `b' V = `V'
            end
            
            /*
                Step 4 */
                
            estimates describe using fv_results
            local nfiles = r(nestresults) - 1
            
            forvalues m = 1/`nfiles' {
                
                bogus `m' // <- uses dataset-specific jav coefficients
                * bogus `m' `b_jav' `V_jav' // <- uses the final MI coefficients
                
                if (`m' == 1) {
                    estimates save bogus_results , replace
                }
                else {
                    estimates save bogus_results , append
                }
            }
            
            estimates use fv_results , number(`= `nfiles' + 1')
            estimates save bogus_results , append
            
            /*
                Step 5 */
                
            mimrgns using bogus_results , esample(esample)
            Best
            Daniel
            Last edited by daniel klein; 21 Apr 2017, 01:58.

            Comment


            • #7
              I do not agree that JAV is, in general, the correct way to go for interactions or polynomials; in special cases JAV is "correct" but in general, no; see:

              Seaman, SR, Bartlett, JW and White, IR (2012), "Multiple imputation of missing covariates in with non-linear effects and interactions: evaluation of statistical methods," BMC Methodology, 12, 46

              van Buuren, S (2012), Flexible imputation of missing data, CRC Press, esp. pp. 130-132

              Comment


              • #8
                I do not have access to Stef van Buuren's book (although I have an older copy), but the article cited seems to investigate the case where you use JAV with multivariate imputation. If I remember correctly this is also what Paul van Hippel does in his article from 2009. I honestly do not really know which conclusion I would draw from studies comparing a chained equation based passive imputation to a multivariate imputation using the JAV approach. You are basically changing two things at once - the approach (JAV vs. passive) and the imputation model. This also bears the question whether the JAV would actually yield valid or better results in a chained equation setting, leaving us with yet another uninformed choice to make if there are missing categorical predictors that are part of the interaction effect. Technically speaking, I am not sure if the "bogus" approach would work for (multivariate normal) imputed categorical variables that contain non-integer values.

                Best
                Daniel

                Comment


                • #9
                  Here is some discussion on passive vs JAV form 2009:

                  http://www.stata.com/statalist/archi.../msg00602.html

                  http://www.stata.com/statalist/archi.../msg00613.html

                  Allison favors JAV. There is at least one exception. Suppose you are trying to compute a scale that is the sum of several items. In an email to me, Allison said “It's better, when possible, to impute at the item level rather than the scale level. Otherwise you lose a lot of data. This is one case where JAV doesn't apply.”

                  Another good source:

                  White, Ian R., Royston, Patrick, Wood, Angela M. 2011. “Multiple imputation using chained equations: Issues and guidance for practice.” Statistics in Medicine. Pp. 377-399.

                  They look at various approaches, including JAV and passive, and note that each has potential problems.

                  I'll just add that if you are estimating a linear regression model then consider using sem with full information maximum likelihood (FIML). It is certainly simpler than MI. But, sem does not currently support factor variables, so you would still have problems with margins.
                  -------------------------------------------
                  Richard Williams, Notre Dame Dept of Sociology
                  Stata Version: 17.0 MP (2 processor)

                  EMAIL: [email protected]
                  WWW: https://www3.nd.edu/~rwilliam

                  Comment


                  • #10
                    Thank you all very much for your guidance on this issue. I did impute scale scores (i.e., mean of items) so I need to look into whether I have a large enough sample size to impute individual items instead. If I do, then it sounds like I should passively compute the scale means and the interaction terms using the factor variable approach (and then use the mimrgns command to plot the interactions). Am I understanding this correctly?

                    I am worried that I will not have a large enough sample to impute all the items in which case I may just have to impute scale scores and the interaction terms (i.e., use the JAV approach). If I do this, I think I am going to start off by taking Richard's advice and conduct the analyses using the JAV and factor variable approach to see if the results differ substantially. Hopefully they will not, and I can just use the factor variable approach and the mimrgns command. If the results from the JAV and factor variable approach differ, I will try Daniel's bogus program.

                    Thanks again everyone!



                    Comment


                    • #11
                      Hi Everyone,
                      I just got interested (and confused) by this debate which has been really stalling (what I thought was) a very simple OLS regression with just 5 IVs and 198 cases (which would only be 156 cases with list wise deletion) and an interaction effect variable. The margins command offers the real payoff for the analysis, so I was wondering if you think MI is still worth it, if any software other than Stata can do MI with interaction effects and/or margins effects? I would like to do sem with multiple imputation in my next project and I read somewhere else here that is not possible or legitimate in Stata either.

                      Comment

                      Working...
                      X