Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem regarding imputation of missing value

    Hii
    I am working on self reported cross section data set. Most of the variable are ordinal in nature. There is some missing values in those variable. Except my dependent variable (Which is ordinal) all independent variable having some missing value (each variable 10-15% missing value). I tried compute it through mi using ologit. I am keeping M =20 (as recommended by stata code book) and random number seed = 1234 (I am still having little confusing regarding the use of Random no .seed). after imputing missing values for one variable, my total observation is increasing. like before imputing it is 1953 and after imputing it become 6133. if I am doing the same procedure for other variable it is increasing further. one more important thing during mi procedure my independent variable is only one variable corresponding to my imputed variable because if i am including other variable as independent variable it is showing error because other independent variable having some missing value. Why my observation is increasing this much and how to solve this issue?

  • #2
    Please show the exact code that you have used.

    The code should look something like

    Code:
    set seed 42 // <- any number will do; used for reproducible results
    mi set flong
    mi register imputed varlist
    mi impute chained (ologit) varlist = depvar [ varlist ] , add(20) noisily
    Best
    Daniel

    Comment


    • #3
      Tnx for your prompt reply Mr. Daniel Klein. I am using the following code:
      Code:
      mi set mlong
      mi register imputed varlist
      mi impute ologit V109 V22, add(20) rseed(1234)
      Just now I used the command you mentioned #2. still there is a same issue my observation is increased from 1953 to 41063.

      Comment


      • #4
        Concerning code, make sure you include your dependent/outcome/response variable as predictor variables in the imputation model. Actually, (at least) all variables that you use in the analysis later should also be in the imputation model.

        The increase in observation is expected and documented in the manual entries on MI. Start reading

        Code:
        help mi styles
        which explains what is stored in the extra observations with each mi-style.

        Best
        Daniel

        Comment


        • #5
          Thanks for your response Mr. Daniel. As a mention in code #3, V22 is my dependent variable as predictor variable I am using in the imputation model. if I am using my other independent variable as predictor variable in the imputation model. it is showing an error because my other independent variable also having missing values.

          Comment


          • #6
            Originally posted by Neeraj Kumar View Post
            if I am using my other independent variable as predictor variable in the imputation model. it is showing an error because my other independent variable also having missing values.
            That is why you suggested you should use

            Code:
            mi imputed chained ...
            where you specify all predictors with no missing values on the right-hand side of the equals sign.

            Say, V109 and V110 were the only two (categorical) predictors with missing values; you would then impute those missing values as

            Code:
            mi impute chained (ologit) V109 V110 = V22 , add(20)
            Do not run a series of single imputation models; instead, use a chained-equations approach and impute missing values in all variables at once.

            Best
            Daniel

            Comment


            • #7
              Thanks for your patience and prompt reply Mr. Daniel. After doing then thing you mentioned in #6. Still the issue is same my observation is increase from 1953 to 41063. I am not understanding whether it is right or wrong. One more thing I noticed that in data browser. I am still seeing some missing value (.).

              Comment


              • #8
                Please follow my advice in #4 and read

                Code:
                help mi styles
                where you will find the explanation for the extra observations in your dataset after imputing missing values along with an illustrating example. I do not know what to add to this.

                Best
                Daniel
                Last edited by daniel klein; 29 Apr 2019, 23:34.

                Comment


                • #9
                  Thanks for your advice Mr. Daniel. Now i Know why my observation was increasing. It is because, I am using mlong and flong style. But if my using Wide and flongsep style my observation is remaining same. But now i am having doubt what style I should select. Because if I am using flongsep style and using add (20) it is generating 20 other file. and if I am using wide style and using add (20). it is generating 20 variable for that one particular variable. so what style i should choose? Can I reduce to add (20) to add (2)? Once again thanks for your helpful advice

                  Comment


                  • #10
                    Originally posted by Neeraj Kumar View Post
                    so what style i should choose?
                    Choose whichever style is convenient. You will have to use mi estimate to run the analyses and the latter does not care which style you use.

                    Originally posted by Neeraj Kumar View Post
                    Can I reduce to add (20) to add (2)?
                    You can, but why would you want to do that?

                    Best
                    Daniel

                    Comment


                    • #11
                      Thanks Mr. Daniel for your reply. I don't have any issue with add (20). I just wanted to know whether i can do that or not. or if i reduce it to add (2) what will be the impact? I want to know one more thing as I mention in #1 my all variable are categorical except one. I am having few variable which are categorical but they are not in order form. so I have to calculate the missing values for them using mlogit command. Earlier I only used order logit because they are having order in categories. I am sharing my codes just let me know whether it is correct or not because when i used that code it showing error.
                      Code:
                      set seed 42
                      mi set wide
                      mi register imputed V108 V109 V110 V111 V112 V113 V126 M127 M128 M129 M130 M132 V133 V134 V135 V136 V137 V138 V139 V140 V141 V142 V143 reli Hth EM SC ED incomescale Sex Marital V237
                      mi register regular V22
                      mi impute chained (ologit)V108 V109 V110 V111 V112 V113 V126 M127 M128 M129 M130 M132 V133 V134 V135 V136 V137 V138 V139 V140 V141 V142 V143 (mlogit)reli Hth EM SC ED incomescale (logit) Sex Marital (regress) V237 = V22, add (20)
                      Thanks you so much for your help

                      Comment


                      • #12
                        Originally posted by Neeraj Kumar View Post
                        if i reduce it to add (2) what will be the impact?
                        You will only have 2 complete datasets. Part of the theory behind MI is based on asymptotics in M, i.e., the number of imputations. There are a couple of rules of thumb on how many imputations you need to get valid results; 2 is most certainly not enough.

                        Originally posted by Neeraj Kumar View Post
                        I am sharing my codes just let me know whether it is correct or not because when i used that code it showing error.
                        Which error? What exactly does Stata do/respond when you issue that syntax?

                        Best
                        Daniel

                        Comment


                        • #13
                          Thanks Mr. Daniel. it showing error 2000. Is it ok? instead of calculating all together if i calculate first with all ologit variable. then using these variable in next time i calculate mlogit variable, then using ologit and mlogit, calculate logit variable and in the last using all calculate continuous variable.

                          Comment


                          • #14
                            No. As I mentioned earlier, you cannot run models separately. The reason is that you cannot include predictors with missing values; however, you must include those predictors to account for the correlations with the variables that you are imputing. And, specifying separate models will not help with the error message. The error means that there is some model where there is not a single observation with all non-missing values. You need to find out where that error comes from. Add to your mi imputed command the flowing two options

                            Code:
                            mi impute (chained) ... , ... noisily showcommand
                            noisily will show all models that are estimated; showcommand is not documented but will give you an idea which model Stata is trying to run.

                            Best
                            Daniel

                            Comment


                            • #15
                              I used the following code. but still it show error 498. perfect predictor detected.
                              Code:
                              set seed 42
                              mi set wide
                              mi register imputed V105 V106 V107 V108 V109 V110 V111 V112 V113 V126 V127 V128 V129 V130 V132 V133 V134 V135 V136 V137 V138 V139 V140 V141 V142 V143 V237 religion health em sclass ed incomescale marriage gd
                              mi register regular V22
                              mi impute chained (ologit) V105 V106 V107 V108 V109 V110 V111 V112 V113 V126 V127 V128 V129 V130 V132 V133 V134 V135 V136 V137 V138 V139 V140 V141 V142 V143  health sclass incomescale (mlogit) religion em  ed (logit) marriage gd (regress) V237 = V22, add(20) noisily showcommand
                              Thank you so much for your prompt reply.

                              Comment

                              Working...
                              X