Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error Rc 2000, no observations

    Hello there,

    I'm trying to to a logistic regression on my missing variables to analyse for the missingness mechanism to see if a multiple imputation is needed. So I put all the variables in a logistic regression model but I the thing that I get back is:
    no observations
    r(2000);

    I read what it means, I actually did have a string variable, which I then converted to a numerical but the problem still continues. When I do univariable analysis with the variable counting the missingness in the specific variable it works.

    for example:
    logistic miss_aneurysm_size sah1uia0
    -> no problem
    logistic miss_aneurysm_size sah1uia0 ageatdiagnosis aneurysm_loc HTN antiHTN heartdisease PVD DM preICH preSAH preStroke hyperchol statinpre antiplatelet SS HRT OCP postmeno i.premorbiddisabilitymrs i.Smoker i.Drinker fhsahuia druguse clipping coiling noth CVS Collapse consan Warfarin i.activityictus admissiondbp admissionsbp admissioninr
    -> still works
    logistic miss_aneurysm_size sah1uia0 ageatdiagnosis aneurysm_loc HTN antiHTN heartdisease PVD DM preICH preSAH preStroke hyperchol statinpre antiplatelet SS HRT OCP postmeno i.premorbiddisabilitymrs i.Smoker i.Drinker fhsahuia druguse clipping coiling noth CVS Collapse consan Warfarin i.activityictus admissiondbp admissionsbp admissioninr admissionsodiumana angioplast DSA coronaryrep crimatt craniectomy deathfriend durantiHTN i.ECG i.ethn i.fishergradeonadmission FNDad gcsdrop haemevac heada heartrate hemi HCP iapapa ituhduad LOC majillrel menin multipleA paped rebleed seiz stent infarct vomit i.wfnsad
    -> doesnt work

    With this I want to check which variables have a relation with the variable with missing values so to know which covariates to put into the imputation model. Is there a limitation to the numbers of covariates I can put into the model? Is there a minimum of observations a variable has to have to go into the model? My sample size is 1640 so this should not be a problem.

    Help would me much appreciated.

    Thank you

    Isabel


  • #2
    I can think of 2 possible explanations: (1) one of the variables you added in the third model is a string variable; use describe to check on this; (2) there is a good bit of missing data and each observation has at least one missing variable; use misstable to look into this

    Comment


    • #3
      There are no string variables. That was the case in one of the variables but this I noticed. There are quite a few variables with missing data one has up to 1200 missing informations, most of the variables have 400 missing informations and there are a total of 35 variables. But only two of the variables with missing data will go into my model of interest. The others are just auxiliary variables. Any ideas in how solving the problems?

      Comment


      • #4
        how many observations have missing values on the variables of substantive interest; if less than 5%, you can probably ignore; if more, I would turn to MI; with so much missing data in the above models, you can't really get much of interest (particularly from that variable that is missing almost 3/4 of the time)

        Comment


        • #5
          I dont have any missing values in my outcome variable of the model of interest so according to some books I could just do a complete case analysis. But I dont think this would be right. IN one variable 5.9% are missing and in the other 16.83%. So its not neglectable. What would be an option to do? Should I do a univariable analysis with the missingness variable as outcome variable and then all the auxiliary variables one by one as independent variable? Should I reduce the amount of covariates in the model I use to analyse the missingness mechanism?

          Comment


          • #6
            I've just attached the table with the missing values
            Attached Files

            Comment


            • #7
              I agree that there is too much missing data for a complete case analysis to be credible or trustable; as I said, I would use multiple imputation (MI) with sensitivity analyses

              Comment


              • #8
                Yes, thats what I want to do. But to know, which auxialiary variables to put in the imputation model I would need to know which variables have an association with the variables that just define missingness in the attached variables (miss_aneurysm_size binary variables saying if a values is missing for a person with in the variable about aneurysm size, same for miss_aneurysm_location). For this I wanted to do the multiple logistic regression with all the possible variables that could have an influence on the missingness and independent variables, but it doesnt work. Is it enough if I just do a univariable analysis and put all the variables which show and association with the outcome variable into the imputation model? Maybe Im not expressing clear enough what my question is. Sorry about that.

                Comment


                • #9
                  Isabel:
                  if you choose to perform a set of univariate logistic regression you lose the correlation between the single predictor and the other independent variables in your model that may have a relationship with missing values. Hence, I would prefer a multiple logistic regression followed by -mi-, as Rich suggested. However the main issue is to understand why that approach did not work in your case and, in turn, the first step to take is investigating what does "did not work" mean with your data. Things would be probably easier if you could post also what Stata gave you back via code delimiters.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    you mean what STATA shows you in the result window?

                    . logistic miss_aneurysm_size sah1uia0 ageatdiagnosis aneurysm_loc HTN antiHTN heartdisease PVD DM p
                    > reICH preSAH preStroke hyperchol statinpre antiplatelet SS HRT OCP postmeno i.premorbiddisabilitym
                    > rs i.Smoker i.Drinker fhsahuia druguse clipping coiling noth CVS Collapse consan Warfarin i.activ
                    > ityictus admissiondbp admissionsbp admissioninr admissionsodiumana angioplast DSA coronaryrep crim
                    > att craniectomy deathfriend durantiHTN i.ECG i.ethn i.fishergradeonadmission FNDad gcsdrop haemeva
                    > c heada heartrate hemi HCP iapapa ituhduad LOC majillrel menin multipleA paped rebleed seiz stent
                    > infarct vomit i.wfnsad
                    no observations
                    r(2000);

                    I use the do.file to safe all the commands but dont run it from the do file but the command window directly.

                    is it that what you meant?

                    Thanks again for your help!

                    Kind regards,

                    Isabel

                    Comment


                    • #11
                      Isabel:
                      not quite. Code delimiters can be found by clicking on #-button among the Advanced editor (A-button) options .
                      That said:
                      - are you sure that you need so many predictors in your model?
                      - I would re-run your logistic regression from scratch, adding one predictor in time to spot where Stata (not STATA, please) chokes on.
                      Last edited by Carlo Lazzaro; 27 Jun 2015, 10:22.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        Dear Carlo,

                        I removed the three variables with the most missing values and the regression model is now running. If I enter all the variables that have an association with the missingness in my imputation model (do I have add the ones that were omitted due to perfect prediction of the outcome variable? Guess so, right?) it doesn't work. It just works if I enter just variables with no missing values. If I want to run it I have to "force" it and there is just a part of the missing values imputed. How can I solve this problem? There are covariates I would like to enter with missing values although I don't want to impute them as they are not going in my model of interest.
                        Also, there are three variable in total I have to impute. But when I try with the second one to analyse for the missingness mechanism basically all the variables are omitted or empty. Why is that? I've attached a word file with the output.

                        I guess all this questions are very basic. I'm really trying my best. I started using Stata in November and have done two courses about missing data so far. The second one being really good but of course with the course material there weren't the problems I'm now facing with my own data.

                        Attached Files

                        Comment

                        Working...
                        X