Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Outputting missing data

    I am trying to create a report to send out to a group of medical centers that details each record that is missing 1 of 31 variables. I have a group of 30+ database managers.

    The report might look like this, with the participating medical center, the record id from the database, the variable of interest, and a variable that details the date. The report would output (for each medical center) all the records missing variable1, then all the records missing variable2, and so forth.

    Medical Center One
    recordid variable1 date

    recordid variable2 date

    Thoughts on how to efficiently create this?

    Thanks in advance,
    Donald Likosky

  • #2
    Loop over the variables of interest and output what you want to excel if that variable is missing. You could use the sheet option in your export statement if you want it all in one file with one excel worksheet for each varaible.

    Code:
    foreach var of varlist variable* {
        export excel ... if mi(`var')
    }
    Last edited by grim; 09 Nov 2014, 15:59.

    Comment


    • #3
      I don't understand your desired output. What does "recordid variable1 date" mean? Do you want to print out that line as a subheader (under the medical center header)? Fine, but if the list consists of precisely those where variable1 is missing, what is the point of listing repeatedly that missing value in the body of the report? Would it make more sense to have the second level header just be the name of the variable in question, and then the third level subheader be "recordid date"? So something like this:

      Code:
      levelsof medical_center, local(mcs)
      
      // NEED TO DEFINE A LOCAL MACRO THE ENUMERATES THE 31 VARIABLES
      // NOT KNOWING THEIR NAMES I CAN'T DO THAT FOR YOU
      // FOR NOW I'LL JUST ASSUME YOU HAVE ALREADY DONE THIS
      // AND THE LOCAL MACRO IS CALLED vbles
      
      foreach m of local medical_center {
          display `"`m'"'
          foreach v of varlist `vbles' {
              display `"`v'"'
              list recordid date if missing(`v'), noobs clean
          }
      }
      will give you the basics. You can add other things to, for example, put some blank lines between medical centers, or between variables. And you could play around with suppressing variables if the particular medical center doesn't have any missings for that one. Also look at the -abbreviate()- option of -list- if you need more of the variable name spelled out than you get by default.

      But this will give you the basic starting structure.

      Comment


      • #4
        Thanks Clyde,

        I have edited the code (below). Although it doesn't retrieve any errors, it doesn't provide any output. Am I missing something?

        Thank you,
        Donny


        Code:
        levelsof Abbreviation, local(mcs)
        
        // NEED TO DEFINE A LOCAL MACRO THE ENUMERATES THE 31 VARIABLES
        // NOT KNOWING THEIR NAMES I CAN'T DO THAT FOR YOU
        // FOR NOW I'LL JUST ASSUME YOU HAVE ALREADY DONE THIS
        // AND THE LOCAL MACRO IS CALLED vbles
        
        foreach m of local Abbreviation {
            local which: label (artpump pulsperf tot_urine phmgmt artfporesz perfustm xclamptm biocoat biotype systype uncardsuc cell totcellv totprimvol nsalbumin25 nsalbumin5 nshetastarch lowcpbhct firstinroom lastprecpbhct firsticuhct firstcpbhct lastcpbhct autobld glucfirst gluchigh gluclast glucinsulindrip inotrope48hrs bypassreturn temparterialhigh lowcore) `vbles'
            display `"`m'"'
            foreach v of varlist `vbles' {
                display `"`v'"'
                list recordid date if missing(`v'), noobs clean
            }
        }

        Comment


        • #5
          Clyde gave you clean code, but not knowing macros, you mangled it. See below, maybe it will clarify.

          Code:
          *========ignore this, prepwork
          clear
          set obs 1000
          
          gen recordid=_n
          gen str20 date="January 1"
          
          *========DO PAY ATTENTION TO THE CREATION OF THE MACRO WITH VARIABLE NAMES!
          local vbles "artpump pulsperf tot_urine phmgmt artfporesz perfustm xclamptm biocoat biotype systype uncardsuc cell totcellv totprimvol nsalbumin25 nsalbumin5 nshetastarch lowcpbhct firstinroom lastprecpbhct firsticuhct firstcpbhct lastcpbhct autobld glucfirst gluchigh gluclast glucinsulindrip inotrope48hrs bypassreturn temparterialhigh lowcore"
          
          *========ignore this, just creating more fake data
          foreach create of local vbles {
                  gen `create'=1 if uniform()<.99
                  }
          
          gen str30 medical_center = ""
          replace medical_center="general hospital" if _n<=250
          replace medical_center="VA hospital" if _n>250
          replace medical_center="Catholic hospital" if _n>500
          replace medical_center="Lutheran hospital" if _n>900
          
                  
          *==========NOW CLYDE'S CODE KICKS IN        
          levelsof medical_center, local(mcs)
          
          // NEED TO DEFINE A LOCAL MACRO THE ENUMERATES THE 31 VARIABLES
          // NOT KNOWING THEIR NAMES I CAN'T DO THAT FOR YOU
          // FOR NOW I'LL JUST ASSUME YOU HAVE ALREADY DONE THIS
          // AND THE LOCAL MACRO IS CALLED vbles
          
          foreach m of local mcs {
              display `"`m'"'
              foreach v of varlist `vbles' {
                  display `"`v'"'
                  list recordid date if missing(`v'), noobs clean
              }
          }

          Comment


          • #6
            BTW -- it is a silly thing about Stata -- if a macro is empty, it just doesn't return anything. Most languages, it at least tells you "empty macro" "out of range at macroname[0]" or something meaningful. Stata is awesome in many ways, but handling empty macros is not one of them. There are a lot of other quirks about when it needs to be wrapped in `' and when "`'" and when you use a macro-name barren of quotes. Lots of people (myself included) often get confused and have to do trial-and-error and/or look for examples. So it's quite understandable you'd be flummoxed by Clyde's advanced usage of multiple macros.
            Last edited by ben earnhart; 09 Nov 2014, 19:45.

            Comment


            • #7
              Thank you Ben.

              Comment


              • #8
                Ben's advice is right if the emphasis is on macros being tricky for many people to learn in so far as they bring different expectations from previous learning of other languages. But they do get easier to use over time.

                The main problem with your last code shown was simple: you made a wild guess at what the syntax might be.

                The syntax

                Code:
                 
                local which:   label { valuelabelname | (varname) } { maxlength | # [#_2] } [, strict]
                is for looking up individual value labels (or alternatively, their maximum length). You didn't actually attempt to use the macro created by the statement defining local which, but from your report the corresponding statement didn't generate an error. My guess is that the statement was so far from the form above that Stata just copied the text. But as said, you never used what you created, which was one problem.

                Comment


                • #9
                  Huh. I re-ran the code, and it doesn't seem to do what it ought to do. For example, on the report for general_hospital, it reports all twelve (based on the seed it happened to have at the time) cases with missing values, instead of the three it should have I fixed it. I added "& medical_center=="`m'"" to the list command. I blindly trusted Clyde's code, which is usually a reasonable thing to do, but in this case, one small oops.

                  Code:
                  *========ignore this, prepwork
                  clear
                  set obs 1000
                  set more off
                  
                  *===to make it replicable
                  set seed 1971
                  
                  gen recordid=_n
                  gen str20 date="January 1"
                  
                  *========DO PAY ATTENTION TO THE CREATION OF THE MACRO WITH VARIABLE NAMES!
                  local vbles "artpump pulsperf tot_urine phmgmt artfporesz perfustm xclamptm biocoat biotype systype uncardsuc cell totcellv totprimvol nsalbumin25 nsalbumin5 nshetastarch lowcpbhct firstinroom lastprecpbhct firsticuhct firstcpbhct lastcpbhct autobld glucfirst gluchigh gluclast glucinsulindrip inotrope48hrs bypassreturn temparterialhigh lowcore"
                  
                  *========ignore this, just creating more fake data
                  foreach create of local vbles {
                          gen `create'=1 if uniform()<.99
                          }
                  
                  sort recordid
                  gen str30 medical_center = ""
                  replace medical_center="general hospital" if _n<=250
                  replace medical_center="VA hospital" if _n>250
                  replace medical_center="Catholic hospital" if _n>500
                  replace medical_center="Lutheran hospital" if _n>900
                  
                          
                  *==========NOW CLYDE'S CODE KICKS IN        
                  levelsof medical_center, local(mcs)
                  
                  // NEED TO DEFINE A LOCAL MACRO THE ENUMERATES THE 31 VARIABLES
                  // NOT KNOWING THEIR NAMES I CAN'T DO THAT FOR YOU
                  // FOR NOW I'LL JUST ASSUME YOU HAVE ALREADY DONE THIS
                  // AND THE LOCAL MACRO IS CALLED vbles
                  
                  foreach m of local mcs {
                      display `"`m'"'
                      foreach v of varlist `vbles' {
                          display `"`v'"'
                          list recordid date if missing(`v') & medical_center=="`m'", noobs clean 
                      }
                  }

                  Comment


                  • #10
                    Ben,

                    You are right. I goofed in that code. Thanks for finding the error and fixing it.

                    Clyde

                    Comment


                    • #11
                      BTW -- I created three errors in my attempt to fix it, not getting my bare macroname, `macroname', and "`macroname'" straight, but finally got it right.

                      Comment


                      • #12
                        ps. Donald -- once you get Stata working again, what was your plan for getting the info out of Stata? Do you know how to use .log files? I've found it amazing how many people who learn Stata w/out utilizing logs.

                        Comment

                        Working...
                        X