Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Changing "NULL" to missing

    I have a dataset provided to me of people's weights and all missing values where the person didn't get weighed are coded as text "NULL".

    Stata doesn't recognise this data and so I can't do anything with the variable.

    When I try to replace weight=. if weight=="NULL" I get type mismatch.

    How do I solve this?


  • #2
    Welcome to Statalist. Please see the FAQ to guide any future posts (and please use your real name).

    As far as your error, you are attempting to replace weight, a string variable, with the numeric missing. Instead, try the following:
    Code:
    replace weight = "" if weight == "NULL"
    Josh

    Comment


    • #3
      Your problem is that, because of the text "NULL" in your data, your variable was read into Stata as a string variable rather than as a numeric variable. So you need to convert it to a numeric variable, and you need to treat the "NULL" as missing. Because your variable is a string, and . is a numeric missing value, you got the type mismatch error.

      First, I would check that the only value of weight that isn't numeric is the "NULL". Then I would use destring to do the conversion, forcing the non-numeric values to be missing.
      Code:
      clear
      input str8 weight
      110
      120
      130
      NULL
      150
      end
      list, clean
      describe weight
      tab weight if real(weight)==.
      destring weight, replace force
      describe weight
      list, clean
      Code:
      . list, clean
      
             weight  
        1.      110  
        2.      120  
        3.      130  
        4.     NULL  
        5.      150  
      
      . describe weight
      
                    storage   display    value
      variable name   type    format     label      variable label
      ------------------------------------------------------------------------------------------------
      weight          str8    %9s                   
      
      . tab weight if real(weight)==.
      
           weight |      Freq.     Percent        Cum.
      ------------+-----------------------------------
             NULL |          1      100.00      100.00
      ------------+-----------------------------------
            Total |          1      100.00
      
      . destring weight, replace force
      weight: contains nonnumeric characters; replaced as int
      (1 missing value generated)
      
      . describe weight
      
                    storage   display    value
      variable name   type    format     label      variable label
      ------------------------------------------------------------------------------------------------
      weight          int     %10.0g                
      
      . list, clean
      
             weight  
        1.      110  
        2.      120  
        3.      130  
        4.        .  
        5.      150

      Comment

      Working...
      X