Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to replace observations with non-missing values from other variables but in the order they are encountered?

    Hi everyone,

    I have the following example data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str19 A byte B int C long D byte E long F byte G int H byte I long J byte(K L M) int N byte O str1(total1 aged blind_disab)
    "Total, New York"     .    . 621937 . 137566 .     . . 66822 . . .     . . "" "" ""
    "Albany"              .    .   6192 .    773 .     . .   976 . . .     . . "" "" ""
    "Allegany"            .    .   1461 .    149 .     . .   157 . . .     . . "" "" ""
    "Bronx"               .    .  86131 .  14126 .     . . 12662 . . .     . . "" "" ""
    "Schnectady"          . 3814      . .      . .  3411 .     . . . .  1714 . "" "" ""
    "Schoharie"           .  726      . .      . .   600 .     . . . .   303 . "" "" ""
    "Schuyler"            .  463      . .      . .   382 .     . . . .   183 . "" "" ""
    "Total, Rhode Island" .    .      . .      . . 24084 .     . . . . 12370 . "" "" ""
    "Bristol"             .    .      . .      . .   289 .     . . . .   140 . "" "" ""
    "Kent"                .    .      . .      . .  2605 .     . . . .  1221 . "" "" ""
    "Newport"             .    .      . .      . .  1163 .     . . . .   514 . "" "" ""
    end


    And I want total1 aged blind_disab to look like:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(total1 aged blind_disab)
    621937 137566 662822
      6192    773 976  
      1461    149 157  
     86131  14126 12662
      3814   3411 1714
       726    600 303  
       463    382 183  
     24084  12370 .    
       289    140 .    
      2605   1221 .    
      1163    514 . 
    end

    I tried the following code and get the following error:

    Code:
    clear
    import excel "C:/ssdi01.xlsx", sheet("Sheet3")
    edit
    
    . gen total1 = ""
    (11 missing values generated)
    
    . gen aged = ""
    (11 missing values generated)
    
    . gen blind_disab = ""
    (11 missing values generated)
    
    .
    . foreach j of varlist B-O{
      2.         foreach i in `j'{
      3.                 replace total1 = `i' if `i' != .
      4.                 replace aged = `i' if `i' != .
      5.                 replace blind_disab = `i' `i' != .
      6.         }
      7. }
    type mismatch
    r(109);
    
    end of do-file
    
    r(109);
    Any help is much appreciated!

    Shree Baba

  • #2
    I haven’t looked at all the code, but immediately see one glaring error. When you generate new variable with
    Code:
    gen newvar=“”
    this is creating a string variable, but the replace commands specify a numeric variable. Instead code:
    Code:
    gen newvar=.
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      Thank you, Carole.

      I changed the code as suggested and now I get the following error:

      Code:
      . gen total1 = .
      (11 missing values generated)
      
      . gen aged = .
      (11 missing values generated)
      
      . gen blind_disab = .
      (11 missing values generated)
      
      .
      . foreach j of varlist B-O{
        2.         foreach i in `j'{
        3.                 replace total1 = `i' if `i' != .
        4.                 replace aged = `i' if `i' != .
        5.                 replace blind_disab = `i' `i' != .
        6.         }
        7. }
      (0 real changes made)
      (0 real changes made)
      invalid 'B'
      r(198);

      Comment


      • #4
        There is a missing -if- in line 5 of your loop:

        Code:
        foreach j of varlist B-O{
            foreach i in `j'{
            replace total1 = `i' if `i' != .
            replace aged = `i' if `i' != .
            replace blind_disab = `i' if `i' != .
            }
            }
        You can simplify your code by eliminating the interior loop:
        Code:
        foreach i of varlist B-O{
            replace total1 = `i' if `i' != .
            replace aged = `i' if `i' != .
            replace blind_disab = `i' if `i' != .
            }
        But neither is not going to get you what you want, you'll just end up with all three new variables containing the same values. If you can give some rules for how you select the values going into your new variables, perhaps you can get more feedback.
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          Thanks again, Carole!

          What I want to do is, grab the first observation from row 1 that is non-missing and use that value as the first observation in var total1. Then, I want to grab second non-missing observation from the same row and put that as the first observation in var aged. Then, I want to grab the third non-missing observation from the same row and put that as the first observation in blind_disab. Then we move to row 2 and do the same.


          Comment


          • #6
            There is probably a more efficient way to do this, but this seems to work:

            Code:
            drop total1 aged blind_disab
            egen working=concat(B-O), punct(" ")           //concatenate all into one string with spaces between
            gen working2=subinstr(working, ".", " ",.)    //get rid of dots
            replace working2=stritrim(strtrim(working2)) //get rid of consecutive and leading/trailing spaces
            split working2, gen(new) destring            //split the string into variables
            rename new1 total1
            rename new2 aged
            rename new3 blind_disab
            drop working working2
            Stata/MP 14.1 (64-bit x86-64)
            Revision 19 May 2016
            Win 8.1

            Comment


            • #7
              I appreciate it, Carole! Thank you. It works in mine!

              Comment

              Working...
              X