Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeating same code x number of times:

    I would like to impute missing values repeatedly, first with the past value, then with the future value:
    HTML Code:
        foreach v of varlist drinkl socwk higov {
            // need code to repeat this 4 times in each direction
            bysort ID: replace `v' =`v'[_n-1] if `v'>=.
            bysort ID: replace `v' =`v'[_n-1] if `v'>=.
            bysort ID: replace `v' =`v'[_n-1] if `v'>=.
            bysort ID: replace `v' =`v'[_n-1] if `v'>=.
            bysort ID: replace `v' =`v'[_n+1] if `v'>=.
            bysort ID: replace `v' =`v'[_n+1] if `v'>=.
            bysort ID: replace `v' =`v'[_n+1] if `v'>=.
            bysort ID: replace `v' =`v'[_n+1] if `v'>=.
        }
    However, this code is long because there are 4 survey waves. How is it possible to keep this in 2 lines?

  • #2
    Castor:
    1) are you sure that the last observed carried forward and next observed carried backward are actually the right approaches?
    2) I'm under the (non-tested) impression that your code rewrites all over again every time it loops over the same set of variables.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo,
      1) Yes. These are control variables in my dataset, and it would be a pity to lose observations in the regression due to some missing values in either (not too important) control variable.
      2) I have check summary statistics before and after the operation. The means are similar, only the number of non-missing observations increases.

      Comment


      • #4
        Castor:
        1) I realize I was not that clear in my previous reply: usually, LOCF and NOCB are not considered the gold standard for dealing with missing values (and in any decent paper you should justify why you did not use multiple imputation);
        2) could you please share and excerpt/example of your dataset via -dataex-, along with what you typed and what Stata gave you back (as per FAQ)? Thanks.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          I note and agree with @Carlo Lazzaro's reservations. Assuming this is a good idea, so that this is really interpolation not imputation, let's back up here.

          First, copying downwards cascades, so doing it once suffices to replace any block of missing values with the previous non-missing value.

          Here's a demonstration: See also https://www.stata.com/support/faqs/d...issing-values/

          Code:
          . clear 
          
          . set obs 7
          Number of observations (_N) was 0, now 7.
          
          . gen id = 1
          
          . gen whatever = cond(inlist(_n, 1, 7), 42, .)
          (5 missing values generated)
          
          . 
          . list, sep(0)
          
               +---------------+
               | id   whatever |
               |---------------|
            1. |  1         42 |
            2. |  1          . |
            3. |  1          . |
            4. |  1          . |
            5. |  1          . |
            6. |  1          . |
            7. |  1         42 |
               +---------------+
          
          . 
          . replace whatever = whatever[_n-1] if missing(whatever)
          (5 real changes made)
          
          . 
          . list, sep(0)
          
               +---------------+
               | id   whatever |
               |---------------|
            1. |  1         42 |
            2. |  1         42 |
            3. |  1         42 |
            4. |  1         42 |
            5. |  1         42 |
            6. |  1         42 |
            7. |  1         42 |
               +---------------+
          .
          Second, the FAQ linked above explains how to copy backwards by temporarily reversing time.

          Third, mipolate from SSC offers various interpolation choices. See https://www.statalist.org/forums/for...-interpolation

          Comment


          • #6
            Thank you for the helpful responses. Again, assuming my approach is correct, I this works fine with using the previous non-missing value.
            HTML Code:
                foreach v of varlist drinkl smoken socwk higov {
                    sort ID wave
                        gen `v'copy = `v'
                    bys ID: replace `v' =`v'[_n-1] if missing(`v')
                    bys ID: replace `v' =`v'[_n+1] if missing(`v')
                }
            It does not, however, always work in the opposite direction, as in the last observation:

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input str12 ID byte wave float(drinkl drinklcopy)
            "010104124001" 4 0 0
            "010104124002" 1 0 .
            "010104124002" 2 0 0
            "010104124002" 3 0 0
            "010104124002" 4 0 0
            "010104125001" 1 0 .
            "010104125001" 2 0 0
            "010104125001" 3 0 0
            "010104125001" 4 0 0
            "010104125002" 1 0 .
            "010104125002" 2 0 0
            "010104125002" 3 1 1
            "010104125002" 4 0 0
            "010104126001" 1 0 .
            "010104126001" 2 0 0
            "010104126001" 3 0 .
            "010104126001" 4 0 0
            "010104126002" 1 . .
            "010104126002" 2 . .
            "010104126002" 3 0 .
            "010104126002" 4 0 0
            end
            label values wave wavel
            label def wavel 1 "Wave 1", modify
            label def wavel 2 "Wave 2", modify
            label def wavel 3 "Wave 3", modify
            label def wavel 4 "Wave 4", modify

            Comment


            • #7
              Please read #5 again. Replacing works in different ways going forwards and backwards. This is explained in the FAQ cited there (first link).

              Comment

              Working...
              X