Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing variable whose name contains number larger than value of another variable

    Hello community,

    I have the following set-up:

    An indicator (test_1) that shows whether some variables (ent_elect_1_1-ent_elect_1_4) have a "problem" in this observation. The "problem" is that the number of non-missing values in these variables (ent_elect_count_1) is larger than the value of another variable (enterp_nr_new_1). Now, I would like to put those variables to missing that are creating these problems. Namely, if test_1==1 because enterp_nr_new_1=2 & ent_elect_count_1=3, I would like ent_elect_1_3 to be replaced by "". Note that the variables ent_elect_1_1 to ent_elect_1_4 are always filled from "left to right", meaning if ent_elect_1_2 is missing, ent_elect_1_3 and ent_elect_1_4 will be missing, too.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(test_1 ent_elect_count_1 enterp_nr_new_1) str3 ent_elect_new_1_1 str1(ent_elect_new_1_2 ent_elect_new_1_3) str3 ent_elect_new_1_4
    1 4 3 5  5  5 -88
    0 0 . ""  ""  "" ""
    0 0 . ""  ""  "" ""
    0 0 . ""  ""  "" ""
    0 0 . ""  ""  "" ""
    0 0 . ""  ""  "" ""
    0 0 . ""  ""  "" ""
    0 1 1 "0" ""  "" ""
    end
    label values enterp_nr_new_1 enterp_nr_1
    The problem is a bit larger because I have this problem not only for 1 set-up but for 20, resulting in variables test_1 to test_20 and the number of ent_elect_i_j variables is with differing j depending on i (always below 11). My approach included some if conditons, for and while loops but it does not work (does not change the dataset).

    Code:
    forvalues i=1(1)20{
        if test_`i'==1{
            forvalues j=1(1)11{
                if enterp_nr_new_`i'==' `j'{
                    local n=`j'+1
                    while `n' < 12{
                        capture confirm variable ent_elect_new_`i'_`n'
                        if !_rc {
                            replace ent_elect_new_`i'_`n'=""
                            local n= `n'+1
                        }
                    }
                }
            }
        }
    }
    Looking forward to some help.

    Kind regards
    Nina


  • #2
    I'm not 100% sure I understand what you want. But try this:
    Code:
    gen long obs_no = _n
    reshape long test_ ent_elect_new_, i(obs_no) j(i_j) string
    split i_j, parse("_") gen(n) destring
    rename n1 i
    rename n2 j
    
    by obs_no i (j), sort: replace test_ = test_[_N]
    by obs_no i (j): replace ent_elect_new_ = "" if j > test_
    
    drop i j
    reshape wide
    This code does not require any particular limit on the values of i or j: it uses whatever it finds. If this is not what you wanted, please post back with a clearer explanation of what you need, and also indicate in what way the results this produces differ from what you want.

    General point: this is one of the many things that is very difficult to do in Stata with wide data (often, as here, leading to convoluted nested loop structures that are hard to write or understand), but is quite simple with long data. In fact, there are really very few things in Stata that work best with wide data. So unless you know that you will be doing some of those, I recommend that you skip the last two commands above and just keep your data long. In all probability it will simplify your work enormously.

    Comment


    • #3
      One fairly general point in addition to Clyde Schechter's very general and highly applicable advice:

      I see code like this

      Code:
       
       if test_`i'==1
      followed by a command. Such code almost never does what you want, as Stata looks in the first observation only. See e.g. https://www.stata.com/support/faqs/p...-if-qualifier/ Clyde and I are working on a piece around that issue, but I doubt it will appear before June 2023, and you will want an answer today, so please look at that FAQ if tempted to use the if command in that way. The point is probably not material, as you need a long layout to do best what you want to do.

      Comment

      Working...
      X