Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thank you Romalpa Akzo. I note I am working with a very large panel dataset with thousands of variables in each wave, which may be one of the reasons I was originally shown to include the keep wanted variables code inside the loop. Noting this, do you still consider it better to move the keep variables code to outside the loop? If so, should I change any other code in #8?

    Your suggested code uses a 'foreach' loop. Do you suggest I change from using a 'forvalues' loop given my situation?
    Last edited by Chris Boulis; 10 Feb 2020, 00:12.

    Comment


    • #17
      1. Putting the code (for keeping variables) outside the loop makes the logic a little bit clearer and might reducing the calculation (at least for my impression). However, if your data is quite large, i.e. with thousands of variables and millions of observations, an out-of-memory could be happening. Then to be "safer", you could put the code inside the loop. In acknowledging that concern of yours, but also noticing that the limitation of memory in Stata is not too small, I suggest you to try both solutions and check the time consuming to select the better one to your situation.

      2. -foreach- loop in my code is not a point, but just an illustration (for my laziness of repeating your code). Sorry for that unclearness. You should keep your -forvalues- for your convenience.

      3. All that said, the core suggestion of my code is the usage of -isvar- as suggested by Nick Cox sensei, instead of -macro list- as in your original code. This package allows you to use the wildcard in the list of wanted variables.
      Last edited by Romalpa Akzo; 10 Feb 2020, 04:58.

      Comment


      • #18
        Thank you Romalpa Akzo. A question regarding structure of my code. So based on your code example in #15, I keep the local macro 'local variables' (which sits above the loop) and lists those variables to keep. I then have my loop (#8), from this code, I cut the keep 'list' to below the loop and add
        Code:
        isvar `variables'
        keep `r(varlist)'
        However, when I ran this code, Stata gave me the message:
        Code:
        . isvar `variables'
        something required
        r(100);
        I still have other local macros within the loop (such as "local allvars `r(varlist)'" - see #8) - is this an issue when using -isvar-?

        Comment


        • #19
          Try below code. Note that my focus is just a hint for the right position (inside or outside the loop, which is up to your choice) of using -isvar-. Other parts, for which I have no information to check out, are just the copy from your original code in #8.
          Code:
          local variables xwaveid wave hhpxid age sex educ marstat le* cety* losat*
          
          local filename partners
          clear
          save "`savingdir'/`filename'", replace emptyok
          
          forvalues wave = 1/18 {
              local waveprefix = word(c(alpha), `wave')
          
              use "`origdatadir'/Combined_`waveprefix'180c.dta", clear
              rename `waveprefix'* *
              drop if hhpxid==""
          
              *isvar `variables'
              *keep `r(varlist)'
          
              generate byte wave = `wave'
              append using "`savingdir'/`filename'"
          
             save "`savingdir'/`filename'", replace
          }
          
          isvar `variables'
          keep `r(varlist)'

          Comment


          • #20
            Thank you Romalpa Akzo for clarifying the code changes needed to adopt -isvar-. This was very helpful. I note that when I added the code outside the loop it created an issue for some post-loop code (something I can look at another time) so I included
            Code:
            isvar `variables'
            keep `r(varlist)'
            inside the loop and it appears to have worked very well. One key benefit of using -isvar- is being able to use a wildcard * which has notably reduced the length of my variable keep list. Being able to use a wildcard has enabled me to incorporate the quarterly variations of a number of variables much more efficiently. Thank you for helping me solve my question in #4.

            Kind regards, Chris

            Comment

            Working...
            X