Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Foreach loop across a list of strings: import files, run model and save new datasets

    Hi all,

    I am struggling to loop over a large list of strings (not variables) in Stata v17 (MacOSx)
    I want to import a list of files, run a meintreg model to adjust for batch effects, generate a variable with the model predictions and store the datasets in separte files.

    My code looks like this, but already fails at recognizing the first element in the list "invalid 'abc'".

    Code:
    foreach X in "abc" "deg" "hij" "klm" {
        use "data_"`X'"", clear
        merge 1:1 pid  using "file_2", update replace
        gen log`X'_low = log10(`X'_lower)
        gen log`X'_upp = log10(`X'_upper)
        * Model to estimate batch effects
        meintreg log`X'low log`X'upp i.experiment suppressYears || pid:
        * predictnl calls add to the data set the model results needed to adjust for batch effects
        predictnl adj1 = 0.2*(_b[2.experiment]+_b[3.experiment]+_b[3.experiment]+_b[5.experiment])
        predictnl adj2 = 0.2*(_b[2.experiment]+_b[3.experiment]+_b[3.experiment]+_b[5.experiment]) - _b[2.experiment]
        predictnl adj3 = 0.2*(_b[2.experiment]+_b[3.experiment]+_b[3.experiment]+_b[5.experiment]) - _b[3.experiment]
        predictnl adj4 = 0.2*(_b[2.experiment]+_b[3.experiment]+_b[3.experiment]+_b[5.experiment]) - _b[4.experiment]
        predictnl adj5 = 0.2*(_b[2.experiment]+_b[3.experiment]+_b[3.experiment]+_b[5.experiment]) - _b[5.experiment]
        * generate logged values, with imputation for OOR values
        gen log10`X' = log10(`X'_upper)
        replace log10`X' = log10(`X'_upper/2) if `X'_lower==.
        replace log10`X' = log10(`X'_lower*2) if `X'_upper==.
        * generate values adjusted to average batch effect
        gen log10`X'adj = log10`X' + adj1 if experiment==1
        replace log10`X'adj = log10`X' + adj2 if experiment==2
        replace log10`X'adj = log10`X' + adj3 if experiment==3
        replace log10`X'adj = log10`X' + adj4 if experiment==4
        replace log10`X'adj = log10`X' + adj5 if experiment==5
        drop if log10`X'adj==.
        save "`X'_analysis.dta", replace
    }
    Any help will be appreciated!

  • #2
    The problem is in the very first line of the loop: -use "data_"`X'"", clear-, which is a syntax error. The problem is that you are using quotes in a way that Stata cannot parse as you intend. I assume that what you want, say, in the first iteration, is to -use data_ABC, clear-. But that isn't what that command says to Stata. That command says to Stata:
    Code:
    use "data_"ABC"", clear
    So i t thinks that you started to write a -use "data_", clear- command and then stuck some ABC"" in after it, which, the -use- command has no place for. The correct code would be:
    Code:
    use `"data_`X'"', clear
    More generally, you cannot put one quoted expression inside another when using ordinary quotes. Because with ordinary quotes, there is no way to tell whether the second quote encountered is the closing of the first quote or the opening of an embedded second quote. To overcome this limitation of ordinary quotes, Stata also has compound double-quotes. See -help quotes##double- for details on using them.
    Last edited by Clyde Schechter; 12 Sep 2022, 12:44.

    Comment


    • #3
      Furthermore, the way you are using the local X at multiple places in the code suggests to me that your strings do not contain spaces or other complicated things. If so, you can simplify further, and not use double quotes or compound quotes at all.

      Code:
      foreach X in abc deg hij klm {
          use data_`X', clear
      ...
      }
      If on the other hand, the strings do contain spaces or other characters Stata cannot use as part of names, then your code will fail in multiple places, including all the -gen- and -replace- commands.

      Comment


      • #4
        Thank you very much for your help. It worked!

        Comment


        • #5
          Originally posted by Hemanshu Kumar View Post
          Furthermore, the way you are using the local X at multiple places in the code suggests to me that your strings do not contain spaces or other complicated things. If so, you can simplify further, and not use double quotes or compound quotes at all.

          Code:
          foreach X in abc deg hij klm {
          use data_`X', clear
          ...
          }
          If on the other hand, the strings do contain spaces or other characters Stata cannot use as part of names, then your code will fail in multiple places, including all the -gen- and -replace- commands.
          Good advice. Indeed, it also worked without double quotes. Thanks!

          Comment

          Working...
          X