Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about nested foreach/forvalues statement

    Hello,

    I'm writing with what I'm sure is a naive question that I've struggled with for years in Stata. I've either not found or had a hard time understanding existing help documentation for this specific question, and if someone is able to simply help me with this code, I would be super grateful. I am using Stata BE 17.

    I would like to create 24 new variables, one new variable for each month of a year (so, January 2022 through December 2023). Ideally, these variables would be named date1 - date24, and they would be labeled Jan2022 through Dec2023.

    I can create 24 new date variables fairly easily as follows:
    Code:
    forvalues i = 1(1)24 {
    gen date`i' = .
    }
    However, when it comes to adding labels, I have a hard time understanding whether it's possible to write another loop for the labels. In the past, because I haven't been able to figure it out, I've simply coded them by hand (which, depending on the number of variables, can take a long time):
    Code:
    label var date1 "Jan2022"
    label var date2 "Feb2022"
    label var date3 "Mar2022"
    etc.

    However, if I write something like:
    Code:
    foreach x in jan feb mar ap may jun jul aug sep oct nov dec {
    forvalues i = 1(1)12 {
    label var date`i' = "`x'2022"
    }
    }
    then the program overwrites values that have already been defined. Therefore, I'll end up with a list of dates all with the same label (the last label in the list).

    Is there a way to assign labels in a 1:1 way, *without* needing to hard-code each label by hand? (I understand this may be obvious to others who are more well-versed in data management in Stata--this problem has just been bugging me for years and I haven't been able to sort it out on my own, so hoping someone can bestow their wisdom on me.)

    (edited for clarity and to amend an error)
    Last edited by Maria Sundaram; 12 Jul 2023, 10:55.

  • #2
    Code:
    forvalues i = 1(1)24 {
        gen date`i' = .
        local lbl: display %tmMonCCYY =tm(2021m12) + `i'
        label var date`i' "`lbl'"
    }
    No need to be apologetic about this question. First of all, we were all beginners once. Nobody, not even Bill Gould, was born proficient in Stata. And beginners, with beginner-level questions, are welcome here. But beyond that, the solution to your problem (or at least the one I've come up with) relies on some relatively little-used features of Stata that many users never encounter. So I would judge that only a reasonably advanced Stata programmer would come up with this. (And I don't think there is any materially easier way to solve this problem.)

    Comment


    • #3
      The problem can be tackled in various ways. Here is one.

      Code:
      tokenize "`c(Mons)'" 
      
      forval j = 1/12  { 
          local mon = lower("``j''") 
          gen date`j' = . 
          label var date`j' "`mon'2021" 
          local J = `j' + 12 
          gen date`J' = . 
          label var date`J' "`mon'2022" 
      }
      That said, it's not obvious why such variables could be useful. Data for different dates is usually best stored in different observations, not different variables.

      Comment


      • #4
        Thank you both very much! This is super helpful.

        Nick, your point/question about why I am doing this in the first place is an excellent one. It's a hamfisted attempt to clean a longitudinal dataset, all with different names for different variables measured at different points in time, into something that is a little more manageable. I couldn't (I think?) reshape from wide to long for the analysis I wanted to do without creating a set of variables that all had the same stub and a relevant suffix.

        That being said, I figured this example would be the easiest to explain out of several different scenarios where I've needed help on this point. Not all of the other examples have been date- or time-related, and not all have involved labels. If I understand correctly, both your coding suggestions might be able to be applied to a more general scenario where I essentially might like to take two lists and intersect them at specific intervals. Clyde, the "+ `i'" terminology was, I think, the biggest piece I was missing. I will also be sure to read up on 'tokenize'.

        Thanks to you both again!
        Last edited by Maria Sundaram; 13 Jul 2023, 11:03.

        Comment

        Working...
        X