Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • If any variable name occurs in global list of strings, run a subset of code

    Dear all,

    I would like to reformat the dates in my data set to a Stata recognized format. Some of these date variables allow for partial dates, which will need to stay in a string format. So I only want to reform variables that take a full date. Fortunately, I know from the metadata from my database which variables take a full date. I have saved these variables names in a global in the meantime. So far so good.

    Code:
    levelsof varname_dates, local(varname_dates) clean
    global varname_dates `varname_dates'
    Now I want to loop through the variable names (not the observations) in my dataset and if that variable name occurs in my `varname_dates' global, then I want to reformat it to a Stata recognized date. I found another post that relates to my question
    HTML Code:
    https://www.statalist.org/forums/forum/general-stata-discussion/general/1466177-if-variable-is-in-varlist
    but I can't quite seem to get this code to work in combination with generating a new variable. See below the code I have written and the Stata feedback.

    Code:
        foreach k of varlist _all {
            if `:list `k' in `varname_dates'' gen `k'_new = date(`k', "DMY")
            order `k'_new, after(`k')
            format `k'_new %td
            drop `k'
            rename `k'_new `k'
            }

    invalid syntax
    gen not found


    Would anyone be kind enough to explain what I am overseeing here? Or perhaps offer an alternative solution? I'm not an experienced guest on this forum yet, so if there is any way I can make my post more understandable, please do let me know.

    Thank you and best wishes,

    Moniek
    Last edited by Moniek Bresser; 17 Nov 2020, 08:32.

  • #2
    I think you'll need to offer us more detail about the background of your problem, as from your description, the solution might be simple and quite different from what you show, i.e., something like:

    Code:
    foreach d of global varname_dates {
       format `d' %td  // or whatever format you want
    }
    However, this suggestion might be irrelevant or wrong in the context of your problem.

    In particular, I find it confusing that your code seems to indicate that the names of the date variables of interest are stored in a *variable,* which would be unusual. So, besides explaining what your difficulties are, you'd do well to explain to us how your list of "full" date variables is stored, and show us an example of your dataset using -dataex- as described in the StataList FAQ.

    Among other things, if you have a list of the relevant variables, as you seem to indicate, your reason for wanting to loop over the entire set of variables is unclear to me.

    Finally, on a smaller note, I'd note that the use of a global macro as you show doesn't do anything helpful for you, and in general is something to be avoided. In your example code, you already have the names stored in a local, so there's almost certainly no reason to put it into a global.



    Comment


    • #3
      I think you just need some {}s :

      Code:
         foreach k of varlist _all {
             if `:list `k' in `varname_dates'' {
               gen `k'_new = date(`k', "DMY")
               order `k'_new, after(`k')
               format `k'_new %td
               drop `k'
               rename `k'_new `k'
            }
      }
      hth,
      Jeph

      Comment


      • #4
        Dear Mike,

        I'll try to give a bit more background. I have two files in my possession. One is my main dataset with participant data, see the first table as an example. I have variables that allow a partial date (here trtstdat / treatment start date as an example) and dates where a full date is mandatory to be recorded (visdat / visit date). These full dates do not come out in a Stata recognized format.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str10 subjectid strL(visyn visdat) str3 mhcheckyn str7 trtstdat
        "001" "Yes" "23/03/2020" "yes" "2016"  
        "002" "Yes" "14/07/2020" "yes" "05/2019"
        end

        The second file I have in my possession is a list of the variables (not the observations of these variables) that take a full date (see example below)

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str7 varname_dates
        "visdat"
        "aestdat"
        "aeendat"
        "dthdat"
        end
        The unique values of varname_dates I saved in a global (but you are right, this could just be a local). And now I want to check in table 1, if any variable names occur in the varname_dates global. In this example, the variable name visdat occurs in this global and therefore it is the visdat variable that I want to reformat to a Stata recognized date with the code described in my first post.

        Any thoughts?
        Last edited by Moniek Bresser; 17 Nov 2020, 09:57.

        Comment


        • #5
          Thanks for the clear explanation. Jeph may well already have done what you want, but here's what I would do:
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str7 varname_dates
          "visdat"
          "aestdat"
          "aeendat"
          "dthdat"
          end
          //
          // Store the to-be-processed variable names in a local.
          levelsof varname_dates, local(dlist) clean
          //
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str10 subjectid strL(visyn visdat) str3 mhcheckyn str7 trtstdat
          "001" "Yes" "23/03/2020" "yes" "2016"  
          "002" "Yes" "14/07/2020" "yes" "05/2019"
          end
          //
          // Create numeric date variable for each name in the list and format it.
          foreach d of local dlist {
             // Some of your varnames don't exist in this dataset, which is why I used -capture-.
             // This might be unnecessary in your actual data, but not harmful.
             cap confirm variable `d'
             if (_rc == 0) {
               gen double num_`d' = date(`d', "DMY")
               format num_`d' %td
             }
             else {
                di "Variable <`d'> is not present here."
             }
          }

          Comment


          • #6
            Dear Jeph,

            Thank you, this was exactly what I was trying to achieve! I still had to remove the additional quotes I had around varname_dates to get it working.

            For anyone who needs a similar solution, please see the final solution:

            Code:
                   foreach k of varlist _all {
                   if `: list k in varname_dates' {
                     gen `k'_new = date(`k', "DMY")
                     order `k'_new, after(`k')
                     format `k'_new %td
                     drop `k'
                     rename `k'_new `k'
                  }
            }
            Thank you both for being so helpful and have a great Friday,

            Best wishes,

            Moniek

            Comment

            Working...
            X