Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to split a global string variable into smaller string variables

    Hi,

    I have defined a global variable as below:

    Code:
    global querynames_post "world_covid_3-13 world_covid_3-20 world_covid_20200327 world_covid_20200430 world_covid_20200922 world_covid_20210319 world_covid_20210611 world_covid_20220219  world_covid_20220511b"
    I also have a variable defined as 'query' in the dataset and I want to do the below:

    Code:
    replace datetime = datetime + (diff_real_rep * 60 * 60 * 1000) if inlist(query, "world_covid_3-20", "world_covid_20200327", "world_covid_20200430", "world_covid_20200922")
    But instead of explicitly writing out the criteria in the inlist, I want to use the
    Code:
    global querynames_post
    in the inlist i.e. split the global into strings separated by commas. How do I do that?
    Last edited by Shreya Dutt; 15 Jul 2022, 12:38. Reason: global

  • #2
    Code:
    local querynames_post "world_covid_3-13 world_covid_3-20 world_covid_20200327 world_covid_20200430 world_covid_20200922 world_covid_20210319 world_covid_20210611 world_covid_20220219  world_covid_20220511b"
    
    local querynames_post: list retokenize querynames_post
    local querynames_post: subinstr local querynames_post " " ", ", all
    
    replace datetime = datetime + diff_real_rep*60*60*1000 if inlist(query, `querynames_post')
    Note: I have done this all with a local, rather than a global macro. Changing everything to global, and replacing `' macro references with $ will lead to the same results. I switched over to locals just because global macros are an inherently unsafe programming technique that should be used only as a last resort when nothing else will serve the purpose. (In 27 years of using Stata almost daily, I have only encountered one situation in which I had to use a global macro.) There is no reason to use a global macro just to hold a list like this.

    The -local retokenize ...- command is needed because your original list has multiple consecutive blanks at one point, and those would lead to multiple consecutive commas in the result--which would precipitate a syntax error. -retokenize- removes the surplus blanks.

    Comment


    • #3
      global macros are an inherently unsafe programming technique
      I just want to briefly expand upon this excellent point. Global variables are unsafe because it is difficult to predict the value of a global variable at any place in the runtime, particularly if you write to a global variable in a number of different places. In Stata specifically, let's say you want to jump around in your code and run different lines in different places. You might think a local variables will disappear, so you should have a global instead. That way you can run a line anywhere in your do file that uses the global and Stata will remember it's value. If the value of the global is constant everywhere in your code this is probably safe and convenient. But if you write to the global variable at different places in your code, as you jump around running different lines, you increase the probability that the global will not contain the information you think it does when you think it should. Global variables don't scale well when your code becomes complex, and even in situations where the global is constant and it is probably safe to use, it is still best to practice using locals wherever possible. The key programing concept here is called "scope" if you want to google around about this.

      Many programing languages make it intentionally unintuitive to globalize data, but statistical software packages are an exception by necessity. Stata does have a global environment - this is where the table containing your data is stored, and it is convenient that you can access the global dataset directly from any do file, from inside a command, or from the console. But Stata also understands that things can go wrong with a global environment, so they (e.g.) throw an error when you try to -generate- a variable that already exists. Stata understands that globalized data is dangerous, so Stata tries to prevent you from hammering over your data by mistake. Some other statistical languages will absolutely let you mess this up (e.g. R). Caveat emptor.
      Last edited by Daniel Schaefer; 15 Jul 2022, 13:47.

      Comment

      Working...
      X