Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Running parallel list loops for large lists of numerical variables

    Dear Forum,

    I intensely follow the Statalist debate around how to best run 2 parallel loops in Stata. The below shown command seems like the way to go if I want to run a loop which does not execute over all possible 9 combinations in this example of 2 lists of variables with 3 variables each, but only does A-1 B-2 C-3

    Code:
    local Returns 10 20 30
    local Market_Cap 1 2 3
    local n : word count `marketcaps'
    
    forvalues i = 1/`n' {
       local a : word `i' of `Returns'
        local b : word `i' of `Market_Cap'
        gen TEST_`a'_`b'=`a'+`b'
       }
    Question1: Is this the correct code for running a parallel loop?
    Question2: I have actually significantly more variables which I want to use in the locals Returns and Market_Cap, i.e. more than 1000 each. I don`t want to list them all by hand an the command 1 - 1000. which should refer to all variables from 1 UNTIL 1000 does not work for local macros. Is there an efficient way to refer to such a long list of variables using local macros or do I have to alter the aforementioned parallel loop formula?

    Thanks a lot for any help!!




    Last edited by Philip HU; 16 Nov 2023, 07:59.

  • #2
    I am all in favour of simplified examples that focus on the immediate question, but unfortunately simplification can go too far.

    Here you are creating three new variables that each equal the sum of two supplied constants. Is your real problem really similar but just larger? Telling us more about the real problem would help.

    You seem to be saying that it is about variables.

    Yet again, if you really have about 1000 pairs of variables, and want to create about 1000 more, it seems likely that you need quite a different data layout.

    This could be a terminology clash. In Stata a variable is in other terms a column or field in the dataset. It's not a general word for a constant held in a macro or scalar (or anywhere else). This clash is tough for people used to programming in many languages outside Stata in which something like

    Code:
    foo = 42
    creates a variable, but on Statalist I suggest that Stata terminology is surely the default to follow.

    Note that the code refers to a local macro marketcaps which isn't defined in what you show. If it is visibly defined correctly before the code in question, then the code will run.

    Comment


    • #3
      Hi,

      sorry for being overly simplistic on this one.

      But yes indeed: It is about performing calculations with two large groups of variables (roughly 1000 variables in each group) and then storing the result in new variables. I am interested in using corresponding pairs (i.e. not all combinations but the first, second, third... pair)

      So does that mean that I should not use local macros then? How could an alternative code look like?

      Thanks a lot!

      Comment


      • #4
        That's more information -- thanks -- but not enough information for me to be able to add helpfully to what I've said.

        Comment


        • #5
          Oh sorry I see why my first post might have been confusing, hopefully this is more clear - A,B,C and AA, BB, CC are normal numerical variables.

          Code:
          local Returns A B C  
          local Market_Cap AA BB CC
          local n : word count `marketcaps'  
          
          forvalues i = 1/`n' {    
          local a : word `i' of `Returns'    
          local b : word `i' of `Market_Cap'    
          gen TEST_`a'_`b'=`a'+`b'    }
          So what I basically just want to know is - how to add a large amount of variables to my two local macros "Returns" and "Market_Cap", since I do not want to add them by hand. ChatGPT proposed to build another loop to add variables to my local macros but that seems overly complicated to me.

          Thanks a lot!!

          Comment


          • #6
            OK, but I can't answer your question about adding more variable names in those local macros, beyond the answer you don't want that you could just type them in.

            The answer lies in some criterion or criteria for selecting what goes in which list, on which we have no information that I can see.

            Or as said earlier you may need a different layout. For example, suppose you have observations by years and a thousand or so panels in different variables. Then every time you calculate something new, you're likely going to get another thousand variables. That's not sustainable. The answer would be a different layout in which the result is just another variable, or a few. The answer is usually reshape long.

            Last edited by Nick Cox; 16 Nov 2023, 10:00.

            Comment

            Working...
            X