Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping through variables which are looked-up from a table or matrix?

    Dear all,
    I am trying to conduct a series of logistic regressions, of the form:

    Code:
    glm outcome age var1a var1b if site ==1, family(binom)
    There are a lot of sites so I want to use loops to automate, something like this:

    Code:
    levelsof site
    foreach g in `r(levels)' {
    glm outcome age var1a var1b if site ==`g', family(binom)
    }
    My problem is that I need to include different independent variables in the model; depending on which site I am looking at (this order was determined from a previous exercise identifying best-fitting models associated with break-points in a continuous variable which vary with site).

    I feel there must be a way for Stata to lookup the necessary code from a table or similar to do this. If the relevant table looked like this:
    site v1 v2
    1 var1a var2a
    2 var1b var2b
    3 var2a var2b
    the final models would be:
    Code:
    glm outcome age var1a var2a if site ==1, family(binom)
    glm outcome age var1b var2b if site ==2, family(binom)
    glm outcome age var2a var2b if site ==3, family(binom)
    I would like to use loops to automate this in the form:

    glm outcome age "column2" "column3" if site =="column1", family(binom)

    I feel there must be a way to do this, storing the whole table as a macro and then drawing the relevant cells into my model.

    I am using Stata 15.0: and thank you very much for any guidance.

    Josh.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double outcome float(age var1a var1b var2a var2b site)
    1 16.5 2 14.4 3 13.4 1
    1 12.5 2 10.9 3  9.9 1
    1 10.5 2  8.6 3  7.6 1
    1 14.5 2 12.7 3 11.7 1
    1 14.5 2 12.5 3 11.5 1
    1 13.5 2 11.4 3 10.4 1
    1 13.5 2 11.2 3 10.2 1
    0 12.5 2 10.1 3  9.1 1
    1 14.5 2 12.7 3 11.7 1
    1 12.5 2 10.1 3  9.1 2
    1 13.5 2 11.7 3 10.7 2
    0 12.5 2 10.1 3  9.1 2
    0 11.5 2  9.9 3  8.9 2
    1 11.5 2  9.6 3  8.6 2
    1 14.5 2 12.9 3 11.9 2
    0 14.5 2 12.9 3 11.9 3
    1 11.5 2  9.9 3  8.9 3
    0 11.5 2  9.8 3  8.8 3
    1 13.5 2 11.4 3 10.4 3
    1 14.5 2 12.2 3 11.2 3
    0 12.5 2 10.8 3  9.8 3
    end

  • #2
    Your example is the nub of the matter here. I see no rule for which variables are used for which site. That being so, there isn't a way to write a loop with tables off on the side that isn't just a long-winded over-complication and restatement of the code you have already, or know how to write.

    Sometimes a do-file ... really is the least bad solution. It's explicit, it is intelligible by you now and in the future, it should be easy to edit, should be transparent to novice users too, and so on.

    Comment


    • #3
      Thanks for your comments, Nick, I very much appreciate your time. The "rule", yes, is that there isn't really one, but I have a list and was hoping there was a way to extract relevant cells from a table, systematically.Thanks for confirming that there isn't one. Josh.

      Comment


      • #4
        I've run into situations that you describe. For cases where I know the "lookup table" will not be changed, I will use a good text editor that allows copy and paste with blocks (instead of lines) and allows appending/inserting text. That will usually take only a couple minutes to write any number of such statements to a do file. When I know the "lookup table" can change, I create a do-file to use the info from the table to write the statements to a new do-file.

        Let's assume your table is in some format that allows you to import it into Stata (excel or whatever) and that you have imported:
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float site str5(v1 v2 v3)
        1 "var1a" "var2a" ""    
        2 "var1b" "var2b" "var2c"
        3 "var2a" "var2b" ""    
        end
        
        **you would not include the info above, instead you would:
        **use my_lookup_table.dta
        **or import your lookup table as needed
        
        cap file close models
        cap file open models using "models.do", write replace
        
        forvalues i= 1/`=_N' {
            file write models "glm outcome age `=v1[`i']' `=v2[`i']' `=v3[`i']' if site==`=site[`i']' , family(binom)" _n
            }
            
        file close models
        You now have a do file called "models.do" containing all your models that looks like the code below. You would re-run the above code anytime your table changes.

        Code:
        glm outcome age var1a var2a  if site==1 , family(binom)
        glm outcome age var1b var2b var2c if site==2 , family(binom)
        glm outcome age var2a var2b  if site==3 , family(binom)
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          Hi Carole,
          Thanks [again] for your ideas and feedback. I was using a similar approach in Excel, basically copying rows of cells which contain my lines of code, including the relevant "var" for each row, and copying the resulting text into a dofile. It works pretty well but I wondered if I could do this directly within Stata. If I could program a local macro "local b = "2 4 5 6 7 " it would be convenient, for example, to simply add/remove numbers to my macro, affecting the final output.

          Anyway, I'm rambling. Thanks again.

          Josh.

          Comment


          • #6
            I don't think I understand where you are going with the macro (to what does 2, 4, 5, 6, 7 refer?). If you indeed have a table in excel with your variables that looks like your table in #1, you can import it into Stata and run the code in #4.
            Stata/MP 14.1 (64-bit x86-64)
            Revision 19 May 2016
            Win 8.1

            Comment


            • #7
              FWIW, I don't think #3 is quite what I was saying in #2. Carole's approach is systematic, but you need to (1) edit a file with the table, (2) use that file to generate a do-file (3) run the do-file. I am suggesting that often you can just do (3) but I don't want to be dogmatic about it. For example the file in (1) may be close to a list you want to include in a report or thesis, and (3) might not be.

              Comment


              • #8
                Hi Carole - my numbers refer to site numbers and they also exist in a column in my table. They are not quite continuous (we sometimes miss one or two) but that doesn't need to be a problem - I'll just have a few blanks in some places. Thank you for the solution - I have also learned a lot from understanding it.

                And yes, Nick, I share your opinion, sometimes (3) is quicker. Today that is probably the case (I have ~50 commands) but if/when I need to start changing them, and minimizing the risk of typing errors, Carole's solution can probably be easier for me, and more systematic. I can also use it as a quality check.

                Thank you both again.

                Josh.

                Comment

                Working...
                X