Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • simplify generate codes with loops always possible?

    I am using the following codes. would it be possible for me to simplify these and make them shorter. I am always getting the advice to write shortest codes but sometimes i feel it is not possible.

    gen are = S2_Q2AK * S2_Q2BK
    gen ki = S2_Q3AK * S2_Q3BK

    gen pre = S2_Q2AP * S2_Q2BP
    gen cul = S2_Q3AP * S2_Q3BP

    gen dek = S2_Q2D * S2_Q2BD
    gen wat = S2_Q3AD * S2_Q3BD


    egen own = rowtotal ( are pre dek)

    egen cul = rowtotal (ki cul wat)


    Thank you in advance for all the advice i have received here. it has been so helpful with university all closing and having limited access to computer labs.



  • #2
    Short code is not always good code. Much more important is that you can easily read that code. If you can easily read code, then it also becomes easier to spot (and correct) mistakes. If your code is easy to read, then it is easier to communicate to others what you have done. That is the core of what makes work scientific: work does not become scientific because the person who did it wore a lab coat while doing it, or has the right to put two or four extra letters (dr or prof) in front of her or his name. Work becomes scientific by being transparent about each step that let to your conclusion, and writing clear readable code helps with that.

    Sometimes shortening the code helps, sometimes it does not. The code snippet you gave us seems perfectly fine to me, and in this case I suspect that shortening the code would make it harder to read.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Maarten Buis gave a compelling argument for valuing clarity above brevity. Sometimes shorter is simpler, sometimes shorter is more complicated.

      Let me point out that given the way you presented your code in post #1, I can look it over, comparing the various generate commands, and it leads me to notice
      Code:
      gen dek = S2_Q2D * S2_Q2BD
      which - after comparing it to the other commands - I think perhaps should be
      Code:
      gen dek = S2_Q2AD * S2_Q2BD
      If we make that change, we can rewrite your code as - I think, I didn't bother testing this -
      Code:
      foreach L in K P D {
      forvalues n = 2/3 {
      gen x`n'`l' = S2_Q`n'A`L' * S2_Q`n'B`L'
      }
      }
      egen own = rowtotal(x2?)
      egen col = rowtotal(x3?)
      and we have succeeded in reducing 8 lines of clear code to 7 lines of opaque code, and at the same time, had to replace the six meaningful variable names are ... wat with 6 abstract names x2K ... x3D. We could have shortened this by one line if we had replaced the variable names own and col with x2 and x3 and placed a single egen command following the forvalues loop.

      I provide this example largely to reinforce Maarten's point by providing shorter code that is not simpler code. But if in fact in your real problem you had, not 2 different numbers in the middle of the variable names, but a much greater variety, then there could be something to be said for using loops.
      Last edited by William Lisowski; 29 May 2020, 08:33.

      Comment


      • #4
        I would briefly echo Maarten and William's point that code should be written for the human reader as much as it should be written for the machine. However, I would add that loops like in William's example are particularly useful when the goal is code generalizability.

        Say for example you wanted to preform the same procedure on some arbitrarily large number of variables - maybe 2000. You could write each line of code, but that would take a long time. You could generate each line of code in excel, then manually make changes as needed to each line, but this would also take a long time (though not quite so much). In a case like this, one should use a loop. Programmer time is more valuable than machine processing time, so if it takes less time to type out each line than it takes to think through the logic of the loop, then just type out the lines. Otherwise, use a loop.

        As another example, say you want to create a program that can be used in a number of different situations. Like many commands in Stata that take variable lists of any length, perhaps the number of variables that you want to loop over is different depending on the particular task at hand. In this case you can't type out each line separately; you must generate the content of the line dynamically with a loop.

        Of course, thinking through the logic of a loop can be its own reward, particularly if you're the sort of person who likes puzzles. Still, I think Maarten's pragmatic attitude is worth imitating as a general rule.

        Comment

        Working...
        X