simplify generate codes with loops always possible?

Naika Sangroo

Join Date: May 2020

Posts: 78
#1

simplify generate codes with loops always possible?

28 May 2020, 20:09

I am using the following codes. would it be possible for me to simplify these and make them shorter. I am always getting the advice to write shortest codes but sometimes i feel it is not possible.

gen are = S2_Q2AK * S2_Q2BK
gen ki = S2_Q3AK * S2_Q3BK

gen pre = S2_Q2AP * S2_Q2BP
gen cul = S2_Q3AP * S2_Q3BP

gen dek = S2_Q2D * S2_Q2BD
gen wat = S2_Q3AD * S2_Q3BD

egen own = rowtotal ( are pre dek)

egen cul = rowtotal (ki cul wat)

Thank you in advance for all the advice i have received here. it has been so helpful with university all closing and having limited access to computer labs.
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3445
#2

29 May 2020, 01:30

Short code is not always good code. Much more important is that you can easily read that code. If you can easily read code, then it also becomes easier to spot (and correct) mistakes. If your code is easy to read, then it is easier to communicate to others what you have done. That is the core of what makes work scientific: work does not become scientific because the person who did it wore a lab coat while doing it, or has the right to put two or four extra letters (dr or prof) in front of her or his name. Work becomes scientific by being transparent about each step that let to your conclusion, and writing clear readable code helps with that.

Sometimes shortening the code helps, sometimes it does not. The code snippet you gave us seems perfectly fine to me, and in this case I suspect that shortening the code would make it harder to read.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
2 likes
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

29 May 2020, 08:08

Maarten Buis gave a compelling argument for valuing clarity above brevity. Sometimes shorter is simpler, sometimes shorter is more complicated.

Let me point out that given the way you presented your code in post #1, I can look it over, comparing the various generate commands, and it leads me to notice

Code:

gen dek = S2_Q2D * S2_Q2BD

which - after comparing it to the other commands - I think perhaps should be

Code:

gen dek = S2_Q2AD * S2_Q2BD

If we make that change, we can rewrite your code as - I think, I didn't bother testing this -

Code:

foreach L in K P D { forvalues n = 2/3 { gen x`n'`l' = S2_Q`n'A`L' * S2_Q`n'B`L' } } egen own = rowtotal(x2?) egen col = rowtotal(x3?)

and we have succeeded in reducing 8 lines of clear code to 7 lines of opaque code, and at the same time, had to replace the six meaningful variable names are ... wat with 6 abstract names x2K ... x3D. We could have shortened this by one line if we had replaced the variable names own and col with x2 and x3 and placed a single egen command following the forvalues loop.

I provide this example largely to reinforce Maarten's point by providing shorter code that is not simpler code. But if in fact in your real problem you had, not 2 different numbers in the middle of the variable names, but a much greater variety, then there could be something to be said for using loops.

Last edited by William Lisowski; 29 May 2020, 08:33.
2 likes
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 814
#4

29 May 2020, 09:15

I would briefly echo Maarten and William's point that code should be written for the human reader as much as it should be written for the machine. However, I would add that loops like in William's example are particularly useful when the goal is code generalizability.

Say for example you wanted to preform the same procedure on some arbitrarily large number of variables - maybe 2000. You could write each line of code, but that would take a long time. You could generate each line of code in excel, then manually make changes as needed to each line, but this would also take a long time (though not quite so much). In a case like this, one should use a loop. Programmer time is more valuable than machine processing time, so if it takes less time to type out each line than it takes to think through the logic of the loop, then just type out the lines. Otherwise, use a loop.

As another example, say you want to create a program that can be used in a number of different situations. Like many commands in Stata that take variable lists of any length, perhaps the number of variables that you want to loop over is different depending on the particular task at hand. In this case you can't type out each line separately; you must generate the content of the line dynamically with a loop.

Of course, thinking through the logic of a loop can be its own reward, particularly if you're the sort of person who likes puzzles. Still, I think Maarten's pragmatic attitude is worth imitating as a general rule.
2 likes
Comment

Announcement

simplify generate codes with loops always possible?

Comment

Comment

Comment