Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Use of rowmean with conditions

    This is something that I have been Googling but couldn't find an answer to.

    I need to generate a new variable by using the command rowmean but with a condition that the respondent should have answered at least 4 questions/variables

    my code is egen newvar = rowmean (var1 var2 var3 var4 var5 var6)
    This is obviously wrong because I dont have the condition of taking the mean ONLY if the respondent has an answer to at least 4 of these variables

    Hope someone can help! Thanks!

  • #2
    Code:
    egen ngood = rownomiss(var1 var2 var3 var4 var5 var6)
    egen meanif4 = rowmean(var1 var2 var3 var4 var5 var6) if (ngood >= 4)
    drop ngood

    Comment


    • #3
      Try something like:

      Code:
      egen numnomissing =  rownonmiss(var1 var2 var3 var4 var5 var6)
      
      egen newvar = rowmean(var1 var2 var3 var4 var5 var6) if  numnomissing>3

      Comment


      • #4
        Originally posted by Mike Lacy View Post
        Code:
        egen ngood = rownomiss(var1 var2 var3 var4 var5 var6)
        egen meanif4 = rowmean(var1 var2 var3 var4 var5 var6) if (ngood >= 4)
        drop ngood
        Im wondering what the logic behind dropping ngood?

        Comment


        • #5
          Originally posted by Joro Kolev View Post
          Try something like:

          Code:
          egen numnomissing = rownonmiss(var1 var2 var3 var4 var5 var6)
          
          egen newvar = rowmean(var1 var2 var3 var4 var5 var6) if numnomissing>3
          Thanks!
          Im trying to understand the logic behind the codes
          so the first code is supposed to generate a variable to determine nonmissing responses
          then the 2nd will generate a new variable where im supposed to take the of the first generated variable if the number of nomissing is >3 (or essentially 4) is it?

          Comment


          • #6
            Yes, the first command counts the number of non-missing per row.

            The second command calculates the row mean, but only if the number of non-missing values is bigger than 3. Otherwise it will generate a missing value for the given observation.
            Originally posted by gi peters View Post

            Thanks!
            Im trying to understand the logic behind the codes
            so the first code is supposed to generate a variable to determine nonmissing responses
            then the 2nd will generate a new variable where im supposed to take the of the first generated variable if the number of nomissing is >3 (or essentially 4) is it?

            Comment


            • #7
              The logic can be re-created with your own loop:

              Code:
              gen sum = 0
              gen count = 0
              
              foreach v in var1 var2 var3 var4 var5 var6  {
                  replace sum = sum + `v' if !missing(`v')
                  replace count = count + !missing(`v')
              }
              
              gen wanted = sum/count if count > 3
              I know one very experienced Stata user -- no longer active here -- who almost never uses egen on the grounds that it usually boils down to a few lines like this, and by the time you've checked whether there is an egen function and whether it does exactly what you want, you could have written out such code. This person is, however, very fast at coding and very knowledgeable about Stata generally, but just doesn't want the burden of remembering or checking egen syntax.

              Comment


              • #8
                Joro Kolev Thanks, that's really helpful!

                Comment


                • #9
                  You are welcome !

                  I think it is a good investment of your time if you read through the help file of

                  1. egen

                  2. user contributed egenmore
                  Code:
                  . findit egenmore
                  then follow the instructions to install.

                  3. egenmisc
                  Code:
                  . findit egenmisc
                  There is no need to memorise, we can always look them up, but reading two three times leaves on the back of your mind what functionality is out there.

                  Originally posted by gi peters View Post
                  Joro Kolev Thanks, that's really helpful!

                  Comment


                  • #10
                    Originally posted by Nick Cox View Post
                    The logic can be re-created with your own loop:

                    Code:
                    gen sum = 0
                    gen count = 0
                    
                    foreach v in var1 var2 var3 var4 var5 var6 {
                    replace sum = sum + `v' if !missing(`v')
                    replace count = count + !missing(`v')
                    }
                    
                    gen wanted = sum/count if count > 3
                    I know one very experienced Stata user -- no longer active here -- who almost never uses egen on the grounds that it usually boils down to a few lines like this, and by the time you've checked whether there is an egen function and whether it does exactly what you want, you could have written out such code. This person is, however, very fast at coding and very knowledgeable about Stata generally, but just doesn't want the burden of remembering or checking egen syntax.
                    That is interesting! But in my opinion it is easier to just use egen, especially for non-experts like myself. Maybe with experience people develop certain habits they find faster or more useful.

                    Comment


                    • #11
                      I use egen myself a lot for convenience and indeed first wrote many of the functions in the official command and most of those in egenmore (SSC) and some others.

                      It is not contradictory to underline that many of its functions are just wrappers for a few simple lines of code and that you may need those lines of code when through caprice there isn’t a function to do what you want.

                      StataCorp also has some degree of ambivalence about egen, which on the whole is being maintained rather than enhanced through addition of new functions.

                      Comment


                      • #12
                        There are different levels of Stata masterity, and they have different levels of demands on our time.

                        For one to understand how loops, macros, and referencing by groups work, one actually needs to read the manual, which is a couple of hundreds of pages. If one invests the time, and understands those, all the egen functions set becomes redundant.

                        However it takes only about 20min to read the egen manual, and if one is lucky and does not have to do something too "dynamic", that would be enough.

                        (Just saying in different words what Nick is saying, it does not hurt to know that there are many ways to do one particular task. But different ways demand different investment on our time.)

                        Comment


                        • #13
                          Some time ago I wrote a tutorial on loops, which came out as Stata Journal 2(2): 202-222 (2002).

                          That the voiume, issue, page numbers and date lined up in that way was pure happenstance, but still perhaps a little diverting.

                          So that paper remains accessible via https://www.stata-journal.com/articl...article=pr0005

                          Slides from an associated talk are at https://www.stata.com/meeting/8uk/fortitude.pdf (Douglas Adams fans should note the multiple pun on slide 42).

                          Many textbook writers churn out new editions every few years, so I didn't feel too guilty in producing a second edition of that tutorial, which is scheduled for Stata Journal 20(4) later this year. It should come out as shorter, especially because I cut the material on for which, qua Stata command, is not even undocumented any more.

                          Comment


                          • #14
                            I have read the original version of "How to face lists with fortitude" multiple times, and it is super good stuff. I am eagerly looking forward to the brand new version.

                            For the old -for- that is undocumented anymore: We The People want the old -for- back.

                            Old -for- also had some crappy features, it was very slow, and it had some limitations as to how many numbers in a numlist you can have, and so no. Also the use of these X and Y was dodgy.

                            Can somebody not write a wrapper/command that behaves as a simple version of the old -for-, but with fast speed (which would be automatic if the new fors are acting behind the scenes), and without the limitations of the old for, and with using some funky symbols for Y and Y like say @ and # ?

                            Originally posted by Nick Cox View Post
                            Some time ago I wrote a tutorial on loops, which came out as Stata Journal 2(2): 202-222 (2002).

                            That the voiume, issue, page numbers and date lined up in that way was pure happenstance, but still perhaps a little diverting.

                            So that paper remains accessible via https://www.stata-journal.com/articl...article=pr0005

                            Slides from an associated talk are at https://www.stata.com/meeting/8uk/fortitude.pdf (Douglas Adams fans should note the multiple pun on slide 42).

                            Many textbook writers churn out new editions every few years, so I didn't feel too guilty in producing a second edition of that tutorial, which is scheduled for Stata Journal 20(4) later this year. It should come out as shorter, especially because I cut the material on for which, qua Stata command, is not even undocumented any more.

                            Comment


                            • #15
                              My guess is for is gone and never coming back. These versions of the command remain accessible as well as completely documented.


                              Code:
                              . search ip8, historical entry
                              
                              Search of official help files, FAQs, Examples, and Stata Journals.
                              STB-30  ip8.1 . . . . . . . . . . . . . . .  An even more enhanced for command
                                      (help for3 if installed)  . . . . . . . . . . . . . . . . . P. Royston
                                      3/96    pp.5--6; STB Reprints Vol 5, pp.65--66
                                      incorporated into improved for command in Stata 5.0
                              
                              STB-26  ip8 . . . . . . . . . . . . . . . . . . . . .  An enhanced for command
                                      (help for2 if installed)  . . . . . . . . . . . . . . . . . P. Royston
                                      7/95    p.12; STB Reprints Vol 5, p.65
                                      incorporated into improved for command in Stata 5.0

                              Comment

                              Working...
                              X