Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Storing scalars with byable programs

    I would like to learn how to retain a scalar for a given statistic for each level of the sort variable in byable programs.

    Please consider the following trivial example in which I might like to save the R-square statistic from each regression when sorting on the variable foreign in the auto.dta data set. Obviously, as written this program will only store the scalar from the final pass of the byable routine.

    Code:
    sysuse auto, clear
    
    cap prog drop bytest
    prog def bytest, rclass byable(recall)
            syntax [varlist] [if] [in]
            marksample touse
            qui regress `varlist' if `touse'
            di "R-square = " %5.4f e(r2) " "_n
            * How do I give the following scalar a distinct name
            * for each level of the byable sort variable?
            return scalar Rsq = e(r2)
    end
    
    bysort foreign: bytest price mpg weight
    I don't know how to distinguish separate values for the retained scalar r(Rsq) with distinct scalar names such that they are not over-written with each pass of the byable routine.

    I could easily do this with a loop because I can use the loop's step value in naming the stored scalar. That approach would avoid the need to use a byable program, but I'd like to learn how to do this in the case of byable programs if that is possible.

    Thanks for any advice you may offer.

    Red Owl

  • #2
    Hello "Red Owl",

    It seems this is your eighth post in the Stata Forum, Also, you seem to have provided interesting topics for discussion so far.

    With regards to the FAQ recommendations on the use of name and family name, I kindly ask you to act accordingly.

    The reason for that is explained in the FAQ as well.

    You just need to click on the "contact us" button and demand the correction.

    Thank you.

    Best,

    Marcos
    Best regards,

    Marcos

    Comment


    • #3
      I don't think there is a good way to do this. (If somebody else knows of one, please do chime in--I would love to learn how!)

      It is, I think, not a coincidence that the characteristic behavior of regular Stata programs used with -by:- is to leave behind only the results from the final iteration. How, after all, would one make it otherwise? You could use the -_byindex()- function to count each call to the program and append the corresponding number to the name of a new scalar. But you can't leave them in r(), because r() still gets cleared each time the program is called. But that probably isn't helpful because then at the end you would have a bunch of scalars and you would be left to puzzle out which -by- group corresponds to which. It might sound appealing to somehow name the scalars in a way that reflects the values of the variables defining the -by- group, but since -by- takes a varlist, there is no limit to the length of such a name (but scalar names do have a length limit), and no limit to the complexity of combining strings, and integer valued variables. And God alone knows how one would handle floating-point valued variables for this. There is also the difficulty that the -byable(recall)- machinery does not give you easy access to what the current values of the -by- group variables are. -marksample- knows and uses them for you: that's what makes -byable(recall)- so easy to write and use. But you would have to get the names of the by variables from local macro _byvars and then write some code to calculate what the values of those variables are in the observations where `touse' == 1 (which wouldn't be hard for numeric variables, but is a bit more challenging for strings; probably need to use the -byn1()- function for that.) Furthermore, you would have to somehow guard all that machinery against being called upon when the same program is called without -by:-.

      In short, the challenges seem daunting. Moreover, while I'm guessing the example you show is far simpler than what you really want to do, there is no need to invent any machinery to accomplish your example: the -statsby- command is designed for that. (Note: -statsby- can be very slow and occasionally trips people up for complicated analyses, but you can always overcome that by just creating a post-file and looping over the -by()- groups instead: which is what -statsby- does internally.)
      Last edited by Clyde Schechter; 29 Dec 2016, 11:54. Reason: Correct typo.

      Comment


      • #4
        Marcos Almeida Red Owl has been with us a long time, going back even to the listserve version, and this has come up before. Red Owl is his real name!

        Comment


        • #5
          Clyde Schechter many thanks for thinking about this and for your thoughtful reply. I didn't have an immediate application in mind and was just puzzling about this as a general programming skills matter. My toy example was just to demonstrate the type of objective I might want to accomplish. You confirmed my expectations about the infeasibility of doing it with byable programming, so I won't need to spend any more time trying to reinvent the wheel when there are workable alternative approaches available. Thanks also for remembering me and my FAQ-evoking name. : )

          Marcos Almeida I appreciate your contributions to this blog and have benefited from them even before I registered and started posting here recently. I especially appreciate the courtesy that you model in your posts and have noticed that you always seem to find a way to compliment the original poster even when offering a critique.

          For many years on the old Stata Listserv, in which I was pretty active, I used to write "(Yes, my real name)" below my signature with each post. I believe I even appended that to my signature in my first two posts on this blog late last month.

          The confusion over my legal name happens on occasion, and I understand that it may look unusual to those not familiar with American Indian culture. It's ironic that, were I to use a fake name like "John Smith," it would likely go unnoticed.

          Even though using my real name sometimes raises questions on scholarly blogs, I have found that those circumstances often provide an opportunity to make a new friend and colleague, and I hope that happens here.

          Cheers,

          Red Owl

          Comment


          • #6
            Red Owl : I'm so sorry for the misunderstanding. Since you registered in the Stata Forum last month, I assumed you were a new fellow, so to speak. Great that you came back! Thank you for your kind words about my participation. My sincere apologies again.

            @ Clyde Schechter: thank you for letting me know.
            Best regards,

            Marcos

            Comment


            • #7
              Marcos Almeida As Humphrey Bogart's character, Rick, said in the movie Casablanca, "I think this is the beginning of a beautiful friendship."

              Red Owl

              Comment


              • #8
                I'd focus on putting the results in a new variable. Even that means more copies than you need, you can always ignore duplicates.

                Comment


                • #9
                  Well, again, that is problematic with -byable(recall)-. The first by-group will generate the new variable, but the second and subsequent times, Stata will balk because the variable already exists. So this kind of thing requires creating the variable before executing the -by varlist: my_program- command and using only -replace-, not -generate-, in the program itself. It also probably requires passing the name of the variable as an argument to the program.(or, less desirable, having a fixed variable name hard-coded into the program).

                  Comment


                  • #10
                    I didn't give examples of what I've sometimes done, which indeed is to loop over a set of groups and generate a variable before the loop and replace it each time around.

                    The code to moments (SSC) shows an example.

                    The code to lvalues (SSC, now SJ) gives an example of doing it differently.

                    Comment


                    • #11
                      Thanks Nick Cox . I often review the code in your programs to try to learn efficient approaches to specific objectives. It's very helpful when you point me to specific programs.

                      Like going to the library, I often find treasures in your code other than those I for which I was originally searching.

                      Red Owl

                      Comment

                      Working...
                      X