Announcement

Collapse
No announcement yet.
This is a sticky topic.
X
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by Nick Cox View Post
    I have to agree with Clyde Schechter (#39) about egen.

    Merging generate and egen is unfortunately a non-starter for more reasons then said.

    Here are two more reasons, and i doubt I've recalled everything that could be said.

    Functions like log() can be used outside generate, as with display or in calculating local or global macros. In contrast, what bites is that egen functions can't be used outside egen.

    Function calls outside egen can be and often are nested, as is often done. Nesting isn't possible with egen functions.

    egen is a series of wrapper functions (in its own sense) for generate. It is not an alternative or complement to generate as written. I know that users may think of it in that way when using it, and I often do too, but what bites for any merger scheme is how egen is implemented.

    egen arouses mixed feelings. I've often seen posts elsewhere asking for the R equivalent of egen, which is striking both ways. Positively, in that people have evidently used egen in Stata, regard it favourably, and woild like to learn of something equivalent. Negatively, in that egen has a history, but it doesn't have a distinct rationale otherwise. Very many commands call up generate within their code to generate new variables and which new kinds of variables may be created with egen and which may be generated with other commands is a matter of caprice.

    As Clyde points out, it is not hard for user-programmers to write their own egen functions. Many of the functions in official egen were originally community-contributed and have since been folded back into the official release. Many others exist outside official Stata.

    But this kind of programming has long since passed its peak. Just as StataCorp are maintaining but not much extending egen, so also user-programmers are not often now writing new egen functions.

    Why is that? I'll speak for myself only as someone who has written several egen functions in the past. Already the lists of official egen functions and of user-written functions are both quite long. If we keep adding more such lists could easily seem too long for anybody to want to scan and too much of a rag-bag to be attractive or convenient.

    I'd much rather write a distinct command that allows or even has a main aim the generation of new variables.

    I think there is a core question here: Which egen functions should rewritten as official function code? Unfortunately, they will usually need new names!
    I've been thinking about this, since the mid-2025 messages. Would it be possible to create a new command called compute that, by default, works like gen unless it sees an egen-only command in which case it works like egen?

    For commands involving arithmetic, it would be great to have say
    compute c=a+b
    compute d=rowmean(a b)

    Perhaps it could be set up so it only does arithmetic, and gen and egen could continue existing. Where egen and gen differ, e.g., in the use of -max()-, compute could default to gen.

    Comment


    • Readers may want to look back at https://www.statalist.org/forums/for...stata-20/page3 and see that
      Chris Martin originally asked a question and I was following a first reply by Clyde Schechter with my own. Understanding here, let alone discussion, is likely to be hard unless we return to the question. Here is Chris's question again, edited slightly to get closer to Stata conventions on showing code. Note as in #136 compute is not a Stata command, as yet.

      It would be great to merge gen and egen so that they're interchangeable. It's fairly easy to write a manual workaround: capture gen ... capture egen ... but that's inelegant.

      Right now, we all have to remember which one to use, even though both are the equivalent of compute, and it taxes my memory. The one thing I miss about SPSS is that it just has compute. R has <-.

      If there's an easy workaround, it would be great. If not, I understand.
      To the new proposal I can only ask in turn new questions.

      Here is one short answer and one longer answer.

      Short answer. Please tell us more about exactly how this would work.

      Longer answer.

      Who is going to write this? A competent user-programmer could write a command using Stata code (possibly Mata code) but Stata's own developers could write a better one. You could write a simple command that just fails if the syntax doesn't have an easy interpretation. But I fear that would confuse more than it really helps. It would fall over often. A more subtle problem is that it might work in ways you didn't expect or want.

      If this is a good idea, why hasn't it been done long since? At best it replaces two commands with one -- or rather three. There would be the problems that could arise with either existing command PLUS a whole bunch of new problems because of an awkward mix of offlap and overlap, trying egen things that can't be combined with generate things, or vice versa. As Clyde pointed out some egen functions have names also used in the rest of Stata.

      I can't see that helping anyone, from people new to Stata. who would need to be told about another way of doing something (or not told about it, in some ways better, in some ways not) to people not new to Stata who have learned their own way of working with generate and egen together. One person extremely competent in Stata told me that they (almost) never used egen because it was faster to write new code using generate than to (re-)discover whether an egen function already existed and to (re-) discover its exact syntax. They had certainly studied egen code; the claim was a little exaggerated to make a point, but it had that point.

      Like all posts in this thread this question is really aimed at StataCorp and it doesn't do to try to guess what they will do, except that it is fun to try.

      1. StataCorp is in business to sell Stata, and they want people to buy Stata. Is this what would help sell Stata? I doubt it, but anyone else's guess is theirs to have.

      2. StataCorp think long-term and change fundamentals very reluctantly. If there is a long-term ambition, I would guess it to be to let some egen functionality be replaced by functionality you can use with generate but at the same time I doubt that is high on their agenda.

      Any language changes over time, natural languages certainly and programming languages too, including Stata, That means usually a profusion of different ways of doing things -- in some circles thought to be a bad idea -- and a mix of things that even people experienced in and dedicated to that language think in retrospect to be bad ideas, If anything is the bad idea here it is egen -- but many readers love it and use it all the time,

      Comment


      • Originally posted by Nick Cox View Post
        If this is a good idea, why hasn't it been done long since?
        I actually tried to do it more than a decade ago (s457851a). Looking back with the benefit of experience, I now share Nick Cox's view of the concept underlying that piece of software.

        Comment


        • daniel klein Sorry that I had forgotten about your command.

          Comment

          Working...
          X