Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • In response to #585, I think this is more a problem with setting up the regex object behind the scenes than with cond() per se as I have had similar “off by 1” behaviour using similar functions in the context of loops.

    Comment


    • Further discussion on the cond() issue is available at this thread: Strange behavior by -cond()- and -ustrregexs()- - Statalist

      Comment


      • Regarding #585

        It is disconcerting to realize that my intuition on how the cond() function works has been incorrect. I've always casually thought of it as roughly equivalent to if-then-else. Clearly that was poor intuition.

        I wonder if
        Code:
        cond(x,a(z),b(z))
        evaluates both a(z) and b(z) before evaluating x, rather than using x to choose which one of them to evaluate for the current observation? If this is the case, then if b(z) more difficult to evaluate than a(z), but x is usually true, then Stata will work harder to evaluate
        Code:
        generate y = cond(x,a(z),b(z))
        than it would to evaluate
        Code:
        generate y = a(z) if x
        replace y = b(z) if !x
        It also seems to suggest that nested cond()'s may in some cases extract a performance penalty for the same reason - everything to the right of the first argument of the outermost cond() will be evaluated.
        Code:
        cond( x<100, a(z), cond( x<1000, b(z), c(z) ) )

        Comment


        • Originally posted by William Lisowski View Post
          I wonder if
          Code:
          cond(x,a(z),b(z))
          evaluates both a(z) and b(z) before evaluating x, rather than using x to choose which one of them to evaluate for the current observation?
          cond(x, a, b) evaluates b, then a and then x irrespective of whether x is true or false.
          Code:
          version 17.0
          
          clear *
          
          log close _all
          log using condeval.txt, text nomsg name(lo)
          
          local seed 1685397315
          local 0 = "runiform() < 0"
          local 1 = "runiform() > 0"
          local show display in smcl as text
          
          set seed `seed'
          `show' `0'
          `show' runiform()
          `show' runiform()
          
          set seed `seed'
          `show' `1'
          `show' runiform()
          `show' runiform()
          
          set seed `seed'
          `show' -1
          `show' runiform()
          `show' runiform()
          
          set seed `seed'
          `show' -1
          * `show' runiform()
          `show' runiform()
          
          set seed `seed'
          `show' cond(`0', runiform(), runiform())
          `show' runiform()
          
          set seed `seed'
          `show' cond(`1', runiform(), runiform())
          `show' runiform()
          
          set seed `seed'
          `show' cond(`1', runiform(), -1)
          `show' runiform()
          
          set seed `seed'
          `show' cond(`1', -1, runiform())
          `show' runiform()
          
          quietly log close lo
          
          exit
          I suppose that you could use the same diagnostic approach if you're interested in how the function behaves with its optional (three-valued logic condition) fourth argument.
          Attached Files

          Comment


          • Allow -statsby- to work with pweights. https://www.stata.com/manuals/dstatsby.pdf says
            All weight types supported by command are allowed except pweights;
            I wonder why -statsby- cares, other than a belief that use of pweights should be discouraged.

            Comment


            • The Methods and Formulas section of [SVY] Subpopulation Estimation suggests that estimating subpopulation characteristics by simply subsetting the data and applying the sampling weights is not appropriate.

              Comment


              • Re: #590 and #591. Anything you can do with -statsby-, and more, can also be done with -runby-, by Robert Picard and me, available from SSC. -runby- actually imposes no restrictions at all on the commands it processes. Any commands that can run standalone in Stata, can be run group-wise using -runby-, and in large data sets much faster. The only major limitation with -runby- is that the commands it runs cannot access local macros defined in the calling program (there is no problem with macros defined within the commands being executed by -runby-), so any such information needed must be passed in some other way.

                So, if you are sure that what you want to do with -pweights- can still produce correct answers when carried out groupwise, then you can do it with -runby-. Of course, you do this at your own peril if your judgment about the suitability of groupwise calculation is ill-founded.

                Comment


                • Add code block support for magic command %stata in pystata.
                  This is crucial for the complete integration of python with Stata, i.e., running a big python and stata code inside python IDE (like Spyder or Jupyter) with pystata integration.

                  Right now, the only workaround is to use the %%stata or stata.run command for multiline support (e.g., for writing a Stata for block in python IDE), but that should be executed separately. Suppose we want to loop over a variable in python and each time run a multiline block/command in PyStata. Right now, there is no way to do that. For example, there is no way to make this code work:
                  Code:
                   %loading pystata as usual
                  import stata_setup
                  stata_setup.config("C:/Program Files/Stata17", "mp")
                  from pystata import stata
                  from sfi import Macro  
                  
                  for ii in [1,2]:  
                   Macro.setLocal('kk', str(ii))
                    %stata   forvalues jj=1/`kk'
                    %stata  {  
                    %stata  disp `jj'
                    %stata  }
                  Last edited by John Williamss; 21 Jan 2023, 14:47.

                  Comment


                  • Originally posted by Maarten Buis View Post
                    Chen Samulsion for #565 Would instead a syntax like collapse, force frame(new_frame_name) work better? My idea would be that that would make a new frame (new_frame_name) with the collapsed data in it and change to that frame.
                    Maarten Buis regarding #575

                    One alternative is to use the user-written prefix command frapply available from SSC:

                    Code:
                    ssc install frapply, replace
                    sysuse auto
                    frapply, into(newframe, replace): contract rep78 foreign |> sort foreign rep78 |> list
                    frapply is flexible; one can apply commands (such as collapse) to any frame, replace or change the current frame to the new frame, and daisy-chain commands. See my discussion here: https://www.statalist.org/forums/for...d-s-to-a-frame.

                    Comment


                    • Preview .do and .ado files on MacOS by clicking space.

                      Comment


                      • #595 - Stata developers are aware of this problem introduced by macOS Ventura and have said it will be corrected in a forthcoming update to Stata 17.

                        https://www.statalist.org/forums/for...-and-ado-files

                        Comment


                        • Originally posted by William Lisowski View Post
                          The Methods and Formulas section of [SVY] Subpopulation Estimation suggests that estimating subpopulation characteristics by simply subsetting the data and applying the sampling weights is not appropriate.
                          Does -statsby- intend to rule out probability weights when estimating the same regression across several countries? I suppose countries are each subsets of the world, but isn't it ok to have separate regressions by country since variations in the probability of being sampled won't move an observation from one country to another? Isn't -statsby- presuming the weights arise from a particular sample design, when other designs are also possible?

                          Comment


                          • Stata string matrix. One use case is to use a string matrix as loop-up table.

                            Comment


                            • I think that Stata should have a command for generalized additive models [ GAM ]. gamfit.exe is very buggy and I believe it runs only on MSWindows machines. A built-in command would be so much better.

                              Comment


                              • Originally posted by Bjarte Aagnes View Post
                                Stata string matrix. One use case is to use a string matrix as loop-up table.
                                Can't you use frames for this now?

                                Comment

                                Working...
                                X