Wishlist for Stata 16

Clyde Schechter replied

15 Nov 2018, 08:54
Here's one I've been mulling over for a while. The way missing values work with logical expressions is often problematic. Missing value is treated as true. But in most contexts, a missing value on an expression or variable really means "could be true or false, we don't know." Adjusting simple logical expressions quickly gets complicated. When calculating a conjunction (&), we would want 1& . = ., 0 & . = 0, . & . = .. For disjunction, 1 | . = 1, 0 | . = ., . | . = ..

With existing Stata features you can work around this by recoding . to 0.5, and then use min(a, b) for a&b, and max(a, b) for a | b and min(a, b) for a | b. But if you have a lengthy logical expression with many operators, and perhaps parenthesized expressions nested within, this kind of translation becomes tedious and error-prone. Moreover, the resulting code is as opaque as possible.

Now, redefining the operation of & and | to do this would break lots of existing code, and would create chaos even if the old behavior were maintained under version control. But why not define new logical operators && and || that would behave this way? It would also be nice to have a negation operator that gave us negation of . = . !! would not work for an analogous negation operator, because !! is itself a legitimate expression of double negation (and one that I find helpful and use fairly often), so one would have to find some other expression for it (or perhaps a function). && would never clash with anything else as it is never legal syntax currently. || is "taken" as a separator between the fixed and random components of mixed models, but I think that it would seldom if ever create confusion as that is a highly distinct context that the parser could recognize, though it might require some lookahead. (In linguistic terms, I think the two meanings of || would be in complementary distribution.)
3 likes
Leave a comment:
Jim Steiner replied

15 Nov 2018, 07:53
A few little things I'd like (or may be unaware that exist):

1. Find-Replace in do-editor tells you how many instances of the find term were replaced.
2. Option to open multiple do-files from explorer window in one editor window rather than one Stata instance per do-file. I don't really want to have to create .stpr files for that purpose.
3. Option to set do file preferences (e.g. font/colors) permanently (this may already be an option--I just haven't seen it if it exists).
4.

Code:

inlist()

command that can hold more than 10 string vars directly rather than having to loop or use multiple or statements.
Leave a comment:
Jean-Claude Arbaut replied

26 Oct 2018, 04:44
I would have other ideas, but on the top of my wishlist:

* Ability to call an external DLL from Mata. Might need additional types to help (byte, short...) in writing more or less the equivalent of a Declare in VBA.
(if it's flexible enough, it would open many many manyyyyyyy other possibilities: call libraries for numerical computations and special functions, plotting, multiprecision, OS services, file I/O in other formats...), and of course add Mata functions that are not easy (or not fast enough) to implement in pure Mata code.
* In the preceding, a way to pass directly Mata matrices (or even a dataset) back and forth would be very valuable.
* Ability to call Mata functions (especially user defined ones) in Stata, in some places where only Stata functions are currently available (maybe with the help of a generic function, e.g. callmata("functionname", arg1, ...), for instance in gen/egen.
* Ability to plot from Mata, especially to plot data from a vector/matrix, and to update a plot with subsequent Mata code.
1 like
Leave a comment:
Jesse Wursten replied

26 Oct 2018, 03:33
Perhaps an arcane/unfeasible request... but is it possible to split the dofile editor and "main stata" processes? Imagine you are running some heavy regressions, you might still want to work on your dofile while it's running. On modern computers with multiple CPU cores, that's often feasible in theory, given that Stata rarely ever uses all cores to their maximum capacity. In practice, the dofile editor always seems to hang up or be very slow at least.

I know you can edit your dofiles in separate programs, but in the end they are never as integrated with Stata as its own dofile editor, so I'd prefer to keep using it.
1 like
Leave a comment:
Dave Airey replied

15 Oct 2018, 10:03
How about power and sample size for ROC AUC analysis, maybe similar to power.roc.test() from pROC in R?
1 like
Leave a comment:
Jesse Wursten replied

15 Oct 2018, 09:42
I wish Stata stops printing lines to the command window once it encounters an error within an if condition. Especially in larger code blocks, scrolling up to find where the error actually occurred gets boring really quickly.

Here's an example.

Code

Code:

local this "example" if "`this'" == "example" { di "`value'" di "something else" error 413 di "1. I don't want to see this line in the output" di "2. I don't want to see this line either" di "3. You get the idea by now" }

Actual output

Code:

. if "`this'" == "example" { . di "`value'" . di "something else" something else . error 413 r(413); . di "1. I don't want to see this line in the output" . di "2. I don't want to see this line either" . di "3. You get the idea by now" . } r(413); end of do-file r(413);

What I would like

Code:

. if "`this'" == "example" { . di "`value'" . di "something else" something else . error 413 r(413); end of do-file r(413);

Some context on when this becomes annoying. When I have a dofile that does three separate (but related) things, I enclose each block in an if-condition (e.g. if "$runA" == "1"). At the top of the dofile I can then set global runA to 1 or 0, depending on whether I want to run that part at this point. No hassle with commenting out parts, no issues with common macros not being defined yet, no remembering which lines need and need not be selected to get the thing to run. These blocks can get very long (hundreds of lines is not uncommon). Correspondingly, any small error, or even a manually added stop (/error 1) means I have to scroll all the way up to see where it happened. Yet I've never in my life needed all those printed lines, because they are verbatim copies of my dofile anyway.
2 likes
Leave a comment:
George Hoffman replied

12 Oct 2018, 17:05
Nick Cox , Mauricio Caceres :

thank you for these suggestions.
tabcount works for tab. for other commands, i'm going to play around with varparse. it's not a complete solution but it's a great start.
thanks again.
Leave a comment:
Nick Cox replied

11 Oct 2018, 02:12
George Hoffman Mauricio Caceres #134ff

See also tabcount from a while back:

SJ-3-4 pr0011 . . . . . . . . Speaking Stata: Problems with tables, Part II
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q4/03 SJ 3(4):420--439 (no commands)
reviews three user-written commands (tabcount, makematrix,
and groups) as different approaches to tabulation problems

tabcount from http://fmwww.bc.edu/RePEc/bocode/t
'TABCOUNT': module to tabulate frequencies, with zeros explicit / tabcount
tabulates frequencies for up to 7 variables. Its main / distinctive
features are that zero frequencies of one or more / specified values or
conditions are always shown in the table / (i.e. entirely empty rows,

Code:

clear input float(var1 var2) 10 10 20 20 30 30 40 40 50 50 60 60 70 70 80 80 90 90 10 90 20 80 30 70 40 60 50 50 60 40 30 70 20 80 10 90 end tabcount var1 var2, c1(<=60 >60) c2(<=70 >70) ---------------------- | var2 var1 | <=70 >70 ----------+----------- <=60 | 11 4 >60 | 1 2 ----------------------
1 like
Leave a comment:
Mauricio Caceres replied

10 Oct 2018, 23:34
Originally posted by George Hoffman View Post

Ncik - no, not that. i was not concise enough in my description of what I was thinking!
i envision a way to generate the indicator variables on the fly.
more generally, the temporary varibel need not be an indicator variable.
the syntax engine would evaluate expressions and create a temporary variable from the expression.

example:

Code:

reg y x1 x2 {x3<=5} {1/x4}

thanks

this would regress y against x1, x2, an indicator for x3<=5, and the value of 1/x4

A basic version of this is not too difficult to implement via a separate command, though it would certainly be nice to have such a thing built in and for everything to be correctly labeled as the expression that generated the on-the-fly variable, rather than a temporary variable name. This requires Stata 14+

Code:

program varparse _on_colon_parse `0' local 1 `s(after)' local ix 0 qui while ustrregexm(`"`1'"', "\{(.+?)\}") { tempvar v`++ix' local g`ix' = ustrregexs(1) gen `v`ix'' = `g`ix'' label var `v`ix'' `"`=ustrregexs(1)'"' local 1 = ustrregexrf(`"`1'"', "\{(.+?)\}", `"`v`ix''"') } `1' end clear set seed 1729 set obs 100 gen x1 = runiform() gen x2 = rnormal() gen x3 = runiform() * 10 gen x4 = rnormal() gen y = 1 + x1 - x2 + 2 * (x3 <= 5) - 3 / x4 + rnormal() * 2 gen var1 = int(100 * runiform()) gen var2 = int(100 * runiform()) varparse: tab {var1<=60} {var2<=70} varparse: reg y x1 x2 {x3<=5} {1/x4}

This gives what you want, I think. While "tab" uses the variable label, however, reg does not. Not sure how to make that happen (perhaps swapping the variable names for the labels in regress can be added to the wishlist? It's not so obvious how to do it in esttab etc. either since the variables no longer exist in memory).
1 like
Leave a comment:
Tom Poulton replied

10 Oct 2018, 19:55
My wish for the next Stata update: I would really appreciate the ability to add headers and footers to documents produced using putpdf.
1 like
Leave a comment:

George Hoffman replied

08 Oct 2018, 05:55

Ncik - no, not that. i was not concise enough in my description of what I was thinking!
i am looking for an on-the-fly variable creation.
my "var1<=60" evaluates to 0 or 1 depending on the value of var1. likewise for var2. thus the tab statement that i envisioned would yield a 2x2 table
in the ccode below, i generated two indicator variables var160 and var170 to demonstrate the desired effect. this particular example takes 3 lines of code to tabulate, but more complex conidtions would require more.

Code:

 input var1 var2

          var1       var2
  1. 10 10
  2. 20 20
  3. 30 30
  4. 40 40
  5. 50 50
  6. 60 60
  7. 70 70
  8. 80 80
  9. 90 90
 10. 10 90
 11. 20 80
 12. 30 70
 13. 40 60
 14. 50 50
 15. 60 40
 16. 30 70
 17. 20 80
 18. 10 90
 19. end

. tab var1 var2 if var1<=60 & var2<=70

           |                                     var2
      var1 |        10         20         30         40         50         60         70 |     Total
-----------+-----------------------------------------------------------------------------+----------
        10 |         1          0          0          0          0          0          0 |         1
        20 |         0          1          0          0          0          0          0 |         1
        30 |         0          0          1          0          0          0          2 |         3
        40 |         0          0          0          1          0          1          0 |         2
        50 |         0          0          0          0          2          0          0 |         2
        60 |         0          0          0          1          0          1          0 |         2
-----------+-----------------------------------------------------------------------------+----------
     Total |         1          1          1          2          2          2          2 |        11


. def var160 = var1<=60

. def var270 = var2<=70


. tab var160 var270

           |        var270
    var160 |         0          1 |     Total
-----------+----------------------+----------
         0 |         2          1 |         3
         1 |         4         11 |        15
-----------+----------------------+----------
     Total |         6         12 |        18

.

i envision a way to generate the indicator variables on the fly.
more generally, the temporary varibel need not be an indicator variable.
the syntax engine would evaluate expressions and create a temporary variable from the expression.

example:

Code:

 reg y x1 x2 {x3<=5} {1/x4}

thanks

this would regress y against x1, x2, an indicator for x3<=5, and the value of 1/x4

Last edited by George Hoffman; 08 Oct 2018, 06:21. Reason: correct typo

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: