Wishlist for Stata 18

William Lisowski replied

20 Jul 2022, 08:03
In #421 I would simplify the syntax of merge to

Code:

merge 1:1 ID DATE using dataset2, usingkey(id date)

for consistency with the normal syntax.
3 likes
Leave a comment:
daniel klein replied

20 Jul 2022, 07:16
While I see the more general point of 421, typing

Code:

rename ID DATE , lower merge 1:1 id date using dataset2

seems to be about the same effort in typing but arguably less effort in reading and understanding syntax.
Leave a comment:
Ali Atia replied

20 Jul 2022, 06:59
On the point of merging, a very small quality of life improvement would be to allow keys with different names in each dataset (as several user-written commands which emulate the behavior of merge do). For instance, something like this would be great:

Code:

use dataset1, clear merge 1:1 using dataset2, usingkey(id date) masterkey(ID DATE)
5 likes
Leave a comment:
Nihat Mugurtay replied

19 Jul 2022, 10:59
Nick Cox, @wbuchanan,

I understood #417 on logical operators. Thank you for your comments. #418 has useful remarks, and that is why I sometimes migrate to MySQL for data management/manipulation (multiple merging, more flexible syntax, user interface).
Leave a comment:
Bruce Weaver replied

19 Jul 2022, 08:30
After reading (and responding to) this discussion from 2017, I suggest that commands like -summarize- and -tabstat- include an option for reporting "descriptive" versions of the variance and SD (i.e., with n rather than n-1) in the denominator. But the default should remain the "inferential" versions that use n-1 in the denominator.
1 like
Leave a comment:
wbuchanan replied

19 Jul 2022, 06:25
Nick Cox
While I agree with you about the semantics/syntax recommendation from Nihat Mugurtay, I think the merge suggestion may have been devoid of some additional context. I'm assuming that Nihat Mugurtay's reference to things being trivial is that this is a solved problem from the SQL world. I think having some more SQL-esque type data management/manipulation language could be useful personally, but I imagine it would likely not be terribly easy to implement using the Stata internals and would not be nearly as efficient as some SQL structures that enable SQL's efficiencies (e.g., indices, pages, etc...) don't exist in Stata.
1 like
Leave a comment:
Nick Cox replied

19 Jul 2022, 05:15
and and AND as logical operators? Please no. One of several considerations is that they are legal variable and scalar names.

Two or more merges at once? That really isn't trivial. Better to write your own loop.
1 like
Leave a comment:
Nihat Mugurtay replied

18 Jul 2022, 06:58
As a newcomer,

It would be great if STATA has a more flexible syntax like SQL. Especially, when it comes to logical operators, syntax flexibility provides a more user-friendly environment. For instance, "and", "AND" or "&" might be used interchangeably. Also, there may be an option for merging more than two datasets like this (it is trivial but also a brainstorming:

Code:

merge 1:1 ccode year using "data1", gen (merge123) & merge 1:1 ccode year using "data2", gen (merge 1234) & merge 1:1 ccode year using "data3", gen (merge 12345)

Each generated "_merge" element would still provide information about what is going wrong.

The third one might be related to working with different tabs. Each time I open a new STATA 17 window, and upload data. There are always at least three different STATA windows, which sometimes make me confused about saving or playing with data as I want.

Best,
Leave a comment:
John Riveros replied

17 Jul 2022, 08:46
Regarding reg3 or Three-Stages Least Squares, it would be great if robust standard errors (Sandwich type, Driscoll-Kraay, and HAC) would be implemented, also implement one-way and two-way fixed effects for reg3 as an option. This will make much more use of the package reg3, in particular in panel data analysis.
1 like
Leave a comment:
patricio cuaron replied

16 Jul 2022, 15:03
Originally posted by Nils Enevoldsen View Post

I'd like a native compressed DTA file format.

For my DTAs larger than a couple dozen megabytes, zstd typically compresses by about 90% in a fraction of a second at the standard level. For smaller DTAs, the typical compression I see is about 50%. (This is after gathering the low-hanging fruit of casting variables to appropriate types.) Fancier variations are possible, but it seems pretty straightforward to support a “.zdta” format that is literally just a compressed .dta.

+1 to this, and I cannot believe no one else brought it up. Even with fast nvme ssds it is a pain to work with multiple multi-GB tables, particularly when collaborating on an academic setting (ie using dropbox or something like it).

Other requests:
enable triple-click selection of whole words on the results window.

enable merges and frame links of differently named variables, as many have requested before.

adopt gtools (https://gtools.readthedocs.io/en/). the syntax for gcollapse, greshape and gquantiles is so much better (and the latter is so much faster than the egen alternative...!)

append frames.

enable multiple browser windows for different frames.

add a frame lists to the right of the results window, below the variables list. ideally it would allow the user to open it in a browser window with just a double click.

adopt/sponsor VS Code with the Stata Enhanced, stataRun and most importantly StataLanguageServer extensions. This would enable a leap in capabilities for code editing at an absolutely minimal investment for StataCorp.

keep improving performance. Very important. A lot of important commands are still slowwwww (eg. egen xtile).
2 likes
Leave a comment:
Clyde Schechter replied

11 Jul 2022, 15:23
I suggest that the error message "too few quotes" be changed to "unmatched quotes" or something like that. Stata issues this message whenever it encounters unbalanced quotes. But it really has no way of knowing if the unmatched quote is missing a mate, or if it is a surplus quote that shouldn't be there in the first place. In my own work, I am more likely to introduce a stray quote when aiming my 5th finger for the "enter" key than I am to forget to provide, or inadvertently delete, a mate to a quote. "Unmatched quotes" would be accurate in either case. Evidently this isn't a big deal, but still.
9 likes
Leave a comment:
daniel klein replied

07 Jul 2022, 15:28
net query should store results in r().

As a programmer, I want to be able to write code like

Code:

quietly net query local other "`r(other)'" net set other PLUS net get ... net set other "`other'"

Edit:

Alternatively, net settings could be stored in c().
Last edited by daniel klein; 07 Jul 2022, 15:30.
3 likes
Leave a comment:
FernandoRios replied

07 Jul 2022, 10:34
Something i have been working with, but would be really nice to have as a native command, would be to be able to draw areas where only the higher, but not lower, line is visible.
for example, I can do this with the "area" and "rarea" commands

But what I would like to get with some options in "area" or "rarea" is this:

Yes, I can construct this using area and line, but would like to have a single line that creates this if possible

(I do hope I'm not simply ignoring a currently available option)
Attached Files
3 likes
Leave a comment:
daniel klein replied

04 Jul 2022, 03:46
This is (probably) a small one: Could we have a Mata function that directly accesses the dataset label? Something along the lines of st_varlabel(), st_varvaluelabel(), etc. I imagine it is rarely needed but if it is, the workaround through Stata's extended macro functions is slightly inconvenient.
1 like
Leave a comment:
Seyi Soremekun replied

24 Jun 2022, 10:44
reposting from separate thread:

Stata 17 suggests autocompletion options when you type a few letters in a do file.
I don't want to turn this nice feature off, however I would like it not to select the first word option in the autocomplete list when I press 'enter'. My primary use of 'enter' is to move to the next line; however when autocomplete is on the result is random unwanted words autotyped in my do files.

I have a read a few previous threads, none seem to have a solution to my problem (apologies in advance if I've missed a thread). It seems the only option is to disable autocomplete entirely, but perhaps a better compromise could be implemented in Stata18.
1 like
Leave a comment:

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: