Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • William Lisowski
    replied
    In #421 I would simplify the syntax of merge to
    Code:
    merge 1:1 ID DATE using dataset2, usingkey(id date)
    for consistency with the normal syntax.

    Leave a comment:


  • daniel klein
    replied
    While I see the more general point of 421, typing

    Code:
    rename ID DATE , lower
    merge 1:1 id date using dataset2
    seems to be about the same effort in typing but arguably less effort in reading and understanding syntax.

    Leave a comment:


  • Ali Atia
    replied
    On the point of merging, a very small quality of life improvement would be to allow keys with different names in each dataset (as several user-written commands which emulate the behavior of merge do). For instance, something like this would be great:

    Code:
    use dataset1, clear
    merge 1:1 using dataset2, usingkey(id date) masterkey(ID DATE)

    Leave a comment:


  • Nihat Mugurtay
    replied
    Nick Cox, @wbuchanan,

    I understood #417 on logical operators. Thank you for your comments. #418 has useful remarks, and that is why I sometimes migrate to MySQL for data management/manipulation (multiple merging, more flexible syntax, user interface).

    Leave a comment:


  • Bruce Weaver
    replied
    After reading (and responding to) this discussion from 2017, I suggest that commands like -summarize- and -tabstat- include an option for reporting "descriptive" versions of the variance and SD (i.e., with n rather than n-1) in the denominator. But the default should remain the "inferential" versions that use n-1 in the denominator.

    Leave a comment:


  • wbuchanan
    replied
    Nick Cox
    While I agree with you about the semantics/syntax recommendation from Nihat Mugurtay, I think the merge suggestion may have been devoid of some additional context. I'm assuming that Nihat Mugurtay's reference to things being trivial is that this is a solved problem from the SQL world. I think having some more SQL-esque type data management/manipulation language could be useful personally, but I imagine it would likely not be terribly easy to implement using the Stata internals and would not be nearly as efficient as some SQL structures that enable SQL's efficiencies (e.g., indices, pages, etc...) don't exist in Stata.

    Leave a comment:


  • Nick Cox
    replied
    and and AND as logical operators? Please no. One of several considerations is that they are legal variable and scalar names.

    Two or more merges at once? That really isn't trivial. Better to write your own loop.

    Leave a comment:


  • Nihat Mugurtay
    replied
    As a newcomer,

    It would be great if STATA has a more flexible syntax like SQL. Especially, when it comes to logical operators, syntax flexibility provides a more user-friendly environment. For instance, "and", "AND" or "&" might be used interchangeably. Also, there may be an option for merging more than two datasets like this (it is trivial but also a brainstorming:

    Code:
    merge 1:1 ccode year using "data1", gen (merge123) & merge 1:1 ccode year using "data2", gen (merge 1234) & merge 1:1 ccode year using "data3", gen (merge 12345)
    Each generated "_merge" element would still provide information about what is going wrong.

    The third one might be related to working with different tabs. Each time I open a new STATA 17 window, and upload data. There are always at least three different STATA windows, which sometimes make me confused about saving or playing with data as I want.

    Best,

    Leave a comment:


  • John Riveros
    replied
    Regarding reg3 or Three-Stages Least Squares, it would be great if robust standard errors (Sandwich type, Driscoll-Kraay, and HAC) would be implemented, also implement one-way and two-way fixed effects for reg3 as an option. This will make much more use of the package reg3, in particular in panel data analysis.

    Leave a comment:


  • patricio cuaron
    replied
    Originally posted by Nils Enevoldsen View Post
    I'd like a native compressed DTA file format.

    For my DTAs larger than a couple dozen megabytes, zstd typically compresses by about 90% in a fraction of a second at the standard level. For smaller DTAs, the typical compression I see is about 50%. (This is after gathering the low-hanging fruit of casting variables to appropriate types.) Fancier variations are possible, but it seems pretty straightforward to support a “.zdta” format that is literally just a compressed .dta.
    +1 to this, and I cannot believe no one else brought it up. Even with fast nvme ssds it is a pain to work with multiple multi-GB tables, particularly when collaborating on an academic setting (ie using dropbox or something like it).

    Other requests:
    1. enable triple-click selection of whole words on the results window.
    2. enable merges and frame links of differently named variables, as many have requested before.
    3. adopt gtools (https://gtools.readthedocs.io/en/). the syntax for gcollapse, greshape and gquantiles is so much better (and the latter is so much faster than the egen alternative...!)
    4. append frames.
    5. enable multiple browser windows for different frames.
    6. add a frame lists to the right of the results window, below the variables list. ideally it would allow the user to open it in a browser window with just a double click.
    7. adopt/sponsor VS Code with the Stata Enhanced, stataRun and most importantly StataLanguageServer extensions. This would enable a leap in capabilities for code editing at an absolutely minimal investment for StataCorp.
    8. keep improving performance. Very important. A lot of important commands are still slowwwww (eg. egen xtile).

    Leave a comment:


  • Clyde Schechter
    replied
    I suggest that the error message "too few quotes" be changed to "unmatched quotes" or something like that. Stata issues this message whenever it encounters unbalanced quotes. But it really has no way of knowing if the unmatched quote is missing a mate, or if it is a surplus quote that shouldn't be there in the first place. In my own work, I am more likely to introduce a stray quote when aiming my 5th finger for the "enter" key than I am to forget to provide, or inadvertently delete, a mate to a quote. "Unmatched quotes" would be accurate in either case. Evidently this isn't a big deal, but still.

    Leave a comment:


  • daniel klein
    replied
    net query should store results in r().

    As a programmer, I want to be able to write code like

    Code:
    quietly net query
    local other "`r(other)'"
    
    net set other PLUS
    net get ...
    
    net set other "`other'"

    Edit:

    Alternatively, net settings could be stored in c().
    Last edited by daniel klein; 07 Jul 2022, 15:30.

    Leave a comment:


  • FernandoRios
    replied
    Something i have been working with, but would be really nice to have as a native command, would be to be able to draw areas where only the higher, but not lower, line is visible.
    for example, I can do this with the "area" and "rarea" commands

    Click image for larger version

Name:	Graph.png
Views:	2
Size:	40.0 KB
ID:	1672626

    But what I would like to get with some options in "area" or "rarea" is this:
    Click image for larger version

Name:	Graph1.png
Views:	1
Size:	42.8 KB
ID:	1672625
    Yes, I can construct this using area and line, but would like to have a single line that creates this if possible

    (I do hope I'm not simply ignoring a currently available option)
    Attached Files

    Leave a comment:


  • daniel klein
    replied
    This is (probably) a small one: Could we have a Mata function that directly accesses the dataset label? Something along the lines of st_varlabel(), st_varvaluelabel(), etc. I imagine it is rarely needed but if it is, the workaround through Stata's extended macro functions is slightly inconvenient.

    Leave a comment:


  • Seyi Soremekun
    replied
    reposting from separate thread:

    Stata 17 suggests autocompletion options when you type a few letters in a do file.
    I don't want to turn this nice feature off, however I would like it not to select the first word option in the autocomplete list when I press 'enter'. My primary use of 'enter' is to move to the next line; however when autocomplete is on the result is random unwanted words autotyped in my do files.

    I have a read a few previous threads, none seem to have a solution to my problem (apologies in advance if I've missed a thread). It seems the only option is to disable autocomplete entirely, but perhaps a better compromise could be implemented in Stata18.

    Leave a comment:

Working...
X