Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • László Sándor
    replied
    So Stata 14 is announced today. I think last week's Stata facebook post revealed the main new features. Notice no performance improvements or infrastructure changes for big data:
    The votes are coming in! Just a reminder, go cast your vote for which of the following features you would most like to see in the next version of Stata.
    59.48% Bayesian analysis
    31.90% Panel and multilevel survival models
    28.45% Survey for multilevel models
    24.14% Endogenous treatment effects
    19.83% Treatment effects for survival models
    18.10% Regression models for fractional data
    18.10% Markov-switching models
    15.52% Power and sample size for survival analysis...
    13.79% IRT (item response theory)
    13.79% Unicode
    08.62% Balance diagnostics for treatment effects
    08.62% Satorra-Bentler for SEM
    07.76% Censored Poisson model
    03.45% Small-sample inference for mixed models

    Leave a comment:


  • Jonathan Horowitz
    replied
    Originally posted by Richard Williams View Post
    I would like to see much better support for Full Information Maximum Likelihood (fiml). Some Stata routines, e.g. SEM, provide some support for fiml (which Stata calls mlmv).
    I second this, but also would like to see this implemented for non-SEM routines too. I realize this probably is a much bigger challenge than implementing it for SEM, but fiml is often the best way to handle missing data (ee: http://www.statisticalhorizons.com/w...ngDataByML.pdf) and it would be great to see it become standard.

    I also second/third/fourth everyone who wants the error message to reference the line in the do file.

    Finally, Satorra-Bentler for -gsem- would be outstanding.

    Leave a comment:


  • Qunyong Wang
    replied
    Stata should stregthend nonparametric and semi-parametric methods, Markov switching model, time-varying coefficient model. All these models are widely used in emprical economics.

    Leave a comment:


  • Joseph Coveney
    replied
    A bit of a quibble, but an option
    Code:
    set default_date_display ISO_8601, permanently
    or
    Code:
    set default_date_display "%tdCY-N-D", permanently
    would be welcome.

    It would affect such commands as
    Code:
     di "`c(current_date)'"
    and
    Code:
    update
    and
    Code:
    describe
    and most important
    Code:
    translate , translator(smcl2ps) header(on)
    translate , translator(smcl2pdf) header(on)
    For the first few, either I can write wrapper workarounds or put up with it as I'm the only one typically seeing it.

    But customers often see output, and they've grown to take compliance-to-standards as a given. My option here (header(off)) is to forgo pagination.

    Leave a comment:


  • Sergio Correia
    replied
    Originally posted by Clyde Schechter View Post
    What about relaxing the restriction that factor variables must have non-negative values.
    Completely agree with that, it's extremely annoying when you have pre/post dummies and end up having to add an arbitrary number to make it always positive.
    It's also hard to work around becuase -fvrevar- is a built-in.

    Leave a comment:


  • Clyde Schechter
    replied
    What about relaxing the restriction that factor variables must have non-negative values. For example, in a clinical trial we might get several pre-randomization observations and then several post-randomization interventions. It is natural to designate a time variable with negative numbers for the pre-intervention observations and positive numbers for the post-intervention ones. So, for example, an observation obtained 2 weeks before randomization might have week = -2, and one obtained 3 weeks after might have week = 3. Currently, you can't use i.week in this circumstance. Evidently the workaround is to create a different variable that is re-centered so that 0 corresponds to the lowest value of week, and then slap a value label on that. But it would be more convenient if we could just use i.week for this.

    Leave a comment:


  • daniel klein
    replied
    And, finally, it would be good if -foreach- would echo the Stata commands it's executing in the way that -for:- used to.
    I respectfully disagree here. foreach and forvalues are programming tools and as such heavily used within programs and ado-files, where output is not desirable. In fact, I hardly find myself in a situation where I would like to have all the commands in a loop echoed to the screen - except for debugging, in which case I can always set trace on to figure out what exactly Stata did in each iteration.

    Best
    Daniel

    Leave a comment:


  • Nick Cox
    replied
    I wish that by understood that sort was meant. I know that bysort exists, but it seems to me to be otiose. by only works when the data are sorted, so it should sort the data when invoked. Since Stata knows the sort order of the data, redundant sorting isn't carried out anyway.
    But that itself would create a downside. The present syntax is not in use because StataCorp could not program it otherwise. It's important as far as is possible for many, many users that users know the current sort order and see when it is changed and only change it consciously. (I'd go so far as to speculate that panel datasets are by far the most common kind now in use.) Some large fraction of my posts here hinge on showing how subscripting, itself entirely dependent on observation order, is key to many manipulations.

    If this request were implemented, then

    1. It would have to be under version control.

    2. We get a new kind of question on Statalist: why did my sort order change? Or more likely why I do get these bizarre results (which turn out to be a consequence of a change in sort order).

    3. We get a new kind of question on Statalist: why do I need to change my sort order? Or more likely why I do get these bizarre error messages (which turn out to be a consequence of programs using the old syntax).

    I take it Ronán is volunteering to handle all these questions personally!

    More positively, bysort already does what is desired. It's just a strange and ugly name.

    Leave a comment:


  • Ronán Conroy
    replied
    I wish that -by- understood that -sort- was meant. I know that -bysort- exists, but it seems to me to be otiose. -by- only works when the data are sorted, so it should sort the data when invoked. Since Stata knows the sort order of the data, redundant sorting isn't carried out anyway.

    And I wish that -by- worked with all Stata commands.

    And, finally, it would be good if -foreach- would echo the Stata commands it's executing in the way that -for:- used to. It can otherwise be difficult to figure out what went on. I like the idea that Stata output should include the precise command that generated each piece of output.

    Leave a comment:


  • Ronán Conroy
    replied
    ​Something that my students have found a little confusing is that the -over- option can be concealed under names like "categories" in the dialogues. Making sure all dialogues are consistent with Stata syntax and with each other would be helpful.

    I understand that there are plans afoot to revise the epidemiology commands, and I applaud this. The dialogues for some of these commands are bewildering, notably -tabodds- and -mcc-.

    And please, Statacorp, why is it necessary for the -tabulate- dialogue to refer to "within-column relative frequencies"? A relative frequency scaled 0-100 is a percentage. They are column percents, which is not only much easier for my poor students but also more precise.

    Leave a comment:


  • Clyde Schechter
    replied
    but perhaps Stata can cover the most common ones.
    That's actually a lot harder then it seems. Stata users are dispersed over numerous disciplines: public health, clinical medicine, biomedicine, econometrics, accounting, finance, sociology, demography, psychology--just to name a few that come quickly to mind. Each discipline has its own journals with their own preferred styles. At most Stata might be able to set up output templates for one or two in each of these disciplines--even that seems unrealistic. This would probably leave pretty much nobody satisfied.

    Stata is a statistics program, not a word processing or document editing program. Trying to give it the features of the latter will inevitably turn it into bloatware. In addition, people wanting to use those features would have to learn new commands or menus to use them--while they still would need to know how to do all the corresponding manipulations in a real word processing/document editing program.

    What I think would be desirable is if the output produced by the ordinary Copy and Copy Table maneuvers were more layout-friendly to word processing programs, formatted so as to make it simple to paste from the Results window into a template table already created in a word processor or spreadsheet. I believe that is the intent of the Copy Table command, but the implementation is flawed, particularly as applied to commands that date back to the earliest versions of Stata.

    Remember, too, that there are several user-written programs that will very flexibly lay out and format the output of estimation commands (outreg2, esttab, estout, etc.). Although I personally don't use them, from the comments seen on this forum, it appears that they can meet most users' needs, though sometimes they fall short or require awkward workarounds for special situations.

    Leave a comment:


  • Navid Asgari
    replied
    Easier output export to Words will the best Stata could do.
    There are a number of commands to make this easier. But, I hope one day we can export our results more easily through simple commands/menus. Stata can have a few templates for export and styling of the tables according to the formats common among journals. Of course, there are too many possible formats and styles, but perhaps Stata can cover the most common ones.

    Leave a comment:


  • Sergio Correia
    replied
    To illustrate that they work the same, below is a do-file that does a merge where some obs. only exist in master, some only in using, and some in both. Notice how the datasets match in this case:

    Code:
    * Create using dataset
    clear
    set obs 5
    gen foreign = _n
    gen x = runiform()
    tempfile using
    save "`using'"
    
    * Load master dataset
    sysuse auto, clear
    sort price
    
    * Merge and sort the usual way
    merge m:1 foreign using "`using'"
    sort price
    datasignature
    local sig1 = r(datasignature)
    
    * Install sortpreserve
    net from https://raw.githubusercontent.com/sergiocorreia/stata-misc/master/
    net install sortpreserve
    
    * Alternative merge
    sysuse auto, clear
    sort price
    
    sortpreserve: merge m:1 foreign using "`using'"
    
    * Verify datasets match
    datasignature
    assert r(datasignature)=="`sig1'"
    
    exit
    Note: Things are different in the (quite rare) -match update- and -match conflict- cases. In those cases, I would just do it the normal way.

    Leave a comment:


  • Sergio Correia
    replied
    Originally posted by Clyde Schechter View Post

    How would that actually work? If there are observations in the -using- data set that are not matched, where do they go? What does it mean to "preserve" the sort order when the dataset itself is different?
    They would go at the end, which is i) already what sortpreserve does in this particular case, and ii) consistent with what would happen if you were to sort again by the initial sort variables (since the sort variables would be missing for the fraction of the dataset coming from -using-)


    Leave a comment:


  • Clyde Schechter
    replied
    Speaking of -merge-, could we have a -sortpreserve- option? Most of the time I do merge, it changes the sort order of the data, which I then have to undo afterwards.
    How would that actually work? If there are observations in the -using- data set that are not matched, where do they go? What does it mean to "preserve" the sort order when the dataset itself is different?

    Leave a comment:

Working...
X