Wishlist for Stata 17

Nick Cox replied

30 Sep 2020, 05:01
John Mullahy I don't think StataCorp will ever allow if (or in) qualifiers within options and FWIW I don't support that idea.

sepscatter from SSC -- see https://www.statalist.org/forums/forum/general-stata-discussion/general/3803-sepscatter-available-from-ssc -- helps with some of these problems but does not improve on the first syntax you cite.
Leave a comment:

John Mullahy replied

29 Sep 2020, 12:54

RIght now one can do this:

Code:

twoway (scatter y x if x<5, mcol(blue) msym(d)) (scatter y x if x>=5, mcol(orange) msym(o))

Something like this would be nice:

Code:

twoway scatter y x, mcol(blue if x<5, orange if x>=5) msym(d if x<5, o if x>=5)

Leave a comment:

Joro Kolev replied

28 Sep 2020, 05:26
Substituting non-linear, numerical optiminsation estimator, for a linear closed form estimator does not always work, Sebastian.

This is illustrated in this thread
https://www.statalist.org/forums/for...-is-this-a-bug
to which I never received a satisfactory answer. Stata Technical Support replied:

"-sureg- and -nlsur- use different algorithms for the underlying
computations, and so with borderline cases, for example with models that
are only weakly identified, results may not necessarily be the same. In
the case of your example, it looks like -nlsur- (with default
optimization settings) has problems fitting all the parameters as can be
seen from the missing standard error of the intercept from the first
equation. However, with well identified cases, they should generally
yield equivalent results:" (they proceed with an example where with strong identification, you get the same results).

Originally posted by Sebastian Kripfganz View Post

With the standard weighting matrix, the onestep estimator is simply the 2SLS estimator. I believe the situations where someone needs a different weighting matrix are rather rare. You can always use the gmm command to achieve this task.
Leave a comment:
Sebastian Kripfganz replied

28 Sep 2020, 04:30
Originally posted by Joro Kolev View Post

Can the linear GMM estimator in -ivregress gmm- be made a bit more flexible? E.g., it would be nice if it:

1. Has -onestep- option.
2. Has the option to calculate the estimator at a user defined weighting matrix.
3. Has the option of whether the first step is done with the identity weighting matrix, invsym(Z'Z), or any user specified matrix.

With the standard weighting matrix, the onestep estimator is simply the 2SLS estimator. I believe the situations where someone needs a different weighting matrix are rather rare. You can always use the gmm command to achieve this task.
Leave a comment:
Joro Kolev replied

26 Sep 2020, 11:28
Can the linear GMM estimator in -ivregress gmm- be made a bit more flexible? E.g., it would be nice if it:

1. Has -onestep- option.
2. Has the option to calculate the estimator at a user defined weighting matrix.
3. Has the option of whether the first step is done with the identity weighting matrix, invsym(Z'Z), or any user specified matrix.
Leave a comment:
Adrien Bouguen replied

25 Sep 2020, 16:49
Two annoying things:

- When I run a complex command (that takes some time to run), the do file related to the output window is frozen. Yet, when I open another do file, I can easily write without restriction. I was wondering if Stata 17 could not make sure that the do file never freezes (since I am able to open a new dofile, I imagine that this is not a computing limitation) so that I can continue working while my command runs. Right now I often save my dofile before running a computing-demanding command and then open the same do file on another stata window which makes my life complicated and be the source of errors (like erasing part of my code)

- When I try to open a ado file with the do file editor instead of opening the do file, stata tries to open it as if it was a do which generates an error. It is annoying.
Leave a comment:
Bert Lloyd replied

24 Sep 2020, 11:30
A request for putdocx and some of the other reproducible research commands:

I wish there were an easy way to tell putdocx "please put whatever the screen output of the following command would be", even if only for simple commands like summarize, codebook, etc. I can usually approximate using putdocx table and the like, but it seems like a lot more effort and my results are (usually) not as nice as what Stata already gives us.

So, something like

putdocx outputof summarize mpg, detail
Leave a comment:
Sebastian H replied

21 Sep 2020, 03:21
Further expansion of discrete choice capabilities:
- ability to obtain conditional parameters from panel mixed logit models (similar to the mixlogit command from Arne R. Hole)
- ability to use Sobol and MLH draws when estimating panel mixed logit models
- allow integration of latent variable (SEM) in panel mixed logit models
- panel latent class logit model (such as lclogit2) including the possibility to impose class-specific constraints
- hierarchical Bayes estimation of panel mixed logit models
- ability to specify utility function for panel mixed logit models to allow for complex models (e.g. integration of scale heterogeneity)
1 like
Leave a comment:
Clyde Schechter replied

17 Sep 2020, 20:37
Now that we have frames in Stata 16, it would be nice to be able to do a few more things with them. For example, it would be nice if the -saving()- option of any command that has such an option could be expanded to "save" the results in a new frame rather than writing to mass storage. And it would be nice to be able to do -joinby- with a frame. (Even better would be if -rangejoin- worked with a frame as its -using-, though I realize that's not even an official Stata command.)
4 likes
Leave a comment:
FernandoRios replied

17 Sep 2020, 12:15
Originally posted by shem shen View Post

I wish xtile and by can be compatible

Hi Shem,
While not exactly what you ask for, there is an "egen" function extension (ssc install egenmore), that allows you to do something like what you describe:

Code:

sysuse auto egen mpg5 = xtile(mpg), by(foreign) nq(5) tab mpg5 foreign
2 likes
Leave a comment:
shem shen replied

17 Sep 2020, 10:46
I wish xtile and by can be compatible
Leave a comment:
John Mullahy replied

17 Sep 2020, 07:06
While with a bit of effort it is possible to trick twoway into doing this, it would be valuable if it could have easily implemented options to produce scatter (and other) graphs where the displayed data can appear in any of four quadrants around specified origins (or x- and y-axis choices) that define the center of the visualization. E.g. familiar pictures of a Cartesian plane centered on (0,0).
3 likes
Leave a comment:
Fahad Mirza replied

17 Sep 2020, 00:20
Would be nice to have Stata beautify its visual output with a modern default scheme and a diver color palette and tone range. Another thing i would love to see is Stata make available the changes we make in graph editor as a code once we close the window so that we can save and use it later. Similarly, i am not sure if there are good 3D graph options however would be nice to see those as well.
3 likes
Leave a comment:
Nick Cox replied

15 Sep 2020, 07:20
#343 Why hsmode reports mode 3 for 5 instances of 3 and 5 instances of 4 is discussed in the help under Ties.
1 like
Leave a comment:
Nick Cox replied

15 Sep 2020, 06:03
hsmode gives you its best guess at the mode.

There is a good reason why Stata doesn't report a mode as a standard summary. What would be a standard summary?

Get 3 statistically-minded people in a room and ask how to determine the mode, and you will get about 12 answers.

1. The mode is the single most common value in a dataset.

2. But if there are ties for that, it is not well determined, so you can't say what it is.

3. No, if there are ties for that you have bi- or multimodality and should report that.

4. No, if there are ties for that, nevertheless sometimes there is an easy answer: you can just average equally common values if they are adjacent. .

5. No, in practice looking at frequencies of reported values is a poor method, especially with measured data, so you need to bin data first before looking at a histogram and choosing a modal class.

6. A mode is in practice a pronounced peak. If the leading mode is 10 times more frequent than the second mode, no-one experienced calls that bimodal. (Discuss!) But all that implies is this is a judgment call and not suitable for automated reporting.

7. All these methods are superseded for measured data by density estimation, including kernel density estimation. But best practice is to report those estimates and comment informally on modes if they are evident.

8. Density estimation allows a report of the mode. But it is contingent on which kernel and which bandwidth you use and on any other decisions made before or after density estimation.

9. Yet another approach is to use hsmode which follows a precise algorithm to give an estimate. It doesn't seem well known or widely used, but it is well defined.

I said 12 answers and I am naturally being a little flippant, but I doubt that I have remembered or know all the possible answers here.

A case in point is data 1, 2, 3, 4, 5 where every person who has passed a first course should be able to say that the mean and median are 3, but what is the mode?

I respect any answer that the mode is indeterminate here, so a Stata command should return missing, but I also respect the answer of hsmode (which returns 3!).

Underlying all this is a idea that equal frequencies for most common values are usually a quirk of small samples. Given more data, most distributions turn out to be unimodal with a definite peak. (Discuss!)
4 likes
Leave a comment:

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: