Wishlist for Stata 18

Chris Stefancik replied

01 Apr 2022, 18:40
How about a merge command syntax where we can specify which variable to link between the master and the using instead of requiring both variables to have the same name.

Code:

merge 1:1 patientid == patient_num using "demo_data.dta", keep(master match)
3 likes
Leave a comment:
Phil Bromiley replied

01 Apr 2022, 09:52
1. I often find myself re-running the same section of code when I'm debugging. If it is a large section of code, this can be inconvenient. Being able to set something like bookmarks and run the code between two bookmarks would be nice.
2. While the eregress estimators are great, if would be nice if we had something similar where possible for fixed effects.
3. xtregar - could it be extended to (i) handle more than one lag on the serial correlation, (ii) present robust standard errors, or (iii) allow for endogeneity?
Leave a comment:
Bert Lloyd replied

31 Mar 2022, 09:00
Options for additional "flavors" of markdown in dyndoc and related. (E.g., pandoc markdown)

Syntax highlighting for dyndoc (i.e. mix of Stata and markdown highlighting) in the do-file editor.
Leave a comment:
Bruce Weaver replied

31 Mar 2022, 07:02
AFAIK, -predict- still has no option to save (raw) deleted residuals (aka., PRESS residuals) following -regress-. I would like to see that option added. For that matter, an option to report PRESS (the prediction sum of squares) would be nice! At the moment, I think one has to roll their own using methods like those shown in the stackoverflow thread below.
https://en.wikipedia.org/wiki/PRESS_statistic

https://online.stat.psu.edu/stat501/lesson/10/10.5

https://stackoverflow.com/questions/...-automatically

Cheers,
Bruce
1 like
Leave a comment:
wbuchanan replied

31 Mar 2022, 06:36
William Lisowski
I like your suggestion for the notational convention. In terms of frequency, effects coding uses negative integers to define levels of factors and in difference-in-differences models that use relative time negative integers are the only useful way to encode the timing information in a way that makes sense (e.g., for positive integers to identify relative time points following the implementation of the intervention and negative integers to identify the periods prior to intervention). I don't think the burden argument is much of an issue since currently there is a burden to develop alternative coding schemes that do not preserve information in the same way instead of using negative integers.
Leave a comment:
Nick Cox replied

31 Mar 2022, 01:53
#317 George Ford semipar is community-contributed, so feature requests should go directly to the authors.
Leave a comment:
PAN XINFENG replied

30 Mar 2022, 22:43
Hope when right click one command and choose something in the history window, I can see that command and the result in result window.
Leave a comment:
Richard Williams replied

30 Mar 2022, 19:57
Originally posted by John Mullahy View Post

This recent thread https://www.statalist.org/forums/for...-a-tobit-model raises again the issue of the inability of suest to combine the results of multiple margins commands.

I imagine that if it was easy for Stata developers to allow this they would have already done so. However, the inability of suest to handle the "nonstandard" VCEs produced by margins (as well as by some other commands) somewhat diminishes its value.

Might the v18 developers consider whether suest (or possibly estimates store) could be modified to accommodate results from margins?

This recent paper by Mize, Doan, and Long demonstrates a rather ingenious approach for doing suest with margins (actually, instead of suest, you use gsem):

https://journals.sagepub.com/doi/ful...81175019852763

The coding is a little trickier than I would like, but Mize links to the code and data used at

https://www.trentonmize.com/research

Having said that, yes, it would be nice if suest made this nice and easy.
2 likes
Leave a comment:
Jared Greathouse replied

30 Mar 2022, 19:05
I know there are user written commands for this, but it would be RIDICULOUSLY helpful for two things on the spatial front: one, to be able to reproject coordinates from a shapefile. Sometimes they're in Mercator projection, and while I don't mind going to Python's geopandas from within Stata, others may not have Python/want to go leaning Python for it.

So reprojection would be awesome. Another thing that would be awesome on a related front is point in polygon merges. Presumably there's already the infrastructure to do (at least the second one) these, but it would be nice to have these extended.
Leave a comment:
John Mullahy replied

30 Mar 2022, 15:29
This recent thread https://www.statalist.org/forums/for...-a-tobit-model raises again the issue of the inability of suest to combine the results of multiple margins commands.

I imagine that if it was easy for Stata developers to allow this they would have already done so. However, the inability of suest to handle the "nonstandard" VCEs produced by margins (as well as by some other commands) somewhat diminishes its value.

Might the v18 developers consider whether suest (or possibly estimates store) could be modified to accommodate results from margins?
Leave a comment:
William Lisowski replied

30 Mar 2022, 13:00
It could still be easily addressed with a notational convention like -(1.rep78) conveying the negative value of 1.rep78 while -1.rep78 is used to identify the value of rep78 assigned to -1.

We all have peculiarities in our data that we have to program around, and I'm not in favor of introducing a special case into the construction of expressions.

My preference is to not expand factor variables notation to allow negative values.

If they were to be allowed, I would require that the "i" be explicitly included when selecting a negative value, e.g. i-2.fvar, and to continue to treat -2.fvar as -1 * 2.fvar.

That places the burden on the user taking advantage of the (infrequently needed) capability for negative values to remember that they do not have the convenience of omitting the optional i in those cases. The alternative places the burden on the naive user to avoid the well-camouflaged trap where -2.fvar != -1 * 2.fvar in the (frequently seen) case where fvar takes only nonnegative values.

Last edited by William Lisowski; 30 Mar 2022, 13:03.
3 likes
Leave a comment:
wbuchanan replied

30 Mar 2022, 11:47
daniel klein
It could still be easily addressed with a notational convention like -(1.rep78) conveying the negative value of 1.rep78 while -1.rep78 is used to identify the value of rep78 assigned to -1. Then the two conditions could be combined in an unambiguous way -(-1.rep78) would be the negated value of -1.rep78. Forcing all of the values to be strictly >= 0 is unnecessary.
Leave a comment:
daniel klein replied

30 Mar 2022, 10:54
Originally posted by wbuchanan View Post

I can't believe I forgot this previously, but go back to allowing factor variables to take negative values. [...]

From https://www.statalist.org/forums/forum/general-stata-discussion/general/1643719-important-bug-using-differencing-and-interactions

Originally posted by Enrique Pinzon (StataCorp) View Post

Negative values are not allowed with 'i.'. Stata cannot support negative values in factor variables because the expanded list of
indicators varibles for i.fvar

0.fvar
1.fvar
2.fvar
3.fvar

are valid variables you can put in Stata expressions, such as

gen mpg_minus_1rep78 = mpg - 1.rep78

Suppose negative values were allowed. Then

-1.rep78

would be ambiguous, because it might mean (1) the negative of 1.rep78, or mean (2) the indicator for when rep78 takes on the value -1. Since
negative values are not allowed, the meaning is unambiguously (1).
2 likes
Leave a comment:
wbuchanan replied

30 Mar 2022, 10:21
I can't believe I forgot this previously, but go back to allowing factor variables to take negative values. Since the factor variable notation is only creating indicators for the distinct values the assumption of values >= 0 is unnecessary. There are quite a few use cases where negative integers actually preserve the ordered nature of values and support natural meanings. For example, in the education sector coding pre-school (-1), kindergarten (0), and the other grades as positive integers preserves both the order in which the grade levels take place over the lifespan and allows for natural mapping of meanings to the majority of the coding (e.g., 1 = 1st grade, 2 = 2nd grade, etc...). I imagine that one potential reason for this is related to identifying base/reference levels, but it seems like it should be easy to add some notational convention to distinguish between positive and negative values that could be used in the factor variable notation to appropriately identify the desired reference level.
Leave a comment:
Greg Petroski replied

26 Mar 2022, 10:20
Ramsay & Silverman's approach to functional data analysis. Functional ANOVA.
Leave a comment:

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: