Wishlist for Stata 18

Lars Egner replied

10 Jan 2023, 01:55
Surely someone else has mentioned this, but I would like to join in because I believe an important feature, I would really like Full Info Max Likelihood methods for missing data to be implemented in the reg/logistic/nbreb famility of commands. Doing regression modelling "through" the sem/gsem commands is not ideal and does not work for many of the more finnicky analysis'.

Implementation of "#" and "##" in sem is a close second, including interpretation variable "X" as one variable in "estat teffect" if written as X##X.
2 likes
Leave a comment:
Chen Samulsion replied

08 Jan 2023, 08:48
Roger Newson's -xcollapse- and -xcontract- resolve my problem in #559. The two commands are separately extention of collapse and contract. The output data set created by xcollapse and xcontract may be listed to the Stata log, or saved to a disk file, or written to the memory (overwriting any pre-existing data set).
Leave a comment:
Julian Reif replied

06 Jan 2023, 07:59
Regarding #567, another option is to use autorename, a user written command:

Code:

ssc install autorename, replace sysuse auto, clear autorename
Leave a comment:
Ali Atia replied

06 Jan 2023, 07:49
Thanks for sharing. Renvars is a user written command -- I hope the functionality is implemented natively moving forward.
Leave a comment:
Igor Paploski replied

05 Jan 2023, 14:47
Regarding #567, I usually use the commands below for this sort of issue:

Code:

ds capture renvars `r(varlist)' , map(strtoname(@[1])) drop in 1
2 likes
Leave a comment:
William Lisowski replied

04 Jan 2023, 16:21
Regarding #567, it's important to note that import delimited includes the varnames() option akin to the firstrow option in import excel.
Leave a comment:
Ali Atia replied

04 Jan 2023, 15:12
A single command to promote first row values to variable names would be very helpful. There is an option for this in the import excel command, but Excel imports are not the only instance where this is relevant. Currently, my workaround is to write a simple loop through all variables to rename them with their first value, and place that into a user defined program.

However, it would be nice to have a native Stata command for what seems like a basic data cleaning functionality.

Last edited by Ali Atia; 04 Jan 2023, 15:15.
Leave a comment:
Jared Greathouse replied

04 Jan 2023, 09:20
What use cases are you aware of that would require that large a number of frames to be stored concurrently in memory?

I think your request for more than 100 frames is due one of your latest projects, am I right?

Quite right FernandoRios wbuchanan . In my edits to scul, I now use frames to store what I call the "testframe", i.e., the reshaped panel dataset where the time variable is the unique ID, and the outcomes of each unit are the columns. I originally planned to have one frame for each unit if we were doing in space placebo studies and then estimate the model that way, iteratively switching betwixt each frame until all placebos are estimated. But then I realized, we can only have 100 frames at once. Suppose we had 101 units in total... then, the frame approach won't work. But, as usual, Mata seems to solve all woes, and I suspect at some point I will integrate this fully into scul.
Leave a comment:
Chen Samulsion replied

04 Jan 2023, 07:11
Dear wbuchanan thank you. I only want to be able to use -restore- following the command, 'cause -collapse-, -contract- will change the entire structure, in this sense they are not as simple as -generate-. In my experience, -preserve- and -restore- are mostly designed specifically for -collapse- and -contract-. Maybe this reason is not enough to persuade most of Stata users. Stata will and should keep as laconically as possible. Entities should not be multiplied unnecessarily.
Leave a comment:
FernandoRios replied

04 Jan 2023, 06:37
Hi Jared Greathouse
I think your request for more than 100 frames is due one of your latest projects, am I right?
I think something you may consider instead is not to create frames, but instead use mata to store the data you need, as if they were frames. there would be no limitation (except memory) to store as many instances of your datasets as needed.
Fernando
2 likes
Leave a comment:
wbuchanan replied

04 Jan 2023, 06:37
Chen Samulsion in your example:

Code:

collapse, force preserve

What would the behavior be? Would the collapse command just return the results to the result window, would it still have the normal behavior that could be reversed using -restore-, or would it be something else entirely? If you only want to be able to use -restore- following the command, why isn't the current -preserve- command sufficient? Lastly, if someone issues the -preserve- command before/after your example command, how would the two preservations be resolved? Just trying to get a better idea of the use case and whether there might be other ways to solve the problem you are trying to address.
2 likes
Leave a comment:
wbuchanan replied

04 Jan 2023, 06:09
Jared Greathouse the number of frames increasing isn't always going to be a software limitation issue. At some point there won't be sufficient RAM on the machine to initialize additional frames. What use cases are you aware of that would require that large a number of frames to be stored concurrently in memory? I imagine it is much easier for the developers to figure out how to prioritize requests when they have some use case explaining the reason for the request (and I am absolutely guilty of pitching ideas/requests without supplying use cases as well).
Leave a comment:
Jared Greathouse replied

03 Jan 2023, 20:36
I know there are software limitations, but maybe more frames in memory at once? At the moment there's 100. Cool. Why not 1000?
Leave a comment:
Clyde Schechter replied

03 Jan 2023, 17:20
Re #559: How far would go with this? Would you want -gen newvar = whatever- to have a -preserve- option? Would it matter if newvar is a tempvar? What about -keep if !missing(x1, x2, x3)-?

We all write commands like this all the time. Sometimes we will feel the need to preserve the data before we do them, but most of the time we won't. So where would you draw the line here? And what do you see as the advantage of having a preserve option in the command instead of writing preserve on a separate line? It's the same number of keystrokes either way, no?
2 likes
Leave a comment:
Chen Samulsion replied

03 Jan 2023, 16:32
When command will obviously change the dataset structure, add a 'preserve' option. For example,

Code:

collapse, force preserve
Leave a comment:

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: