Wishlist for Stata 18

Rasool Baloch

Join Date: Nov 2016

Posts: 59
#121

27 Sep 2021, 05:09

Dear Statalist,

I would like a feature for data entry as form like in excel form rather than entering data in row. It will be easy for entry and less chances of errors.

Best regards,
Rasool Bux

Attached Files
1 like
Comment
Christopher Bratt

Join Date: May 2019

Posts: 144
#122

27 Sep 2021, 09:23

William Lisowski Thanks for the clarification.

(Personally, I’d say it it’s interactive use when I try out a few lines of code and happen to move this code to a do file in Stata or a window in an external editor, with or without saving the code. This is not the first time Stata’s terminology is different from what I would think is intuitive, but I agree: Avoiding confusion is important. So thanks for correcting me.)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#123

27 Sep 2021, 12:12

Rasool Baloch With the understanding that I don't do very much hand entry of data in my particular work (and when I do, I don't use Stata for it), why not just enter the data in Excel and then use the -import excel- command to bring it into Stata? I worry about bulking up a program with things that are readily accomplished with other tools. Everything you add to a program risks introducing bugs, even inducing bugs in other parts of that program. My own philosophy is that programs should focus on one area and do that really well, and have interoperability with other tools that perform related but different tasks.
4 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#124

27 Sep 2021, 19:30

#113 Jack Edmondson This is a thoughtful, interesting post.

I would like to suggest that StataCorp reconsider how it abstracts and utilizes different commands and functionalities from a computer science perspective.

(emphasis added)

I am reminded of a very early users' meeting I attended -- perhaps 1995 -- in which I was enthusing about Stata's programmability: if Stata doesn't provide a command, you can program your own! The now senior biostatistician I was talking to riposted: I am a statistician. I don't want to write programs. I just want to do statistics.

Naturally we were both right. But quite often I see the line that Stata is a programming language and therefore should align with specific concepts, even fashions. in programming and computer science (usually in those languages that are the speakers' or writers' own favourites). Again, the only short answer is Yes and No. The programmability of Stata is crucial -- including crucial to users who will never write a new Stata program strict sense but who will benefit mightily if other users and Stata's developers do. But what most users want is, in the first instance, a language for doing statistics (including graphics and data management).

What I mean by this is that there are features that have been added long ago that, if one were to redesign today, would likely never implement them in the way they currently exist in Stata, or that there are functionalities that other software/languages have which Stata lacks.

That's often true. The question is quite what to do about it. The existence of old features is sometimes an embarrassment and a bit of an awkward mess. But the long-term stability of Stata is a long-term feature that helps promote long-term commitment too, and in users who have been around say 3 years as well as those who have been around say 30 years.

sort and gsort: Yes, that is a small mess that the company might tidy up. I don't think it is a real problem.

Other examples include the distinction between gen and egen

Indeed. This is a classic. egen was written many years ago as a stop-gap. It's an arbitrary rag-bag. If egen didn't exist it wouldn't be reinvented now, and that's been true for many years. To declare a stake, I was first author of several of the functions in official egen and was first author of many of the functions for it in the community. I am proud of some of them and embarrassed by that work too. It's not a modern way to provide extra functionality. I don't write new egen functions any more. Neither typically do StataCorp.

But look at what the community thinks, including new users. See for example the little threads on Twitter on favourite commands. Very often, egen gets a highly positive rating. Look at this stuff! These are so many useful tools here that solve lots of problems. The typical user doesn't know or care how egen is implemented and whether it meets some programming criteria. (That's not meant disrespectfully: outside Stata I often neither know nor care how programs I use are written. I might notice if one is slow or awkward to use.) Also, egen is very often featured on Statalist as a simple, often excellent way to do things. It's not going to go away quickly.

one is unable to use the merge command to join two datasets together using variable columns that have different names, a feature present in many other alternatives to Stata.

That's an interesting one. I don't know enough (i.e. more than zero) to comment on other software that allows this. If enough people want this, it may come. I have not seen this request before. I fear that it could confuse new users quite as much as it would benefit people who want it.

These examples may not seem like a big concern, but nevertheless their improvement would represent quality of life increases over the long-run. Indeed, in my experience, it is the little distinctions of things like this that frustrate new users of Stata the most. Refining the functionality of already existing commands in Stata to better and more natural levels of abstraction would certainly be an improvement to its long-term accessibility (even though I'm sure a few veterans might grumble about some changes to core functionalities on account of having to re-learn something they knew for years, you could suppose).

As an older person, I read that last remark on veterans as a little inappropriate. Excuse us for hanging around! Still, I hope to retain a sense of humour.

The balance is difficult. I am personally in favour of undocumenting more things than StataCorp seem willing to do. I would undocument date() rather than daily(). as under the first name too many users don't believe what is clearly documented, that it yields daily dates, and only daily dates. I would undocument Stata's weekly dates, which I have never heard of being used for real. But personal peeves don't translate always into what will really benefit the community.

That is because users are a community and there are silent majorities that need consideration. Tidy up too much, and all of a sudden many do files and programs and many course materials and textbooks become much harder to understand. One of the strengths of Stata is how rarely -- compared with the alternatives -- one has to explain old features that aren't current.

I don't think StataCorp needs to throw everything away, of course, but certainly some introspection on how core features might benefit from small changes or minor improvements would be very welcome.

Agreed.
6 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#125

28 Sep 2021, 06:26

Just as a sequel to the previous.

Stata is newer than say SPSS or SAS but it's been around long enough for any active user to notice all sorts of awkwardness, inconsistencies and redundancies. StataCorp in my experience are shifting increasingly to a three-way model of focusing on big changes -- especially those which users would be quite unable or unwilling to program (e;g; frames, the new table) -- letting user-programmers do what they do best, say adding specialised statistical or graphical functionality -- and taking a long view about existing official code in between.

The point is perhaps made best through little stories rather than a lengthy sermon.

Originally rename could rename one variable at a time. Users often wanted more and there were answers to that need, such as write a loop. Some community-contributed commands emerged (which inside were just: a loop). The comment to StataCorp was that rename should be more versatile! and the reply was in effect We agree and it's on our list. Meanwhile have you looked at the community commands? It took a while but then Bill Gould saw how he wanted to do it and did it and there we are and the community-contributed commands are just history (they still work). This isn't that different from many of us having a desire to write a paper on some topic but waiting until the time and inclination to do it are coupled with getting some ideas on how to do it well. .

For a long time (20 years or more) there have been grumbles about tables. So many commands, very confusing to know which to choose. StataCorp say: We are very interested in improving support in that area. (There are many troubles here. One is: I want a table or even I want a simple table sounds easy, but so do all these other guys and their tables are also simple in their view, but usually different.) Come Stata 17 and a new table command and a large fraction of the reaction here is Hey, you broke my table code! Naturally many people are also delighted to try out the new table.

I often (weekly?) see posts here using egen commands with calls to functions that went undocumented in Stata 9 (18 years ago!) such as sum() and rsum(); What is going on here? I have three guesses:

1. The users started with Stata in Stata 8 or earlier, so that's how they know the old code.

2. Someone guessed at what the syntax might be and they were right. This is in particular plausible for sum(),

3. We are just seeing copying. The code is just based on code from somewhere else that worked, and so on. There could be a long chain of copying, and that's fine by me. This is where my guesses go for the most part.

Executive summary: The urge to get Stata to tidy up and make software more systematic is admirable and desirable. Lots of users, however, are not so keen on (the consequences of) your tinkering with the commands they know and use.

P.S. I do know about version control. https://www.stata.com/support/faqs/p...stata-version/
4 likes
Comment
Fabrice Nzepang

Join Date: Sep 2021

Posts: 3
#126

28 Sep 2021, 08:42

How to conduct Panel Data PSTR and PSTAR in STATA?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#127

28 Sep 2021, 09:27

Fabrice Nzepang You asked that at https://www.statalist.org/forums/for...d-pstar-models and the suggestion to explain what these methods are remains good. Otherwise, no answer implies no one knows of any command.
Comment
Sonnen Blume

Join Date: Aug 2018

Posts: 342
#128

29 Sep 2021, 21:42

SAS has introduced a public web-based version, maybe Stata could try something similar...
1 like
Comment
Christopher Bratt

Join Date: May 2019

Posts: 144
#129

30 Sep 2021, 00:16

SAS has introduced a public web-based version, maybe Stata could try something similar...

Meaning you can run a version of Stata from any browser, for instance on an iPad? Yes, please.
2 likes
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2403
#130

30 Sep 2021, 13:35

I would welcome the addition of an option in the new collect system to be able to export the displayed tabular data (such as from -collect preview-) to a Stata dataset. It doesn't need to have the formatting/stylistic details, but having a familiar export system to a dataset when one just wants to collect those values can be convenient.
2 likes
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2403
#131

04 Oct 2021, 12:05

A (relatively) minor request. An -nmissing()- function to complement the -missing()- function. It would return the number of missing values in the specified variable list. Of course, there is also -egen, rowmiss()-, but doesn't handle string values by default and requires an intermediate variable generation step if the result is used for filtering.
3 likes
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1133
#132

04 Oct 2021, 15:49

I second Leonardo Guizzetti's suggestion in #131. SPSS has NMISS() and NVALID() functions, which I have found quite useful on many occasions.
https://www.ibm.com/docs/en/spss-sta...alue_functions

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
1 like
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35724

#133

05 Oct 2021, 05:43

@Leonardo Guizetti Bruce Weaver I am agnostic about any nmissing() because egen, rowmiss() certainly includes string variables, and automatically.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float numvar str4 strvar
1 "frog"
2 ""    
3 "toad"
. ""    
5 "newt"
end

. egen nmiss = rowmiss(*var)

. llist 

     +-------------------------+
     | numvar   strvar   nmiss |
     |-------------------------|
  1. |      1     frog       0 |
  2. |      2                1 |
  3. |      3     toad       0 |
  4. |      .                2 |
  5. |      5     newt       0 |
     +-------------------------+

Comment

Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2403
#134

05 Oct 2021, 06:12

You're right, of course, Nick. It is -egen, rownonmiss()- that has the -strok- switch. Either way, a complementary function would be just that, an alternative.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#135

06 Oct 2021, 20:35

One additional thing I would like is an update to Stata's mapping commands. That is, I'd like to be able to put boundary boxes to denote a region of particular importance, or otherwise zoom in on specific polygons.

For example, I'm mapping the U.S. for a project, and I have 4 treatment regions (really 4 counties across the U.S.). I'd like to show imbalance in the outcome between my treated unit and local/regional untreated units on the day before the intervention. I know I can do this with bars/dots (in fact this is what I've done), but it would be really nice to zoom in on the polygons so the readers can see the complete picture.
2 likes
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment