Wishlist for Stata 18

Rich Goldstein

Join Date: Mar 2014

Posts: 4459
#211

15 Dec 2021, 11:41

re: #209 - to follow up on Leonardo Guizzetti 's comment, please see

Code:

help datetime##s4

and, if this is not what you mean, please clarify
Comment

John Mullahy

Join Date: Dec 2016
Posts: 750

#212

15 Dec 2021, 12:09

Might there be a way to add an option to the –centile– command that allows it to compute percentiles in the precisely same way as do –summarize, d– and –estat bootstrap, percentile– (and probably other commands)?

Code:

. sysuse auto
(1978 automobile data)

. tempfile tboot

. qui bootstrap, reps(500) saving(`tboot'): reg price mpg, level(50)

. estat bootstrap, percentile

Linear regression                               Number of obs     =         74
                                                Replications      =        500

------------------------------------------------------------------------------
             |    Observed               Bootstrap
       price | coefficient       Bias    std. err.  [50% conf. interval]
-------------+----------------------------------------------------------------
         mpg |  -238.89435  -6.540619   56.911373   -283.3891  -206.8067   (P)
       _cons |   11253.061   133.0593   1356.8499    10512.26   12328.61   (P)
------------------------------------------------------------------------------
Key: P: Percentile

. use `tboot'
(bootstrap: regress)

. sum, d

                           _b[mpg]
-------------------------------------------------------------
      Percentiles      Smallest
 1%    -380.3653      -436.6451
 5%    -344.9039      -410.6723
10%    -324.1469      -385.3353       Obs                 500
25%    -283.3891      -385.0602       Sum of wgt.         500

50%    -237.3672                      Mean           -245.435
                        Largest       Std. dev.      56.91137
75%    -206.8067      -121.5362
90%    -175.8918      -111.9515       Variance       3238.904
95%    -159.8261      -107.5473       Skewness      -.2622398
99%     -130.299      -100.4849       Kurtosis       2.846957

                          _b[_cons]
-------------------------------------------------------------
      Percentiles      Smallest
 1%     8331.229       7704.635
 5%     9282.614        7916.47
10%     9701.941        8226.62       Obs                 500
25%     10512.26       8312.646       Sum of wgt.         500

50%     11250.26                      Mean           11386.12
                        Largest       Std. dev.       1356.85
75%     12328.61       14624.41
90%     13168.51       14663.63       Variance        1841042
95%     13638.37          14861       Skewness       .1261091
99%     14560.25       15460.15       Kurtosis       2.709669

. centile *, c(25 75)

                                                          Binom. interp.   
    Variable |       Obs  Percentile    Centile        [95% conf. interval]
-------------+-------------------------------------------------------------
      _b_mpg |       500         25   -283.3945       -288.5923   -277.2365
             |                   75   -206.7666       -212.8225   -200.2001
     _b_cons |       500         25    10512.16        10293.56    10656.54
             |                   75    12332.11        12182.53    12567.85

Comment

Asjad Naqvi

Join Date: Oct 2014

Posts: 91
#213

17 Dec 2021, 14:47

1. Just a brief response to the request for a joyplots command by Igor Paploski. I agree with what Maarten Buis and Nick Cox said. Joyplots are very context-specific and work under certain circumstances. The number of parameters required to provide full functionality is quite a lot. It is definitely quicker to create them manually. As Nick said, they are basically overlapping area graphs and the guide provides the code. If you have the code block in place, you can easily copy and modify it to fit different data sets. There is not so much programming involved here. If the creation of extra junk variables is the issue, then just replace them with tempvars. I intentionally don't go so far in the guides to accommodate more basic users, but there is definitely a lot of scope for code optimization.

2. My wish list for Stata:

a) dofile editor:
i) It will be great if syntax highlighting is different colors for different commenting-out options (* vs // vs /* */). This will help discern actual comments inside the code versus marked-out code.

b) Graph options
i) ability to scale marker weights based on a starting and ending value
ii) ability to scale colors, and line widths based on a starting and ending value
iii) Marker labels can be read from variables. Please also allow this for angles, colors, marker sizes. This will make grouping so much easier on graphs

c) A long-term wishlist
i) Ability to read pixels (to process images, geotiffs, edge detection, native machine learning routines, etc)
ii) color scaling within graph elements defined by starting and ending color. For example, higher values on the y-axis are darker than lower values for the same line.

Last edited by Asjad Naqvi; 17 Dec 2021, 15:00.
8 likes
Comment
Rasool Baloch

Join Date: Nov 2016

Posts: 59
#214

21 Dec 2021, 02:50

I wish in Stata 18 horizontal scroll bar for result window rather than wraping results in result window. Or is there any option to turn on horizontal scroll bar for result window like in do editor preferences. Thanks and best regards
1 like
Comment
Daniel Schneider

Join Date: Sep 2014

Posts: 29
#215

21 Dec 2021, 04:40

An option to suppress repeated categorical axes in a by-graph would be very helpful. The repetition of axis labels can be switched off using the by-option suboption noiylabel / noixlabel, but this only works for numerical axes, not for categorical ones. This issue has been discussed before, for example:
Suppressing Repeated Categorical Axis Labels in by-graphs
In by cat_var bar graph, how to show var names only on the first graph y-axis
Multiple graphs combined with the over option (and only one legend)
Comment
Bert Lloyd

Join Date: Apr 2014

Posts: 108
#216

22 Dec 2021, 07:10

An option for log to close when a do-file exits with error.
1 like
Comment
Dan Palmer

Join Date: Jul 2018

Posts: 71
#217

22 Dec 2021, 13:05

Following execution of a tabi command, ensure that an r(p_exact) stored value (i.e., 1.00) is saved even in the absence of table output. See discussion below.

https://www.statalist.org/forums/for...o-output-shown
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4459
#218

22 Dec 2021, 14:03

I am opposed to #217 - I don't think that the result makes any sense
1 like
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2400
#219

22 Dec 2021, 16:02

Originally posted by Rich Goldstein View Post

I am opposed to #217 - I don't think that the result makes any sense

My sentiments exactly. Just because other software produces a result doesn't mean that it is correct in doing so.
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1130
#220

24 Dec 2021, 07:16

In 2019, Goncalo Cotovio asked if there is an immediate form of the paired t-test. The answer then, as now, is that there is not an immediate form of the paired t-test. But why not? The paired t-test can easily be computed from the following summary data:
n = the number of paired scores

r = the correlation between the paired scores

#mean1 = mean of 1st sample

#sd1 = SD of 1st sample

#mean2 = mean of 2nd sample

#sd2 = SD of 2nd sample

This is exactly the same number of arguments needed for the immediate form of the unpaired t-test:

Code:

Immediate form of two-sample t test ttesti #obs1 #mean1 #sd1 #obs2 #mean2 #sd2 [, options2]

For an immediate form of the paired t-test, the syntax might be something like this:

Code:

Immediate form of two-sample paired t test ttesti #obs #r #mean1 #sd1 #mean2 #sd2, paired [options]

The following code shows the needed computations and compares the results to those from -ttest- using the raw data.

Cheers,
Bruce

Code:

clear webuse fuel summarize * Write the needed summary measures to the dataset quietly summarize mpg1 generate byte n = r(N) in 1 generate m1 = r(mean) in 1 generate sd1 = r(sd) in 1 quietly summarize mpg2 generate m2 = r(mean) in 1 generate sd2 = r(sd) in 1 quietly pwcorr mpg1 mpg2 generate r = r(rho) in 1 * Now compute the paired t-test from the summary data generate mdiff = m1-m2 generate sddiff = sqrt(sd1^2+sd2^2-2*r*sd1*sd2) generate sediff = sddiff/sqrt(n) generate tobs = mdiff/sediff generate byte df = n-1 generate pval = ttail(df,abs(tobs))*2 list mdiff-pval in 1 * Compare results to those from -ttest- ttest mpg1==mpg2

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
2 likes
Comment
Ali Atia

Join Date: May 2020

Posts: 737
#221

25 Dec 2021, 14:24

Currently, the shell command is ignored when running Stata in batch mode in Windows (leaving the notice "request ignored because of batch mode" in the log-file). It would be nice if this were rectified in a future update.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3007
#222

27 Dec 2021, 06:41

As discussed here, margins should not be available after nonlinear models with fixed effects are estimated. The explanation for that is simple: any interesting quantity that we may want to compute will depend on the value of the fixed effects, which are not estimated by these commands. Therefore, margins computes something that most of the times is meaningless. This could be done in a future update, but at least it would be good to have this looked into in the next version.
5 likes
Comment
Euslaner

Join Date: Apr 2014

Posts: 186
#223

27 Dec 2021, 10:51

This should be simple but I can't get it to work. I have a data set attached in which I want to create a variable for US states. The state variables in the World Values Study are X048ISO and X048WVS.I created a variable called region from X048WVS. I then wanted to create a state level variable when X048WVS is not equal to a region value. So I typed and received:

gen str24 state = "."

. des state

storage display value
variable name type format label variable label
-------------------------------------------------------------------------------------------------------------------------------
state str24 %24s

. replace state = X048WVS if region1 == "."
type mismatch
r(109);

where region is gen str24 state = "."

. des state

storage display value
variable name type format label variable label
-------------------------------------------------------------------------------------------------------------------------------
state str24 %24s

. replace state = X048WVS if region1 == "."
type mismatch
r(109);

where region ==

. fre X048WVS if S003 == 840

X048WVS -- Region where the interview was conducted (WVS)
----------------------------------------------------------------------------------------
| Freq. Percent Valid Cum.
-------------------------------------------+--------------------------------------------
Valid -2 No answer | 55 0.62 0.62 0.62
840001 US: New England | 364 4.13 4.13 4.75
840002 US: Middle Atlantic States | 962 10.91 10.91 15.66
840003 US: South Atlantic | 971 11.01 11.01 26.67
840004 US: East South Central | 462 5.24 5.24 31.91
840005 US: West South Central | 670 7.60 7.60 39.51
840006 US: East North Central | 996 11.29 11.29 50.80
840007 US: West North Central | 418 4.74 4.74 55.54
840008 US: Rocky Mountain state | 369 4.18 4.18 59.72
840009 US: Northwest | 156 1.77 1.77 61.49
840010 US: California | 445 5.05 5.05 66.54
840011 US: Alaska | 3 0.03 0.03 66.57
840012 US: Hawai | 4 0.05 0.05 66.62
840013 US: Pacific | 348 3.95 3.95 70.56
840201 US: AL Alabama | 28 0.32 0.32 70.88
840202 US: AR Arkansas | 11 0.12 0.12 71.01
840203 US: AZ Arizona | 73 0.83 0.83 71.83
840204 US: CA California | 282 3.20 3.20 75.03
840205 US: CO Colorado | 73 0.83 0.83 75.86
840206 US: CT Connecticut | 26 0.29 0.29 76.15
: | : : : :
840231 US: NJ New Jersey | 56 0.63 0.63 89.36
840232 US: NM New Mexico | 27 0.31 0.31 89.67
840233 US: NV Nevada | 22 0.25 0.25 89.92
840234 US: NY New York | 118 1.34 1.34 91.26
840235 US: OH Ohio | 101 1.15 1.15 92.40
840236 US: OK Oklahoma | 28 0.32 0.32 92.72
840237 US: OR Oregon | 33 0.37 0.37 93.09
840238 US: PA Pennsylvania | 83 0.94 0.94 94.04
840239 US: RI Rhode Island | 8 0.09 0.09 94.13
840240 US: SC South Carolina | 12 0.14 0.14 94.26
840241 US: SD South Dakota | 24 0.27 0.27 94.53
840242 US: TN Tennessee | 55 0.62 0.62 95.16
840243 US: TX Texas | 177 2.01 2.01 97.17
840244 US: UT Utah | 22 0.25 0.25 97.41
840245 US: VA Virginia | 68 0.77 0.77 98.19
840246 US: VT Vermont | 6 0.07 0.07 98.25
840247 US: WA Washington | 65 0.74 0.74 98.99
840248 US: WI Wisconsin | 68 0.77 0.77 99.76
840249 US: WV West Virginia | 19 0.22 0.22 99.98
840250 US: WY Wyoming | 2 0.02 0.02 100.00
Total | 8819 100.00 100.00
--------------------------------------------------------------------------------------

and I wanted to assign state values of . fre X048WVS if S003 == 840

X048WVS -- Region where the interview was conducted (WVS)
----------------------------------------------------------------------------------------
| Freq. Percent Valid Cum.
-------------------------------------------+--------------------------------------------
Valid -2 No answer | 55 0.62 0.62 0.62
840001 US: New England | 364 4.13 4.13 4.75
840002 US: Middle Atlantic States | 962 10.91 10.91 15.66
840003 US: South Atlantic | 971 11.01 11.01 26.67
840004 US: East South Central | 462 5.24 5.24 31.91
840005 US: West South Central | 670 7.60 7.60 39.51
840006 US: East North Central | 996 11.29 11.29 50.80
840007 US: West North Central | 418 4.74 4.74 55.54
840008 US: Rocky Mountain state | 369 4.18 4.18 59.72
840009 US: Northwest | 156 1.77 1.77 61.49
840010 US: California | 445 5.05 5.05 66.54
840011 US: Alaska | 3 0.03 0.03 66.57
840012 US: Hawai | 4 0.05 0.05 66.62
840013 US: Pacific | 348 3.95 3.95 70.56
840201 US: AL Alabama | 28 0.32 0.32 70.88
840202 US: AR Arkansas | 11 0.12 0.12 71.01
840203 US: AZ Arizona | 73 0.83 0.83 71.83
840204 US: CA California | 282 3.20 3.20 75.03
840205 US: CO Colorado | 73 0.83 0.83 75.86
840206 US: CT Connecticut | 26 0.29 0.29 76.15
: | : : : :
840231 US: NJ New Jersey | 56 0.63 0.63 89.36
840232 US: NM New Mexico | 27 0.31 0.31 89.67
840233 US: NV Nevada | 22 0.25 0.25 89.92
840234 US: NY New York | 118 1.34 1.34 91.26
840235 US: OH Ohio | 101 1.15 1.15 92.40
840236 US: OK Oklahoma | 28 0.32 0.32 92.72
840237 US: OR Oregon | 33 0.37 0.37 93.09
840238 US: PA Pennsylvania | 83 0.94 0.94 94.04
840239 US: RI Rhode Island | 8 0.09 0.09 94.13
840240 US: SC South Carolina | 12 0.14 0.14 94.26
840241 US: SD South Dakota | 24 0.27 0.27 94.53
840242 US: TN Tennessee | 55 0.62 0.62 95.16
840243 US: TX Texas | 177 2.01 2.01 97.17
840244 US: UT Utah | 22 0.25 0.25 97.41
840245 US: VA Virginia | 68 0.77 0.77 98.19
840246 US: VT Vermont | 6 0.07 0.07 98.25
840247 US: WA Washington | 65 0.74 0.74 98.99
840248 US: WI Wisconsin | 68 0.77 0.77 99.76
840249 US: WV West Virginia | 19 0.22 0.22 99.98
840250 US: WY Wyoming | 2 0.02 0.02 100.00
Total | 8819 100.00 100.00
--------------------------------------------------------------------------------------

and I wanted to attach values to state from W048WVS when W048WVS is a state, not a region. But my command didn;t work (see above). I tried to upload the data set (very small) but it wouldn;'t load. But this should be obvious to people who can do this better than I can.

Thanks,

Ric Uslaner
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#224

27 Dec 2021, 10:57

Euslaner (post #223) -

This is off-topic here, which is the Wishlist for Stata 18.

Please post your question to a new topic, preferably using code delimiters [CODE] and [/CODE] to render the example output more readable.
3 likes
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#225

30 Dec 2021, 19:18

I know I usually say stuff like this, but I think it has to be said for those interested in treatment effects: another interesting algorithm we might find useful would be the Bayesian Structural Time Series approach. It's a method that's related to synthetic controls, but of course differs radically in the underlying theoretical framework. It's already been applied in a few interesting contexts, and I'd hate to see R beat Stata further (as much as it currently does, anyhow) in the treatment effects department.

I know StataCorp did quite a lot with teffects in 16 and 17, but I essentially think we should, and should want to see formalized, StataCorp implementations of synthetic controls, RD, and other treatment effect estimators, just as we have with difference-in-differences in (i think?) Stata 17.
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment