Wishlist for Stata 16

Belinda Foster started a topic Wishlist for Stata 16

06 Jun 2017, 08:59
Wishlist for Stata 16

As per Nick Cox's "request"!
Tags: None
Dave Airey replied

17 Jun 2019, 12:57
It would be nice to add tests for comparing Pearson correlations:

https://journals.plos.org/plosone/ar...l.pone.0121945

and Spearman correlations:

https://www.omicsonline.org/open-acc....php?aid=54592
Leave a comment:
daniel klein replied

17 Jun 2019, 09:04
Originally posted by wbuchanan View Post

[...] I remember either for Stata 14 or 15 there was a counter up on the Stata homepage for a few weeks prior to the release.

Yes, and then the countdown stopped and the page would not update for a day or so ... I prefer the old-fashioned announcement (traditionally on Statalist): Stata 16 is shipping now.
Leave a comment:
wbuchanan replied

17 Jun 2019, 07:46
Is anyone else wondering when the release date for Stata 16 will be announced? I remember either for Stata 14 or 15 there was a counter up on the Stata homepage for a few weeks prior to the release.
1 like
Leave a comment:
Bruce Weaver replied

14 Jun 2019, 12:21
Please consider tweaking -ranksum- to make it report (at least optionally) the Mann-Whitney U statistic and Wilcoxon's W (as some authors call it). As noted in this thread, -ranksum- currently reports neither. Thanks.
1 like
Leave a comment:
Rich Goldstein replied

12 Jun 2019, 02:57
I agree with daniel klein and have previously discussed this with Stata personnel as noted in #11 in https://www.statalist.org/forums/for...ple-imputation
Leave a comment:
daniel klein replied

12 Jun 2019, 02:47
Probably too late for 16, but could we please have a persist option in mi impute chained, similar to the one in community-contributed ice (Royston; SSC or SJ)?

The problem: Multiple imputations via chained equations often fails because one of the models, usually mlogit, fails to converge. If this happens on observed data or if it happens on each iteration, I do not mind Stata stopping with an error; probably there is something wrong with my model. However, it is terribly annoying to have your machine running for a day, only to find that mlogit did not converge in iteration 7 on m=42. The model converged 410 times before (10 iterations * 41 datasets, not counting the runs on observed data); chances are it will converge in iteration 8 on m=42. So, I really want to be able to tell Stata to just skip this one iteration for the respective variable, not terminate the complete process.

Ideally, I want a model-specific option, like

Code:

mi impute chained ... (mlogit, skipnonconvergence(#)) ...

that specifies the maximum number of iterations per imputed dataset that I am willing to skip if the model does not converge. This seems far less dangerous than giving us the already existing force option that just happily accepts missing imputed values.

Best
Daniel
Last edited by daniel klein; 12 Jun 2019, 02:51. Reason: formatting of option names
2 likes
Leave a comment:
Clyde Schechter replied

09 Jun 2019, 14:50
Thank you, that is very useful.
Leave a comment:

Bjarte Aagnes replied

09 Jun 2019, 14:42

#242: While waiting for the stripnonprintable()

ustrregexra() can be used to strip off "non-printable" characters using Unicode categories:

Code:

scalar S2 = ustrregexra(S1,"[^\p{L}\p{M}\p{N}\p{P}\p{S}\p{Zs}]","")

If only single U+0020 SPACEs is wanted, remaining whitespace characters can be replaced and trimed by:

Code:

scalar S2 = itrim(ustrregexra(S2,"\p{Zs}",ustrunescape("\u0020")))

Code:

\p{L}  or \p{Letter}: any kind of letter from any language
\p{M}  or \p{Mark}  : a character intended to be combined with another character
\p{N}  or \p{Number}: any kind of numeric character in any script.
\p{P}  or \p{Punctuation}: any kind of punctuation character.
\p{S}  or \p{Symbol}: math symbols, currency signs, dingbats, box-drawing characters, etc.
\p{Zs} or \p{Space_Separator}: a whitespace character that is invisible, but does take up space.

List of Unicode characters of category “Space Separator”: https://www.compart.com/en/unicode/category/Zs

Leave a comment:

Clyde Schechter replied

08 Jun 2019, 16:46
Problems frequently arise from non-printing characters in strings. By definition, they can't be seen by the user, but Stata sees them and takes them seriously. It often crops up with string variables in data sets that have been imported from various sources. And they can create havoc when you are trying to merge two data sets from different sources with different kinds of non-printing characters contaminating them. Can we have a string function that eliminates all non-printing characters from a string? Perhaps it could even be built out from the existing -egen, sieve()- function in the engemore package as a new class of characters "printable", but updated to cope with Unicode.
Leave a comment:

John Mullahy replied

15 May 2019, 12:56

Could the capabilities of cformat be extended to other types of displayed output? I'm thinking specifically of correlation matrixes from correlate, but I suspect there are others as well where such functionality might be useful.

Code:

. set cformat

. reg y x

      Source |       SS           df       MS      Number of obs   =     1,000
-------------+----------------------------------   F(1, 998)       =    288.46
       Model |  437.799556         1  437.799556   Prob > F        =    0.0000
    Residual |   1514.6571       998  1.51769249   R-squared       =    0.2242
-------------+----------------------------------   Adj R-squared   =    0.2235
       Total |  1952.45666       999  1.95441107   Root MSE        =    1.2319

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |   .4625608   .0272347    16.98    0.00     .4091169    .5160047
       _cons |   .0619798   .0389869     1.59    0.11    -.0145258    .1384855
------------------------------------------------------------------------------

. corr y x
(obs=1,000)

             |        y        x
-------------+------------------
           y |   1.0000
           x |   0.4735   1.0000


. set cformat %5.2f

. reg y x

      Source |       SS           df       MS      Number of obs   =     1,000
-------------+----------------------------------   F(1, 998)       =    288.46
       Model |  437.799556         1  437.799556   Prob > F        =    0.0000
    Residual |   1514.6571       998  1.51769249   R-squared       =    0.2242
-------------+----------------------------------   Adj R-squared   =    0.2235
       Total |  1952.45666       999  1.95441107   Root MSE        =    1.2319

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |       0.46       0.03    16.98    0.00         0.41        0.52
       _cons |       0.06       0.04     1.59    0.11        -0.01        0.14
------------------------------------------------------------------------------

. corr y x
(obs=1,000)

             |        y        x
-------------+------------------
           y |   1.0000
           x |   0.4735   1.0000

Announcement