Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Nick Cox
    replied
    #173 Commands that work differently can be written by users too. I find it striking that no such command seems to exist and my guess is that the more experienced you are, the more you just work with Stata's existing behaviour on this point.

    More warnings: I think that's a good idea.

    Leave a comment:


  • Nick Cox
    replied
    I have heard two substantial talks on proposals for three-way logic in Stata. After each talk the discussion seemed to split three ways.

    1. Regardless of the merits of your solution, this is way too complicated to explain either to new users or to experienced users accustomed to Stata's existing logic, in some cases experience meaning decades and not just years. The time for three-way logic to be introduced to Stata was 1984, not now.

    2. We really don't need or want three-way logic. The present system works fine. It's not outrageous to expect people to read documentation or to look carefully at what Stata does. There are functions like missing() that do what people want.

    3. We need a three-way logic, but your version is quite illogical.

    Sure, that is only a sample of two talks and it's difficult to recall all of the discussion as often two or three people were speaking at the same time.

    Leave a comment:


  • Daniel Feenberg
    replied
    Originally posted by Nick Cox View Post
    daniel klein is right. This isn't a bug. It's a documented design choice. Responsibility for code remains with the coder.

    But I am curious: what would the easy fix be, precisely?
    Other packages have different ways of addressing this situation, my favorite is PSTAT where logical results can have three values, true, false or missing.

    Leave a comment:


  • Sven-Kristjan Bormann
    replied
    One quick and dirty fix would be to add a new option to the replace command. The name of this new option could be something like "nomiss" and extend the example " replace x=7 if y>3 ,nomiss" to "replace x=7 if y>3 & !missing(y)". Alternatively, create a new command like "replacenm" (replace no miss) which has this option as its default. In this or similar way, you could get the intended effect without breaking old code, which relies on the current behaviour.
    Maybe it would be even enough to issue a suppressible warning that missing values have also been changed. Such a warning would at least provide a hint to the coder that something might have happened which was not intended.

    Leave a comment:


  • Nick Cox
    replied
    daniel klein is right. This isn't a bug. It's a documented design choice. Responsibility for code remains with the coder.

    But I am curious: what would the easy fix be, precisely?

    Leave a comment:


  • daniel klein
    replied
    Originally posted by Sonnen Blume View Post
    5. Editing value labels/variable names right on the variable window, without having to type on console.
    Stata 11 introduced the variable manager and Stata 12 brought the properties window, where you can modify labels without typing commands.

    Originally posted by Leah Jones View Post
    [...] using the replace command with greater than (i.e. replace x=7 if y>3) missing values are also replaced since Stata treats missing as a high value. [...] Seems like it would be easy to fix and ensure this doesn't lead to too many problems.
    This has been discussed several times and I assume it will be discussed in the future. I believe there is nothing to "fix": this is not a bug, it is a design choice. More important, once you understood the principle, you can and probably will actually use it as a feature in your code. Changing something this basic now is likely to cause more problems (especially for those who have been working with Stata for a long time) than it would help (especially those who are new to Stata).

    Best
    Daniel
    Last edited by daniel klein; 05 Jan 2019, 02:19.

    Leave a comment:


  • Leah Jones
    replied
    Not sure if this has been mentioned, but when using the replace command with greater than (i.e. replace x=7 if y>3) missing values are also replaced since Stata treats missing as a high value. You can include code to exclude missing, but it can still be overlooked at times (see https://journals.sagepub.com/doi/pdf...03122418780369). Seems like it would be easy to fix and ensure this doesn't lead to too many problems.

    Leave a comment:


  • Nick Cox
    replied
    #168 (2) Please give an example of a graph you want that you think Stata can't do. There are many kinds, no doubt, but StataCorp really need a more precise wish list than that.

    (5) Do you know about varm ?

    (6) Statalist exists for this purpose, and others. If you can find a way to insist that people only post reproducible problems, that would be good. It would save a lot of time and effort.

    Leave a comment:


  • Sonnen Blume
    replied
    How about:

    1. Option for calculating CIs for percentages in two way tables (SPSS can do that from complex sample module).
    2. Quality of the graphs, at least half as thrilling and sophisticated as ones that ggplot2 can already do.
    3. Modifiable example charts online (https://plot.ly/)
    4. Loading SPSS files directly.
    5. Editing value labels/variable names right on the variable window, without having to type on console.
    6. Making a help menu/forum exclusively for sharing the reproducible codes/examples of analysis.

    Leave a comment:


  • Clyde Schechter
    replied
    That sounds interesting. I was not aware of :print: in Stata's regular expressions. Thank you!

    Leave a comment:


  • Mauricio Caceres
    replied
    @Clyde Schechter: Are you looking for something like this?

    Code:
    replace x = ustrregexra(x, "[^[:print:]]", "")

    Leave a comment:


  • Clyde Schechter
    replied
    We frequently get questions on the Forum about unexpected problems arising with string variables that turn out to be due to the presence of non-printing characters. Data imported from spreadsheets or word processing documents frequently contain such characters. They are not visible in -list-ings and other routine ways of viewing the data, but they often trip up the programmer. They are certainly a plague I deal with often. Whlie it is not all that complicated to write a loop over these and use -subinstr()- to remove them one at a time, it would sure be convenient to have a single function call that purges them all in one fell swop. Perhaps that could be accomlished by expanding the scope of "whitespace" removed by ustrtrim(), or perhaps there are Unicode related reasons why that would not be a good idea. But it should be possible for StataCorp to write some function that does this, no?

    Leave a comment:


  • Joro Kolev
    replied
    Originally posted by George Ford View Post
    egen xres = resample(x)

    creates a new series xres that is a random sample of x. options would include:
    1. with or without replacement (, permute)
    2. resample in blocks of integer N (, block(N))
    3. includes or exclude missing values (, dropna)
    4. weights options
    5. a similar command to resample an entire matrix


    -summ- for matrices



    Permuting a variable, and "bootstrapping" (sampling with replacement) a variable are super easy, see Ängquist, Lars. "Stata tip 92: Manual implementation of permutations and bootstraps." Stata Journal 10, no. 4 (2010): 686. It is not in the article, and I have not thought it through in detail, but doing it in blocks does not sound hard either.

    How do you mean -summ- for matrices? I believe I know how to do with with -matrix accum-, if what I think if what you have on your mind.

    Leave a comment:


  • John Mullahy
    replied
    Cross-posting this William Lisowski post on "Promotion Problems" to the Wishlist since it concerns some issues that Stata 16 developers might wish to address.

    https://www.statalist.org/forums/for...otion-problems

    Leave a comment:


  • Weiwen Ng
    replied
    Originally posted by Weiwen Ng View Post
    In the absence of Stata buying MPlus, I would like to see some additional postestimation statistics added for latent class/profile methods. Normalized entropy and sample size-adjusted BIC would be convenience features. You can program SSBIC quite easily, discussed more below. There was some recent discussion on how to do entropy as well.

    I would particularly like to see an implementation of the Lo-Mendell-Rubin adjusted likelihood ratio test. This compares a model with K classes with a model with K-1 classes; the difference in -2*log likelihood does not follow the usual chi square distribution for reasons I don't understand but the linked article explains. Then, automating the bootstrapped LR test described in the paper linked would be very nice. It appears that the bootstrapped LR test is the current gold standard for model selection.

    ...
    Updating my prior request. I'd like to drop my request for the LMR likelihood ratio test. A simulation study revealed that in cases where classes are poorly separated and some indicators load on multiple classes - which is probably what a lot of real life situations will involve - the LMR LR test had high false positive rates, and it was outperformed by straight up BIC and the boostrapped LR test.

    I'd like to reiterate my request for the bootstrapped LR test to be included in Stata 16, and for model entropy to be calculated (perhaps after estat lcprob, or as its own command).

    Stata 15 can't currently estimate models with both categorical and continuous latent variables, as far as I know, and I'd like to make that a request. The categorical latent variable is, obviously, a latent class. A continuous latent variable could be something like a random intercept for a group. Do this, and you will enable multilevel latent class analysis, mixture item response theory models, mixture models involving random effects in general, and untold other wonders.

    Leave a comment:

Working...
X