Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Christoph Thewes
    replied
    nothing you can't do with other commands/ados, but this would be nice and easy:
    Option for .tab to display numerical value before the value-label. nolabel display numeric codes rather than value labels, but what I normally want is both!

    So exactly what "numlabel VAR, add" is doing, but as an display-option for tab.

    Leave a comment:


  • Nigel Moore
    replied
    More options for post hoc analysis, especially with unbalanced data sets:
    • Dunnet's test (AFAIK, the two Dunnet's supported by Stata require balanced data sets, certainly Theresa Powell's does)
    • Steel's test
    • Dwass-Steel-Critchlow-Fligner
    • Shirley's test
    • Williams' test
    Oh, and a way of running Bartlett's without -oneway-, or appending -pwcompare- to -oneway-. Currently, we have to run ANOVA twice, -oneway- for Bartlett's and -anova- for -pwcompare-. How on earth can that make sense?

    Leave a comment:


  • Richard Williams
    replied
    Ability to directly read in more data files in other formats, e.g. SPSS.

    Leave a comment:


  • Bruce Weaver
    replied
    I would like to see an option for the N-1 Chi-square added to tabulate. As Ian Campbell's simulation study has shown, it is a better choice than the Fisher-Irwin test (aka., Fisher's exact test) for analyzing 2x2 tables when the marginal totals are not fixed in advance by design. Of course, it is possible to to calculate the N-1 Chi-square oneself by various means (see below). But I think it would be better to have a built in option.

    Cheers,
    Bruce


    Code:
    . * Three ways to compute the N-1 Chi-square with Stata.
    .
    . * First, generate a data set containing the 4 cell counts.
    . * I'll use the well-known 2x2 table showing the relationship
    . * between type of feeding (breast vs bottle) and malocclusion
    . * of the teeth in infants (see Yates, 1934; Kendall & Stuart,
    . * 1967; Campbell, 2007).
    .
    . clear all
    
    . input rowvar colvar N
    
            rowvar     colvar          N
      1. 0 0 4
      2. 0 1 16
      3. 1 0 1
      4. 1 1 21
      5. end
    
    . list
    
         +----------------------+
         | rowvar   colvar    N |
         |----------------------|
      1. |      0        0    4 |
      2. |      0        1   16 |
      3. |      1        0    1 |
      4. |      1        1   21 |
         +----------------------+
    
    .
    . * METHOD 1.
    .
    . * Write a small program to compute E.S. Pearson's N-1 Chi-square test
    . * using stored results from the 'tabulate' command.
    . * Program name:  ESPChiSq, short for Egon S. Pearson's N-1 Chi-Square.
    . capture program drop ESPChiSq
    
    . quietly program ESPChiSq
    
    .
    . * Use tabulate command to compute Pearson's Chi-square.
    . tabulate rowvar colvar [fweight = N], chi2
    
               |        colvar
        rowvar |         0          1 |     Total
    -----------+----------------------+----------
             0 |         4         16 |        20
             1 |         1         21 |        22
    -----------+----------------------+----------
         Total |         5         37 |        42
    
              Pearson chi2(1) =   2.3858   Pr = 0.122
    
    . ESPChiSq
    Egon S. Pearson's N-1 Chi-Square Test
    N-1 ChiSq   df     p-value
    ----------------------------
    2.3290418    1    .12698002
    ----------------------------
    
    .
    . * Could also be done using tabi (i.e., immediate form of tabulate).
    .
    . tabi 4 16 \ 1 21, chi2
    
               |          col
           row |         1          2 |     Total
    -----------+----------------------+----------
             1 |         4         16 |        20
             2 |         1         21 |        22
    -----------+----------------------+----------
         Total |         5         37 |        42
    
              Pearson chi2(1) =   2.3858   Pr = 0.122
    
    . ESPChiSq
    Egon S. Pearson's N-1 Chi-Square Test
    N-1 ChiSq   df     p-value
    ----------------------------
    2.3290418    1    .12698002
    ----------------------------
    
    .
    . * ----------------------------------------------
    .
    . * METHOD 2.
    .
    . * Compute a constant stratum variable
    . generate Stratum = 0   
    
    . list
    
         +--------------------------------+
         | rowvar   colvar    N   Stratum |
         |--------------------------------|
      1. |      0        0    4         0 |
      2. |      0        1   16         0 |
      3. |      1        0    1         0 |
      4. |      1        1   21         0 |
         +--------------------------------+
    
    .
    . * Use tab3way to display the contingency table.
    . tab3way rowvar colvar Stratum [fweight=N] , rowtot coltot
    
    
    Frequency weights are based on the expression: N
    Table entries are cell frequencies
    Missing categories ignored
    
    -------------------------------
              | Stratum and colvar
              | -------- 0 --------
       rowvar |     0      1  TOTAL
    ----------+--------------------
            0 |     4     16     20
            1 |     1     21     22
        TOTAL |     5     37     42
    -------------------------------
    
    . * Use the cc command to compute the Mantel-Haenszel statistic & p-value.
    . cc rowvar colvar [fweight=N], by(Stratum)
    
             Stratum |       OR       [95% Conf. Interval]   M-H Weight
    -----------------+-------------------------------------------------
                   0 |       5.25      .4440375    270.558     .3809524 (exact)
    -----------------+-------------------------------------------------
               Crude |       5.25      .4440375    270.558              (exact)
        M-H combined |       5.25      .5338913   51.62568              
    -------------------------------------------------------------------
    
                       Test that combined OR = 1:
                                    Mantel-Haenszel chi2(1) =      2.33
                                                    Pr>chi2 =    0.1270
    
    .
    . * The M-H test above is matching the Linear-by-linear association test
    . * from SPSS.
    .
    . * Now see what happens if the stratification variable is omitted.
    . cc row col [fweight=N]
                                                             Proportion
                     |   Exposed   Unexposed  |      Total     Exposed
    -----------------+------------------------+------------------------
               Cases |        21           1  |         22       0.9545
            Controls |        16           4  |         20       0.8000
    -----------------+------------------------+------------------------
               Total |        37           5  |         42       0.8810
                     |                        |
                     |      Point estimate    |    [95% Conf. Interval]
                     |------------------------+------------------------
          Odds ratio |             5.25       |    .4440375     270.558 (exact)
     Attr. frac. ex. |         .8095238       |   -1.252062    .9963039 (exact)
     Attr. frac. pop |         .7727273       |
                     +-------------------------------------------------
                                   chi2(1) =     2.39  Pr>chi2 = 0.1224
    
    . * If I omit the constant Stratum variable, Pearson's Chi-square is computed.
    .
    . * ----------------------------------------------
    .
    . * METHOD 3.
    .
    . * As Howell's notes below show, Mantel's Chi-square for linear trend
    . * (aka., the test of linear-by-linear association in SPSS) is equal
    . * to Pearson's r-squared * (N-1).
    . * https://www.uvm.edu/~dhowell/methods7/Supplements/OrdinalChiSq.html
    .
    . quietly correlate rowvar colvar [fweight = N]
    
    . * return list
    . local Linear = (r(N)-1)*r(rho)^2
    
    . local dfLinear = 1
    
    . display "N-1 Chi-square = " `Linear'
    N-1 Chi-square = 2.3290418
    
    . display "             p = " chi2tail(1,`Linear')
                 p = .12698002
    
    . * ----------------------------------------------
    Here is the ESPChiSq program code that did not appear in that output due to the use of quietly.

    Code:
    capture program drop ESPChiSq
    quietly program ESPChiSq
     display "Egon S. Pearson's N-1 Chi-Square Test"
     display "N-1 ChiSq   df     p-value"
     display "----------------------------"
     display (r(N)-1)/r(N)*r(chi2) "    " (r(r)-1)*(r(c)-1) ///
      "    " chi2tail(1,(r(N)-1)/r(N)*r(chi2))
     display "----------------------------"
    end

    Leave a comment:


  • Isaac Maddow-Zimet
    replied
    Richard Williams I've had similar experiences with sem (and especially gsem) being very slow compared to Mplus. I've used R's lavaan package in the past, and I've found it much more comparable, speedwise, to Mplus than to Stata for fitting structural equation models. But it would be great to have a speedy implementation in Stata as well.

    Leave a comment:


  • Richard Williams
    replied
    I also find it strange that StataCorp ignores people's advice for speeding up a number of built in commands.
    In fairness, Stata 14.2 did make some nice speed improvements in -sem-, especially when fiml is used. As I said before, Mplus is amazingly fast, but I think it is faster than everyone. It would be helpful to know if other packages are faster than Stata. If I am not running something too monstrous, I find Stata quite zippy, seemingly faster than SPSS (but I haven't used SPSS in years).

    In short, I would love to see speed improvements in Stata, but users of other packages may want speed improvements too.

    Incidentally, my ongoing hope is that Stata buys out Mplus and steals all their algoritms! Or copies whatever is the Mplus secret of success.

    Leave a comment:


  • Cynthia Inglesias
    replied
    In terms of methods Stata 15 is remarkable indeed. But Mata only got a handful of functions, while no effort at all was made to document the graphics language. With the latter, StataCorp can begin small (e.g. one or two chapters in every release). I really think that for power users this would be a great addition. The built-in editor got some love but still no auto save function or syntax auto-complete.

    I also find it strange that StataCorp ignores people's advice for speeding up a number of built in commands. Sure, increasing the number of variables in Stata MP is helpful but it would be a lot more useful in my opinion to speed up data management without necessarily using parallelization. Collectively, the time required to run certain commands accumulates fast with big data.

    Stata is a great piece of software and with relatively few changes it could become a lot better. Some of the decisions StataCorp makes are truly puzzling. I hope Stata 16 will be the version where i can do 99% of my work in a single environment.
    Last edited by Cynthia Inglesias; 07 Jun 2017, 03:35.

    Leave a comment:


  • Richard Williams
    replied
    Maybe it is in there somewhere in the new docs, but I am not seeing anything about faster execution. I don't know how it does it, but Mplus is often much faster that Stata sem (for that matter Mplus may be by far the fastest of all SEM programs) Stata has gotten better but if I could get the speed of Mplus and the ease of use and integration of features that Stata has, I would be very happy.

    Leave a comment:


  • Rich Goldstein
    replied
    here are some issues especially related to the design aspect of studies:

    randomization plans including adaptive randomization (yes, I am familiar with the user-written routines -ralloc-, -rct_minim- and -randomize-

    programs for various kinds of matching (introductions to various forms can be found in (a) Rosenbaum, PR (2010), Design of observational Studies, Springer or (b) Stuart, E.A. (2010). Matching Methods for Causal Inference: A review and a look forward. Statistical Science 25(1): 1-21. forgot to mention that I do recognize that there are user-written routines (e.g., I have used -vmatch- several times) but these are quite limited

    re: MI, I was told by StataCorp tech support that I could not do this; I then learned (Ian White), that -ice- can handle the following situation: I have a categorical variable with dozens of categories (say, numbered 1-99); there are two types of missing values: (a) standard missing and (b) "60" is a special category that means it is supposed to be one of 61-69 but we don't know which it is; I want to be able to impute both at the same time

    for power analysis: I would like to compute power for a certain width of a confidence interval (i.e., I want the power for an "accurate" estimate of, say, an important coefficient)
    Last edited by Rich Goldstein; 06 Jun 2017, 13:43.

    Leave a comment:


  • Richard Williams
    replied
    Start updating web pages an hour earlier when Stata 16 is released so as not to prolong the suspense.

    Leave a comment:

Working...
X