Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Apologies if this has already been suggested.

    For ttest, it would be great if one could specify a nonzero difference between mu1 and mu2 for unpaired t-tests.

    Context. I teach my students that all t-tests have a common format, as follows: t = (statistic - parameter|H0) / SEstatistic. (See attached slides for a nicer view of that.) For an unpaired t-test, the statistic = the difference between the two sample means, the parameter|H0 = the specified difference between the two population means (which does not necessarily have to be zero), and SEstatistic = the standard error of the difference between two independent means. To illustrate a null specifying a nonzero difference between population means, I made up an example stating that in 1960, the population difference in height between men and women was 5 inches. Someone collecting data currently wishes to test the null hypothesis that the difference is still 5 inches. It would be great if statistical software allowed one to test hypotheses like this without having to resort to the trickery of subtracting a given amount (5 inches in this case) from the scores of one group.

    Cheers,
    Bruce


    Click image for larger version

Name:	IntroBiostats_08_slide39.png
Views:	2
Size:	161.8 KB
ID:	1494894 Click image for larger version

Name:	IntroBiostats_08_slide40.png
Views:	1
Size:	88.6 KB
ID:	1494895
    Attached Files
    --
    Bruce Weaver
    Email: [email protected]
    Web: http://sites.google.com/a/lakeheadu.ca/bweaver/
    Version: Stata/MP 18.0 (Windows)

    Comment


    • Bruce Weaver is this what you have in mind?

      Code:
      webuse nhanes2f, clear
      mean weight, over(sex)
      test Male = Female
      test Male = Female + 10
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      Stata Version: 17.0 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • Originally posted by Richard Williams View Post
        Bruce Weaver is this what you have in mind?

        Code:
        webuse nhanes2f, clear
        mean weight, over(sex)
        test Male = Female
        test Male = Female + 10
        Hi Richard. If I use ttest on the same data, I get a different result. It appears that test after means is using N-1 as the df, not N-2. The difference won't matter much with really large samples, I suppose, but people often have relatively small samples when doing t-tests.

        Code:
        webuse nhanes2f, clear
        ttest weight, by(sex)
        display "t^2 = " r(t)^2
        mean weight, over(sex)
        test Male = Female
        * -test- is using df = N-1, not N-2.
        test Male = Female + 10
        * To get the right result with correct df:
        generate wt = weight
        replace wt = weight + 10 if sex==2
        ttest wt, by(sex)
        display "F = " r(t)^2
        * I would like to be able to do something like this:
        *   ttest weight, by(sex) delta(10)
        * where delta = mu1-mu2|H0
        --
        Bruce Weaver
        Email: [email protected]
        Web: http://sites.google.com/a/lakeheadu.ca/bweaver/
        Version: Stata/MP 18.0 (Windows)

        Comment


        • You pointed me in the right direction, Richard. Re #227 and #228, -regress- followed by -lincom- gives the right result.

          Code:
          webuse nhanes2f, clear
          * Test H0: mu1-mu2 = 10
          generate wt = weight
          replace wt = weight + 10 if sex==2
          ttest wt, by(sex)
          * Perhaps -regress- followed by -lincom- can give the result I want.
          quietly regress weight i.sex
          lincom 1.sex - 2.sex-10
          display "t = " r(t)
          But I still think most users would find it easier if ttest had an option allowing one to specify the value of mu1-mu2|H0. Also, regress does allow various vce() options to deal with variance heterogeneity, but I don't think those options match exactly what you get with the unequal or welch options for ttest.

          Cheers,
          Bruce
          --
          Bruce Weaver
          Email: [email protected]
          Web: http://sites.google.com/a/lakeheadu.ca/bweaver/
          Version: Stata/MP 18.0 (Windows)

          Comment


          • there are multiple user-writtne modules to do variations of quantile regression. these have evolved to meet the many limitations of xtqreg, significantly no factor variables or interactions.
            my suggestion/request: enhance xtqreg functionality to include more of the other xt-class commands.
            if this functionality already exists i apologize and would appreciate education.
            thank you
            George Hoffman

            Comment


            • Would it be possible to raise the limit on "estimates store"? Currently one can only store 300 estimations, a limit quickly reached if you need to verify robustness to many different specification tweaks.

              Comment


              • Would it be feasible, in the outcome of any regression model, to list the variable(s) a given predictor is collinear with?
                Last edited by Carlo Lazzaro; 02 May 2019, 09:12.
                Kind regards,
                Carlo
                (Stata 18.0 SE)

                Comment


                • Carlo Lazzaro I don't think so because sometimes a given variable is collinear with acombination of variables rather than a single other variable.

                  The classic example would be year of interview, year of birth, and age. If we have interviews that were taken at different years (so year of interview is not a constant), then we might think that the current (at time of interview) situation may influence the outcome, the age of the respondent could influence the outcome, and the circumstances in which the respondent grew up (the year of birth) might influence the outcome. However, age = year of interview - year of birth, so if you know two, you also know the third. So it is the combination of variables that results in perfect colliniearity.
                  ---------------------------------
                  Maarten L. Buis
                  University of Konstanz
                  Department of history and sociology
                  box 40
                  78457 Konstanz
                  Germany
                  http://www.maartenbuis.nl
                  ---------------------------------

                  Comment


                  • Maarten:
                    I do share your point.
                    Maybe the "collinearity list" can work for simple cases only; when things get messier, Stata could return something like "collinearity depends on a combination of variables. List unfeasible".
                    Kind regards,
                    Carlo
                    (Stata 18.0 SE)

                    Comment


                    • Perhaps this already is possible (but if so I haven't seen how to do it): Could twoway scatteri be modified to accept numlists for the coordinates? E.g. instead of
                      Code:
                      tw scatteri 4 5 4 6 4 7
                      one could use something like
                      Code:
                      tw scatteri 4 (5(1)7)
                      and instead of
                      Code:
                      tw scatteri 4 5 4 6 4 7 5 5 5 6 5 7
                      one could use something like
                      Code:
                      tw scatteri (4 5) (5(1)7)

                      Comment


                      • Originally posted by John Mullahy View Post
                        Perhaps this already is possible (but if so I haven't seen how to do it): Could twoway scatteri be modified to accept numlists for the coordinates?
                        One workaround can be to build your immediate coordinate list in a macro and then call -scatteri-, as in the following example.

                        Code:
                        tw scatteri 4 5 4 6 4 7 5 5 5 6 5 7, name(have)
                        
                        local coordlist ""
                        foreach a of numlist 4 5 {
                          foreach b of numlist 5(1)7 {
                            local coordlist = "`coordlist' `a' `b'"
                          }
                        }
                        di "`coordlist'"
                        
                        tw scatteri `coordlist', name(want)

                        Comment


                        • a 'simple' suggestion: the bottom of the Stata main window should/could act as a more versatile status bar. it already shows the results of `pwd'. most typically, it could show number of obs (_N), the number of variables, memory consumption, sort order, last _rc, it could also indicate if data had changed since last `use'. perhaps, the user could select what cold be displayable in the status bar....

                          this suggestion arose because i spent the last two days working with a dataset that had some observations dropped (becsue of an errant .ado that i was building). im not sur ehow many times that might have happened previously - but i came very close to 'save, replace' as i usually do, which would have led to a very bad situation. perhaps, if the obs and var count were readily visible, i would notice the dataset status without explicit query.
                          thank you for considering
                          george hoffman
                          Last edited by George Hoffman; 12 May 2019, 11:23.

                          Comment


                          • Originally posted by George Hoffman View Post
                            a 'simple' suggestion: the bottom of the Stata main window should/could act as a more versatile status bar. it already shows the results of `pwd'. most typically, it could show number of obs (_N), the number of variables, memory consumption, sort order, last _rc, it could also indicate if data had changed since last `use'. perhaps, the user could select what cold be displayable in the status bar....

                            this suggestion arose because i spent the last two days working with a dataset that had some observations dropped (becsue of an errant .ado that i was building). im not sur ehow many times that might have happened previously - but i came very close to 'save, replace' as i usually do, which would have led to a very bad situation. perhaps, if the obs and var count were readily visible, i would notice the dataset status without explicit query.
                            thank you for considering
                            george hoffman
                            What's wrong with how Stata already displays this information? For example, last _rc code is displayed next to its command in the cmdlog, and the viewer pane displays _N, memory usage (for that dataset only, not active operations), number of variables, etc. The only thing I don't think it automatically shows is sort order, but that is found quickly enough by using a -describe- command.

                            Comment


                            • Re #237, with respect to the problem that would have ensued after -save, replace-, I would argue that it is just bad programming practice to ever overwrite a source data file with a derived data file. Even with all the information you ask for in the status bar, there is always the possibility that the code taking you from start to end contains errors, errors that don't happen to show up in the information shown in the status bar. To be prepared for that possibility, whenever you transform a data set you should save it as a new data set under a new name. Never overwrite the data you started with, and always save the do-file. If you do that, if an error is discovered later, you can always fix the error and re-run.

                              Comment


                              • dear Leonardo and Clyde -
                                thank you for you rresponses.
                                @ leonardo - yes, the information is available in other ways already. the properties window does display most of the fields that I referenced.. i'm not sure how most people use Stata, but the more windows I have open, the less room I have for the results pane, which is where I'm ususally focused.
                                @ clyde - yes, i acknowledge a bad habit. others have asked for version control to be built into the save function. i am aware of some user-written options. i will investigate.
                                i am a long-time user of this fantastic program and usergroup. i appreciate the help. thank you all again.

                                Comment

                                Working...
                                X