Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by Jack Chau View Post
    Based on Clyde's previous response, it would seem the effect size should only come from experts.
    The guess or estimate for the denominator of effect size can come from prior experience, either personal or community (experts included). But the numerator—the delta or effect or whatever you want to call it—is usually chosen by the experimenter, often with reference to what's considered an important difference in the community. To paraphrase Stephen Senn, it's the difference that you'd hate to miss.

    Comment


    • #32
      Originally posted by Joseph Coveney View Post
      Do you mean nuisance parameters, or the effect? For the former, you'd simulate throughout a range of plausible values and see how sensitive the test performance (power, test size) or model performance (bias, efficiency) is to them. For the latter, that's just a conventional power analysis, done analogously with effect size and sample size varying and type 1 error rate fixed.
      That's the challenge. Running sensitivity analyses makes complete sense, but how would you identify what those plausible values are in the absence of literature estimates?

      Comment


      • #33
        Originally posted by Joseph Coveney View Post
        The guess or estimate for the denominator of effect size can come from prior experience, either personal or community (experts included). But the numerator—the delta or effect or whatever you want to call it—is usually chosen by the experimenter, often with reference to what's considered an important difference in the community. To paraphrase Stephen Senn, it's the difference that you'd hate to miss.
        I don't follow this thinking. The effect size is a single numeric value (e.g. suppose you are comparing an intervention to the control or reference case, then it may be that the intervention coefficient/effect size is 1.3 which suggests that that intervention would increase the outcome 30% more relative to the control in the case of a continuous outcome). Why is there a denominator or numerator at all?

        Comment


        • #34
          Originally posted by Jack Chau View Post
          Why is there a denominator or numerator at all?
          It's the difference family of effect sizes (Cohen's d and the like). You can Google for it.

          You can also
          Code:
          help esize
          and then click on the View complete PDF manual entry hyperlink on the help file that pops up, and then click on the Methods and formulas hyperlink at the beginning of the entry in the user's manual.

          Comment


          • #35
            Originally posted by Jack Chau View Post
            how would you identify what those plausible values are in the absence of literature estimates?
            I have yet to encounter a research problem so novel that there isn't even a hint from either prior experience (personal or community) or common sense.

            But, okay.

            Start with the real line. Is a negative value plausible for a given parameter? If not, then you've cut problem essentially in half right off the bat.

            Is a value of absolutely zero plausible (say for a variance or similar nuisance parameter)? Now you're down to a range of something like -epsdouble()- to -maxdouble()-. Scan that range if you can't do any better. (I would be surprised if you couldn't. For example, someone in the biological sciences can exclude ranges that "aren't physiological"—values of whatever that aren't compatible with life. It gets things down to what's manageable fairly quickly.)

            Comment


            • #36
              Originally posted by Joseph Coveney View Post
              I have yet to encounter a research problem so novel that there isn't even a hint from either prior experience (personal or community) or common sense.

              But, okay.

              Start with the real line. Is a negative value plausible for a given parameter? If not, then you've cut problem essentially in half right off the bat.

              Is a value of absolutely zero plausible (say for a variance or similar nuisance parameter)? Now you're down to a range of something like -epsdouble()- to -maxdouble()-. Scan that range if you can't do any better. (I would be surprised if you couldn't. For example, someone in the biological sciences can exclude ranges that "aren't physiological"—values of whatever that aren't compatible with life. It gets things down to what's manageable fairly quickly.)
              I do have a previous research study that I could use estimates from...however, I am skeptical about their internal validity. i.e. the sample size was so small in that study that those estimates are in my view...inaccurate and non-representative...would I still be able to use them? My apologies if this was somewhat of a rhetorical question as I am not a biostatistician.

              Comment


              • #37
                Originally posted by Jack Chau View Post
                I am skeptical about their internal validity. i.e. the sample size was so small in that study that those estimates are in my view...inaccurate and non-representative...would I still be able to use them?
                If the data were honestly and competently gathered then they're valid and representative.

                If you're worried about precision of the parameter estimates from the data for use in power analysis, then use the worst-case (least favorable to your research hypothesis) 95% confidence bound as the plug-in estimate of the parameter.

                As always, be sure to explore how sensitive your power analysis is to your assumptions.

                Comment


                • #38
                  Originally posted by Joseph Coveney View Post
                  If the data were honestly and competently gathered then they're valid and representative.

                  If you're worried about precision of the parameter estimates from the data for use in power analysis, then use the worst-case (least favorable to your research hypothesis) 95% confidence bound as the plug-in estimate of the parameter.

                  As always, be sure to explore how sensitive your power analysis is to your assumptions.
                  Is worst-case the lower or upper bound of the confidence interval?

                  Comment


                  • #39
                    If I wanted to plot power as a function of the number of clusters, would this code be written correctly?

                    *Graph estimates of power as a function of the number of clusters

                    power swcrt, num_clus(3 6 9 18 36) reps(1000) ///
                    graph(ydimension(power) xdimension(num_clus) ///
                    ytitle(Power) ///
                    xtitle(Number of clusters)
                    Would I need to specify "power" and "num_clus" as the macros that I defined above?

                    Comment


                    • #40
                      Hello Statalisters,

                      If I have the coefficients for each level of a categorical variable and I want to simulate a dependent variable with it - how would I go about doing so in a panel dataset? Should I take each coefficient and multiply it by the time variable that I have created using if condition as follows (assuming a 3 level time variable in addition to baseline)?

                      qui gen y = `intercept' + `intrvcoeff'*intrv + u_3 + u_2 + error if time == 0 (Reference level)
                      qui gen y = `intercept' + `timecoeff1'*time + `intrvcoeff'*intrv + u_3 + u_2 + error if time == 1
                      qui gen y = `intercept' + `timecoeff2'*time + `intrvcoeff'*intrv + u_3 + u_2 + error if time == 2
                      qui gen y = `intercept' + `timecoeff3'*time + `intrvcoeff'*intrv + u_3 + u_2 + error if time == 3

                      Thanks in advance,
                      Last edited by CEdward; 29 Oct 2019, 18:20.

                      Comment


                      • #41
                        No. If time is a discrete variable taking on values 0, 1, 2, and 3, then it would be:

                        Code:
                        quietly gen y = `intercept' + `timecoeff1'*1.time + `timecoeff2'*2.time + `timecoeff3'*3.time + 
                        `intrvcoeff'*intrv + u_3 + u_2 + error

                        Comment


                        • #42
                          Originally posted by Clyde Schechter View Post
                          No. If time is a discrete variable taking on values 0, 1, 2, and 3, then it would be:

                          Code:
                          quietly gen y = `intercept' + `timecoeff1'*1.time + `timecoeff2'*2.time + `timecoeff3'*3.time + 
                          `intrvcoeff'*intrv + u_3 + u_2 + error
                          Many thanks Clyde, I have edited my code above. Since the coefficients will only apply when the time variable is a certain value (b/c those coefficients are for a specific level in relation to the reference), would the code I used above be correct? I guess it would be equivalent to what you posted?

                          Comment


                          • #43
                            the coefficients will only apply when the time variable is a certain value (b/c those coefficients are for a specific level in relation to the reference)

                            I don't understand what this means.

                            Comment


                            • #44
                              Originally posted by Clyde Schechter View Post
                              I don't understand what this means.[/FONT][/COLOR][/LEFT]
                              Right - that was a convoluted way of explaining that
                              timecoeff1 only applies when time == 1
                              timecoeff2 when time == 2
                              , etc.

                              Comment


                              • #45
                                Yes, that's what the code in #43 does. If time == 1, then 1.time is 1, 2.time is 0, and 3.time is 0. This is how factor variable notation in Stata works. See -help fvvarlist-.

                                Comment

                                Working...
                                X