Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • pwcorr, star(.05) but with 3 star levels

    Good afternoon,
    Is there a way to do something analogous to pwcorr, star(.05), but with 3 star levels corresponding to 10%, 5%, and 1% significance?
    Thank you.

  • #2
    Code:
    ssc install estout, replace
    Code:
    sysuse auto, clear
    pwcorr price headroom mpg displacement
    
    *ESTOUT
    quietly estpost corr price headroom mpg displacement, matrix
    esttab ., replace b(3) unstack nonum nomtitle not noobs ///
    label compress eqlabels((1) (2) (3) (4), lhs("Variables")) ///
    starlevels(* 0.1 ** 0.05 *** 0.01)
    Res.:

    Code:
    . pwcorr price headroom mpg displacement
    
                 |    price headroom      mpg displa~t
    -------------+------------------------------------
           price |   1.0000 
        headroom |   0.1145   1.0000 
             mpg |  -0.4686  -0.4138   1.0000 
    displacement |   0.4949   0.4745  -0.7056   1.0000 
    
    . 
    
    . esttab ., replace b(3) unstack nonum nomtitle not noobs ///
    > label compress eqlabels((1) (2) (3) (4), lhs("Variables")) ///
    > starlevels(* 0.1 ** 0.05 *** 0.01)
    
    --------------------------------------------------------------------
    Variables              (1)          (2)          (3)          (4)   
    --------------------------------------------------------------------
    Price                1.000                                          
    Headroom (in.)       0.115        1.000                             
    Mileage (mpg)       -0.469***    -0.414***     1.000                
    Displacem.. in.)     0.495***     0.474***    -0.706***     1.000   
    --------------------------------------------------------------------
    * p<0.1, ** p<0.05, *** p<0.01

    Comment


    • #3
      Thank you very much.
      I have 14 variables, and would prefer to have the output either replace the data in memory or write directly to excel. How would I do that?

      Comment


      • #4
        Code:
        esttab . using myfile.csv, ...
        See

        Code:
        help esttab
        Last edited by Andrew Musau; 04 Jan 2023, 17:45.

        Comment


        • #5
          I wanted to mention corrci from the Stata Journal. (No stars, however. Stars are an abomination in my view.)


          Code:
          . sysuse auto, clear
          
          . corrci headroom trunk weight length displacement , saving(corrci_results) savepvalue
          
          (obs=74)
          
                                     correlations and 95% limits
          headroom     trunk             0.662    0.511    0.774
          headroom     weight            0.483    0.287    0.641
          headroom     length            0.516    0.326    0.666
          headroom     displacement      0.474    0.276    0.634
          trunk        weight            0.672    0.524    0.781
          trunk        length            0.727    0.597    0.819
          trunk        displacement      0.609    0.442    0.735
          weight       length            0.946    0.915    0.966
          weight       displacement      0.895    0.838    0.933
          length       displacement      0.835    0.750    0.893
          
          . u corrci_results
          
          . list
          
               +---------------------------------------------------------------------+
               |     var1           var2          r      lower      upper     pvalue |
               |---------------------------------------------------------------------|
            1. | headroom          trunk   .6620111   .5107769   .7735031   1.34e-10 |
            2. | headroom         weight   .4834558   .2866197   .6411296   .0000128 |
            3. | headroom         length   .5162955   .3262901   .6662006   2.50e-06 |
            4. | headroom   displacement   .4744915   .2759067   .6342269   .0000195 |
            5. |    trunk         weight   .6722057   .5242273   .7807783   5.46e-11 |
               |---------------------------------------------------------------------|
            6. |    trunk         length   .7265956   .5972572    .819102   2.34e-13 |
            7. |    trunk   displacement   .6086351   .4415427   .7349259   8.78e-09 |
            8. |   weight         length   .9460086   .9153801   .9657493   5.86e-37 |
            9. |   weight   displacement   .8948958   .8376901   .9326781   6.17e-27 |
           10. |   length   displacement     .83514   .7497066   .8931923   2.27e-20 |
               +---------------------------------------------------------------------+
          Last edited by Nick Cox; 04 Jan 2023, 18:40.

          Comment


          • #6
            Starring in tables

            Can someone please explain precisely what is the point of starring particular results according to P-values in lengthy tables of results?

            On a Neyman-Pearson view there could be a decision problem addressed by a significance test in which case the advice (instruction!) is to work with a single critical significance level and find whether results are or are not significant at that level. This approach has been criticised in many different ways, as should be familiar. But if you are doing this repeatedly, then you may wish to flag (e.g.) results with P < 0.05 with stars (which here could mean any convenient symbol) -- although even that raises questions of multiplicity, i.e. many hypotheses are being tested at the same time, and the tests aren't usually independent,

            The use of multiple levels for starring in tables of results has long seemed puzzling to me, for all that it is very common.

            It seems closer to a Fisher view whereby the P-value is regarded as an indicator in its own right, perhaps as a measure of strength of association. But if the P-values can and should be taken seriously, then look at the P-values directly. Don't degrade them as if you were rating hotels or movies or website purchases.

            In the case of correlation,

            1. Correlations themselves are already measures of strength of association (linear, monotonic, whatever, depending on what flavour you are using). .

            2. They can be interpreted directly by looking at scatter plots.

            3. P-values depend highly sensitively on issues such as how samples were taken, whether distributions are bivariate nomal (if P-values are calculated that way), and whether there is cluster or serial dependence. At a minimum, they can't be more informative or more reliable than the correlations on which they are based.

            4. Stressing P-values rather than correlations seems often to be a way of talking up disappointing results. "Indeed, my correlations appear to be mostly very weak, but many of them are significant!" In fields where it is still important to keep track of weak relationships that may have slight predictive value in conjunction with others, that can make sense, In others the researcher is setting up a smokescreen.

            More generally: In many fields, there seems to be ritual display of lengthy correlation and other tables, say in Appendices or supplementary material. Quite why?

            a.. An explanation that you should do this because you are expected to do this or because it is standard practice seems circular to me. I am still asking Quite why?

            b. A better argument is that someone might want to look in detail at your results and might complain if they are absent, whereas if people don't want to do this they need not, and won't usually complain about tables they don't need.

            b. still raises an empirical question of how many people ever look at these tables carefully. The starring seems an admission that there is usually far too much information to process, in which case wouldn't graphical display be as or more effective (not to say more attractive)?

            Comment


            • #7
              Hi Nick,

              Sorry, didn't realize there was a new reply.
              I can share my view as a financial economist. Basically, the reason for stars is to draw the reader's attention. Its not an admission that there is too much information, because certain types of information are expected to appear on the table based on theory (for example, a theoretical model might be complex, and the reader might want to confirm that certain coefficients are NOT statistically significant).
              The reason for multiple levels starring is so the reader can decide how much they trust the association. Of course, p-values can be provided instead of t-stats, etc., but starring would still be helpful for the attention reason I gave above.
              As far as "Correlations themselves are already measures of strength of association", strength yes, reliability no. One person might have a data set with 3 observations on each variable and a correlation that is very high (or highly negative), whereas another might have a data set with 3000 observations and a lower correlation. The former would be completely unreliable (and hence, low test statistic), while the latter would be reliable (with high test statistic).
              Scatter plots and other charts are often used in my field, but mainly for intuition. Ultimately, readers usually want numbers and significance levels before deciding whether they believe there is a genuine relationship among variables.

              Comment


              • #8
                Thanks for your spirited reply.

                I agree with you: people looking at correlations should pay attention to sample size. I wouldn't pay any serious attention to results based on sample sizes of 3, but your point is taken.

                Your readers must differ a lot from my readers

                Comment


                • #9
                  To add to #8, written late where I was.

                  In what tables I see, the sample size is generally (almost) the same across a table of correlations. Thus some researchers want to show, and some readers demand to see, an indication of critical values for such a sample size, such as critical values for being significant at three significance levels. As said, that plays fast and loose with what P-values are, but there you go. If missing values are occasional that indication of three or so critical levels can be given just once as a header or footer in a table. It doesn't necessarily imply peppering a table with multiple stars.

                  There are more problems if a table of correlations mixes results for quite different sample sizes.

                  Much although not all of my reservations are about the psychology implied. The implication seems to be that readers don't want to look at a table of bare correlations, because it is (perhaps a little boring and) too complicated to absorb. So, the way to make a complicated table more interesting or easier to absorb is to add more complication? Sometimes that can be right -- after all,sometimes a graph needs annotation too -- but I have to wonder. .

                  Comment

                  Working...
                  X