Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I bootstrap pwcorr or pcorr with STATA?

    I'm trying to use a Person's correlation with two non-normal variables. I know that I could use non-parametric correlation models such as Spearman's, but that would make the information more difficult to interpret. Can someone help me with a simple solution to apply bootstraping to pwcorr and pcorr functions using STATA?

  • #2
    Please read the FAQ Advice, all the way to #18. http://www.statalist.org/forums/help

    If you have just two variables, then the pwcorr and pcorr commands (not functions) are just irrelevant distractions.

    It's not clear what your problem is otherwise, as you don't show what code you tried. Here is one minimal example.

    However, it's often much more productive to consider whether a transformation is needed or correlation makes sense at all. Plotting the data tells you more.

    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . gen gpm = 1000/mpg
    
    . su gpm weight
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
             gpm |         74     50.1928    12.79856   24.39024   83.33334
          weight |         74    3019.459    777.1936       1760       4840
    
    . bootstrap r(rho), nodots nowarn reps(10000) seed(2803): corr gpm weight
    
    Bootstrap results                               Number of obs     =         74
                                                    Replications      =     10,000
    
          command:  correlate gpm weight
            _bs_1:  r(rho)
    
    ------------------------------------------------------------------------------
                 |   Observed   Bootstrap                         Normal-based
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           _bs_1 |   .8544289   .0392602    21.76   0.000     .7774803    .9313775
    ------------------------------------------------------------------------------
    
    . estat bootstrap, all
    
    Bootstrap results                               Number of obs     =         74
                                                    Replications      =      10000
    
          command:  correlate gpm weight
            _bs_1:  r(rho)
    
    ------------------------------------------------------------------------------
                 |    Observed               Bootstrap
                 |       Coef.       Bias    Std. Err.  [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           _bs_1 |   .85442889  -.0017009   .03926019    .7774803   .9313775   (N)
                 |                                        .766372   .9175806   (P)
                 |                                        .759826   .9142154  (BC)
    ------------------------------------------------------------------------------
    (N)    normal confidence interval
    (P)    percentile confidence interval
    (BC)   bias-corrected confidence interval
    Last edited by Nick Cox; 15 Dec 2015, 12:12.

    Comment


    • #3
      Cross-posted at http://stats.stackexchange.com/quest...orr-with-stata

      Our cross-posting policy is also spelled out in the FAQ Advice.

      Comment


      • #4
        Thank you Nick Cox and sorry for all the mistakes on my previous post.
        Can I modify the code to fit with a pcorr command?
        Best regards,
        Last edited by Filipe Rodrigues; 17 Dec 2015, 08:34.

        Comment


        • #5
          If you only have 2 variables, as stated in post #1, -pcorr- will not run: you need at least 3 variables to use it.

          Comment


          • #6
            Filipe:
            this thread seems to be on spot: http://www.statalist.org/forums/foru...l-correlations
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment


            • #7
              Thank you all, specially Carlo Lazzaro.
              Can someone help apply this small program (http://www.statalist.org/forums/foru...l-correlations) in a specific example?
              Bw
              Filipe

              Comment


              • #8
                So, what you have to do is take the program in #2 of the linked post, and replace <varname> with the name of the first variable you want to use in the partial correlations, and then replace <varlist> with the second variable you want to partially correlate with the first one followed by all the others that you want to adjust for in the partial correlation. Then run it exactly as it was shown there and you will have it.

                I will reiterate what I said in #5 above--in this thread so far you have only spoken of two variables, and you cannot do -pcorr- without at least a third variable.

                Comment


                • #9
                  Thank you, I have more than 2 variables.

                  ​I run exactly as you said but an error happens every time, even after changing variables and checking if everything is wright with them.

                  Code:
                  . program myprogram, rclass
                    1.     pcorr ykl40_2 groupcode age 
                    2.     mat R = r(p_corr)
                    3.     return scalar foo = R[rownumb(R,"foo"),1]
                    4. end
                  
                  . bootstrap foo=r(foo), reps(1000): myprogram
                  (running myprogram on estimation sample)
                  'r(foo)' evaluated to missing in full sample
                  r(322);

                  Comment


                  • #10
                    Your problem is with
                    Code:
                     return scalar foo = R[rownumb(R,"foo"),1]
                    You are asking Stata to find the number of the row of R whose name is "foo". But there is no such row, so Stata returns a missing value for scalar foo. The names of the rows in R will be groupcode and age. Assuming it is the partial correlation with variable groupcode that you are interested in, you have to change this to:

                    Code:
                    return scalar foo = R[rownumb(R, "groupcode"), 1]

                    Comment


                    • #11
                      Thank you soo much, it worked perfectly.
                      Happy new year!

                      Comment

                      Working...
                      X