Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • correlation with p-value over groups

    Dear All, Suppose that I have the following data
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(stkcd A B year)
    2 -.01321  .0649455 2001
    2 -.17586  .1972809 2002
    2 -.15283  .3979636 2003
    2 .103938  -.215786 2004
    2 .077891   .100931 2005
    2 -.05099 -.0849801 2006
    6 .080194  6.26e+08 2001
    6 .035823 -1.42e+08 2002
    6 -.06987 -5.34e+08 2003
    6 .041957 -4.57e+08 2004
    6 -.01533 -8.77e+07 2005
    6 .011217 -4.21e+08 2006
    end
    How can I obtain correlation coefficients between A and B, along with their p-values over/by `stkcd'. Thanks.
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    Hello River,

    I wonder whether this is what you want:

    Code:
    by stkcd, sort : pwcorr A B, sig
    Best regards,

    Marcos

    Comment


    • #3
      Hi, Marcos: Thanks but not exactly. I'd like to generate two new variables, one for the correlation coefficients and the other for the corresponding p-values.

      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment


      • #4
        I tried the following code (ssc install rangestat)
        Code:
        rangestat (corr) A B, interval(year . .) by(stkcd)
        but only obtained correlations, not their p-values.
        Ho-Chuan (River) Huang
        Stata 19.0, MP(4)

        Comment


        • #5
          Try

          Code:
          program corr_p
              correlate A B
              generate rho = r(rho)
              generate p = 2*ttail(r(N)-2, abs(r(rho))*sqrt(r(N)-2)/sqrt(1-r(rho)^2))
          end
          
          rangerun corr_p , interval(year . .) by(stkcd)
          I hope I got the rangerun (SSC) call correct.

          Best
          Daniel

          Comment


          • #6
            Daniel gives excellent advice as always. It follows that this is another way to do it:


            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(stkcd A B year)
            2 -.01321  .0649455 2001
            2 -.17586  .1972809 2002
            2 -.15283  .3979636 2003
            2 .103938  -.215786 2004
            2 .077891   .100931 2005
            2 -.05099 -.0849801 2006
            6 .080194  6.26e+08 2001
            6 .035823 -1.42e+08 2002
            6 -.06987 -5.34e+08 2003
            6 .041957 -4.57e+08 2004
            6 -.01533 -8.77e+07 2005
            6 .011217 -4.21e+08 2006
            end
            
            rangestat (corr) A B, interval(year . .) by(stkcd) 
            
            generate p = 2*ttail(corr_n-2, abs(corr_x)*sqrt(corr_n-2)/sqrt(1-corr_x^2))
            
            list, sepby(stkcd) 
            
                 +---------------------------------------------------------------------+
                 | stkcd         A           B   year   corr_n~s     corr_x          p |
                 |---------------------------------------------------------------------|
              1. |     2   -.01321    .0649455   2001          6   -.730778   .0989641 |
              2. |     2   -.17586    .1972809   2002          6   -.730778   .0989641 |
              3. |     2   -.15283    .3979636   2003          6   -.730778   .0989641 |
              4. |     2   .103938    -.215786   2004          6   -.730778   .0989641 |
              5. |     2   .077891     .100931   2005          6   -.730778   .0989641 |
              6. |     2   -.05099   -.0849801   2006          6   -.730778   .0989641 |
                 |---------------------------------------------------------------------|
              7. |     6   .080194    6.26e+08   2001          6   .6641471   .1502541 |
              8. |     6   .035823   -1.42e+08   2002          6   .6641471   .1502541 |
              9. |     6   -.06987   -5.34e+08   2003          6   .6641471   .1502541 |
             10. |     6   .041957   -4.57e+08   2004          6   .6641471   .1502541 |
             11. |     6   -.01533   -8.77e+07   2005          6   .6641471   .1502541 |
             12. |     6   .011217   -4.21e+08   2006          6   .6641471   .1502541 |
                 +---------------------------------------------------------------------+

            Comment


            • #7
              Hi all,

              I've got a quite similar question.

              I just wonder, in this example, how I can use rangestat to estimate correlation coefficients for three variables or more (e.g. A B C) , instead of only two variables (A B) ?

              Thank you very much for your help.

              Vinh

              Comment


              • #8
                Vinh Ng: I don't know any short-cuts for that. You could run rangestat repeatedly or write your own program for rangerun.

                Comment


                • #9
                  Hi Nick,

                  Thanks a lot for your advice. Sorry for lengthening your thread, River. I have tried using rangerun. Supposed I have the following dataset:

                  Code:
                  input float(stkcd A B C year)
                  2 -.01321  .0649455 0.003 2001
                  2 -.17586  .1972809 0.005 2001
                  2 -.15283  .3979636 0.001 2001
                  2 .103938  -.215786 0.007 2001
                  2 .077891   .100931 0.111 2002
                  2 -.05099 -.0849801 0.235 2002
                  6 .080194  6.26e+08 0.051 2002
                  6 .035823 -1.42e+08 0.068 2002
                  6 -.06987 -5.34e+08 0.868 2003
                  6 .041957 -4.57e+08 0.190 2003
                  6 -.01533 -8.77e+07 0.189 2003
                  6 .011217 -4.21e+08 0.345 2003
                  end
                  Here is the program I have written:

                  Code:
                  program trials
                      quietly spearman A B C
                      gen spearmancorr=r(rho)
                  end
                  
                  rangerun trials, interval(year . .) by(year)
                  However, it gives me the Spearman correlation coefficients between A and B only, without those between A and C, and B and C.

                  Would you mind advising me what else I would need to do?

                  Thanks a lot for your help,

                  Vinh

                  Comment


                  • #10
                    Clearly, spearman leaves in memory only the last correlation calculated, so you must run it as many times as you want correlations and save to a new variable each time.

                    Comment


                    • #11
                      Originally posted by Nick Cox View Post
                      Clearly, spearman leaves in memory only the last correlation calculated, so you must run it as many times as you want correlations and save to a new variable each time.
                      It used to be that way. In Stata 15, spearman saves in r(Rho) the correlation matrix of all variables. Vinh can get the required elements from there, reducing running time.

                      Code:
                      help el()
                      is a convenient tool to extract entries from a matrix.

                      Best
                      Daniel

                      Comment


                      • #12
                        Daniel: Good point

                        Comment


                        • #13
                          Hi Daniel, Many thanks for your helpful suggestion.
                          Ho-Chuan (River) Huang
                          Stata 19.0, MP(4)

                          Comment


                          • #14
                            Hi Nick, Many thanks for your helpful suggestion.

                            Ho-Chuan (River) Huang
                            Stata 19.0, MP(4)

                            Comment


                            • #15
                              Originally posted by daniel klein View Post

                              It used to be that way. In Stata 15, spearman saves in r(Rho) the correlation matrix of all variables. Vinh can get the required elements from there, reducing running time.

                              Code:
                              help el()
                              is a convenient tool to extract entries from a matrix.

                              Best
                              Daniel
                              Hi Daniel,

                              Thanks a lot for your suggestion. I'm still struggling with the el() function. I have tried the following:

                              Code:
                              program trialc
                                  quietly spearman A B C
                                  gen spearmancorr=el(r(rho),3,3)
                              end
                              
                              rangerun trialc, interval(year . .) by(year)
                              Unfortunately, it gives me nothing.

                              Could you please help me with this?

                              Thank you very much,

                              Vinh

                              Comment

                              Working...
                              X