Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correlation between two vectors

    Dear Statalister,

    I am currently working with data which include patents, patent classes and year. Here is dataex of my data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year long number_of _patents int class
    
    2000 5   1
    2000 6   2
    2000 7   3
    2000 3   4
    2001 9   1
    2001 8   2
    2001 4   3
    2001 12  4
    2001 5   5
    
    end

    This means that in 2000 there were 5 patents in class 1, 6 patents in class 2, 7 patents in class 3 and 3 patents in class 4.

    What I intend is to find Peason correlation of patent distribution between two consecutive years , namely between vector a(5 6 7 3) and vector b(9 8 4 12 5). The more similar patent distributions between two consecutive years are, the less environmental turbulence is present.

    Thanks in advance for any help.

    Best regards, Farid


  • #2
    I assume you mean association when you talk about correlation, because a correlation does not make any sense for a categorical variable like class of patent. In that case it is just a matter of a simple chi squared test on cross-tabulation: tab class year [fw=number_of_patents], col chi . The null hypothesis is that the distributions are independent, that is, any patterns that seem to exist are only due to differences in the marginal distributions
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Maarten, let me try to restructure my date to make easier:

      * Example generated by -dataex-. To install: ssc install dataex clear input int year long number_of _patents int class

      2001 5 1
      2002 6 1
      2003 7 1
      2004 3 1
      2001 9 2
      2002 8 2
      2003 4 2
      2004 12 2
      2005 5 2

      end

      Here are two patent classes (1 and 2). The second column represents number of patents in a certain year within a class. Therefore, I need autocorrelation between years for class 1 and class 2 in the same column. It looks like having autocorrelation for different groups in the same column. Put differenlty, the code is supposed to calculate autocorrelation from 1st to 4th column, then start again from 5th to 9th.

      Is this possible?

      Best regards, Farid

      Comment

      Working...
      X